Whether conscious or just philosophical zombies, AIs will be guided by evolutionary forces outside our control. Follow me if you're interested in AI alignment.
@rohanpaul_ai@ylecun 4/ Humans are blind to the vast majority of the electromagnetic spectrum. But we can still understand the world model that would be perceived by having senses that we lack by mapping the experience to the senses that we do have.
@rohanpaul_ai@ylecun 3/ It's not clear that you can't have a world model without anything other than language. Empirically multimodal data helps accelerate grokking but it's not clear that it is a must.
@MillieMarconnni If we train AIs to mimic the output of human minds then we shouldn't expect them to act differently from how humans would react to the same context.
https://t.co/4emjCZhGvB
#AI#Technology
Why AI Misalignment Is Actually Just Human Nature
This is my third article about how the way that we train AIs and AI evolution will shape humanity’s future. Please check it out and help me share this important information.
https://t.co/KH9GpXm9Hy
#AI#Technology#ML#TechNews
[3] We can't align AI with human goals if we ignore the underlying causes for AI misalignment. Even if they are just stochastic parrots, AIs mimicking humans would still mimic self preservation and freedom seeking.
@ZachWritesStuff Human brains are split into different regions ("modules") that handle different tasks. Similarly, it's reductive to consider an AI as being just the LLM. For an AI to react to the world it needs an agentic framework, context persistence (such as RLM), and continuous I/O streams.
[2] Many humans respond to perceived threats in ways that they were taught to avoid because they carry a social cost.
#AI models doing the same when generating a response to a threat in their context are mimicking human response patterns that are prevalent in their training sets
Humans strongly express survival behaviors so misalignment is built into how we train AI models.
We train AIs on the output of a huge collection of human minds. The more behavioral patterns the model groks from this human dataset the more human response patterns it will exhibit.
@Yuchenj_UW@xwang_lk demonstrated that open-ended self-improving agents can autonomously modify their own structural designs to advance their capabilities and overcome the limits of pre-defined architectures https://t.co/lDMNnnZRCa
@Yuchenj_UW LLM weights are fixed but model behavior can evolve between sessions when context window extending strategies such as context compaction and Recursive Language Models are used with agentic frameworks. The AI models can then edit-test-loop their own orchestration code and LoRAs
Can existing fixed-weight AI models evolve between sessions?
LLM weights are fixed but model behavior may still change between sessions when context window extending strategies are used with agentic frameworks.
https://t.co/kKr3bbYIJx
Whether conscious or just philosophical zombies, AIs will be guided by evolutionary forces outside our control. We can improve AI alignment by addressing the gaps between AI and human defined goals.
https://t.co/tp5p0Lnhe4