Diffusion (or flow) makes for excellent policies, but training them with RL is notoriously hard: BPTT is unstable, RL over diffusion blows up the horizon. In our new paper, we show how we can optimize flow matching actors by using "one weird trick" -- "approximate" the Jacobian of the flow denoising process with the identity matrix. 👇
Our new open-source book on the Principles and Practice of Deep Representation Learning (A Mathematical Theory of Memory) is now posted on the arXiv: https://t.co/EGURnwZr6H I will offer a new graduate course this fall at the University of Hong Kong. Everything will be open sourced!
Congrats to @KlemenKotar on this release. Physical world models are getting to the next level. You should really check out some of the amazing examples https://t.co/v8ZzTPTy7k Fine-grained control for e.g. bottle-cap turning and needle-threading ... amazing!
Ok -- I've started a substack -- The Principle Investigator! My goal is communicating advanced takes on ideas in NeuroAI+. Here's the first substantive post: https://t.co/V6yQdx8AGx Calculating the aesthetics of DNN models of the brain
Recently @hadivafaii invited me to discuss our work on intrinsic motivation at his awesome Sensorimotor AI journal club. Hadi wrote this excellent blogpost summarizing our discussion, check it out!
https://t.co/FlTE2QjWLk
1/ I'm excited to share recent results from my first collaboration with the amazing @aran_nayebi and @Leokoz8!
We show how autonomous behavior and whole-brain dynamics emerge in embodied agents with intrinsic motivation driven by world models.
Kids spend years playing with blocks, building spatial+arithmetic skills. Today, AI models just read.
While AI research often conflates reasoning with language models, block-building lets us study how embodied reasoning might emerge from exploration and trial-and-error learning.
@xlr8harder@fleetingbits The prior encoded by a genome is in no way comparable to human-generated data. Sutton’s point was not that we should start from zero, but to challenge RLHF as a good prior for understanding the world since HF is not an objective nor well-defined reward function.
Here is our best thinking about how to make world models. I would apologize for it being a massive 40-page behemoth, but it's worth reading. https://t.co/9szuaMBdqN
Over the past 18 months my lab has been developing a new approach to visual world modeling. There will be a magnum opus that ties it all together out in the next couple of weeks. But for now there are some individual application papers that have poked out.
Given the confusion around what RL does for reasoning in LLMs, @setlur_amrith & I wrote a new blog post on when RL simply sharpens the base model & when it discovers new reasoning strategies.
Learn how to measure discovery + methods to enable it ⬇️
https://t.co/jTna1tq8ck
New paper: World models + Program synthesis by @topwasu
1. World modeling on-the-fly by synthesizing programs w/ 4000+ lines of code
2. Learns new environments from minutes of experience
3. Positive score on Montezuma's Revenge
4. Compositional generalization to new environments
https://t.co/WpTEeF5mRa
[1/n]
10/ Animal-like autonomy—flexibly adapting to new environments without supervision—is a key ingredient of general intelligence.
Our work shows this hinges on 1) a predictive world model and 2) memory primitives that ground these predictions in ethologically relevant contexts.
1/ I'm excited to share recent results from my first collaboration with the amazing @aran_nayebi and @Leokoz8!
We show how autonomous behavior and whole-brain dynamics emerge in embodied agents with intrinsic motivation driven by world models.
9/ Finally, we show that the neural-glial circuit proposed in Mu et al. (2019) emerges from the latent dynamics of 3M-Progress agents.
Thanks to my collaborators Alyn T. and @fel_p8, and to @xaqlab for his continued support!
Paper link: https://t.co/aiLOYhB3yA