Excited to share Flow Matching Policy Gradients: expressive RL policies trained from rewards using flow matching. It’s an easy, drop-in replacement for Gaussian PPO on control tasks.
Coffee changes connectivity in the brain.
Increased functional connectivity of the higher visual & executive control networks were seen with coffee, but not caffeine alone.
📢Presenting SDE Matching🔥🔥🔥
🚀We extend diffusion models to construct a simulation-free framework for training Latent SDEs. It enables sampling from the exact posterior process marginals without any numerical simulations.
📜: https://t.co/mKUvZZ4Vzt
🧵1/8
Flow Matching (FM) is one of the hottest ideas in generative AI - and it’s everywhere at #ICML2025.
But what is it? And why is it so elegant? 🤔
This thread is an animated, intuitive intro into (Variational) Flow Matching - no dense math required.
Let's dive in! 🧵👇
Despite theoretically handling long contexts, existing recurrent models still fall short: they may fail to generalize past the training length. We show a simple and general fix which enables length generalization in up to 256k sequences, with no need to change the architectures!
Now in Nature! 🚀 Our method learns a controllable 3D model of any robot from vision, enabling single-camera closed-loop control at test time! This includes robots previously uncontrollable, soft, and bio-inspired, potentially lowering the barrier of entry to automation!
Paper: https://t.co/43qHC9sxZI
(1/n)
Sakana AI developed a new coding agent, ALE-Agent, trained to solve NP-hard optimization problems.
Our agent participated in a live coding competition, the challenging AtCoder Heuristic Contest, and ranked #21 out of 1,000 human participants!
Learn more: https://t.co/KOYxix8oy0
Your boss emails you a point in 128-billion-dimensional space. It's Llama 8B in bfloat16. They want it compressed.
What should you do 🤔... quantise to NF4? 🧵
Text-to-LoRA: Instant Transformer Adaption
https://t.co/FXTZUFd0Jo
Generative models can produce text, images, video. They should also be able to generate models! Here, we trained a Hypernetwork to generate new task-specific LoRAs by simply describing the task as a text prompt.
I've been experimenting with attention mechanisms to design locomotion controllers that adapt to different shapes, this is the same controller running on two different bodies https://t.co/56a5NleE2Y paper coming soon 👀
New Paper!
Darwin Godel Machine: Open-Ended Evolution of Self-Improving Agents
A longstanding goal of AI research has been the creation of AI that can learn indefinitely. One path toward that goal is an AI that improves itself by rewriting its own code, including any code responsible for learning.
That idea, known as a Gödel Machine, proposed by @SchmidhuberAI over two decades ago, is a hypothetical self-improving AI. It optimally solves problems by recursively rewriting its own code when it can mathematically prove a better strategy, making it a key concept in meta-learning or “learning to learn.”
While the theoretical Gödel Machine promised provably beneficial self-modifications, its realization relied on an impractical assumption: that the AI could mathematically prove that a proposed change in its own code would yield a net improvement before adopting it. Sakana AI, in collaboration with Jeff Clune’s lab at UBC, proposes something more feasible: a system that harnesses the principles of open-ended algorithms like Darwinian evolution to search for improvements that empirically improve performance.
We call the result the Darwin Gödel Machine. DGMs leverage foundation models to propose code improvements, and use recent innovations in open-ended algorithms to search for a growing library of diverse, high-quality AI agents.
Applied to practical tasks, we implemented Darwin Gödel Machine as a self-improving coding agent that rewrites its own code to improve performance on programming tasks. It creates various self-improvements, such as a patch validation step, better file viewing, enhanced editing tools, generating and ranking multiple solutions to choose the best one, and adding a history of what has been tried before (and why it failed) when making new changes (see the attached video).
We believe that Darwin Gödel Machines represent a concrete step towards AI systems that can autonomously gather their own stepping stones to learn and innovate forever!
We present Panda: a foundation model for nonlinear dynamics pretrained on 20,000 chaotic ODE discovered via evolutionary search. Panda zero-shot forecasts unseen ODE best-in-class, and can forecast PDE despite having never seen them during training (1/8)
https://t.co/AuWFmVal8o
Could a major opportunity to improve representation in deep learning be hiding in plain sight? Check out our new position paper: Questioning Representational Optimism in Deep Learning: The Fractured Entangled Representation Hypothesis. The idea stems from a little-known observation about networks trained to output a single image: when they are discovered through an unconventional open-ended search process, their representations are incredibly elegant and exhibit astonishing modular decomposition. In contrast, when SGD (successfully) learns to output the same image its underlying representation is fractured, entangled - an absolute mess!
This stark difference in the underlying representation of the same "good" output behavior carries deep lessons for deep learning. It shows you cannot judge a book by its cover - an LLM with all the right responses could similarly be a mess under the hood. But also, surprisingly, it shows us that it doesn't have to be this way! Without the unique examples in this paper that were discovered through open-ended search, we might assume neural representation has to be a mess. These results show that is clearly untrue. We can now imagine something better because we can actually see it is possible.
We give several reasons why this matters: generalization, creativity, and learning are all potentially impacted. The paper shows examples to back up these concerns, but in brief, there is a key insight: Representation is not only important for what you're able to do now, but for where you can go from there. The ability to imagine something new (and where your next step in weight space can bring you) depends entirely upon how you represent the world. Generalization, creativity, and learning itself depend upon this critical relationship. Notice the difference in appearance between the nearby images to the skull in weight space shown in the top-left and top-right image strips of the attached graphic. The difference in semantics is stark.
The insight that representation could be better opens up a lot of new paths and opportunities for investigation. It raises new urgency to understand the representation underlying foundation models and LLMs while exposing all kinds of novel avenues for potentially improving them, from making learning processes more open-ended to manipulating architectures and algorithms.
Don't mistake this paper as providing comfort for AI pessimists. By exposing a novel set of stark and explicit differences between conventional learning and something different, it can act as an accelerator of progress as opposed to a tool of pessimism. At the least, the discussion it provokes should be quite illuminating.
New Paper: Continuous Thought Machines 🧠
Neurons in brains use timing and synchronization in the way that they compute, but this is largely ignored in modern neural nets. We believe neural timing is key for the flexibility and adaptability of biological intelligence.
We propose a new neural architecture, “Continuous Thought Machines” (CTMs), which is built from the ground up to use neural dynamics as a core representation for intelligence. By using neural dynamics as a first-class representational citizen, CTMs naturally perform adaptive computation.
Many emergent, interesting behaviors arise as a result: CTMs solve mazes by observing a raw maze image and producing step-by-step instructions directly from its neural dynamics. When tasked with image recognition, the CTM naturally takes multiple steps to examine different parts of the image before making its decision. This step-by-step approach not only makes its behavior more interpretable but also improves accuracy: the longer it “thinks,” the more accurate its answers become.
We also found that this allows the CTM to decide to spend less time thinking on simpler images, thus saving energy. When identifying a gorilla, for example, the CTM’s attention moves from eyes to nose to mouth in a pattern remarkably similar to human visual attention.
I think this work underscores an important, yet often lost, synergy between neuroscience and AI. While modern AI is ostensibly brain-inspired, the two fields often operate in surprising isolation. By starting with such inspiration and iteratively following the emergent, interesting behaviors, we developed a model with unexpected capabilities, such as its surprisingly strong calibration in classification tasks, a feature that was not explicitly designed for.
When we initially asked, “why do this research?”, we hoped the journey of the CTM would provide compelling answers. By embracing light biological inspiration and pursuing the novel behaviors observed, we have arrived at a model with emergent capabilities that exceeded our initial designs. We are committed to continuing this exploration, borrowing further concepts to discover what new and exciting behaviors will emerge, pushing the boundaries of what AI can achieve.
Kinda cute that you can reduce KV cache by replacing it with a universal, transferable dictionary + old school sig. proc reconstruction algorithm.
We tested on non-reasoning models and was sota, but methinks it'll work even better on reasoning ones.
The ICML random coins landed favorably on this one, so you'll get to chat with @jon_ghoh about it.
Today we are announcing ARC-AGI-2, an unsaturated frontier AGI benchmark that challenges AI reasoning systems (same relative ease for humans).
Grand Prize: 85%, ~$0.42/task efficiency
Current Performance:
* Base LLMs: 0%
* Reasoning Systems: <4%
Excited to share our latest work: “Bio-Inspired Plastic Neural Networks for Zero-Shot Out-of-Distribution Generalization in Complex Animal-Inspired Robots” 🪲🦎
We show that Hebbian learning outperforms LSTM-based adaptation for real-world transfer. It even works without domain randomization!
It can handle:
✅ Uneven terrain
✅ Morphological damage
✅ Sim-to-real gaps
This was a fun experiment we ran while developing The AI Scientist-v2. With the permission of ICLR, we submitted an AI-generated paper to an ICLR workshop that passed the peer-review process.
We documented the entire process and our learnings in a blog: https://t.co/z6u1D9YEo2
As AI researchers, we also wrote our own (human) reviews documenting our own assessment and critiques of the AI-generated papers, and conducted code reviews on the computational experiments conceived by The AI Scientist-v2, which you might find interesting!
The AI-generated papers and our analysis of them are also published on our GitHub: https://t.co/k2CcnfGUy1
As we embrace artificial novelty search and open-ended discovery with AI, I believe computational creativity can enable frontier LLMs to produce even more novel and imaginative ideas (and if these ideas are related to AI / ML, can be tested with actual computational experiments conceived by AI). Perhaps one day, AI systems can produce groundbreaking scientific discoveries (or maybe, an accepted NeurIPS or ICLR paper 😛)