🌟We’re excited to announce H-GAP, a generalist model for humanoid control.
Trained on large MoCap-derived data, it can generate diverse, natural motions & transfer skills to new tasks without fine-tuning!
Paper: https://t.co/0m65xcO7EF
Website: https://t.co/xOL7cjWGBO
[1/N]
It has been an absolute privilege and pleasure to build up @UCL_DARK with @egrefen, @robertarail and @jparkerholder over the past eight years. Yesterday, the UK government announced not just one but two national academic fundamental AI research labs. I am extremely excited to announce that @UCL_DARK will be sunsetted and merge with @FLAIR_Ox, @whi_rl, @UCL_LASP and AIRL, to form the British Open-ended Learning and Discovery (BOLD) Lab — @BOLD_Lab_AI.
This is a huge moment for academic AI research in the UK. Backed with £30m by @UKRI_News and @EPSRC, it provides a unique opportunity to attract leading international academic talent to the UK, and equip them with the computational resources to do groundbreaking exploratory AI research (more on the computational resources soon). It also creates a mentorship network of academics, industry leaders and entrepreneurs to educate young talent on how to translate fundamental AI research into real world impact.
I want to thank all the students who made @UCL_DARK successful, in particular our PhD alumni @MinqiJiang, @_samvelyan, @zhengyaojiang, @_robertkirk, @akbirkhan, @LauraRuis, @YingchenX, @PaglieriDavide, and the work of our honorary faculty @egrefen, @robertarail and @jparkerholder who were generously contributing to mentorship and research in their free time.
We are excited to announce that we will sunset @UCL_DARK and merge into the British Open-Ended Learning & Discovery (BOLD) Lab. It's a unique opportunity to come together as a joint fundamental AI research group. This account will be inactive going forward. Follow @bold_lab_ai 🦋
OpenAI ran a hiring challenge, but the top candidate was one they couldn’t hire: our autonomous research agent, Aiden.
In Parameter Golf, Aiden ran for 22 days, and out-outperformed all 1,016 other researchers: 🧵 (1/8)
Nice of @jennyzhangzt to share this paper, which I selfishly think was ahead of its time. The context was that I was leaving Meta to do another startup, and thought I would not be writing papers for years. Of course, @MinqiJiang had all the good ideas + did most of the writing 😅
Excited to co-found Recursive (@recursive_si) with an exceptional team in London and SF to create AI that experiments on how to safely improve itself, turning compute into knowledge that accumulates in an open-ended process of endless, automated scientific discoveries.
My PhD thesis is out 🥳🎓
How do LLMs, trained on trillions of tokens, reason?
Can they generalise beyond their training data or are they constrained by what they've seen before?
My takeaway: they can generalise beyond training in interesting ways, showing genuine reasoning
Apply to do research with me on emergence of agency/planning in LLMs, out-of-context reasoning, understanding generalization from data, or propose your own direction!
Very excited to be mentoring this spring 💫
Excited to announce our MIT Press book “Neuroevolution: Harnessing Creativity in AI Agent Design” by Sebastian Risi (@risi1979), Yujin Tang (@yujin_tang), Risto Miikkulainen, and myself.
We explore decades of work on evolving intelligent agents and shows how neuroevolution can drive creativity in deep learning, RL, LLMs and AI Agents!
📖 Free open-access edition: https://t.co/1VraVue7Sk
In addition to our own works, this video features work by Jürgen Schmidhuber (@SchmidhuberAI), Seth Bling (@SethBling), Igor Karpov, Jacob Schrum, Yulu Gan (@yule_gan), Ken Stanley (@kenneth0stanley), Joel Lehman (@joelbot3000), Jeff Clune (@jeffclune), Nick Cheney (@CheneyLab), Richard Song (@XingyouSong), Chelsea Finn (@chelseabfinn), Julian Togelius (@togelius), Sam Earle (@Smearle_RH), Hod Lipson (@hodlipson), and Jean-Baptiste Mouret (@jb_mouret).
I’m beyond excited to announce our MIT Press book on Neuroevolution! An HTML version is now available for free on https://t.co/Q9uDN3w1GM, with a print edition coming out later in 2026.
Real intelligence is not static; it evolves. For decades, the field of neuroevolution has pursued this necessary adaptability. Our book chronicles its development, from early concepts to its modern integration with deep learning and reinforcement learning, exploring its potential for understanding the origins of intelligence and its real-world applications.
And the companion webpage is more than just a book site! It comes equipped with interactive demos, videos, exercises, and tutorials to allow everyone to experience neuroevolution in action. Check it out and let us know what you think!
It was a pleasure to work on this book over the last 4+ years with David (@hardmaru), Yujin (@yujin_tang), and Risto. We are incredibly proud of the result and look forward to celebrating! We hope to connect with many of you at NeurIPS.
We are very grateful to Melanie Mitchell (@MelMitchell1) who provided a fantastic foreword. To quote her: “The next big thing in AI is coming, and I suspect that neuroevolution will be a major part of it”. We think so too!
I had to share this stunning gif!
Do Continuous Thought Machines dream dream of electric sheep...?
This is a UMAP projection showing the neurons of a CTM firing while generating text (5 tokens, with time to think between).
Do you see the emergence of FAST and SLOW thoughts?
🎮 How can agents learn to generalize from limited offline data?
We introduce iMac (Imagined Autocurricula) - training agents entirely in world models with emergent curricula!
We are excited to share that “Continuous Thought Machines” has been accepted as a Spotlight at #NeurIPS2025! 🧠✨
The CTM is an AI that mimics biological brains by using neural dynamics & synchronization to think over time. It can solve complex mazes by building internal maps, gaze around images to classify them, and learn algorithms—all emergent from its core design.
This is just the beginning. A hint of what we're exploring next… (video attached!)
The team:
@LearningLukeD@ciaran_regan_@risi1979@jeffreyseely@YesThisIsLion
Proud to announce that Dr @LauraRuis defended her PhD thesis titled "Understanding and Evaluating Reasoning in Large Language Models" last week 🥳. Massive thanks to Noah Goodman and Emine Yilmaz for examining! As is customary, Laura received a personal mortarboard from @UCL_DARK. Details 👇
We’re excited to introduce ShinkaEvolve: An open-source framework that evolves programs for scientific discovery with unprecedented sample-efficiency.
Blog: https://t.co/zoZlH8jSXc
Code: https://t.co/TlYGSIk2Ek
Like AlphaEvolve and its variants, our framework leverages LLMs to find state-of-the-art solutions to complex problems, but using orders of magnitude fewer resources!
Many evolutionary AI systems are powerful but act like brute-force engines, burning thousands of samples to find good solutions. This makes discovery slow and expensive. We took inspiration from the efficiency of nature. ‘Shinka’ (進化) is Japanese for evolution, and we designed our system to be just as resourceful.
On the classic circle packing optimization problem, ShinkaEvolve discovered a new state-of-the-art solution using only 150 samples. This is a big leap in efficiency compared to previous methods that required thousands of evaluations.
We applied ShinkaEvolve to a diverse set of hard problems with real-world applications:
1/ AIME Math Reasoning: It evolved sophisticated agentic scaffolds that significantly outperform strong baselines, discovering an entire Pareto frontier of solutions trading performance for efficiency.
2/ Competitive Programming: On ALE-Bench (a benchmark for NP-Hard optimization problems), ShinkaEvolve took the best existing agent's solutions and improved them, turning a 5th place solution on one task into a 2nd place leaderboard rank in a competitive programming competition.
3/ LLM Training: We even turned ShinkaEvolve inward to improve LLMs themselves. It tackled the open challenge of designing load balancing losses for Mixture-of-Experts (MoE) models. It discovered a novel loss function that leads to better expert specialization and consistently improves model performance and perplexity.
ShinkaEvolve achieves its remarkable sample-efficiency through three key innovations that work together: (1) an adaptive parent sampling strategy to balance exploration and exploitation, (2) novelty-based rejection filtering to avoid redundant work, and (3) a bandit-based LLM ensemble that dynamically picks the best model for the job.
By making ShinkaEvolve open-source and highly sample-efficient, our goal is to democratize access to advanced, open-ended discovery tools. Our vision for ShinkaEvolve is to be an easy-to-use companion tool to help scientists and engineers with their daily work. We believe that building more efficient, nature-inspired systems is key to unlocking the future of AI-driven scientific research. We are excited to see what the community builds with it!
Learn more in our technical report: https://t.co/yzag3wd4jL
What if you kept asking an LLM to "make it better"? In some recent work at FAIR, we investigate how we can efficiently use RL to fine-tune LLMs to iteratively self-improve on their previous solutions at inference-time.
Training for iterated self-improvement can be costly. The naive approach to training for K self-improvement steps leads to K times the number of rollout steps per episode.
We introduce Exploratory Iteration (ExIt), an RL-based automatic curriculum method that bootstraps diverse training distributions of self-improvement tasks by upcycling the LLM's own responses at previous turns as the starting points for both self-improvement and *self-divergence.*
In order to decide what task to train on next, the curriculum prioritizes sampling of partial turn histories that led to higher return variance in its GRPO group (a learnability score that comes for free).
This automatic curriculum over the bootstrapped task space teaches the model how to perform iterated self-improvement while only ever training the model on single-step self-improvement tasks.
We look at ExIt's impact in both single-turn (contest math problems) and multi-turn (BFCLv3 multi-turn tasks), as well as MLE-bench, where the LLM is run in a search scaffold to produce solutions to real Kaggle competitions. Across these eval settings, we find ExIt produces models with greater capacity for inference-time self-improvement compared to GRPO. Notably, ExIt models can self-improve on test tasks for many more steps than the typical solution depth encountered during training, including a 22% improvement in MLE-bench performance compared to GRPO.
"Always reasoning" (ReAct) isn't optimal for LLM agents! 🧠
Our new paper identifies a "Goldilocks" effect: planning too frequently or not enough degrades performance. We show how to train agents to learn to dynamically allocate test-time compute when needed for best results. 👇