⭐New Paper Alert ⭐
How can your #RL agent quickly adapt to new distribution shifts ? And without ANY tuning?🤔
We suggest you get on the Fast TRAC🏎️💨, our new Parameter-free Optimizer that surprisingly works. Why?
Website:https://t.co/xiLDcXv4B7
1/🧵
Mean Field Games provide a framework for modelling large populations.
ICML26 Spotlight: Introducing Recurrent Structural Policy Gradient for partially observable MFGs with common noise, benefitting from faster convergence than model-free RL, but remaining tractable, unlike DP.
Excited to present two ✨Highlight papers at #CVPR2026 this week!
DENALI: A Dataset Enabling Non-Line-of-Sight Spatial Reasoning with Low-Cost LiDARs
🔗https://t.co/GyR5QX3ZYI
Task-Driven Implicit Representations for Automated Design of LiDAR Systems
🔗https://t.co/5hOdC4ljVz
Hindsight Experience Replay has become the ubiquitous method for goal-conditioned reinforcement learning, but leaves open the question of which goal to relabel with.
In this work, accepted at ICML, we propose instead simply Learning Everything All at Once (LEO).
1/
really cool work from @ClarisseWibault on offline goal-conditioned RL!! You can use hierarchy to reuse experience across similar parts of the state space (no new hps and some great performance gains!)
CV has CNNs, NLP has transformers - what inductive bias does RL have? How can policies generalise to regions of the dataset suffering from poor transitions?
We motivate hierarchy by enabling distinct state-representations at different levels of the hierarchy @FLAIR_Ox@j_foerst
AI is now a major part of scientific research 🔬. But can it actually forecast scientific progress?
We tested 6 frontier models on 4,760 real breakthroughs under strict knowledge cutoffs.
They recognize science. They can't forecast it. 🧵⬇️
AI is now a major part of scientific research 🔬. But can it actually forecast scientific progress?
We tested 6 frontier models on 4,760 real breakthroughs under strict knowledge cutoffs.
They recognize science. They can't forecast it. 🧵⬇️
We are releasing OptimusKG, a modern multimodal knowledge graph. Learn more at: https://t.co/xO6hLNJlHo
Graph AI has enabled advances across science and medicine (e.g., see our review at https://t.co/frXYpz927P). Integrating KGs with LLMs improves generalization, planning, reasoning, and zero-shot learning, and can enable RL reward modeling for reasoning models.
Graph AI depends on high-quality underlying graphs. For example, in 2023, our lab (@payal_chandak, @KexinHuang5) introduced PrimeKG, which has now been downloaded 116K+ times and cited 450+ times.
However, current biomedical KGs suffer from several limitations:
⏱️ Out-of-date
⚠️ Lack of standardized schema
🛠️ Unavailable or irreproducible construction code
📊 No metadata about nodes and edges
🤷♂️ No provenance information
📋 No quality checks
We introduce OptimusKG, a modern open-source multimodal biomedical knowledge graph that integrates molecular, anatomical, clinical, and environmental data. OptimusKG has reproducible, modular, FAIR-compliant code; an ontology-grounded, interoperable schema; rich metadata; and tracked provenance.
✅ 190K nodes across 10 types
✅ 21.8M edges across 26 types (5.5x greater than PrimeKG)
✅ 110.3M properties across 150 keys
✅ 65 source datasets and 18 ontologies
A frontier multimodal research agent (@EdisonSci) assessed whether data points in OptimusKG are supported by evidence from the scientific literature, identifying support for 70.0% of sampled true edges; by contrast, 83.4% of sampled false edges had no supporting evidence. OptimusKG also captures experimental data that may precede synthesis in the scientific literature.
To learn more about OptimusKG and download the graph:
🌐 Website and documentation: https://t.co/xO6hLNJlHo
👨💻 Code: https://t.co/OzD4jRCNOz
📎 Pre-print: https://t.co/jY0HklYwRv
🐍 Python client: https://t.co/5HrRiQYN9W (run pip install optimuskg)
📩 Dataset: https://t.co/thK1LqkxQE
We are excited for the community to use and build upon this resource to enable innovation in graph AI.
More below! 👇🏽 1/6
Just had a single-author paper accepted to #RSS2026!
https://t.co/o9Kn69CWGj
Motivated by growing interest in differentiable world models and physics simulators, we ask whether there is a unified principle for combining sampling-based global “exploration” with gradient-based local “exploitation” in trajectory and policy optimization with differentiable dynamics.
By viewing control through the control-as-inference lens—recasting optimization as sampling from an unnormalized Boltzmann distribution defined by an energy function—Tempered Sequential Monte Carlo (TSMC) naturally integrates importance sampling with gradient-based Hamiltonian Monte Carlo.
The key idea behind TSMC is to define a tempering path that gradually transforms an easy-to-sample prior into a complex, multi-modal posterior—or equivalently, deforms a convex energy landscape into a nonconvex one (graduated non-convexity)!
We implement TSMC for both trajectory and policy optimization. On small- to medium-scale problems, it appears broadly applicable and compares favorably with state-of-the-art baselines.
Excited to explore whether TSMC can scale to large-scale planning with complex, high-dimensional dynamics!
Natural evolution's open-endedness leads to beautiful, complex emergent structures and self-organizing behavior 🌱✨. Replicating this in silico is famously hard 💻. Our paper points to a promising direction by evolving populations of competing neural cellular automata with lifelike behavior 🧬🤖 #Isambard
⚠️⚠️flashing lights, rapid cuts, or strobe effects in this thread! 🚨🚨
1/n
Natural evolution's open-endedness leads to beautiful, complex emergent structures and self-organizing behavior 🌱✨. Replicating this in silico is famously hard 💻. Our paper points to a promising direction by evolving populations of competing neural cellular automata with lifelike behavior 🧬🤖 #Isambard
⚠️⚠️flashing lights, rapid cuts, or strobe effects in this thread! 🚨🚨
1/n
Excited to announce I’ll be joining @EugeneVinitsky at @nyutandon this autumn for a PhD!
I will be working on the intersection of game theory, reinforcement learning, and autonomous vehicles.
Thanks to everyone who helped me get to this point, especially from @FLAIR_Ox :)
1/ We just released π0.7 — a steerable generalist robot model with emergent capabilities.
I want to share a bit of the backstory, because π0.7 taught me something surprising about where robot learning is heading. A thread on bittersweet lessons 🧵
Introducing ✨Infusion✨, our *new paper* made possible by the UK AISI Challenge Fund and Sovereign AI!
1/8🧵 TL;DR
Influence functions are commonly used to attribute model behavior to its training data. In this paper we explored the reverse: whether it's possible to use influence functions to craft training data that induces model behavior?
Huge thank you to my amazing collaborators for making this possible
@LauraRuis@_robertkirk@egrefen@j_foerst and of course
@AISecurityInst and @UKSovereignAI!
1/ As compute continues to grow and simulators continue to improve, it is becoming feasible to train RL agents for billions or trillions of timesteps. However, this is only useful if agents can continue learning over such long training horizons, which is far from given 👇
@natashamalpani In addition to LeRobot, you should check out @Neuracore_AI (from @stepjamUK)! It has pretty comprehensive and cool robot data infra tools
The coolest benchmark I've seen in 2026. Algorithmic Discovery Agents (ADAs) are the future! Also, it opens up so many directions (especially in optimizing AI for science)
1/ 🪩 Automating the discovery of new algorithms could unlock significant breakthroughs in ML research. But optimising agents for this research has been limited by too few tasks to learn from!
Introducing DiscoGen, a procedural generator of algorithm discovery tasks 🧵
1/ 🪩 Automating the discovery of new algorithms could unlock significant breakthroughs in ML research. But optimising agents for this research has been limited by too few tasks to learn from!
Introducing DiscoGen, a procedural generator of algorithm discovery tasks 🧵
Built with my own 2 hands, fully open source! Huge shoutout to Codex’s invisible hands @reach_vb for helping me pull this together :))
Having always worked on algorithms, it was quite daunting (and fun!) to build a simulator for once. See you on the open source side for more 👀