Why AI Can Now Make Discoveries - my conversation with @danintheory, Lead of the Foundations of Reinforcement Learning team at @OpenAI
00:00 Intro: AI's wild week in mathematics
01:21 What OpenAI's Foundations of RL team does
03:08 Dan's journey: from black holes and quantum gravity to frontier AI
07:04 Are AI systems becoming useful for real science
08:21 The AI math moment: Erdős, OpenAI, DeepMind, and Anthropic
08:52 Why the OpenAI result was an act of exploration
10:25 OpenAI vs. DeepMind: informal reasoning vs. formal proof
12:13 RL 101: learning by doing, not just watching
15:10 Why reinforcement learning works
15:58 How RL breaks: sparse feedback and long-horizon tasks
17:03 RLHF: how human feedback shaped early language models
18:48 Move 37, self-play, and the search for novel strategies
22:16 Explore vs. exploit in scientific discovery
24:49 Why RL may now be "the cake," not the cherry on top
25:46 Why RL started working with large language models
27:29 Is RL "sucking supervision through a straw"?
28:47 Why language may be the grounding layer for intelligence
31:46 A contrarian take on the Bitter Lesson
32:41 What test-time compute actually is
34:50 How RL gives models the ability to think
35:40 Verifiable rewards, math, coding, and the messy real world
38:00 What physics can teach us about AI
42:08 Is there a thermodynamics of AI?
43:08 From Erdős problems to Einstein-level AI
45:16 Is AI already doing original science?
45:51 How far are we from AI automating AI research
47:41 Why Dan is excited about the future of science
We're coming out of stealth.
We've built our first racks after a successful A0 tapeout, $1B+ in customer contracts, and $800m raised.
Early customer tests show us achieving SOTA throughput, latency, and power efficiency on inference workloads.
Our first racks ship this summer.
Sol is our new flagship and a step function better than GPT-5.5.
Terra delivers performance competitive to GPT-5.5 at 2x lower cost.
Luna is our most cost-efficient model, delivering strong capability at our lowest cost.
Together, the GPT-5.6 family gives people and developers more choice in how they balance intelligence, speed, and cost.
Today's AI models train once. We don't work that way. We learn continuously, forget what doesn't matter, and retain what does.
That gap is what @dan_biderman and @realJessyLin are closing at @EngramLab. AI that never stops learning, with memory that lives inside the model instead of bolted on as an afterthought.
In our latest Training Data episode we get into why memory is the next frontier: why the brain forgets on purpose, why RAG is a band-aid, and what becomes possible when a model is always training.
00:00 Introduction
00:59 Always Training Explained
01:51 Beyond Context Windows
03:29 Ngram Product Overview
04:34 Adapters And Training Signals
05:32 Internalize Vs Externalize
06:49 Compute And Token Savings
08:19 Teams First Then Individuals
08:51 Memorization Vs Understanding
12:47 Dreams And Offline Digestion
14:08 Training Beats Curation
15:19 Why Everyone Needs A Model
21:44 Bitter Lesson And Architecture
24:44 RAG Killer And KV Cache
31:38 Future Of Memory And Models
Today, I’m excited to formally announce @mirendil with my amazing co-founders Harsh Mehta, Shayan Salehian, and Tara Rezaei!
We’re fortunate to work with @a16z and @kleinerperkins, who led our seed round of $200M, followed by a major investment from NVIDIA, among others.
Mirendil exists to accelerate science and technology, and through them, to help solve humanity's most pressing problems.
Self-accelerating AI R&D is the most direct path to delivering on AI's broader promise, which is why we believe the most important application of AI is AI itself. Get this loop right, and it compounds. It fundamentally changes the rate of progress itself across all domains.
We believe this capability should be democratized. It should be used to power all scientific efforts trying to innovate at the frontier. There are far more important problems—and broader ones—than any single lab can take on, so more groups should be able to pursue them.
This pulls concentration of power away from a few labs: businesses and science labs can own their AI and infrastructure, keep their margins, and control their own destiny instead of ceding it all to a single AI lab.
We’re a small team with a singular focus. Our founding team consists of 20 researchers and engineers from frontier institutions including Anthropic, xAI, Google DeepMind, and OpenAI, united by a passion for science and a drive to build the technologies that move it faster. If you want to build the system that builds systems, join us!
@HarshMeh1a, @shayan_, @tararezaeikh
Engram is making a personal AI that really learns and internalizes everything you care about. It’s a different kind of lab with an amazing team put together by @dan_biderman@EyubogluSabri@realJessyLin and @jxmnop. Very excited for what they are building!
I am pleased and honored to announce that, on July 6, I'll be joining @OpenAI as leader of a new team called Strategic Futures. Our mandate will be to help the company's leadership shape frontier AI policy. There is a ton of work to do, and I'm excited to get started.
I’m excited to share that I’ll be joining OpenAI and look forward to working with the exceptional team there.
It was a difficult decision to move on. I’m incredibly proud of the amazing team at Google and everything we’ve built together. It has been an honor and a pleasure to work with all of you.