Why AI Can Now Make Discoveries - my conversation with @danintheory, Lead of the Foundations of Reinforcement Learning team at @OpenAI
00:00 Intro: AI's wild week in mathematics
01:21 What OpenAI's Foundations of RL team does
03:08 Dan's journey: from black holes and quantum gravity to frontier AI
07:04 Are AI systems becoming useful for real science
08:21 The AI math moment: Erdős, OpenAI, DeepMind, and Anthropic
08:52 Why the OpenAI result was an act of exploration
10:25 OpenAI vs. DeepMind: informal reasoning vs. formal proof
12:13 RL 101: learning by doing, not just watching
15:10 Why reinforcement learning works
15:58 How RL breaks: sparse feedback and long-horizon tasks
17:03 RLHF: how human feedback shaped early language models
18:48 Move 37, self-play, and the search for novel strategies
22:16 Explore vs. exploit in scientific discovery
24:49 Why RL may now be "the cake," not the cherry on top
25:46 Why RL started working with large language models
27:29 Is RL "sucking supervision through a straw"?
28:47 Why language may be the grounding layer for intelligence
31:46 A contrarian take on the Bitter Lesson
32:41 What test-time compute actually is
34:50 How RL gives models the ability to think
35:40 Verifiable rewards, math, coding, and the messy real world
38:00 What physics can teach us about AI
42:08 Is there a thermodynamics of AI?
43:08 From Erdős problems to Einstein-level AI
45:16 Is AI already doing original science?
45:51 How far are we from AI automating AI research
47:41 Why Dan is excited about the future of science
Ok bad stories about VCs are spreading on X right now, but VCs have horror stories about founders too
Like, that one time when a founder decided to take another term sheet with a higher valuation despite our obvious ability to add value, thought leadership and vendor discounts
Dan Roberts (OpenAI):
"I can make a very long-distance prediction for the next 6 months.
I think we'll see more math and science breakthroughs, and obviously we'll turn this on AI itself, and the models will get a lot more powerful, and that'll be fun. You could do science of AI and have it feel like doing physics."
Our first commercial TTS model was optimized for WER and SSIM because that’s what research had taught us over years to be the standard metrics. The first customer feedbacks we had unveiled the huge blind spots of these metrics, in particular on naturalness, rhythm, emphasis, question intonation, etc. Now our internal eval has dozens of criteria monitored on each model.
Vercel is partnering with and integrating Shopify.
Starting with @v0, you can now prompt a Next.js + Shopify store in seconds.
The old tradeoff was “easy monolith” or “costly headless”. No more. Easy @nextjs Shopify storefronts with no scale or sophistication ceiling.
Reasoning LLMs typically take 2-3 seconds to start emitting tokens. In a voice agent, that's 2-3 seconds of silence after the user finishes speaking.
The @MiniMax_AI team just shipped a community contribution to Gradbot with two models running in parallel. MiniMax-M2-her produces a short acknowledgement that starts streaming to TTS immediately, while MiniMax-M2.7 runs in the background reasoning and tool calls.
Thanks to @davidtaoweiji for this contribution. Checkout our readme for more details.
https://t.co/gxSTdrCiAm
KongBrain gives OpenClaw a memory upgrade. Built by @eyeJUK3, it uses SurrealDB to store conversations, skills and causal chains, so your AI agent remembers, adapts and compounds knowledge. Check out the GitHub repo to learn more. 👉 https://t.co/TNFWz92goX