This Stanford University paper just broke my brain.
They just built an AI agent framework that evolves from zero data no human labels, no curated tasks, no demonstrations and it somehow gets better than every existing self-play method.
It’s called Agent0: Unleashing Self-Evolving Agents from Zero Data via Tool-Integrated Reasoning
And it’s insane what they pulled off.
Every “self-improving” agent you’ve seen so far has the same fatal flaw:
they can only generate tasks slightly harder than what they already know.
So they plateau. Immediately.
Agent0 breaks that ceiling.
Here’s the twist:
They spawn two agents from the same base LLM and make them compete.
• Curriculum Agent - generates harder and harder tasks
• Executor Agent - tries to solve them using reasoning + tools
Whenever the executor gets better, the curriculum agent is forced to raise the difficulty.
Whenever the tasks get harder, the executor is forced to evolve.
This creates a closed-loop, self-reinforcing curriculum spiral and it all happens from scratch, no data, no humans, nothing.
Just two agents pushing each other into higher intelligence.
And then they add the cheat code:
A full Python tool interpreter inside the loop.
The executor learns to reason through problems with code.
The curriculum agent learns to create tasks that require tool use.
So both agents keep escalating.
The results?
→ +18% gain in math reasoning
→ +24% gain in general reasoning
→ Beats R-Zero, SPIRAL, Absolute Zero, even frameworks using external proprietary APIs
→ All from zero data, just self-evolving cycles
They even show the difficulty curve rising across iterations:
tasks start as basic geometry and end at constraint satisfaction, combinatorics, logic puzzles, and multi-step tool-reliant problems.
This is the closest thing we’ve seen to autonomous cognitive growth in LLMs.
Agent0 isn’t just “better RL.”
It’s a blueprint for agents that bootstrap their own intelligence.
The agent era just got unlocked.
Join us at #MSIgnite for BRK1700: Windows & Microsoft 365 Copilot — a closer look at secure AI and agent productivity.
📅 Wed, Nov. 19
🕘 9:00–9:45 AM PST
📍 Moscone West, Level 2, Room 2007
Don’t miss what’s next for Windows and Microsoft Copilot: https://t.co/tTqDiTsTsN
"AI isn't replacing radiologists" good article
Expectation: rapid progress in image recognition AI will delete radiology jobs (e.g. as famously predicted by Geoff Hinton now almost a decade ago). Reality: radiology is doing great and is growing.
There are a lot of imo naive predictions out there on the imminent impact of AI on the job market. E.g. a ~year ago, I was asked by someone who should know better if I think there will be any software engineers still today. (Spoiler: I think we're going to make it). This is happening too broadly.
The post goes into detail on why it's not that simple, using the example of radiology:
- the benchmarks are nowhere near broad enough to reflect actual, real scenarios.
- the job is a lot more multifaceted than just image recognition.
- deployment realities: regulatory, insurance and liability, diffusion and institutional inertia.
- Jevons paradox: if radiologists are sped up via AI as a tool, a lot more demand shows up.
I will say that radiology was imo not among the best examples to pick on in 2016 - it's too multi-faceted, too high risk, too regulated. When looking for jobs that will change a lot due to AI on shorter time scales, I'd look in other places - jobs that look like repetition of one rote task, each task being relatively independent, closed (not requiring too much context), short (in time), forgiving (the cost of mistake is low), and of course automatable giving current (and digital) capability. Even then, I'd expect to see AI adopted as a tool at first, where jobs change and refactor (e.g. more monitoring or supervising than manual doing, etc). Maybe coming up, we'll find better and broader set of examples of how this is all playing out across the industry.
About 6 months ago, I was also asked to vote if we will have less or more software engineers in 5 years. Exercise left for the reader.
Full post (the whole The Works in Progress Newsletter is quite good):
https://t.co/ON3GwlI3mi
For those of you who are in Singapore, I am bringing back Parisalon. Before covid we met over curated Jeffersonian dinners to discuss issues like climate etc. Due to interest, the salon will return as opt-in and more open. If interested and you know me on WA, lmk.
https://t.co/40PemSYNOY
My positive review of Ricks' book on how Washington, Adams, Jefferson and Madison were influenced by the Greeks and Romans, while I wonder why this matters in our modern world. Thanks to the invitation from the non-fiction book club of Singapore.
@SusanLindnerIS and I sat down for an amazing conversation spanning Pull the Goalie, women's inclusion, and some personal stories from my life in tech #innovation. https://t.co/OTftNVb9ZR
IT’S HAPPENING 😱🚨
The women’s hockey rivalry hits its peak as Team Canada 🇨🇦 goes head-to-head against the USA 🇺🇸 for GOLD 🥇 at #Beijing2022
WATCH THE GAME on Feb. 16 at 11 PM ET on CBC TV, in the CBC Sports app or @cbcgem
🎵 @reuben_thedark