This week the most advanced AI model on the planet got switched off by a foreign government. British researchers were studying it. British companies were testing it. British hospitals were piloting it. Not any more.
This isn't an AI story. It's the story of every industry we used to lead.
Britain has some of the best AI talent in the world. DeepMind was built here. Our AI Safety Institute writes the rules other countries follow. We have the researchers, the universities, the standards.
What we don't have is the power stations to run the data centres, the planning system to build them, or the industrial base to make the chips. So the work happens here and the value lands somewhere else. We invent. Others build. Others decide. Then we read about it on Saturday morning.
Same story as the kit our soldiers don't have. Same story as the factories we used to.
I spent nine months in government making this argument inside the room. I'll make it louder from outside.
The Dirty Little Secret of AI:
I wanted to see if I could train a full neural network on a real 1979 PDP-11. Spoiler alert - I did.
Allow me to explain transformers and attention when they're reduced to their most basic forms, all in 6K of program code...
1/4 LLMs solve research grade math problems but struggle with basic calculations. We bridge this gap by turning them to computers.
We built a computer INSIDE a transformer that can run programs for millions of steps in seconds solving even the hardest Sudokus with 100% accuracy
🚨 Microsoft just quietly dropped a tool that turns ANY document into LLM-ready data in seconds.
It's called MarkItDown, a lightweight Python library that converts PDFs, Word, Excel, PowerPoint, images, audio, and YouTube URLs into clean Markdown your LLM can actually use.
No custom parsers. No brittle pipelines. No preprocessing hell.
Built by the AutoGen team and battle-tested across 87K GitHub stars.
The numbers don't lie:
→ pip install markitdown and you're converting files in under 60 seconds
→ 10+ file formats supported out of the box
→ Native MCP server for direct Claude Desktop integration
And it works everywhere:
→ Command line: markitdown file.pdf > doc .md
→ Python API: 3 lines of code
→ Docker
→ Azure Document Intelligence for enterprise OCR
100% Opensource. MIT license.
This is the document preprocessing tool your RAG pipeline has been waiting for LLM-ready output without the LLM-ready headache.
Link in the first comment 👇
We've added a new command to Claude Code called /insights
When you run it, Claude Code will read your message history from the past month. It'll summarize your projects, how you use Claude Code, and give suggestions on how to improve your workflow.
Introducing Claude Opus 4.6. Our smartest model got an upgrade.
Opus 4.6 plans more carefully, sustains agentic tasks for longer, operates reliably in massive codebases, and catches its own mistakes.
It’s also our first Opus-class model with 1M token context in beta.
We just open sourced the code-simplifier agent we use on the Claude Code team.
Try it: claude plugin install code-simplifier
Or from within a session:
/plugin marketplace update claude-plugins-official
/plugin install code-simplifier
Ask Claude to use the code simplifier agent at the end of a long coding session, or to clean up complex PRs. Let us know what you think!
Woke up to 690,000+ views on my Ralph article (what?)
So I'm going to double down and show you a real-world example of how I used Ralph to ship a feature this morning.
Also, I created a public repo that makes it super easy for you to get Ralph running: https://t.co/V2Qu7SMyxh
1. Created the PRD
https://t.co/SURYq8Xizc
Note how I used my create-prd markdown file from https://t.co/YXuNG6XKYz
2. Created the Ralph user stories
https://t.co/5Q7F6xLjUh
Note how I used my Ralph Skill to create the user stories
https://t.co/xdw8ur90m6
3. Started Ralph with max 25 iterations
./scripts/ralph/ralph.sh 25
Iteration 1
https://t.co/l38qgqyQeG
Iteration 2
https://t.co/xj05rXNqDO
Iteration 3
https://t.co/zOknKrGE4O
Iteration 4
https://t.co/vId1EtNGST
Iteration 5
https://t.co/KfGzjNq9AR
Iteration 6
https://t.co/fyMA7fnix9
Iteration 7
https://t.co/HGmrhbiaUc
Iteration 8
https://t.co/V4iww9gNUQ
Iteration 9
https://t.co/4eKssSyTW3
Iteration 10
https://t.co/GdIEliVL28
Iteration 11
https://t.co/jCOjNDbMdB
Iteration 12
https://t.co/a2I1g1EXwP
Iteration 13
https://t.co/xhQTR3R1lg
Iteration 14
https://t.co/F2TNGDxvao
After it finished, I tested manually and found a few edge-case bugs, which Amp quickly fixed.
This Stanford University paper just broke my brain.
They just built an AI agent framework that evolves from zero data no human labels, no curated tasks, no demonstrations and it somehow gets better than every existing self-play method.
It’s called Agent0: Unleashing Self-Evolving Agents from Zero Data via Tool-Integrated Reasoning
And it’s insane what they pulled off.
Every “self-improving” agent you’ve seen so far has the same fatal flaw:
they can only generate tasks slightly harder than what they already know.
So they plateau. Immediately.
Agent0 breaks that ceiling.
Here’s the twist:
They spawn two agents from the same base LLM and make them compete.
• Curriculum Agent - generates harder and harder tasks
• Executor Agent - tries to solve them using reasoning + tools
Whenever the executor gets better, the curriculum agent is forced to raise the difficulty.
Whenever the tasks get harder, the executor is forced to evolve.
This creates a closed-loop, self-reinforcing curriculum spiral and it all happens from scratch, no data, no humans, nothing.
Just two agents pushing each other into higher intelligence.
And then they add the cheat code:
A full Python tool interpreter inside the loop.
The executor learns to reason through problems with code.
The curriculum agent learns to create tasks that require tool use.
So both agents keep escalating.
The results?
→ +18% gain in math reasoning
→ +24% gain in general reasoning
→ Beats R-Zero, SPIRAL, Absolute Zero, even frameworks using external proprietary APIs
→ All from zero data, just self-evolving cycles
They even show the difficulty curve rising across iterations:
tasks start as basic geometry and end at constraint satisfaction, combinatorics, logic puzzles, and multi-step tool-reliant problems.
This is the closest thing we’ve seen to autonomous cognitive growth in LLMs.
Agent0 isn’t just “better RL.”
It’s a blueprint for agents that bootstrap their own intelligence.
The agent era just got unlocked.
Introducing Karpathy: An Agentic Machine Learning Engineer built with Google ADK, Claude Code, and our Claude Scientific Skills. It supports fully automated or highly interactive workflows, giving you complete control over how you build and refine machine learning systems. Karpathy is open source, free to use, and available today. You only need Claude Code and an OpenRouter API key to get started. We're excited to see how you all use it!
https://t.co/WutDj2Mqrg
And yes, it's named after @karpathy! 😁
When Grokipedia is good enough (long way to go), we will change the name to Encyclopedia Galactica.
It will be an open source distillation of all knowledge, including audio, images and video.
Join @xAI to help build the sci-fi version of the Library of Alexandria!