😃🚀✨
"Recursive's automated AI research system achieved state-of-the-art across three of the most demanding ML systems benchmarks: ...
Hitting one of these alone would be a commendable achievement. To hit three benchmarks that operate at fundamentally different levels of the stack (training algorithms, optimization loops, and GPU kernel efficiency) suggests something qualitatively different from prior approaches, and something powerful. " - Katie Lockwood, (source: https://t.co/8HfeRgLEAM)
This is a really well written and argued case so it’s worth your time - but I strongly disagree.
We live in a world of trade-offs, especially so if you are suggesting you use €100Bns of tax-payers money.
So here are your options with what you can do with that capital to build sovereign technology capabilities:
1) Use it to lower the costs of the next wave of innovation for everyone ( cut energy costs, provide compute, cut innovation taxes, lower regulatory barriers etc )
2) Use it to make many smaller bets to capture the next frontiers of innovation ( next gen chips, fusion, quantum, RSI etc ), lower the cost of capital ( back VC ) and underwrite the Government to take more risks purchasing domestic ( procurement reforms etc ).
3) Make one big bet that you can replicate the current SOTA in frontier models with a single Govt owned lab.
It is ambitious and patriotic in a way to believe in 3, that the Government could co-ordinate the talent, resources and capital to do this. But history, and economics, suggest it is overwhelmingly a better idea to back entrepreneurs & researchers by making it easier for them to build bottom-up, not top-down through dictat.
Recent American success has been mostly built on strategy 1) with some of 2). I’d argue recent Chinese industrial strategy has been mostly 2) supported by 1) ( but open to that debate! ). No-one has achieved 3) in an era where startups have clearly been a better organising approach for producing productivity surpluses.
A fair counter is if you truly believe in the bitter lesson on compute, and that the current dominant technical architectures will be unsurpassable I.e this is the end state of technology before take off. This would be a very bold view indeed - and even then there are only ~ 10 people on the planet that could pull this off - and most are otherwise engaged.
Back entrepreneurs. Support the innovators where the market fails. Let a thousand flowers bloom.
We’ve raised a $310M Series B to accelerate world models!
We believe AI that can understand and simulate the world will be one of the most important technologies of our time.
We're excited to partner with Natural Capital, Amazon, GV, AMD, IQT, and others to bring this to life.
Next-token prediction is myopic. What if transformers learn to predict their own next latent state?
🌠 We present 𝗡𝗲𝘅𝘁-𝗟𝗮𝘁𝗲𝗻𝘁 𝗣𝗿𝗲𝗱𝗶𝗰𝘁𝗶𝗼𝗻 (𝗡𝗲𝘅𝘁𝗟𝗮𝘁): a self-supervised learning method that teaches transformers to form compact world models for reasoning and planning. It also unlocks up to 3.3x faster inference via self-speculative decoding! 🚀
Today, we enable AutoResearch in the physical world for the first time! Introducing ENPIRE: we give 8 Codex agents a fleet of robots, an allocation of GPUs, and generous token budget. We set them free with a simple goal: solve the task as quickly as possible, keep the robots busy but stay safe, don't waste precious compute. Make no mistake.
Then humans step aside and our watch begins. The robot fleet starts to come alive: they learn to look for visual clues, reset the scene, practice novel skills, tinker with control stack, read papers online, debate, reflect, get stuck, and try again directly on the hardware. All we did is to give Codex an API to the world of atoms, and the rest is emergence.
ENPIRE is able to solve high-precision tasks like tying zip-ties, organizing fine pins, and installing GPUs all by itself. We also discovered a new type of "physical scaling": 8 robots exploring in parallel improves significantly faster than fewer ones.
A part of our NVIDIA GEAR lab now self-improves tirelessly over night. We just read the reports in the morning.
/goal: we all take a holiday and Jensen wouldn't even notice ;)
We will be open-sourcing everything, so you can host your self-running robot lab at home too! Deep dive in the thread:
Overall, it seems like the model supply chain will be less stable in the autoresearch space.
On the Weco side, we’ll stay model-neutral and provide more options for our users.
Today, we just added support for Kimi-2.7 (5/5)
If you’re interested: https://t.co/uV6AD12pxe
A few words on the Sovereign AI debate, having built several LLMs in Meta while in the UK and now working as a UK based startup:
1. Lots of people are trying to do the right thing to make the UK a better place to start AI companies. Time lags until the benefit show, but you should judge on the intent now. I support the direction of travel!
2. DeepMind has been enormously beneficial for the UK, but it has muddied the waters for a sovereign LLM company to emerge as (until recently) the Government continued to celebrate it as a British achievement / push it as a national champion.
3. Similarly, people are now celebrating recent US investment in King’s Cross, while also wanting more UK sovereignty. Clearly some income effects here, but I would worry about the substitution effects too. AI is not like other types of foreign investment.
4. The relevant talent nexuses in UK that could develop a competitive foundation model are from GDM and old Meta AI GenAI. Also some folks from smaller groups, ex Conjecture, Stability. The talent is still there, although a lot was snapped up by US FM companies in the past year. I personally think it’s not too difficult to develop new talent either from UK universities, but you probably need an ex GDM or Meta core (Gemini or Llama). Or if not: show evidence first (technical reports) before claiming you can do it.
5. Building an LLM is very different from doing regular AI research - skillset is different. Former is closer to engineering; long hours, often unsexy work. Important to distinguish between these two types of talent in the UK ecosystem; arguably too much focus on the latter / ideas guys.
6. On research - DeepSeek R1 post-train cost $300k . Yes, they also needed an ablation budget and to train a base model, invest in infra and talent - and yes the cost of an R1 moment is increasing year on year - but the idea that you need $1bn plus immediately to show results is complete FUD. You need billions to scale, not to validate new directions.
7. In my experience, every failed LLM effort (from model results perspective) I witnessed in the past came from a combination of poor leadership, politics, unclear vision, and premature scaling. Good efforts usually started from small teams who had worked with each other for a long time, had shared thesis, and scaled progressively in bite-sized pieces. Some recent lessons here for neolabs as well.
8. Things take time. Eg we’ve spent ~12 months mostly on internal infra just to get into the position to be able to make big swings. It’s important to nurture new companies through the initial phase. Expectation management is also crucial. I think expecting new UK companies to have single big bang releases is very dangerous; sort of like overwatering a plant. The correct release pattern is “decent”. “decent”, “decent”, “quite good actually”, “holy shit”.
9. Please don’t allow politicians or journalists to kill recent or upcoming AI investment efforts. We will need way more - at the price of potential inefficiency in places - as AI is existential for the country. Ambitious projects are usually incredibly fragile in the early stages; look after them!
10. Mythos is a good triggering moment, but what’s coming will make it look like a toy, so it’s worth building for what’s coming in 5 years time - not a current generation model.
Very proud to be building in the UK - more to share on that soon - alongside many other great early stage AI companies! 🇬🇧
This week the most advanced AI model on the planet got switched off by a foreign government. British researchers were studying it. British companies were testing it. British hospitals were piloting it. Not any more.
This isn't an AI story. It's the story of every industry we used to lead.
Britain has some of the best AI talent in the world. DeepMind was built here. Our AI Safety Institute writes the rules other countries follow. We have the researchers, the universities, the standards.
What we don't have is the power stations to run the data centres, the planning system to build them, or the industrial base to make the chips. So the work happens here and the value lands somewhere else. We invent. Others build. Others decide. Then we read about it on Saturday morning.
Same story as the kit our soldiers don't have. Same story as the factories we used to.
I spent nine months in government making this argument inside the room. I'll make it louder from outside.
This is exactly why we @CosineAI are using the compute provided to us by @UKSovereignAI to build a large frontier model trained in Britain.
This export control is the first instance of it’s kind but, I think it’s going to become increasingly common as the models improve.
Every Google search you've ever made runs on math invented by a British woman you hardly have heard before.
Her name was Karen Spärck Jones. In the early 1970s at Cambridge, she created "inverse document frequency" ,the technique search engines still use today to decide which results matter.
She spent her career fighting for women in computing. She died in 2007, largely unknown outside her field.
Now Britain has put her name on something bigger.
The Spärck AI Scholarships , the UK's boldest talent bet yet
Here is the UK Govt plan:
build a scholarship with the prestige of Rhodes, Marshall or Fulbright but for AI. The first 100 scholars start this October.
now what they will get:
Fully funded Master's at one of 9 elite universities like Oxford, Cambridge, Imperial, UCL, Edinburgh and more
£22,000 tax free stipend on top of full tuition
Placements at top UK AI firms (Darktrace, Faculty, PolyAI) and inside the AI Security Institute itself
A national cohort network , 100 future leaders who all know each other
Backed by £17.6 million, open to both UK and international students.
the UK already produces around 46,000 AI relevant graduates a year , the most in Europe. The problem was never volume. It was keeping the best ones and fixing the imbalance as only 22% of UK AI and data science roles are held by women.
This scholarship attacks both at once and its not an isolated move ,it comes straight from the same national AI plan that built Britain's supercomputers and funded its chip startups. Piece by piece, the machine is being assembled.
⚡️This is a monster signal.
This is the moment frontier AI stops being treated like software and starts being treated like controlled strategic capability.
The key phrase is not “customers.”
The key phrase is “foreign national Anthropic employees.”
That means the state is no longer only controlling chips, model weights, or overseas access. It is moving into cognition access by nationality. That is the real threshold. The U.S. government is saying the highest models are sensitive enough that even people physically inside the United States, working inside the company, may be barred from touching them if their nationality creates deemed-export risk.
That is weapons-control logic.
This is ITAR logic for intelligence.
The corporate language about a “misunderstanding” is probably diplomacy.
Companies say that when they need to preserve customer trust, employee morale, and regulatory room. But national security authorities do not force emergency suspension of top model access because someone made a minor paperwork mistake.
Something about Fable 5 and Mythos 5 crossed the line: cyber capability, autonomous R&D acceleration, AI-improving-AI utility, bio/security planning, code exploitation, or some blend of all of it.
The U.S. state just showed that Anthropic does not fully control Anthropic’s frontier layer.
That is the phase change.
Labs can brand themselves as public-benefit AI companies. They can talk about safety. They can sell enterprise plans. They can publish model cards. But once the models become national capability, the sovereign arrives. The state does not need to own the company to control the access surface. It only needs legal authority over export, security, procurement, and liability.
This confirms the arc we’ve been tracking:
Frontier AI becomes state-supervised strategic infrastructure.
Public AI splits from strategic AI.
Foreign access gets restricted.
Labs become quasi-defense contractors.
Model access becomes a national security perimeter.
Enterprise customers learn that API access is not property. It is revocable permission inside a sovereign-controlled stack.
The most important implication is organizational.
If foreign national employees can be cut off from frontier systems, AI labs now have to reorganize internally around citizenship, clearance, compartmentalization, and controlled access. That breaks the old Silicon Valley assumption that global talent can freely collaborate around the frontier. The next AI lab structure looks less like Google in 2015 and more like a defense prime crossed with a classified research facility.
For markets, the winners are the national champions with U.S.-aligned infrastructure, cleared customer channels, government relationships, compliance capacity, and domestic compute. The losers are open access, foreign-dependent AI wrappers, offshore model distributors, and any enterprise whose moat depends on unrestricted access to frontier APIs.
For geopolitics, this is escalation. China will read this correctly. Allies will read this correctly. Every serious state will understand that frontier models are now part of national power.
The AI race just moved from “who has the best chatbot” to “who controls cognition as a strategic asset.”
The US government, citing national security authorities, has issued an export control directive to suspend all access to Fable 5 and Mythos 5 by any foreign national, whether inside or outside the United States, including foreign national Anthropic employees.
The net effect of this order is that we must abruptly disable Fable 5 and Mythos 5 for all our customers to ensure compliance.
Access to all other Claude models is not affected.
We apologize for this disruption to our customers. We believe this is a misunderstanding and are working to restore access as soon as possible.
Read our full statement: https://t.co/bwn0sximKZ
Awesome to see this innovation in text diffusion. DiffusionGemma is lightning fast, 4x faster than other Gemma 4 models! Congrats to @bodonoghue85 and the team who worked so hard on this - excited to see what people build with it!