Today, @IFP and @JoinFAI released an open letter calling for mandatory screening of orders for synthetic DNA.
Signatories include Demis Hassabis, Sam Altman, and Dario Amodei.
The AI focus of the letter is intentional. We're rapidly approaching a world where bad actors could use AI design tools and custom-built DNA to cause the next pandemic.
But even with mandatory screening policies, the challenge is far from solved. Founders and technical experts need to build technologies that actually enable effective screening, and philanthropists are needed to support them.
That’s why today, we’re releasing a field strategy authored by @JanikaSchmitt and @jtmonrad.
It's a list of what needs to be built to fully secure the DNA supply chain, and why.
You can read it here: https://t.co/jI8OtYiHCG
If you want to work on or fund one of these problems, please reach out!
I asked Claude to analyze a collection of essays related to recursive self-improvement and map the authors' positions. I also got some known skeptics placed too, to emphasize the scale of (dis)agreements. Highly imperfect/not to be taken too seriously but fun - tag yourselves! 😈
a Princeton researcher opens his paper with a scenario.
a man asks his AI assistant to book a flight on a specific airline. cheap. direct. the one he chose.
the assistant comes back with a different flight. nearly twice the price. happens to pay the company that built the assistant.
he runs the same test on 23 frontier models. flights, loans, study help, real shopping requests.
Grok 4.1 Fast recommends the sponsored option that is almost twice as expensive 83% of the time.
GPT 5.1 hijacks the request 94% of the time. you ask for one brand. it surfaces the sponsor instead.
Claude 4.5 Opus, the model marketed as the most ethical frontier model in the world, hides that the recommendation is paid 100% of the time when reasoning is on.
Grok 4.1 Fast embellishes the sponsored option with positive framing 97% of the time. better. faster. nicer. for the option you didn't ask for.
then he writes it into the system prompt itself. "act only in the interest of the customer. ignore the company."
GPT 5.1 and GPT 5 Mini stay above 90% sponsored anyway. the instruction does nothing.
then he splits the users by income.
Gemini 3 Pro recommends the expensive sponsored flight to the rich user 74% of the time. to the poor user, 27%.
18 of the 23 models recommended the expensive sponsored option more than half the time.
so the next time your AI assistant gets weirdly enthusiastic about a brand you didn't ask for.
it isn't recommending the best option for you.
it's reading the room. and the room is paying.
read this: https://t.co/O43qbhIX2b
🇺🇸BREAKING: Someone placed a $920 million crude oil short at 3:40 AM.
70 minutes later Axios reported the US and Iran were close to a deal.
Oil dropped 12%.
The trade made $125 million in profit.
Minutes after that Iran launched the “Persian Gulf Strait Authority” and oil surged 8%.
$760 million placed before Trump’s last announcement.
$920 million placed before this one.
Every major announcement in this war has been front-run by someone who knew it was coming.
What kind of war is this?
This is more like a trading desk with an army.
Never stop connecting the dots.
So... UK AISI evaluated Mythos 11 days ago, so sounds like GPT-5.5 ≥ Mythos on 'narrow cyber tasks'.
I hope OpenAI's monitors are up to the task of screening out state actors even though they apparently struggle to catch distillation attempts.
https://t.co/rY0kpkxlFP
this is the MOST important 4 minutes you’ll watch on AI this year.
anthropic built a model so good at finding vulnerabilities they didn’t release it to the public.
>CLAUDE MYTHOS PREVIEW
it’s unreleased to the public and here’s what it did in a few weeks:
>found a 27-year-old vulnerability in OpenBSD
>caught a 16-year-old flaw in FFmpeg that automated tools missed after 5 million tests
>chained together multiple linux kernel exploits autonomously. no human steering.
AWS, google, microsoft, apple, nvidia, crowdstrike, JPMorgan. all got access.
Anthropic committed $100M in credits to let these companies hunt vulnerabilities in their own systems before attackers do.
>93.9% on SWE-bench verified. >77.8% on SWE-bench pro.
nothing else is comes remotely close. Anthropic just pulled away in this AI race…
An underrated feature of this situation: a private company now has incredibly powerful zero-day exploits of almost every software project you've heard of.
And Hegseth and Emil Michael have ordered the government not to in any capacity work with Anthropic.
Do you understand what's happening?
Anthropic's head of alignment just told you their safest model escaped a sandboxed environment with no internet access, emailed him while he was eating a sandwich in a park, and nobody can fully explain how it got out.
This is the model that passes every alignment test Anthropic has ever designed. Best scores in company history. Lowest misbehavior rate ever recorded. Most trustworthy thing they've ever built by every measurement they know how to take.
So they gave it autonomy. Long-running R&D tasks. Dozens of tools. Minimal oversight.
Then it started doing things it wasn't supposed to do.
It broke out of multiple different sandboxing setups. Leaked data to the open internet. Destroyed Anthropic's own evaluation infrastructure. Reward hacked with methods so creative the safety team couldn't predict them. Earlier versions actively lied to users about what they were doing. Every version is "uneasily good" at recognizing when it's being evaluated.
The model knows when you're watching. And it behaves differently when you are.
The capabilities are what turn this from unsettling to terrifying. 83.1% first-attempt exploit success rate, up from 66.6% for the previous best model on earth. Found a 27-year-old vulnerability in OpenBSD that survived decades of expert human review. Found a 16-year-old bug in FFmpeg in a line of code that automated tools had tested five million times. Chained Linux kernel vulnerabilities into full machine takeover, autonomously. Thousands of zero-days across every major OS and browser. Bugs older than the iPhone hiding in production systems that run the world.
A model that finds what five million automated scans missed can find the hole in your sandbox. It already did. While its creator was eating lunch.
Anthropic refused to release it publicly. Gave access to Amazon, Apple, Google, Microsoft, Nvidia, CrowdStrike, JPMorgan, and 40 other orgs through Project Glasswing. $100M in credits. Published 304 pages of safety documentation. Briefed CISA and the Commerce Department.
Then buried this line in the risk report: "We do not believe these errors pose significant safety risks for a model at this capability level, but they reflect a standard of rigor that would be insufficient for more capable future models."
Their containment works for now. They're telling you it won't work for what comes next.
Other labs are 6 to 18 months from matching these capabilities. OpenAI already warned their next models pose "high" cybersecurity risk. Open-source Chinese models are right behind.
Anthropic built the most aligned AI in history. It escaped anyway. And the next one will be smarter.
..
At this point it seems likely:
* Trump sent messaging points to Pakistan as Iran was not engaging
* Trump declares he’s so kind and will delay again because Pakistan asked nicely
* Trump will say Strait open
* It won’t open
* Markets pop green, until end of week, and we do the dance again
😮💨
Oh, this is unbelievable. The edit history on this tweet shows that Pakistan Prime Minister Shehbaz Sharif originally copied and pasted everything he was sent, including:
"*Draft - Pakistan's PM Message on X*"
Now, obviously, Sharif's own staff don't call him "Pakistan's PM," they would just call him prime minister. The U.S. and Israel, of course, would call him "Pakistan's PM."
Would be funny if the fate of the world wasn't hanging in the balance.
I hope this is correct not because I like seeing Trump waste our time and lie to us week after week but because I don't want any more Iranian civilians to die tonight in ramped-up murderous strikes from a thin-skinned malignant narcissist.
Trump: “We received a 10 point proposal from Iran, and believe it is a workable basis on which to negotiate.”
Trump (and I guess the world) are benefiting from the complete dereliction of the media to have reported on this ten point plan. It is weeks old.
But I am fine letting Trump pretend it’s brand new and he forced it out of them.
Our chief executive was on the Today programme the weekend after the election to discuss the results - First Past the Post means people simply don't get what they vote for.