We're expanding Glasswing today. To solve such a big/complex/urgent problem, we need Mythos-level capabilities in as many defenders' hands as possible. That's why we're working on safeguards to scale that safely ASAP.
11 of my reflections from the past 2 months of Glasswing 🧵:
We’re expanding Project Glasswing. We’ve extended access to Claude Mythos Preview to approximately 150 additional organizations, based in more than fifteen countries.
Read more about this expansion and our future plans for Project Glasswing: https://t.co/QrtHSBdRbh
Over the past few months, we've been holding dialogues with scholars, philosophers, clergy, and ethicists on the questions AI raises—starting with how good character forms.
Read more about how we’re widening the conversation on frontier AI: https://t.co/vKGiODEq6q
Personal update: I've joined Anthropic. I think the next few years at the frontier of LLMs will be especially formative. I am very excited to join the team here and get back to R&D. I remain deeply passionate about education and plan to resume my work on it in time.
A lot of people have been wondering about Mythos, Glasswing, and the vulns we / our partners are fixing. Today, I’m excited for us to start sharing more. (For context, I lead Glasswing @AnthropicAI.)
Two independent evaluations this week—from XBOW and the UK AISI—confirm what we've been seeing internally: Claude Mythos Preview is a step change in autonomous cybersecurity capabilities. We need to start preparing fast for a world of models with this level of capabilities.
The UK AI Security Institute tested the model we shipped at the launch of Project Glasswing and found Mythos Preview is the first model to solve both of their end-to-end cyber ranges, including one (Cooling Tower) which no model had ever cleared. But attackers (and defenders) have sophistication & cost constraints – Mythos is also the only model that clears every one of their tasks estimated over 8 hours under their deliberately low 2.5M-token cap.
XBOW tested it on their offensive security benchmarks, finding "token-for-token, unprecedented precision." It's the only model to succeed at subtle V8 sandbox work.
Other Glasswing partners shared similar stories. In a few weeks of testing, Mythos Preview has helped them find many thousands of (estimated) high + critical severity vulnerabilities, sometimes double what they'd normally find in a year.
I don't share this to boost Mythos. In fact, this is not about Mythos. It’s about preparing for the coming world of models being better, faster, cheaper, and more creative than some of the best human experts at dual use capabilities. Clearly, we need them supporting defenders as widely as can be done safely – and especially the least resourced ones.
Within a year, Mythos will probably look quite dumb (relative to other new models). And others may release openly available or unguardrailed models of Mythos-level capabilities.
We started Project Glasswing because capabilities like Mythos Preview's won't stay rare, or stay in careful hands. We are bringing it to defenders as fast as we responsibly can, while working to figure out, for example, the right safeguards and patching & disclosure processes.
Also, to be clear, compute has never been a limiter in our rollout.
Expect a fuller update on our Glasswing work in the coming days.
XBOW report: https://t.co/Mumtbf3kE3
UK AISI report: https://t.co/vBgqz0AeKJ
One of the things that made the Mythos release hard to interpret is that Anthropic held back details on most vulns they found, to give defenders time to patch.
1 month later, info from orgs with access to Mythos is starting to trickle out, e.g. this post from Mozilla today:
I've spent the past few weeks reading 100s of public data sources about AI development. I now believe that recursive self-improvement has a 60% chance of happening by the end of 2028. In other words, AI systems might soon be capable of building themselves.
We're expanding our collaboration with Amazon to secure up to 5 gigawatts of compute for training and deploying Claude. Capacity begins coming online this quarter, with nearly 1 gigawatt expected by the end of 2026.
Introducing Claude Design by Anthropic Labs: make prototypes, slides, and one-pagers by talking to Claude.
Powered by Claude Opus 4.7, our most capable vision model. Available in research preview on the Pro, Max, Team, and Enterprise plans, rolling out throughout the day.
we're looking for someone to lead external comms for anthropic's research teams. the role requires deep familiarity with both the broader media landscape and (imo, our most interesting) research in interp, alignment, and model welfare. my DMs open
https://t.co/vKAaVXIiuO
Join me this Wednesday in SF for an event celebrating the new book from NPR's Planet Money team. We'll talk about the impact of AI on society, how we think about the future at Anthropic, and maybe read some of my Import AI writing. More info: https://t.co/NiZBnk9bTM
I’m proud that so many of the world’s leading companies have joined us for Project Glasswing to confront the cyber threat posed by increasingly capable AI systems head-on.
https://t.co/pn3HSVsThP
We're hiring for a couple of important roles:
1) Communications lead: Seeking excellent writers with big ideas. Talk to me or @maxwellcyoung .
2) An operational wizard to scale the Policy and TAI orgs, working closely with me and Sarah Heck to run the orgs.
You can now enable Claude to use your computer to complete tasks.
It opens your apps, navigates your browser, fills in spreadsheets—anything you'd do sitting at your desk.
Research preview in Claude Cowork and Claude Code, macOS only.
Anthropic is expanding to Australia & New Zealand. We’ll soon open an office in Sydney—our fourth in Asia-Pacific after Tokyo, Bengaluru, and Seoul.
Read more: https://t.co/qJYOSfWOkf
Claude hit #1 on the iOS App Store in 14 countries
#1 - Australia
#1 - Austria
#1 - Belgium
#1 - Canada
#1 - France
#1 - Germany
#1 - Ireland
#1 - Italy
#1 - New Zealand
#1 - Norway
#1 - Singapore
#1 - Switzerland
#1 - United Kingdom
#1 - United States