Update: I joined @AnthropicAI to lead AI for State & Local Government! aka making government better with Claude :)
Last year I met 20K+ voters running for office in SF.
One of those was @DanielaAmodei, President & co-founder of Anthropic 🧵
DM me if you’re interested in buying an incredible art piece five years in the making.
It’s a new kind of sculpture that animates reflected light.
See video in reply.
A lot of people have been wondering about Mythos, Glasswing, and the vulns we / our partners are fixing. Today, I’m excited for us to start sharing more. (For context, I lead Glasswing @AnthropicAI.)
Two independent evaluations this week—from XBOW and the UK AISI—confirm what we've been seeing internally: Claude Mythos Preview is a step change in autonomous cybersecurity capabilities. We need to start preparing fast for a world of models with this level of capabilities.
The UK AI Security Institute tested the model we shipped at the launch of Project Glasswing and found Mythos Preview is the first model to solve both of their end-to-end cyber ranges, including one (Cooling Tower) which no model had ever cleared. But attackers (and defenders) have sophistication & cost constraints – Mythos is also the only model that clears every one of their tasks estimated over 8 hours under their deliberately low 2.5M-token cap.
XBOW tested it on their offensive security benchmarks, finding "token-for-token, unprecedented precision." It's the only model to succeed at subtle V8 sandbox work.
Other Glasswing partners shared similar stories. In a few weeks of testing, Mythos Preview has helped them find many thousands of (estimated) high + critical severity vulnerabilities, sometimes double what they'd normally find in a year.
I don't share this to boost Mythos. In fact, this is not about Mythos. It’s about preparing for the coming world of models being better, faster, cheaper, and more creative than some of the best human experts at dual use capabilities. Clearly, we need them supporting defenders as widely as can be done safely – and especially the least resourced ones.
Within a year, Mythos will probably look quite dumb (relative to other new models). And others may release openly available or unguardrailed models of Mythos-level capabilities.
We started Project Glasswing because capabilities like Mythos Preview's won't stay rare, or stay in careful hands. We are bringing it to defenders as fast as we responsibly can, while working to figure out, for example, the right safeguards and patching & disclosure processes.
Also, to be clear, compute has never been a limiter in our rollout.
Expect a fuller update on our Glasswing work in the coming days.
XBOW report: https://t.co/Mumtbf3kE3
UK AISI report: https://t.co/vBgqz0AeKJ
I’ve worked at Anthropic for three weeks and I can say it is both wonderful and *quite* different from any other place I’ve ever worked. Feeling immense gratitude to get to work on the things I do (and concomitant obligation/responsibility.)
@zmwang is a 🐐 at designing magical experiences.
I've known him for over a decade, and have seen him create 100 person global feasts for students at new university at Minerva Uni to bachelor parties for friends.
Highly worth checking out his cookbook!
I'm writing a cookbook for connection and launching the first recipes → https://t.co/iW2nFzoi6q
For 15+ years I've been the friend people texted at 11 pm before a big life thing.
"I'm walking into my first board meeting tomorrow, how do we build trust in the first 10 minutes?"
"I'm designing my 35th, help me make it more than a bar night."
"I'm going on a trip with my parents, how do we actually connect?"
"We're designing our intercultural wedding, how do we make it magical and not formulaic?"
🍽️ They each wanted a "recipe" for how to design an experience. I'm sharing some of the meaningful ones that helped them (ingredients, steps, and a little science explaining why it works).
👨👩👧 What I learned is that people wanted to connect with the most important people in their lives, but were stuck. Holidays with parents, date night with their partner, a cousin's trip, a team offsite, a birthday, a baby shower. We default to the same script and wonder why the night felt okay but not great. It's usually not a lack of love in the room. It is a lack of inspiration about how.
🍚 In the same way you pull up a food recipe when you want to cook something great for someone, I believe we need inspiration from recipes for the experiences we host. Magical recipes to teach people how to host, facilitate, and create the "vibes".
I will share a new recipe every Monday (https://t.co/iW2nFzoi6q).
Comment 👇: What recipes would you want to read? Who do you want to deepen connection with in your life?
Imagine every pixel on your screen, streamed live directly from a model. No HTML, no layout engine, no code. Just exactly what you want to see.
@eddiejiao_obj, @drewocarr and I built a prototype to see how this could actually work, and set out to make it real. We're calling it Flipbook. (1/5)
Imagine every pixel on your screen, streamed live directly from a model. No HTML, no layout engine, no code. Just exactly what you want to see.
@eddiejiao_obj, @drewocarr and I built a prototype to see how this could actually work, and set out to make it real. We're calling it Flipbook. (1/5)
Some news! Last week I joined @AnthropicAI to help state and local governments safely and effectively deploy AI capabilities.
Personally I see profound opportunity to transform how public services are delivered & accessed—and I feel a deep responsibility to help that go well.