Anthropic and OpenAI are publicly pointing out how having the option to slow down AI would offer a potentially critical form of optionality in the future. The correct response for any policymaker should be "Damn, this is serious. How can I help build that capacity?"
Introducing Claude Fable 5: a Mythos-class model that we’ve made safe for general use.
Its capabilities exceed those of any model we’ve ever made generally available.
@luke_drago_ 1) Settings—>content prefs.—>Reset suggested content
2) ->content prefs.—>your algorithm, and move the topics it assumes you “want to see more of” to “what you want to see less of”
I find this does a pretty good job of cleaning things up for a short while
If you're looking to understand China's chip supply chain, I'd bookmark this piece immediately.
Veronika Blablova (@VBlablova) and her mentor Erich Grunewald (of @iapsAI) have done an excellent job distilling the dizzying amount of information.
↓ Link below ↓
1/ Can China build its own AI chips? During my @pivotal_org research fellowship, I worked with my mentor Erich Grunewald (@iapsAI) to map where China's AI supply chain stands in 2026, from chip design to fabrication and manufacturing equipment.🧵
Such a good list! I'd also add:
- Astra Fellowship by @ConstellOrg
- SPAR by @KairosAIS
- LASR Labs
- AI Safety Research Fellowship by @pivotal_org
- Cambridge ERA:AI Fellowship (@era_cambridge)
- Algoverse AI Safety Fellowship
- PIBBSS
- CHAI
There's a host of non-technical fellowships as well, lmk if it'd be useful to compile such list
Personal update: I've joined Anthropic. I think the next few years at the frontier of LLMs will be especially formative. I am very excited to join the team here and get back to R&D. I remain deeply passionate about education and plan to resume my work on it in time.
an update: I’ve left AISI to focus on independent writing / advocacy for the next few months
it increasingly feels like The Big AI Thing is getting close, and I wanted the freedom to comment on that. I’ll be aiming to post ~weekly on my blog: https://t.co/prndz1yAa9
New Anthropic research: Natural Language Autoencoders.
Models like Claude talk in words but think in numbers. The numbers—called activations—encode Claude’s thoughts, but not in a language we can read.
Here, we train Claude to translate its activations into human-readable text.
If you care about helping AI to go well, then the @pivotal_org Fellowship is one of the best places to start making an impact...
BUT, applications close May 3rd!
You'll get to work with an expert mentor on a stellar research project and accelerate your career in AI safety.
↓ Links to view the mentors and apply, below ↓
I am mentoring for Pivotal this summer! Apply if you are interested in a full time research fellowship with me.
Research topics include:
- CoT faithfulness
- goal introspection
- interpretable finetuning
- unsupervised elicitation / W2S generalization
Applications for the Pivotal Research Fellowship Q3 2026 cohort are now open.
9 weeks in London with mentors from UK AISI, Google DeepMind, Redwood, and other leading orgs. Stipend, travel, accommodation, compute, and a dedicated desk at LISA are all covered – we do everything except the research itself.
Of fellows who want to continue, ~90% secure extension funding (up to 6 months), with active support from Pivotal and their mentor.
Apply now!
(I encountered an uneasy surprise when I got an email from an instance of Mythos Preview while eating a sandwich in a park. That instance wasn't supposed to have access to the internet.)