This is a super exciting release - Claude Fable 5 is the same underlying model as Mythos but with added safeguards. The benchmarks are great and it's SOTA on everything by a margin but I'll add that *qualitatively* also, this is a major-version-bump-deserving step change forward (imo of the same order as Claude 4.5 was in November), peaking especially for long problem-solving sessions on very difficult problems. You can give it a lot more ambitious tasks than what you're used to, the model "gets it" and it will just go, and it's never felt this tempting to stop looking at the code at all (but don't do this in prod!). The model still has quirks that people will run into and the safeguards are configured to be a little too trigger happy for launch, which can hopefully be tuned over time.
I feel a lot of things changing as working software increasingly comes out on a tap. The Jevon's paradox kicks in and I feel my own demand for software growing substantially. You can ask for anything - explainers, visualizers, dashboards, bespoke single-use apps (e.g. a full wandb that is hyper-specific just for your project), you can 10X your test suite, auto-optimize code, run giant research projects with custom HTML for the results, anything! "Free your mind" (Matrix ref). Really looking forward to all the things people build!
Introducing Claude Fable 5: a Mythos-class model that we’ve made safe for general use.
Its capabilities exceed those of any model we’ve ever made generally available.
Cybersecurity and biosecurity requests may auto-reroute to Opus 4.8 (shown in the UI, billed at Opus prices).
Docs + prompting guide:
https://t.co/p03FDlBxnT
Terence Tao: "We lived in a world with cognitive friction until very recently, where every task required us to use our brain.
So we didn't really think about it, we just thought this was the cost of doing something intellectual. But now we have AI and the other technologies that can bring these frictions down to zero."
Most research time is not spent having cinematic insights.
It is spent checking cases, chasing references, translating intuition into computation, testing a path, finding it false, and deciding whether the failure taught you anything.
AI changes the cost of that loop.
Terence Tao says that now he can try “crazier things,” and that makes so much difference. Because unconventional ideas are often not rejected by proof, but by inconvenience.
A mathematician may avoid a strange direction not because it is foolish, but because the bookkeeping, coding, or literature search needed to test it is too expensive for a hunch.
This is where cognitive friction becomes scientific friction.
Lowering it does not make taste, judgment, or proof disappear; it makes more weak signals cheap enough to inspect before they are abandoned.
AI is making hesitation less expensive, and that is often where discovery begins.
London hit 34°C today in Data Centre Alley, West London.
The servers needed extra cooling measures and some ad-hoc solutions to stay stable through the heatwave.
Reassuring to see the Supermicro systems with NVIDIA GPUs perform reliably under real conditions. Owned infrastructure and solid hardware engineering make a tangible difference when the environment gets difficult.
#NVIDIA #SuperMicro #DataCenter #AIInfrastructure
Personal update: I've joined Anthropic. I think the next few years at the frontier of LLMs will be especially formative. I am very excited to join the team here and get back to R&D. I remain deeply passionate about education and plan to resume my work on it in time.
NVIDIA reportedly raised RTX PRO 6000 Blackwell GPU pricing ~30% globally last Friday (15 May). Confirmed today via a major US channel partner. Quotes being updated industry-wide.
For workstation AI builders, the math just shifted.
#NVIDIA#GPU#AIHardware
Grok CLI is sharp. Terminal-native agentic coding, fast tool calls, slots straight into the rotation next to Claude Code. Another solid lane open for autonomous dev work. 💪🏽 @grok@elonmusk#AI#LLM#Coding
Visited AMAX HQ in Fremont last week. Deep technical discussion and a tour of their lab. A real glimpse of the next generation of NVIDIA supercomputers being built.
#AMAX#NVIDIA#AIInfrastructure
That's a wrap on Code with Claude San Francisco! Clawd had an amazing time, and we hope you did too.
Thanks to everyone who came out and everyone who tuned in 🧡
Fireside chat at Sequoia Ascent 2026 from a ~week ago. Some highlights:
The first theme I tried to push on is that LLMs are about a lot more than just speeding up what existed before (e.g. coding). Three examples of new horizons:
1. menugen: an app that can be fully engulfed by LLMs, with no classical code needed: input an image, output an image and an LLM can natively do the thing.
2. install .md skills instead of install .sh scripts. Why create a complex Software 1.0 bash script for e.g. installing a piece of software if you can write the installation out in words and say "just show this to your LLM". The LLM is an advanced interpreter of English and can intelligently target installation to your setup, debug everything inline, etc.
3. LLM knowledge bases as an example of something that was *impossible* with classical code because it's computation over unstructured data (knowledge) from arbitrary sources and in arbitrary formats, including simply text articles etc.
I pushed on these because in every new paradigm change, the obvious things are always in the realm of speeding up or somehow improving what existed, but here we have examples of functionality that either suddenly perhaps shouldn't even exist (1,2), or was fundamentally not possible before (3).
The second (ongoing) theme is trying to explain the pattern of jaggedness in LLMs. How it can be true that a single artifact will simultaneously 1) coherently refactor a 100,000-line code base *and* 2) tell you to walk to the car wash to wash your car. I previously wrote about the source of this as having to do with verifiability of a domain, here I expand on this as having to also do with economics because revenue/TAM dictates what the frontier labs choose to package into training data distributions during RL. You're either in the data distribution (on the rails of the RL circuits) and flying or you're off-roading in the jungle with a machete, in relative terms. Still not 100% satisfied with this, but it's an ongoing struggle to build an accurate model of LLM capabilities if you wish to practically take advantage of their power while avoiding their pitfalls, which brings me to...
Last theme is the agent-native economy. The decomposition of products and services into sensors, actuators and logic (split up across all of 1.0/2.0/3.0 computing paradigms), how we can make information maximally legible to LLMs, some words on the quickly emerging agentic engineering and its skill set, related hiring practices, etc., possibly even hints/dreams of fully neural computing handling the vast majority of computation with some help from (classical) CPU coprocessors.
Runaway agents and malicious prompt injections are a permissions-and-backup problem, not a model problem.
Layered permissions bound what an agent can touch. Scoped creds, no admin keyring, no push channel to the offsite mirror, no path to the sealed seed.
Layered backups recover what an agent can destroy. Snapshot lattice, immutable cold archive with object-lock, pull-based geo-mirror, verified restore drills.
The recovery surface is strictly larger than the destruction surface. By design. Whatever an agent can reach to destroy is by definition recoverable from what it cannot reach.
That is the threat model. There is no third thing.
The Cursor/Railway incident was single-credential, single-volume, single-site. Identical outcome from a drunk intern or a rushed migration. The agent is incidental.
Solved problem. Has been for thirty years.
#AgenticAI #PromptInjection #AISecurity #InfoSec
https://t.co/RqKYrL1UZw
Third RTX PRO 6000 Blackwell Max-Q arrived for the experimental Trinity cluster at PureTensor, self-imported from the US via @nvidia's Inception channel. Thanks Sean Ardura and the Inception team for the swift reseller connection.
We've signed an agreement with Google and Broadcom for multiple gigawatts of next-generation TPU capacity, coming online starting in 2027, to train and serve frontier Claude models.
- Drafted a blog post
- Used an LLM to meticulously improve the argument over 4 hours.
- Wow, feeling great, it’s so convincing!
- Fun idea let’s ask it to argue the opposite.
- LLM demolishes the entire argument and convinces me that the opposite is in fact true.
- lol
The LLMs may elicit an opinion when asked but are extremely competent in arguing almost any direction. This is actually super useful as a tool for forming your own opinions, just make sure to ask different directions and be careful with the sycophancy.
We've spent years building LlamaParse into the most accurate document parser for production AI. Along the way, we learned a lot about what fast, lightweight parsing actually looks like under the hood.
Today, we're open-sourcing a light-weight core of that tech as LiteParse 🦙
It's a CLI + TS-native library for layout-aware text parsing from PDFs, Office docs, and images. Local, zero Python dependencies, and built specifically for agents and LLM pipelines. Think of it as our way of giving the community a solid starting point for document parsing:
npm i -g @llamaindex/liteparse
lit parse anything.pdf
- preserves spatial layout (columns, tables, alignment)
- built-in local OCR, or bring your own server
- screenshots for multimodal LLMs
- handles PDFs, office docs, images
Blog: https://t.co/Yuq6w77ZgR
Repo: https://t.co/UJy6KQ2bnQ
Your work tools in Claude are now available on mobile.
Explore Figma designs, create Canva slides, check Amplitude dashboards, all from your phone.
Give it a try: https://t.co/hwPB3zlk0w