Fable 5 coming back
- Claude Code v2.1.190 adds several new strings pointing toward Fable 5 returning soon.
- References to it being "purchased separately from your plan" were removed, while new weekly usage limit and included usage messages were added, suggesting a weekly allotment model.
- Fable 5 also reappeared in Amazon Bedrock.
- Sonnet 5 is in Early Access for select enterprise customers and may be serving as a stop gap while Fable 5 development continues.
INCREDIBLE RESOURCE
The MOST COMPLETE GUIDE for understanding LLMs from first principles is now available online to read for free
Covers the model mechanics
- Tokens / tokenizers
- Transformers
- Attention
- KV cache
- Prefill vs decode
- Decoding controls
- Model packages
- Chat templates
- Long context
- RAG
- Agents / tools
- Fine-tuning
- Multimodal models
Then connects that to running models locally
- What "local" really means
- Open-weight vs opensource
- Quantization
- VRAM math
- Hardware tiers
- File formats / load safety
- Runtimes / serving modes
- Model selection
- Privacy
- Failure modes
- Benchmarks
- Practical setup paths
You should read this, and if you cannot now then you most definitely wanna bookmark it for later
Opensource AI FTW
You run Kernels, not models
The model is just a graph
The Inference Engine serves as a scheduler, optimizer, and executor
But the actual work? That happens in the Kernels
- MatMul Kernels
- Attention Kernels
- RMSNorm Kernels
- KV cache Kernels
- Quantized linear Kernels
- Sampling Kernels
- Fused “please don’t write this back to memory 9 times” Kernels
Same model, same GPU, same VRAM
Wildly different performance
Because one stack is using optimized fused Kernels that understand your hardware
And the other stack is playing hot potato with tensors through 47 tiny launches and pretending the GPU is the problem
Bad Kernels make people say:
“this model is slow”
Good Kernels make people say:
“wait how is this running locally?”
This is why Inference Engines and the Kernels implemented within them matter
The model is the recipe
The hardware is the kitchen
The Kernels are the knives, pans, burners, and the chef not cutting onions with a spoon
Most people benchmark models
The real ones benchmark the Kernels underneath
Let me make Local AI easy for you
Give Codex Cli the article below & tell it:
- Infer the right Inference Engine from your hardware + article below
- Use uv+venv
- Pick the right kernels
- Tune flags, batching, KVCache, etc
- Optimize for your hardware & chosen model
See? Easy
Programming is not about code, just like music is not about notation. It is the art & science of managing complexity through layers of abstraction. AI is simply a part of it.
Opus 4.5 was launched on Nov 24, 2025
Opus 4.6 was launched on Feb 5, 2026
Opus 4.7 was launched on Apr 16, 2026
Opus 4.8 was launched on May 28, 2026
Fable 5 was launched on June 9, 2026
what is Anthropic eating bro.
I just dropped a full video on vibe coding with loops.
A loop is a recursive goal you define once.
The agent works until a stop condition is met.
No more prompting and waiting and prompting again.
Right now I have loops running on Sentry errors in BridgeSpace.
I set the goal, walk away, and come back to fewer production errors than when I started.
This is the step beyond prompting.
And it is bringing us one step closer to fully autonomous software development.
Full video out now.
Artifacts are now live in Claude Code.
Ask Claude to turn what it's working on into a page and send the link to your team. The page updates as the session keeps working.
Available today on Team and Enterprise plans.
New in Claude Code: Artifacts.
Interactive pages built from your session, like a PR walkthrough or a living project dashboard, shared with your team at a private link.
Available in beta on Team and Enterprise plans.
New in Claude Design: it stays on brand with your design system across projects, lets you edit directly on the canvas, syncs with Claude Code, and connects to more of the tools you already use.