This is a super exciting release - Claude Fable 5 is the same underlying model as Mythos but with added safeguards. The benchmarks are great and it's SOTA on everything by a margin but I'll add that *qualitatively* also, this is a major-version-bump-deserving step change forward (imo of the same order as Claude 4.5 was in November), peaking especially for long problem-solving sessions on very difficult problems. You can give it a lot more ambitious tasks than what you're used to, the model "gets it" and it will just go, and it's never felt this tempting to stop looking at the code at all (but don't do this in prod!). The model still has quirks that people will run into and the safeguards are configured to be a little too trigger happy for launch, which can hopefully be tuned over time.
I feel a lot of things changing as working software increasingly comes out on a tap. The Jevon's paradox kicks in and I feel my own demand for software growing substantially. You can ask for anything - explainers, visualizers, dashboards, bespoke single-use apps (e.g. a full wandb that is hyper-specific just for your project), you can 10X your test suite, auto-optimize code, run giant research projects with custom HTML for the results, anything! "Free your mind" (Matrix ref). Really looking forward to all the things people build!
We believe AI can be a dedicated research partner to help discover the next breakthrough.
Enter Co-Scientist: our latest Gemini-based multi-agent system that can generate, debate and evolve novel hypotheses for complex scientific problems 🧵
Building apps has never been easier.
With Sites, Codex can turn your work, ideas, and plans into an interactive website or app your team can explore, use, and share with a URL.
Rolling out to Business and Enterprise plans, before expanding more broadly.
Personal update: I've joined Anthropic. I think the next few years at the frontier of LLMs will be especially formative. I am very excited to join the team here and get back to R&D. I remain deeply passionate about education and plan to resume my work on it in time.
Personal update: I've joined Anthropic. I think the next few years at the frontier of LLMs will be especially formative. I am very excited to join the team here and get back to R&D. I remain deeply passionate about education and plan to resume my work on it in time.
15 AI related accounts you should follow on Twitter:
1. @karpathy
His tweets already create LLMs narratives that you later see on linkedin in 2 months.
2. @fchollet
posts thoughtful research on intelligence, benchmarks, and AI limitations. Keras creator + ARC-AGI
3. @ylecun
Yann LeCun is Deep learning pioneer & Meta Chief AI Scientist; big-picture research takes and critiques (and drama).
4. @AndrewYNg
Andrew Ng is AI education legend; practical ML advice, courses, and real-world implementation. creator of deeplearning ai
5 @rasbt
Sebastian Raschka posts on Practical ML/LLM implementations, "build from scratch" tutorials, and books.
6. @dair_ai
Weekly ML/AI paper threads and accessible research explainers (high-signal for staying current).
7. @lilianweng
Lilian Weng is ex-OpenAI and her Lil'Log-style threads are good. has In-depth LLM research breakdowns
8. @jeremyphoward
posts interesting takes on AI/crypto news, and works on democratizing practical deep learning and accessible education.
9. @simonw
Simon post Practical LLM tools, takes, experiments, prompting, and engineering breakdowns. django co-founder
10. @_akhaliq
Curates the latest arXiv papers, model releases, and open-source AI drops.
11. @ID_AA_Carmack
AGI/low-level optimization takes that makes you think about the problem.
12. @gwern
Really high-quality long-form AI research notes and essays.
13. @goodside
LLM evaluation, prompting research, and real capabilities testing
14 @drfeifei
Computer vision pioneer; human-centered AI and spatial intelligence research
15 @demishassabis
Been following his work for 9 years. Demmis is my hope against google usurpating their power with AI. Demmis is google DeepMind's CEO
Let me know who I missed guys and save it for future
Imagine every pixel on your screen, streamed live directly from a model. No HTML, no layout engine, no code. Just exactly what you want to see.
@eddiejiao_obj, @drewocarr and I built a prototype to see how this could actually work, and set out to make it real. We're calling it Flipbook. (1/5)
Andrej Karpathy: "90% of your AI coding bill is paying for context you didn't need to send"
Here are 10 things senior AI engineers stopped wasting tokens on:
1. Auto-context loading 50 files for a 30-line fix: $1.20/turn for tokens you'll never read. 80% input waste, every session
2. Running Opus on lint, format, and rename tasks: $0.60 for what Haiku nails at $0.02. 30x overpay on the cleanup tier
3. Tool call loops that re-send the full repo on every retry: 5x context cost per agentic flow. fixing these alone cuts 30-50% of bills
4. Sonnet as the default model: Kimi 2.6 matches its quality on most coding tasks at 1/6 the cost. defaulting to Sonnet in 2026 is leaving 60-70% on the table
5. Streaming responses on stable-prefix workflows: kills your prompt cache. you pay 10x for tokens that should have cost cents
6. "Just in case" file includes: 80,000-token prompts that should be 3,000. context bloat is the silent budget killer
7. Per-session knowledge rebuilding: 10 min writing a SKILL.md once vs paying agents to re-figure out your environment every run. $4 vs $0.30 per execution
8. Single-model setups: premium tier on every task is the most expensive mistake in AI coding right now
9. Asking 10 small questions one at a time: 10 separate input prefix charges vs one batched call. 70-90% savings on routine workflows
10. Buying Claude Pro + ChatGPT Plus + Cursor Pro: you seriously use one. the other two are habit, not utility
what actually compounds instead:
- context discipline (grep before fetching, always)
- prompt caching on every stable prefix
- multi-model routing (Kimi 2.6 default, Opus for the 10%)
- graduated skills via SKILL.md files
- profiling tool calls before optimizing prompts
- the routing mindset (right model for right task)
in 12 months, the gap between developers shipping on $200/month and $4,000/month budgets won't be skill
it'll be how well they route
study this.
New @openclaw beta is up! Mostly security hardening and fixes today.
Ther'e also an early version of an Apple Watch app in there if you wanna go digging in the repo... https://t.co/hc8Qu6QC5h
I'm participating in Bike MS: ACC Texas MS 150 2025 to raise money for the National Multiple Sclerosis Society. The Society is the best investment to solve the challenges of MS and, ultimately, finding a cure. Cheer me on — donate today! https://t.co/hFLbNLQIrK via @DonorDrive
Apple TV 4K 2022 (Left) vs Apple TV 4K 2021 (Right)
The new Apple TV 4K is fanless for the first time ever w/ the A15 Bionic and 20% smaller in volume. Plus, no 'tv' on the top logo anymore. 👀
Full review drops today!
So...I don't know how many games it took me to finally see a no-hitter completed in person. Kind of fun.
No pressure on Verlander (0-6, 6.07 Workd Series resume) to pitch well in game 5. No pressure at all...
But a heck of a lot less than if Astros down 3-1.
Even the previews are a bit traumatizing. But that’s how it should be. The sheer insanity of it all is highlighted by a lack of a point or view or message.
At this German WWI cemetery 44,000 German soldiers are buried. There is a mass grave containing nearly 25,000 bodies, about 7,000 of which are unknown. It’s located in a
small village in Belgium nearby which poison gas was first deployed as a modern weapon of mass destruction.