New Fable 5 beats Opus 4.8 on real world physics simulations
We gave both models the same three prompts and asked them to build self contained HTML5 sims with real physics and no libraries:
1. Chaotic double pendulum
2. Galton board
3. Water in a spinning drum (WCSPH)
Generation cost
Fable 5: $3.35 on 68.7k tokens, time 14m 47s
Opus 4.8: $0.93 on 38.9k tokens, time 8m 10s
Fable clearly did better on the water simulation, producing a much more solid and continuous body of water. Opus left larger gaps near the walls, scattered particles around the scene, and struggled to keep the fluid stable.
Andrej Karpathy spent 2h showing how he actually uses AI day to day
he's a co-founder of OpenAI and led AI at Tesla, so when he shows how he works, it’s worth watching
and the whole session is just him telling the machine what he wants in simple terms, like he's briefing a coworker
watch what's actually happening the entire time:
> he describes the task in normal words
> it goes off and does the work
> he glances at the result and nudges it with one more sentence
that's the whole skill, and you've had it since you learned to talk
the only gap between that and a worker that runs on its own is handing that sentence a schedule and the tools to act
check his work, then build the version that keeps working when you stop
There hasn't previously been a treatment vs pancreatic cancer this successful. Striking improved (a > doubling) survival results @NEJM and @ASCO today with daraxonrasib, which also became available via an FDA approved early access program and began shipping to physicians this week @RevMedicines
https://t.co/e04jqJMPw0
The ramp up of cancer immunotherapy is remarkable. Now we're seeing vaccines achieve some cures or remissions in the most refractory cancers: pancreatic, melanoma, glioblastoma, renal, triple-negative breast cancer.
✓ out the new Ground Truths (link in profile)
Creator and head of Claude Code:
"100% of my code is written by Claude Code. I have not edited a single line by hand since November. Every day I ship 10, 20, 30 PRs… I have five agents running while we’re recording this."
Google’s TurboQuant is cool. It cuts memory and makes things much faster with full accuracy.
For agentic AI workflows, this could be big. Agents need lots of context for long talks, plans, and tool use. TurboQuant makes that cheaper and faster.
RAG is good for pulling in fresh facts, but better context handling might matter more for real step by step agent work.
Introducing TurboQuant: Our new compression algorithm that reduces LLM key-value cache memory by at least 6x and delivers up to 8x speedup, all with zero accuracy loss, redefining AI efficiency. Read the blog to learn how it achieves these results: https://t.co/CDSQ8HpZoc