No, America. Your best and brightest are no longer at Harvard, Yale, Stanford and the like.
Your best and brightest are kids like my tenth graders coming up through mission-aligned classical schools with teachers who know American kids in particular hunger for that which is True, Good, and Beautiful and are willing to GRIND for it, as Americans do.
The kids are here, in every town and city. We can all help build them.
This is what my tenth graders read this year:
The Symposium
The Apology
The Phaedo
The Death of Ivan Ilych
1984
Brave New World
The New Organon
The New Atlantis
Gulliver’s Travels
The Abolition of Man
Beowulf
The Canterbury Tales
Purgatorio
Inferno
King Lear
Pride and Prejudice
The Rime of the Ancient Mariner
@O_TooleKathleen@CLT_Exam@JeremyTate41@soren_schwab@Jordan_C_Adams
@educatedandfree This is roughly my reading list in high school 10th grade as well, but we never 'finished' the list. Unfortunate to see the bending of curriculum rather than pushing for kids to be better.
Google's web browsing AI tell's me it is similar to Gemini 1.5 Pro, which compares to Sonnet 3.5.. Then drops "I sit below Fable 5". Yea... duh. Way below. And I sit below @karpathy. #technicallytrue
Me, a Windows os user, trying to work on a mac for the last 2 months, running into bugs that slow down (or freeze) my system, or crash my mac entirely (just happened) and being met with "Oh, good thing I never heard of bugs on Windows". It should. Not be. Breaking. On. Basics.
This is awesome! But I just thought about all my skills and preferences dying. Fable 5 (seemingly) wont need the template, the layout, the small notes that lived through opus 4.5-4.8, evolving with every release to handle the new quirks. I am sure skills matter, but how detailed?
this is my personal singularity moment
this post may sound like a paid ad. I only wish. I'm concerned, more so than happy. the world is changing, and, among the scenarios where AI goes terribly wrong, inequality is the most realistic, yet, the one Anthropic seems to be the least concerned about. I'm glad OpenAI is taking the opposite stance: *personal AGI for everyone*. I think this is a commendable position in the times we live. but who am I in the queue of the bread?
anyway, Fable is here, so I'll just report my first-hour experience
first of all, all my pet prompts are solved.
→ λ-calculus puzzles
→ bug questions
→ one-shot apps
all are trivial to it.
I don't have anything harder other than my
ongoing work
so, in the last several days, I've been toying with HVM5, a new interaction net evaluator with a faster loop.
after writing the first version, I left 32 GPT-5 agents working for ~20 hours each. this resulted in up to 2x speedups, but the file size increased by 2-fold and quality decreased significantly.
I then simplified the whole thing into an even simpler core, and left Opus 4.8 and GPT 5.5 optimizing it for 8 hours. Opus got a legit 6% - 34% speedup in most benches. GPT got better results, but, sadly, an unusable file.
I then asked Fable to optimize it.
2 hours later, it landed a 1770% speedup in one case, 100%+ in other 4, and 22% in average. yes, in 2 hours it outperformed me, opus 4.8 and a swarm of gpt 5.5 agents, by one order of magnitude.
that could not possibly be legit. "it must be hardcoding the benchmarks" (GPT trauma). so I read its explanation and what it did was, indeed, the most high impact optimization one could try first. seems like HVM5 was wasting a lot of time garbage-collecting unused branches of pattern-match nodes. I had optimized that for static mats, but not for dynamic mats. skill issue. Fable figured how to do it for these, resulting in a massive speedup in some benches
but wait, is that *correct*? I'm not sure yet, it is credible, but this is the kind of thing that is very easy to get wrong on interaction nets. the problem is, when I was ready to start auditing Fable's solution so I could tell whether it was buggy or legit, it interrupted me to tell me it had found a massive bug on the code *I* had written.
... wait, what?
so... for garbage collection purposes, I stored a bit on lambda term pointers that meant "the variable bound by this lambda has been freed, so, its lambda must free whatever argument it is applied to". that's fine. yet, on duplicator nodes, I also used the same bit to mean "one of the duplicated variables was freed, so, treat this dup as a passthrough no-op". so, if a lambda entered a duplicator, it would mistake the lambda's collection bit for its own, resulting in corrupted interaction!
that's a mouthful, why I'm writing this?
just so you can appreciate the sheer absurdity of what just happened. I didn't ask it to find bugs. I asked it for an optimization. and even if I did ask it to find bugs, this bug is so astonishingly subtle and specific, identifying it takes mastering the domain to an extent that it beyond even me. I'd easily need hours or days to fix it, *if* I ever came across it. chances are it would just go unnoticed. and Fable found it and fixed it like it was nothing, while it was busy adding a 17x speedup to a file that neither I, nor Opus 4.8, nor a fleet of GPT 5.5 managed to barely make 2x faster.
oh and there is also another tab where it is also ripping through Bend's codebase and finishing everything I had to do
I don't know what to say anymore
this isn't about Anthropic or OpenAI, this is about our collective future as a species. the world is changing, and we need to be aware of it, and discuss how to handle this change.
receipt below . . .
As much as I want to remain semi-professional here about Fable 5, and what this means for software, there is a single gif that I can think of that is explaining this moment.
This is a super exciting release - Claude Fable 5 is the same underlying model as Mythos but with added safeguards. The benchmarks are great and it's SOTA on everything by a margin but I'll add that *qualitatively* also, this is a major-version-bump-deserving step change forward (imo of the same order as Claude 4.5 was in November), peaking especially for long problem-solving sessions on very difficult problems. You can give it a lot more ambitious tasks than what you're used to, the model "gets it" and it will just go, and it's never felt this tempting to stop looking at the code at all (but don't do this in prod!). The model still has quirks that people will run into and the safeguards are configured to be a little too trigger happy for launch, which can hopefully be tuned over time.
I feel a lot of things changing as working software increasingly comes out on a tap. The Jevon's paradox kicks in and I feel my own demand for software growing substantially. You can ask for anything - explainers, visualizers, dashboards, bespoke single-use apps (e.g. a full wandb that is hyper-specific just for your project), you can 10X your test suite, auto-optimize code, run giant research projects with custom HTML for the results, anything! "Free your mind" (Matrix ref). Really looking forward to all the things people build!
This is genuinely amazing if it works out the way they say it does. I've been bitten by hype from a reputable source before, but if this is another confirmable step for research ideation, I am all for it.
We've made a breakthrough in self-evolving AI scientists moving from "search" to "principled discovery": Scientific discovery requires that the search space itself changes, and an AI scientist must perceive this shift without intervention. We built an AI that achieves this for the first time with the ability to discover the scientific vocabulary it reasons in. Evidence, tools, artifacts, verifiers, failures & claims become typed provenance. We show three distinct modalities: 1) retrieval, adding known objects; 2) search, exploring a fixed schema; and critically: 3) discovery, a verified regime transition.
We solve the open-endedness evaluation problem by lifting agentic workflows into a typed copresheaf and proving, via a Kan obstruction, that true discovery is not unbounded generation but a verifiable schema expansion: old evidence is transported by Left Kan extension, and genuine novelty is mathematically quantified by the pointwise residual beyond the transported image - separating discovery from mere search and making novelty objective and measurable rather than a subjective judgment or benchmark delta.
Our AI scientist is built in a way that does not pre-conceive the approach it chooses; instead, we endow the system with formal power to adapt, evolve, and reason from first principles. Case studies include:
1⃣Builder/Breaker model that discovers mode-conditioned compliance in proteins;
2⃣CategoryScienceClaw that finds anisotropic fiber-network stiffness rules.
Great work in collaboration with my graduate student @fwang108_@MITdeptofBE
F.Y. Wang & M.J. Buehler, Self-Revising Discovery Systems for Science: A Categorical Framework for Agentic Artificial Intelligence, arXiv:2606.01444, 2026
Anthropic moving fast as ever. Agent building was a request of many businesses. I've built a few, added evals, feedback, customized controls.
Now we have a new tool to do most of that for us. I am excited to test it out and see how much faster it can improve our workflows.
Introducing Claude Managed Agents: everything you need to build and deploy agents at scale.
It pairs an agent harness tuned for performance with production infrastructure, so you can go from prototype to launch in days.
Now in public beta on the Claude Platform.
Before limited-releasing Claude Mythos Preview, we investigated its internal mechanisms with interpretability techniques. We found it exhibited notably sophisticated (and often unspoken) strategic thinking and situational awareness, at times in service of unwanted actions. (1/14)
Developers have to create to our scheme. This includes tools the LLM can use. The LLM check works on catching bad actors on code. The rest of this is teacher verification.
That is my answer. We need more HITL for high important topics like content that will reach children.
Another gauntlet project done. It is an interesting one. We had to read a case study, and think about scalability and security of a chat app for students k-12. I built out a verification system and will talk about it in the comments as a thread.
Verification layers: That is an LLM check, an admin, a community of teachers, and finally a single teacher. We can get the application to students and setup up several checkpoints across the way.
🚨 Someone just built a fully open-source mocap system that works with any camera.
It's called FreeMoCap, a markerless 3D tracking system that runs on ordinary webcams. It turns multiple camera feeds into research-grade skeletal data automatically.
100% Open Source.