- Claude Fable 5:
What Developers Are Actually Saying (July 2026)
After reviewing benchmarks, expert evaluations, community reports, and post-restoration feedback, a clear picture is emerging:
- The Good:
Fable 5 is widely viewed as the most capable coding and agentic model currently available for complex, long-horizon work.
Developers consistently praise it for:
• Autonomous execution on multi-hour or multi-day tasks
• Large-scale refactoring and migrations
• Strong planning, self-verification, and debugging workflows
• Better handling of ambiguous, high-level objectives
• Exceptional performance in design, UI, 3D, simulation, and creative engineering tasks
Many describe it less as a coding assistant and more as a technical partner that can own entire projects.
- Notable reactions:
• Andrej Karpathy called it a “major-version-bump” level improvement, especially on difficult long-form problem solving.
• Boris Cherny described it as the biggest leap since Opus 4.5 and claimed it is writing virtually all of his code.
• Simon Willison called it “a beast” that is expensive and slow, but capable of powering through extremely challenging tasks.
- The Numbers
Fable 5 significantly outperforms Opus 4.8 on difficult benchmarks:
• SWE-Bench Pro: 80.3% vs 69.2%
• FrontierCode Diamond: 29.3% vs 13.4%
• SWE-Bench Verified: ~95% vs ~88.6%
The gap grows as tasks become longer, more agentic, and more complex.
- The Frustrations:
The biggest complaints are remarkably consistent:
• Massive token consumption (often 2–3×+ higher than Opus)
• Slow execution speed
• Expensive usage at scale
• Aggressive safety routing and unexpected fallbacks to Opus 4.8
• Sometimes overkill for routine engineering work
Many users report burning through quotas far faster than expected.
- How Teams Are Using It
The emerging best practice is hybrid routing:
Fable 5: planning, architecture, frontier problems, deep reviews, creative work, autonomous execution.
Opus 4.8: implementation, iteration, routine coding, cost-sensitive workloads, and sensitive domains.
A common workflow is: Fable for strategy and review, Opus for execution.
- Bottom Line:
Fable 5 is not a universal replacement for existing models.
It is a meaningful capability jump for developers working on difficult, high-value problems where autonomy, reasoning depth, and long-horizon execution matter.
For everyday coding, the cost and token burn often outweigh the benefits.
For frontier work, many developers describe it as an unfair advantage.
was wondering about how they can technically do this comparison of posts in this large amount of daily posts here.
Idk exactly, but my simple guess after looking into some research is that they rely on embedding-based similarity checks with vector search.
Brute-force comparison is impossible at this scale. Instead they generate dense embeddings for each post’s content and use fast approximate nearest-neighbor search to find near-duplicates efficiently.
Text gets normalized then turned into vectors with models like Sentence-BERT so semantic similarity (including rewording or minor changes) can be measured.
Images are resized and fed into models like CLIP to create visual embeddings. Video uses multi-level representations, clip-level plus frame-level, plus spatial-temporal matching.
These embeddings go into a vector database where a new post’s vectors are quickly compared via high cosine similarity thresholds.
Only the small set of strong candidate matches gets further verification. Matches above the threshold are linked into graphs (post similarity, text similarity, and account similarity) to detect patterns like high-volume reposting.
The earliest timestamp among matching embeddings identifies the original.
This whole flow lets them spot when accounts are mostly reposting existing content with minor modifications, which supports experiments like removing certain top accounts from feeds and adjusting payouts.
That’s my best guess on how the core similarity checks actually work at their scale.
@paulg@mvanhorn All 30 accounts post content that was posted by other users before them—usually verbatim or with minor modifications—and at a very high volume.
Since running the experiment, their payouts have been reduced by up to 99%.
I was watching the stream when you said #tjr was about to get rugged soon — and it actually rugged just a few minutes later. I immediately checked gmgn to look at the parameters.
That %phishing score seems like a really solid rug predictor. It was already at ~80% right before it happened.
Does your coding agent keep getting lost in large codebases?
This is probably the best codebase indexing and harness tool available for Claude Code right now.
The `setup-codebase-harness` skill from AI Builder Club turns any repo into a properly indexed, agent-ready environment. It sets up architecture docs, shared knowledge structures, domain references, and verification workflows so your agents actually understand the full context instead of hallucinating or missing important files.
Stop fighting context loss. Start building agents that can reliably work across big codebases.
https://t.co/v2uObH0z5g
Okay,
why should home computers connect to each other to form this massive network?
What’s the real incentive for ordinary people to participate from their homes?
And who exactly is going to pay them for contributing their computing power?
In my view, any truly decentralized system like this needs a blockchain-based model. Participants (just the “miners”) should earn rewards or fees for their contributions.
Incentives are the core of making it work.
@elonmusk@Kyrannio Your partnership with Cursor is a brilliant move.
The future of AI is inextricably linked with AI-native coding tools, and this collaboration positions you perfectly at the center of that revolution.
As a total LP newbie (< 1 week), I’ve been messing with the LP Army Playground + my AI assistant… and honestly, my first attempt wasn’t half bad 😎
@met_lparmy
As a total LP newbie (< 1 week), I’ve been messing with the LP Army Playground + my AI assistant… and honestly, my first attempt wasn’t half bad 😎
@met_lparmy
In which countries is Fugu actually usable right now?
It feels like this model is running into the same regulatory walls as Claude. export controls and compliance restrictions are blocking access for a lot of users
Introducing Sakana Fugu: A full multi-agent orchestration system accessible via a single model API.
Our ‘Fugu Ultra’ model matches the performance of Fable and Mythos, delivering frontier capability without the risk of export controls.
Try it: https://t.co/hhO6qTawgb 🐡
In agent loops, one thing keeps standing out:
the critical importance of crystal-clear goals.
Want a reliable verifier in the loop?
You need strict, unambiguous success criteria.
And to define success properly, you must first be brutally clear about your system’s objectives!
This is a brilliant, forward-thinking move by Elon.
AI coding has long needed massive compute power like SpaceX’s AI superclusters, and now it has it.
Excited to see the future of software engineering become even more powerful and glorious, while devs still get to make money😂
🤯 GLM-5.2 is here — built for long-horizon coding and agentic tasks, now with a solid 1M-token context.
The strongest open-source coding model yet!
Available now on Ollama's cloud, hosted in the US on the latest @NVIDIAAI Blackwell datacenter GPUs. Privacy policy and zero data retention apply, as always.
Try it 👇
Claude Code:
ollama launch claude --model glm-5.2:cloud
Codex App:
ollama launch codex-app --model glm-5.2:cloud
Hermes Agent:
ollama launch hermes --model glm-5.2:cloud
Chat:
ollama run glm-5.2:cloud
More integrations and information in the model page 🧵