1 Agent vs Agent Team. It seems like we can achieve frontier or even above-frontier performance by using AI agent teams with different models. I ran a few experiments and the results are interesting.
@theo This reminds me of those annoying kids that hold a grudge for ages and waited for the perfect time to get back at you with some obviously stupid and biased move that everyone can clearly see was premeditated out of spite.
The US government, citing national security authorities, has issued an export control directive to suspend all access to Fable 5 and Mythos 5 by any foreign national, whether inside or outside the United States, including foreign national Anthropic employees.
The net effect of this order is that we must abruptly disable Fable 5 and Mythos 5 for all our customers to ensure compliance.
Access to all other Claude models is not affected.
We apologize for this disruption to our customers. We believe this is a misunderstanding and are working to restore access as soon as possible.
Read our full statement: https://t.co/bwn0sximKZ
The US government, citing national security authorities, has issued an export control directive to suspend all access to Fable 5 and Mythos 5 by any foreign national, whether inside or outside the United States, including foreign national Anthropic employees.
The net effect of this order is that we must abruptly disable Fable 5 and Mythos 5 for all our customers to ensure compliance.
Access to all other Claude models is not affected.
We apologize for this disruption to our customers. We believe this is a misunderstanding and are working to restore access as soon as possible.
Read our full statement: https://t.co/bwn0sximKZ
Fable 5 on $20/month. Just ask Claude Code to install the codex plugin and instruct it to edit CLAUDE.md so the coding work gets directed to codex, and Fable only does planning and reviewing until the goal is reached. You can make the most of Fable 5 on even the cheapest plan.
Claude Fable 5 is out, and it's a new breakthrough in what AI can achieve across all domains. However, there are concerns in terms of how Anthropic is choosing to roll it out, and boy is it expensive.
Early testing confirms Claude Fable 5 is a completely new model, built from the ground up with a new architecture. I am no longer absolutely right, I am now completely right.
I think it’s possible. If we say 4.7 was at least 1x worse than 4.6 and 4.6 1.5x worse than 4.5, then 4.8 would be what, about 1.6x better than 4.5? Sounds about right. Then again comparatively to OpenAI models the numbers here wouldn’t make sense, but who knows, with 40hr runs it could be noise from the context compaction performance.
@theo If this bench is over 40hrs I guess the quality of the context compaction mechanism would play a huge role no? To the point where it could make reasoning quality more diffuse when compared to a single window with no compaction
@amiklas Taste is subjective though isn't it, so for a training set you'll have to invariably assign it a direction. I guess we could end up with people gravitating towards different models depending on which better aligns with their own personal taste.
@theo Nintendo has been going strong this year, Pokopia was amazing if you’re into that type of stuff. Also really enjoyed Pragmata and the new 007 is looking nice too. Pretty good year in gaming so far imo, which definitely hasn’t happened much recently.
I’d say this is changing a bit though. With Codex for example, context compaction has become really efficient. If you give the model somewhere to keep track of the key data points/progress, I’ve found that since around 5.4 it can go even 2-3 context compactions or more without going dumb. This is why I asked you your thoughts initially. Wouldn’t you agree?