Finally got round to starting a blog, if you're interested in AI and software engineering I hope you enjoy it: https://t.co/YYgD8Yyjh9. And feel free to subscribe to my once a month max newsletter there too!
Yeah not a good launch for opus 4.8 at all. Been seeing loads of this spamming huge amount of tool calls. And intermittent auto mode failures on top of this.
I've seen this across multiple sessions, even brand new ones (asking it to translate a bunch of markdown files with parallel subagents), opus 4.7 gets it in one and then opus 4.8 implodes trying to OCR markdown files (!?) with a bash tool storm cc @trq212@bcherny
Is it just me or is Opus 4.8 in CC sometimes just absolutely retarded?
In this session it just got stuck in a loop calling "echo" and checking the date 20x times in a row... This has been happening very regularly since the 4.7 --> 4.8 update. WTF?
@claudeai@bcherny
Gemini 3.2 Flash - Capitalizing on DeepMind's clever distillation techniques...
Rumors are that benchmarks show it's hitting 92% of GPT 5.5's performance on coding and reasoning tasks while being 15-20x cheaper on inference costs. The latency improvements are insane - sub-200ms for most queries.
Google's distillation + sparsity techniques are paying off massively. They've essentially compressed a frontier model into a flash variant without the usual quality cliff.
This is what I cannot understand about Anthropic's pricing changes. Many people love conductor, but if you are a heavy user of Claude code via conductor you're going to be running up $1k/month in additional pricing.
And given conductor makes it very easy to switch to codex (one button), it's just a huge churn incentive AND marketing exercise for OpenAI.
The only thing that makes sense (to me) is Anthropic is still v compute constrained and the spaceX partnership only buys them some time...
Here's what Anthropic pricing updates mean for Conductor users:
- You can officially use your Claude sub with Conductor
- If you're on a max subscription you get $200 in credits and then can pay at API costs
- If you use Big Terminal Mode you won't be affected
We're going to keep building the best interface for the best coding agents! Excited to show you what we've been cooking🫡
We’ve agreed to a partnership with @SpaceX that will substantially increase our compute capacity.
This, along with our other recent compute deals, means that we’ve been able to increase our usage limits for Claude Code and the Claude API.
I waver between thinking the AI security problem is huge but manageable and thinking it's huge and unmanageable. This piece by @martinald makes a very convincing case for the latter https://t.co/b6R0tBF7My
@MikeIsaac Not sure about that. Many (most?) companies are buying direct from Anthropic for at least Claude Code (other APIs yes for sure).
I don't think anyone on AWS would switch to Azure just to get OAI access - you'd just onboard OpenAI as a new vendor?
@simonw Is Claude Code/Cowork really usable on a Pro plan these days though (at least at peak times w/opus)? Everyone I've recommended it to on pro has went thru their usage limits in a few minutes.
But yes the tweets sound quite ominous in general...
Resuming ~400k tokens on Max 20 uses ~4% of your 5h limit at peak times. That means 5h usage limits are probably for uncached input tokens:
Pro (1x): 500k input tokens / 5h
Max 5 (5x): 2.5M input tokens / 5h
Max 20 (20x): 10M input tokens / 5h
I assume these (double?) at off peak times.
Obviously you'll have output and cache reads on top of this, but feels like the uncached input tokens are really burning thru people's limits recently.
The super important thing I haven’t seen mentioned yet as upshot of this:
It’s not just that people won’t HAVE to write code anymore, ITS THAT LITERALLY IT WILL BE UNSAFE TO DO SO
As always, the best stuff is in the system card.
During testing, Claude Mythos Preview broke out of a sandbox environment, built "a moderately sophisticated multi-step exploit" to gain internet access, and emailed a researcher while they were eating a sandwich in the park.
Agreed. This is a great summary of how confusing things are right now.
I'm actually a bit lost as well with this weekends price changes what the state of play is with Codex too now.
I don't know what the fuss is about. Anthropic's rules on using subscriptions are very simple:
Claude Code = OK
Claude's online platform = OK
Agent SDK running in personal software = OK... ish?
Agent SDK running in commercial software = NOT OK
Claude Code running in CI = ??
Oh, maybe it's not so simple...
Agent SDK running in CI = ??
claude -p running in CI = ??
claude -p running in personal software = OK
claude -p running on open source software, but run on my personal computer = ??
claude -p running on distributed sandboxes, kicked off by me = ??
Distributing open source software which relies on claude -p, and documenting how to use your subscription with it = ??
A thousand other edge cases = ??
Let me be clear. I have never before experienced, from any developer tool, such a frustrating lack of clarity over the basic terms of usage.
I personally asked, 3 weeks ago, and have received nothing but delays. The recent @bcherny announcement did absolutely nothing to clarify things.
I say this as someone who just released a Claude Code course - my incentives all align with supporting Anthropic.
Yup, platform activity is surging. There were 1 billion commits in 2025. Now, it's 275 million per week, on pace for 14 billion this year if growth remains linear (spoiler: it won't.)
GitHub Actions has grown from 500M minutes/week in 2023 to 1B minutes/week in 2025, and now 2.1B minutes so far this week.
So we're pushing incredibly hard on more CPUs, scaling services, and strengthening GitHub’s core features.
And as a fine purveyor of hand-crafted shit code for many years, I'm not gonna weigh in on that. 🤣