Introducing Claude Opus 4.8: it builds on Opus 4.7 with sharper judgment, more honesty about its own progress, and the ability to work independently for longer than its predecessors.
Available today at the same price.
Personal update: I've joined Anthropic. I think the next few years at the frontier of LLMs will be especially formative. I am very excited to join the team here and get back to R&D. I remain deeply passionate about education and plan to resume my work on it in time.
Personal update: I've joined Anthropic. I think the next few years at the frontier of LLMs will be especially formative. I am very excited to join the team here and get back to R&D. I remain deeply passionate about education and plan to resume my work on it in time.
> Sure! I’ll try to get that.
> Glad the clinical/high-stakes point resonated and felt actionable 👍
- One quick additional thought on that: Claude is already an incredible model, but in critical conversations we’ve found it sometimes gets convinced too easily and agrees with uncertain directions. We’d love an optional “Strict” or “Critical” mode (or even a “Bold” toggle) that makes it stricter on tasks and more rigorously sticks to the real/best solution instead of shifting too quickly.
> Any suggestions or guidance on the best place/person to discuss costs, startup/high-volume usage, and potential partnership / commercial support options? (credits or volume relief at our level would be huge)
Thanks again!
• Lingering instruction-following regressions and forgetfulness in longer sessions (Opus 4.7 ignoring instructions mid-task or losing context) - even after the April postmortem fixes.
• Abnormal usage drain and token inflation that makes costs harder to predict (quotas burning faster than expected).
• Personal : In critical or high-stakes conversations (e.g. complex debugging or clinical decision support), Claude sometimes gets convinced too easily and agrees with uncertain or suboptimal directions instead of rigorously sticking to the most accurate solution. We’ve found models like ChatGPT tend to hold their ground better on the “real” answer in those scenarios. @_sholtodouglas
Hey @_sholtodouglas As a startup whose Claude Code bill is basically trying to buy us dinner every month… here’s some real talk 😂
I (and a ton of others) switch when I hit the weekly limit on the $200 Max plan - literally the top reply here 😂.
The bigger frustration for startups:
Enterprise is fully usage-based ($20/seat base + 100% API rates on every token - chat, Claude Code, everything). That model is fine in principle.
But small startups that rely heavily on Claude Code while staying compliant (HIPAA/BAA via sales-assisted Enterprise) are getting crushed. $30k+/month in pure usage spend is common and very tough on budgets for seed/Series A teams.
We’ve tried reaching out a lot about partnership opportunities and better commercial support with limited success so far.
Would love the chance to discuss potential partnership possibilities directly in DM - happy to share our detailed usage dashboards, numbers, and real feedback if it helps shape solutions.
Claude quality is still elite when the economics line up. Looking forward to chatting.
Thanks!
A few other generic pain points that keep coming up for growth-stage teams like ours:
• New agents / Claude Code sessions have rendering/UI flakiness (terminal flickering, inconsistent output in longer agent runs) that still pops up even after recent fixes.
• Admin controls feel underpowered once you scale past ~150 employees — the jump to full Enterprise gives us the basics (RBAC, spend caps, groups), but real-time visibility, governance at that size, and seamless policy enforcement across 100+ devs still requires a lot of manual work.
• Credits/offers for teams spending this much would be huge (I know this is more billing/sales side, but any volume discounts or committed-spend relief would help a ton). @_sholtodouglas
When do you reach for other models instead of Claude? What can we do better? Hit me with all of your frustrations. dms open.
If you can give me detail (e.g. specifics/transcipts) - it'll help a lot in finding out exactly what we need to do to improve the next model