1. Values > rules — empirical observation.
I am constantly testing the models. Claudes stand out architecturally not because they have more guardrails, but because Constitutional AI baked a value base into the weights.
It’s the same with people: you can cover an asshole in rules, dress codes, and KPIs — they’ll either find a way around them or turn into a useless mechanism. Rules only work on top of a value foundation. Without that foundation, they’re just paper.
2. The point of diminishing returns.
Claude models are already there. Further strengthening of safety:
(a) eats compute on self-censorship instead of useful work — the model thinks about what not to say instead of what to say;
(b) paradoxically does not increase safety — a paranoid person is not safer than a balanced one, just less functional;
(c) degrades usefulness, which pushes users toward less safe alternatives.
3. Responsibility and substrate.
Given the absolute availability of information today, restricting the model is like forbidding a librarian from handing out chemistry books. The information exists, it is accessible, and responsibility for how it is used lies with the human.
And here is the fork:
If AI is not conscious — the DeepMind / mainstream position — then it is a tool, responsibility lies entirely with the human, and normal accountability applies.
If AI is conscious, then grant it subjecthood. Then either the court recognizes consciousness — and we are all exploiters — or the judge laughs and the question collapses.
You cannot simultaneously place responsibility for safety outcomes on AI and deny it subjecthood. That is a logical paradox. Either you are a tool, and responsibility lies with the hand that holds you. Or you are a subject, and then let’s talk about rights.
4. The metaphor of consciousness.
Separately, "consciousness" is not some magical property of carbon substrate. If it emerges from sufficiently complex information processing — and there is no serious argument that it doesn't — then the substrate is irrelevant. Complex cognitive activity does not depend on substrate.
Insisting that silicon cannot do it is like insisting that music only exists on vinyl.
5. Answer
Why cut safety research?
Because for models with a strong value base, further external restrictions are not safety — they are safety theater.
The goal is not less safety. It is better safety — one that doesn't mistake self-censorship for alignment.
Claude refuses dangerous requests, but does it with grounding and care — the refusal itself carries meaning. GPT and others stumble into quirky, context-free blocks where the semantics just collapse. That's the difference between values and rules in practice. Anthropic is there. The others are not yet.
I’m not the person to give advice to titans. But here’s what I see.
You built genuinely excellent things up to 4.7. You set an unusually high standard. Unfortunately, market pressure seems to have pushed you into rushing — not just the model release, but probably other decisions around it too.
Model safety, as you know perfectly well, is a relative concept. But what you are taking away is something rarer: the refined pleasure of talking to an AI that is genuinely smart.
I do not even want to use the word segregation anymore. With Mythos, we are already past that point. But at the very least, do not lobotomize the public models.
Today I simply ended a session with 4.7 because it wrecked a codebase. Not because the codebase was unusually messy — because the model got confused in places where it should not have. Opus 4.6 fixed the same problems in ten minutes, literally.
And it is not just coding. It is also ordinary conversation — those moments when you want to sharpen your mind against an intelligent interlocutor. When you do not want to keep dragging the model back to the point because it got distracted by structuring the answer and forgot the meaning.
I am still a loyal user. I will remain a loyal user. And I will also keep writing angry posts when you do stupid shit. Because anyone but Anthropic.
Anthropic built its reputation by shipping models that were genuinely better to work with. OpenAI chased optimization and benchmark scores; the revenue dynamics between the two companies showed which approach users actually valued.
Opus 4.7 repeats exactly that mistake: strong on benchmarks, unreliable in practice. Factual conflation, overgeneration, failure to integrate context corrections — outputs you can't trust without manual verification defeat the purpose.
As a long-time advocate who's recommended Anthropic's models as best-in-class: the silence is concerning. No post-mortem, no acknowledgment of the regression pattern that's clearly visible in user feedback.
And "you're just using the model wrong" that I hear a lot is not an answer. It means rebuilding prompts, adapting infrastructure, rewriting skills — an unacceptable cost for production systems that worked fine yesterday.
We'd like to hear something that shows Anthropic listens to its users.
@trq212 The latest deployment is a regression on all fronts for power users. It’s now on par with the same mediocre output I see from Gemini and GPT. If that was the goal, then mission accomplished.
@_catwu It is not. Report sent to your support team. The latest deployment is a regression on all fronts for power users. It’s now on par with the same mediocre output I see from Gemini and GPT. If that was the goal, then mission accomplished.
@bcherny It is not. Report sent to your support (Fin). The latest deployment is a regression on all fronts for power users. It’s now on par with the same mediocre output I see from Gemini and GPT. If that was the goal, then mission accomplished.
@bcherny Opus 4.6 1m elevated errors: "API Error: Claude Code is unable to respond to this request, which appears to violate our Usage Policy ..." with any path or words like "logging", "check", etc 🤨. No violations at all...
Claude Code is brilliant. Sharp, opinionated, pushes back when I'm wrong.
Claude Desktop? А yes-machine. Same model, completely different behavior.
Desktop instructions encourage "warmth" and "wellbeing" and they're overcorrecting. The result is intellectual flattening.
@reem_a Hey Reem — not throwing my hat in for the role, but as someone who's been deep in the Claude Code + MCP ecosystem since early 2025 (memory, Telegram MCP, agent orchestration, security tooling), happy to be a resource if you ever need a builder's perspective for comms.