So basically typing claude runs the standalone binary (the one from https://t.co/6vHPmhLDqd or npm install -g).
That binary ships with a custom Bun fork that silently does native string replacement on every API request (it hunts for the cch=00000 billing sentinel and swaps it).
If your conversation ever contains that sentinel string (super easy when you’re discussing limits, billing, or just reading the Claude Code source), it mangles the cache key → prompt cache breaks → you pay 10-20× more tokens on every single turn. That’s why you hit the limits so fast.
npx @anthropic-ai/claude-code runs the clean npm package on stock Bun/Node → no replacement happens → caching works properly → token usage stays normal (27 % after 2 hours of heavy prompting, as I posted).
Game changer: started using `$ npx @anthropic-ai/claude-code` to fire up Claude Code in the terminal… and the absurd token usage limits are GONE.
Been prompting hard for almost 2 hours and my current session is only at 27%.
We are SO back!
I am done working with worktrees. The agents are not ready for this. They get crazy with the cleanups afterwards and I have ended up losing changes because of this. And also, the mental load is bigger.
hey @cursor_ai@leerob I am on auto mode. Do you see that small text on the right? It is a business rule. I never mentioned to the agent to post it in the UI, this is the 5th time I find business rules in the UI. It is very small text, and the fact it post it like that it means it knows is something internal or it is classifying it as something else in a way that interprets it as a note/reminder. Can you check into this? thanks.