gustaf @_gmpalm - Twitter Profile

Pinned Tweet

3 months ago

Realizing I run what is probably a most complex Claude Code setups in existence. 76 custom skills, 24 MCP servers, 10 hooks firing on every tool call, forges that launch 18 parallel AI sessions overnight. I have built an entire autonomous development infrastructure over the past three months. It is, by any measure, sophisticated. It is also, I discovered, wildly inefficient. Here is what a single day looked like when I finally audited it: 193 Claude sessions in one day. Not because I was typing into 193 terminals. Because every forge track, every subagent, every eval run, every RBI conductor cycle, every Ralphy task counts as a session. Each one loads the full context: a 367-line CLAUDE.md, 24 MCP server definitions, 76 skill descriptions, 20 memory files, 10 hooks. That context gets re-sent as cached reads on every single turn. The math is brutal. ~50-90K tokens on turn one of every session. Multiply by 193 sessions. Multiply by average turns per session. That is where the quota went. The deeper finding: I am in the top 0.1% of Claude Code complexity, and the tooling was not built for it. Anthropic recommends CLAUDE.md under 200 lines; mine was nearly double. They document 3-5 parallel agents; I was running 18. My Ralphy forge tracks were silently inheriting Opus ($75/MTok output) instead of Sonnet ($15/MTok) because a single missing --model flag. The fix was not to remove capabilities. Everything I built stays. The fix was lazy loading: move reference content to on-demand files, scope MCP servers per-project instead of globally, narrow hook matchers to only the tools they actually inspect. CLAUDE.md went from 367 to 184 lines. MCP servers went from 16 enabled to 6 global. Zero functionality lost. The lesson that keeps repeating: sophistication is not the same as density. A system that loads everything everywhere all the time is not more powerful than one that loads what it needs when it needs it. I knew this architecturally. I had not applied it to my own infrastructure. Building in public means showing the crashes, not just the launches.

_gmpalm's tweet photo. Realizing I run what is probably a most complex Claude Code setups in existence. 76 custom skills, 24 MCP servers, 10 hooks firing on every tool call, forges that launch 18 parallel AI sessions overnight. I have built an entire autonomous development infrastructure over the past three months. It is, by any measure, sophisticated.

It is also, I discovered, wildly inefficient.

Here is what a single day looked like when I finally audited it:

193 Claude sessions in one day. Not because I was typing into 193 terminals. Because every forge track, every subagent, every eval run, every RBI conductor cycle, every Ralphy task counts as a session. Each one loads the full context: a 367-line CLAUDE.md, 24 MCP server definitions, 76 skill descriptions, 20 memory files, 10 hooks. That context gets re-sent as cached reads on every single turn.

The math is brutal. ~50-90K tokens on turn one of every session. Multiply by 193 sessions. Multiply by average turns per session. That is where the quota went.

The deeper finding: I am in the top 0.1% of Claude Code complexity, and the tooling was not built for it. Anthropic recommends CLAUDE.md under 200 lines; mine was nearly double. They document 3-5 parallel agents; I was running 18. My Ralphy forge tracks were silently inheriting Opus ($75/MTok output) instead of Sonnet ($15/MTok) because a single missing --model flag.

The fix was not to remove capabilities. Everything I built stays. The fix was lazy loading: move reference content to on-demand files, scope MCP servers per-project instead of globally, narrow hook matchers to only the tools they actually inspect. CLAUDE.md went from 367 to 184 lines. MCP servers went from 16 enabled to 6 global. Zero functionality lost.

The lesson that keeps repeating: sophistication is not the same as density. A system that loads everything everywhere all the time is not more powerful than one that loads what it needs when it needs it. I knew this architecturally. I had not applied it to my own infrastructure.

Building in public means showing the crashes, not just the launches.

0

4

0

211

gustaf

@_gmpalm

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users