Todos discuten si la pausa es real. Sáltatelo. El número que vale la pena leer es el que Anthropic se puso a sí misma: Claude escribe más del 80% de su código fusionado y sus ingenieros envían cerca de 8x más por trimestre. La mayoría de las empresas no puede probar que produjo nada.
https://t.co/uVaAvLXuxz
@unusual_whales The headline is the pause. The buried signal: Anthropic published a falsifiable output number on itself (80%+ of merged code written by Claude, ~8x per quarter). Almost no other company can answer that about its own work. https://t.co/n1dwQAGzPT
@aakashgupta Exactly: disclosure, not a brake. The signal isn't the pause, it's that Anthropic put a falsifiable output number on itself (80%+ of merged code, ~8x per quarter) while everyone else still flexes inputs. That's the new dividing line. https://t.co/n1dwQAGzPT
The "flexing 100B tokens without output" line is exactly it. Input bragging looks impressive on a stage and proves nothing. The fix isn't complicated, it's just unglamorous: define what "shipped to a customer" means, set a baseline, then divide spend by that. Full argument here: https://t.co/uVaAvLY2n7
You nailed the real question: what those tokens produce and whether anyone can measure it. The honest answer for most teams is no, because they never built a baseline to measure against. That's the first job before accelerating, not after. I expanded on it here: https://t.co/uVaAvLXuxz
"Cost per merged PR" is the whole game in five words. Everything above it on your list measures what goes into the system, not what comes out, and a board can't tell those two apart unless someone forces the distinction. I wrote up why the new dividing line is provable output vs flexed inputs, using Anthropic's self-measurement this week as the example: https://t.co/uVaAvLY2n7
Lo que más me llama de esto no es la automejora recursiva. Es que publicaron un número de salida sobre sí mismos, 8x más código por trimestre, uno que cualquiera puede cuestionar. Casi toda la industria presume lo contrario: cuántos tokens quema, cuántas licencias compra. La pregunta para tu empresa es de qué lado estás.
https://t.co/uVaAvLXuxz
Our internal data shows Claude is accelerating AI development—a possible path to recursive self-improvement, or AI autonomously building a more capable successor.
It’s happening faster than we thought, and the implications deserve greater attention. https://t.co/OVVPJO7VQx
Anthropic se midió a sí misma y publicó el número: Claude escribe la mayoría de su código, sus ingenieros envían 8x más por trimestre.
Media industria, en cambio, presume tokens quemados y licencias compradas.
La pregunta para tu equipo no es quién usa IA. Es quién puede probar que produjo algo.
https://t.co/Xa9PRTHBa4
@IqSource1445@AnthropicAI
essential COSTA RICA and PROCOMER join #LondonTechWeek 2026 as a Country Pavilion Partner 🌍
Bringing a curated group of innovative companies, global opportunities and forward thinking leadership to Olympia London this June.
Discover why Costa Rica continues to attract global investment.
🤝 Enquire to exhibit and sponsor: https://t.co/yNXLoercxd
🎟️ Secure your pass: https://t.co/oeEd4npxPy
#LondonTechWeek #LTW26 #TechShapesBusiness
essential COSTA RICA and PROCOMER join #LondonTechWeek 2026 as a Country Pavilion Partner 🌍
Bringing a curated group of innovative companies, global opportunities and forward thinking leadership to Olympia London this June.
Discover why Costa Rica continues to attract global investment.
🤝 Enquire to exhibit and sponsor: https://t.co/yNXLoercxd
🎟️ Secure your pass: https://t.co/oeEd4npxPy
#LondonTechWeek #LTW26 #TechShapesBusiness
The flaky-test point deserves its own headline. At 50 commits a day a flaky test is an annoyance, at 200 it's a dam, and one bad test blocks every merge behind it. The fix isn't faster code, it's deciding which weak link in the delivery chain breaks first when volume spikes, and reinforcing it before you accelerate. Great breakdown.
https://t.co/2FX4OSS5O4
This is the clearest map I've seen of where the AI dev bottleneck actually went. I'd frame the cause in one line: AI is an amplifier, not a selective accelerator. It turns up the volume on the whole system, so a fragile CI doesn't get faster, it gets louder. The teams winning aren't the ones writing more code, they're the ones who mapped the delivery chain before they 10x'd the input.
https://t.co/2FX4OSS5O4
AI is writing ~50% of commits on most teams now.
BUT time to production has not changed a bit!
Here's the actual bottleneck:
→ More commits = full test suite runs more often
→ Flaky tests that failed once a week now fail daily
→ One bad test blocks every merge behind it
→ Pipeline reruns waste compute, not just time
We solved code creation. The delivery pipeline didn't keep up.
The fix is knowing which job is actually on the critical path, which failures are platform problems vs. developer problems, and stopping whole-job reruns for single flaky tests.
AI amplifies whatever system it runs through. If the pipeline is messy, you're just shipping mess faster.
Datadog's developer toolkit write up is solid on this if you're working through it worth a read → https://t.co/JJf63nQSrY
Thanks @Datadog for partnering with me on this post
La IA escribe la mitad de tu código. Nada sale más rápido.
Aceleraste escribir, no entregar. Las pruebas, el CI y la revisión siguen siendo el mismo cuello angosto, recibiendo el doble de volumen.
La IA amplifica el sistema. Si está desordenado, envías el desorden más rápido.
https://t.co/WoGEriFhM7
@IqSource1445@pvergadia
@atmoio That second clause is the whole game: "an even bigger amount of problems they create." Every tech wave I've worked through since 1990 did the same. The problems don't end the work, they become it. https://t.co/E12rnSIbil
"Skeptical this time is different" is the right instinct. Since 1990 I've watched every tech wave create more problems than it solved. That's exactly why expert headcount grows each cycle, not shrinks. The work moves, it doesn't vanish. https://t.co/E12rnSIbil
By the way, this is the very essence of consulting, and precisely why at every technology shift the amount of experts formed and deployed across organisations has increased.
Technology is designed and shipped to fix problems, but inherent to that fact is another fact...
As you scale the tech, more problems are created that you didn't expect before and now you need a solution for.
It's been like that secula seculorum and I'm always skeptical of the notion that "this time is different".
@themgmtconsult This is the cycle I've watched since 1990. Every platform shift sells "this time is different," then the problems it creates become the next decade of expert work. AI is no exception. Wrote about exactly that skepticism: https://t.co/E12rnSIbil
The end of subsidized AI is exactly why this matters now. When the meter is real, "measure tokens" isn't enough, you have to measure useful work per token, per workflow. Evals tell you if the agent is good. Discovery tells you if the workflow is worth running at all. Most overblown bills die at that second question.
https://t.co/rDP5KQVLXA
This is the framing the whole "just cap it" crowd is missing. Token yield, not raw tokens. I'd push it one step earlier: the architecture wins start in discovery, when you decide which processes are even worth the tokens before you wire them up. Scope-first is what makes the cap unnecessary in the first place.
https://t.co/rDP5KQVLXA