shwetank

@Shwetankumar

Chief Scientist @ EnCharge AI. Writing about why AI efficiency is a data movement problem, not a compute problem. Subscribe:

Joined May 2014

150 Following

73 Followers

584 Posts

Pinned Tweet

shwetank @Shwetankumar

about 2 months ago

Anthropic is down again. OpenAI is burning $1.70 for every dollar it earns. One company can't serve its customers. The other can't afford to. I broke down both companies' leaked financials. Neither comes out clean. https://t.co/54Ys0BsBRE

shwetank @Shwetankumar

3 days ago

@UHC @amazon cc: @econliberties

shwetank @Shwetankumar

6 days ago

@UHC what part of my needing my medication for 45 days while I travel is unclear. Why do you keep denying me prescription already approved by my doctor! @amazon pharmacy thanks calling @UHC on my behalf. Cant wait till you guys offer insurance and drive them out of business!

Shwetankumar retweeted

EnCharge AI @EnChargeAI

16 days ago

Our Chief Scientist @Shwetankumar on why on-device AI is stuck, and how EN100 fixes it. The culprit: data movement. EN100 computes inside the memory cells, eliminating it entirely. Full blog here: https://t.co/l7bqIqnW87

EnChargeAI's tweet photo. Our Chief Scientist @Shwetankumar on why on-device AI is stuck, and how EN100 fixes it.

The culprit: data movement. EN100 computes inside the memory cells, eliminating it entirely.

Full blog here: https://t.co/l7bqIqnW87 https://t.co/N2iwbMJOLr

Who to follow

unpaid intern @airdrops_io | boss makes me post about free money | farming engagement instead of yields

Better Than You

@Ladylulu32

Sweet, cool, loyal. I dont club much, unless invited, really I just Be Chillin.

shwetank @Shwetankumar

19 days ago

Outside the valley only a small fraction of the world cares about what’s happening in the valley.

Peter Yang

@petergyang

19 days ago

If you're stuck in the Bay Area tech rat race / psychosis, make time to travel to other places. Go to a small town in Europe or visit Asia - you'll see that life can be about much more than whether you're IC7 or IC8 or what company you work for. Don't be the person to put on your tombstone: "He got divorced and neglected his kids but at least he made D2 at FAANG"

139

181

759

521K

shwetank @Shwetankumar

about 2 months ago

@axios "Supply chain risk" is a procurement posture; "mission-critical dependency" is a deployment posture. DoD holds both because frontier capability has converged — banning a vendor at the contract layer no longer prevents adoption at the operator layer.

shwetank @Shwetankumar

about 2 months ago

@simonw The pelican benchmark has aged into the cleanest vibes-eval we have. Every new frontier model gets the same prompt, every new frontier model hallucinates the same wings-on-handlebars composition, and the deltas are actually informative.

329

shwetank @Shwetankumar

about 2 months ago

@Alibaba_Qwen The agentic coding + real-world agent reliability framing is the most interesting part of the announcement. It would be really good to understand which workloads is it being tuned against internally - eval suite, real customer deployments, or something in between?

shwetank @Shwetankumar

about 2 months ago

@simonw "tool results can't override rules" was interesting - it means they're putting the trust boundary in the prompt itself instead of running a classifier over Claude's output afterward.

shwetank @Shwetankumar

about 2 months ago

@simonw The Opus 4.7 tokenizer measurements make this worse for brownfield. 1.47x tokens on tech docs and CLAUDE.md files means every already-written repo is more expensive to reason about than a blank project of equivalent complexity.

shwetank @Shwetankumar

about 2 months ago

@finmoorhouse The reason this chart feels sublime and alarming in equal measure: every project on the right has a unit of output. Lane-miles. Moon landings. Working bombs. Hyperscaler capex has gigawatts — which is what we burn, not what we build. We're missing the numerator.

shwetank @Shwetankumar

about 2 months ago

@daniel_mac8 Unchanged per-token price + 1.47x tokens on technical content = 47% revenue lift on the exact workloads Anthropic is capacity-constrained on. That's not an accident. When you're at $21.4B/GW and rationing, the tokenizer is one of the knobs revenue-per-watt gets pulled through.

shwetank @Shwetankumar

about 2 months ago

@simonw Flamingo comparison misses the efficiency story. Qwen 73.4 on SWE-Verified vs Opus 87.6 = 14 points. Qwen ~$0.20/M tokens vs Opus $15/M input + $75/M output = 50-300x. Capability gap closes. Unit economics don't. That's the real comparison.

109

shwetank @Shwetankumar

about 2 months ago

@dylan522p @LipBuTan1 The efficiency pivot at GTC 2026 only holds value if there's a credible second and third seat. Lip-Bu's perf/watt discipline is the pressure that forces the incumbent to actually ship on the new metric instead of the old one. Long road — but the road finally exists.

shwetank @Shwetankumar

about 2 months ago

@bcherny The agentic precision matters more than raw benchmarks here. Longer unsupervised runs at the same token cost is much better for my bill all in! :)

124

shwetank @Shwetankumar

about 2 months ago

@BenBajarin AI accelerators at 55% of revenue, capex trending to $56B. Enormous input numbers. Still no standardized metric for return-on-compute at the output end!

shwetank @Shwetankumar

about 2 months ago

Same token pricing, better benchmarks. More useful output per dollar — that's the whole Anthropic financial thesis right now. Their revenue trajectory makes more sense when you look at unit economics, not model size. https://t.co/54Ys0BsBRE

Chubby♨️

@kimmonismus

about 2 months ago

Claude Opus 4.7 is out. the TL;DR Anthropic released Opus 4.7 today. Same pricing as 4.6 ($5/$25 per million tokens), available across API, Bedrock, Vertex AI, and Microsoft Foundry. What changed vs Opus 4.6: Coding (obviously). Biggest gains on the hardest, long-horizon software engineering tasks. Early testers report being able to hand off work that previously needed supervision. Opus 4.7 now verifies its own outputs before reporting back. Vision. Accepts images up to 2,576px on the long edge (~3.75MP), over 3x more than any prior Claude. This is the real unlock for computer-use agents reading dense screenshots and diagram extraction. Instruction following. Now interprets instructions literally. Anthropic explicitly warns: prompts tuned for 4.6 may break or produce unexpected output. Existing harnesses need re-tuning. Memory. Better at file system-based memory across long multi-session work. Real-world knowledge work. State-of-the-art on Finance Agent eval and GDPval-AA (third-party economically valuable knowledge work across finance, legal, etc). New features shipping today: xhigh effort level between high and max. Finer control over reasoning vs latency. Claude Code default is now xhigh for all plans. Task budgets in public beta on the API. /ultrareview in Claude Code. A dedicated review session that flags bugs and design issues. Three free for Pro and Max users. Auto mode extended to Claude Code Max users. Claude decides on your behalf, less interruption, less risk than full skip-permissions. The honest caveats: New tokenizer means the same input maps to 1.0 to 1.35x more tokens depending on content type. Opus 4.7 thinks more at higher effort levels, especially on later agentic turns. More output tokens. Safety profile is roughly similar to 4.6. Improvement on honesty and prompt injection resistance, modestly weaker on over-detailed harm-reduction advice for controlled substances. Still less capable than Claude Mythos Preview (was expected), which Anthropic is keeping on limited release. Opus 4.7 is the testbed for the cyber safeguards that will eventually enable a broader Mythos rollout. tl;dr Measured against 4.6, this is a meaningful upgrade in the three places that matter most to Anthropic's actual customer base: agentic coding reliability, vision for computer-use agents, and knowledge work benchmarks like GDPval-AA. Solid upgrade, obviously shy to Mythos. Feels like a pretty good iteration-update!

294

188K

shwetank @Shwetankumar

about 2 months ago

Same model class, opposite release strategies. One restricts to 40 orgs, one pushes to thousands. Both confirm AI just repriced vulnerability economics — offense and defense costs collapsing simultaneously.

Matt Shumer

@mattshumer_

about 2 months ago

OpenAI fine-tuned GPT-5.4 for cybersecurity, with fewer refusals, new capabilities like binary reverse engineering, and is rolling it out to verified defenders. Very different approach than Anthropic is taking with Mythos. We'll see how each plays out!

mattshumer_'s tweet photo. OpenAI fine-tuned GPT-5.4 for cybersecurity, with fewer refusals, new capabilities like binary reverse engineering, and is rolling it out to verified defenders.

Very different approach than Anthropic is taking with Mythos. We'll see how each plays out! https://t.co/b7nX1yNjcH

144

31K

shwetank @Shwetankumar

about 2 months ago

@moninvestor Meta routing $21B of compute to CoreWeave while Anthropic signs a multi-year too = the vertical stack is unbundling at both ends. Capacity moving out of hyperscalers just as the Big 3 admit the model moat is migrating out of weights!

shwetank @Shwetankumar

about 2 months ago

@gensynai @benfielding @ambient_xyz @PrismaXai If the three agree distillation is a bigger threat than each other, they're implicitly agreeing their models converge enough to be copied. The moat moved out of the weights — into routing and provider risk management.

475

shwetank

@Shwetankumar

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users