Databricks says annualized revenue rose 80%+ YoY to $6.9B, up from $5.4B in Q4; CEO Ali Ghodsi says higher AI agent usage is increasing costs, lowering margins (@jordannovet / CNBC)
(Visit Techmeme dot com for the link and full context!)
Tokens Per Task is the other thing to consider as agentic harnesses co-evolve with models. @GavinSBaker
Value = tasks (or task outcomes).
Delivering outcomes with the fewest generated tokens in the agentic loop will be the bigger picture optimization.
Costs from GDPVal-AA benchmark runs, not just in reference harness, but in the custom harnesses might be very interesting to track.
"If token throughput per watt rises faster than price per token falls, revenue per gigawatt can net expand. This is compounded by models getting smarter, which unlocks higher value tasks for end users [justifying increased costs]"
Well said @ShanuMathew93 and @downingARK
$MU $DRAM Where analysts & market got it wrong. They model HBM like it has been around for decades.
HBM is a new architecture built to feed AI models context at speed. It did not exist 3 years ago in any meaningful revenue.
It is not where data sits. It is where AI thinks.
Really excited to open source a new project: Omnigent, a meta-harness for AI agents.
It lets you build multi-agent coding and custom agents, sitting above Claude Code, Codex, Pi, and agent SDKs to let you compose them. It also adds live collaboration and rich control policies.
Token capital may become real firm capital, but it will not automatically preserve human agency, broad value distribution, or institutional control. Those have to be designed, measured, audited, and governed.
Once systems become more autonomous, faster, copyable, and coordinated, the likely emergent phenomenon is group agents that exceed human oversight capacity. The “learning loop” can become an automated bureaucracy moving far faster than its managers.
Okay, this is seriously cool.
A team from @GoogleDeepMind, including DeepMind Cofounder Shane Legg, published a paper "From AGI to ASI".
In the paper, they include instructions for an AI agent to read along with you.
You can open the paper in Codex's in-app browser and have GPT-5.5 read it with you and explain all the concepts.
This is the future. AI agents will be part of the target audience, and help us to understand anything we want.
Game theory from here is super interesting:
Original Mags (Google, Amazon, Microsoft, Meta) now have a serious non-zero opportunity to tank the frontier labs.
Go to the government, kneecap the labs’ motion of putting the latest models out in the wild, become the trusted gatekeeper between the labs and the public at large (including internationally) by having the labs go through their clouds (AWS, GCP, Azure) and implement strict KYC to seal the deal.
The frontier labs should have seen this coming years ago and implemented a robust KYC for just this moment. The fact they didn’t is kind of concerning.
Why did they not do it?
Best guess is because it would have changed the run-rate revenues (downward) which would have then changed funding dynamics - lower valuations, more dilution, less secondary.
A valuation reset may happen now anyways, except the labs may end up with less control and more restrictions at the end of it. At the same time, everyone is already clamoring about token prices of the old models from the labs anyways…
This couldn’t be a better setup for open source and neoclouds. Big question is can they meet the moment?
There are too few of them and their progress seems sporadic at best.
A Heartfelt Thanks to Our TYE Global Championship 2026 Sponsors
As we welcome the world's brightest young entrepreneurs to Seattle for the TYE Global Championship on June 13, we extend our sincere gratitude to the incredible sponsors and supporters who make this event possible.
@GavinSBaker@polynoamial And test time compute resides mostly in the top of the memory hierarchy (SRAM, HBM, LPDDR). The balance of logic and memory in a rack will inexorably shift towards memory bandwidth and capacity.
Super important post from @polynoamial and the investor TLDR is: all current estimates for compute demand might be low.
“We likely don't know what the capability ceiling is for modern LLMs because it's too expensive to measure.
Frequently when I discuss this, people ask why we don't just evaluate with a harness that pushes test-time compute until performance plateaus. The problem is that, empirically, the plateau is very far out. Sometimes we may not observe a plateau at all within practical budgets
Notice that for the stronger models the performance improvement over time is stronger. It seems likely that as models become stronger they become more effective at operating over longer horizons. The point of plateau is pushed out, and may even disappear.”
If test-time compute performance improvement over time *effectively* scales at some ratio with training…
Yes - And I have a (simplistic) thesis that at the end of the day all LLMs need is High Bandwidth Memory - and the logic coupled to the HBM is secondary - you can optimize the code that runs on the logic to make best use of what you get. So Anthropic is going for every accelerator they can get - including TPU, Trainium, Maia, and probably pricing in a way that aligns with the token throughput they can get out of each.
👀 Agentic AI is driving memory demand. As inference workloads surge, how will next-gen AI servers tackle rising compute costs and capacity limits? 💡#TrendForce has raised its global memory market outlook: https://t.co/SJQ2upkDIs 🔗