This is very interesting. Coinbase seems to have lowered their token spend ($$) to about half, by
1) routing to cheap inference like GLM 5.2 and Kimi 2.7 that are still pretty performant
2) Smart routing + caching
They still use the same tokens as before. Start of a trend?
Open source is taking share from the frontier models on the OpenRouter platform. This has ignited a mini-AI panic.
To us, it seems overblown. First, OR is not a representative sample.
But, more importantly, frontier token pricing on OpenRouter has surged this year.
@fernavid
@GavinSBaker Three big reasons:
1) China willingness to reduce imports (ie draw inventories)
2) China ability to flex down demand (EV + mass transit)
3) Dark transits were larger than expected
That’s my postmortem
The charts highlight something that many investors still underestimate: AI is rapidly becoming a scale business rather than a pure technology business.
At the frontier, Anthropic remains the intelligence leader. Claude Fable 5 scores 60 on the Artificial Analysis Intelligence Index, ahead of Opus 4.8 at 56 and GPT-5.5 at 55. However, the second chart shows that this final increment of intelligence comes at a very steep cost. Fable 5 costs roughly $3.25 per benchmark task versus $1.78 for Opus 4.8 and $0.99 for GPT-5.5.
In other words, Anthropic appears to be operating on the far right side of the efficiency curve where each additional point of intelligence requires disproportionately more compute, more reasoning tokens, and more inference cost. The company currently owns the performance crown, but it is paying heavily for it.
The more interesting story is happening below the frontier. DeepSeek V4 Pro Max delivers 44 intelligence points for around $0.06 per task. MiMo V2.5 Pro delivers 42 points for roughly the same cost. Gemini Flash reaches 50 points while remaining dramatically cheaper than the frontier models. The intelligence gap between the best and the “good enough” models remains meaningful, but the cost gap is becoming enormous.
This is exactly what happens when a technology industry starts to mature. The market initially competes on absolute performance. Eventually it begins competing on performance per dollar.
The scatter plot makes this particularly clear. The most attractive area is not necessarily the top-right corner where intelligence is highest. It is the upper-left area where intelligence remains high while costs collapse. Many Chinese models now sit squarely in that region.
This is why the narrative that “the cheapest model wins” is just as wrong as the narrative that “the smartest model wins.” Neither is sufficient. The winners are likely to be the models that sit on the Pareto frontier, delivering the highest intelligence for a given cost level. Today that frontier includes Anthropic at the very high end, Google in the middle, and a growing group of Chinese labs including DeepSeek, Qwen, MiniMax, and Xiaomi at the efficiency end.
For investors, this has an important implication. Falling inference costs are not necessarily bearish for AI infrastructure. Historically, lower computing costs have almost always led to more consumption rather than less. We saw it with storage, bandwidth, cloud computing, and semiconductors. As intelligence becomes cheaper, companies simply run more agents, automate more workflows, and expand AI usage into new domains.
The charts therefore reinforce a view that we have held for some time: AI is not becoming less valuable because models are getting cheaper. AI is becoming more ubiquitous because models are getting cheaper.
Anthropic currently leads the intelligence race. Chinese labs are increasingly leading the efficiency race. Google appears to be competing effectively across both dimensions. The result is an industry that is simultaneously moving up the intelligence curve and down the cost curve. That combination is exactly what drives mass adoption.
The most bullish takeaway is that the industry is no longer trading off intelligence against cost. It is increasingly improving both at the same time. Historically, that is the hallmark of a technology entering its exponential adoption phase rather than approaching saturation.
The dark fleets going through Omani coast are real. Read this article from FT from several days ago
"The US first established a system of providing air cover to ships attempting to pass the strait via the Omani route some two weeks ago, according to three people with knowledge of the transits. Vessels wanting to pass through the waterway are instructed to request permission from US Central Command, which then provides the coordinates of the Omani route and issues instructions for ships to turn off their GPS signalling systems and all electronics. Vessels were also advised to transit under cover of darkness, two of the people said."
Dan Smoot, chief executive of Vantor, which tracks ships by satellite imaging, said at the WSJ CEO Council summit in London on Wednesday that there was a “tremendous amount” of shipping activity through the Strait of Hormuz that was “outside the news right now”.
https://t.co/lWk4AyIEkL
The biggest story of the year is the way the economy is being reshaped right in front of our eyes.
The heartland, being obliterated by globalization, is making an economic comeback thanks to the AI build out, abundant energy, and tax policies that encourage domestic sourced materials over foreign.
For the past decade, the middle of the country didn’t produce freight, it received it.
That has flipped this year.
https://t.co/11as8UrHQ6
ai pilled companies spending $90k / employee per year 🤯
per employee, not per engineer.
where does this go in a year!? can't wait to find out...
from ramp's dataset (70k companies).
As I wrote this, I saw X go into meltdown over tokens.
You've seen the headlines: “Uber blows yearly AI budget in just one quarter.” “Meta employee burns 281 billion tokens in April.”
But, the problem isn't spending. Spending works. Since 2023, the top quartile of our AI spenders doubled their revenue. The bottom quartile? Flat.
It's blind spending. We don’t know which spend worked.
A sales team has qualified leads. A support team has resolved conversations. These are units you can measure against. All a token tells you is the meter ran, not whether the work was worth it or not.
Finance says, “half the budget,” engineering says, “double it” and you don’t know who’s right because there is no shared language of value. There’s no attribution, and no attribution means no allocation.
For example, right now, all work, no matter the size or shape, defaults to frontier models. But meeting summaries and calendar updates don’t require GPT-5.5 Pro.
In isolation this seems trivial, but re-route just 10% of a $10M AI bill from frontier to GPT-4 level intelligence you’ve saved nearly one million dollars. This sounds like a made-up stat — it’s not. It truly is that much cheaper.
This is the future of finance: not blindly rubber-stamping or rejecting AI spend, but allocating it with the same rigor companies apply to headcount.
$MSFT AI CEO @mustafasuleyman: "Anthropic is extremely expensive, and I think many people are urgently looking for alternatives. ..We pay a lot of money to Anthropic — so our goal is to reduce and ultimately eliminate that cost....We’re more focused on the Anthropic-style which is enterprise [use cases], developers, and coding" [BBG + FT]
CHART OF THE DAY: Perhaps the most important story in global markets / geopolitics right now.
China's oil imports plunged to ~6.6m b/d in May, according to @Vortexa data, down ~38% vs 2025 average (or ~4m b/d).
I wrote this @Opinion column in early May: https://t.co/XK71uh81m1
May I present to you the modern day version of NINJA loans.
The "no income, no job, no assets" wonders of finance.
They are alive and well.
40%+ of private credit middle market borrowers have negative earnings after interest, tax, working capital and capex.
Many of them are "asset-light" companies (which is a euphemism for "there is nothing to repossess if they default)
See this and weep (and ponder whether you get paid enough to take this measure of risk)
(source: FSB, link below)
"During Warsh’s term (if he lasts as long as prior Fed chairs), he will preside over the dreaded crossover point: in 2031 entitlements, interest and other mandatory outlays are projected to permanently exceed Federal tax revenues for the first time."
-JPM Cembalest
AN “UNKNOWN” COMPANY ACCIDENTALLY SPENT $500 MILLION DOLLARS IN 1 SINGLE MONTH ON ANTHROPIC’S AI TOOLS AFTER FORGETTING TO SET LIMITS FOR THEIR EMPLOYEES