This is one of the craziest AI launches of 2026 and it came out of basically nowhere (Save this).
A company called Subquadratic just shipped SubQ, and the benchmarks are almost hard to believe.
To understand why this is such a big deal, you have to understand the fundamental problem that has defined AI for the last decade.
Every large language model in existence is built on transformer architecture, and transformers use a mechanism called standard attention that checks every single word in a sequence against every other word.
Double the context length and compute doesn't double, it quadruples, triple it and compute goes up nine times.
This quadratic scaling is why frontier models have been stuck at roughly 1 million tokens, why running them at those lengths gets expensive fast, and why the AI labs have essentially been printing money charging you more the longer you need the model to think.
The industry has known this problem existed since 2017 but they scaled it anyway. SubQ is built from the ground up to solve it.
Instead of processing every possible token relationship, SubQ's sparse attention architecture identifies which relationships actually matter and ignores the rest meaning compute is used where it counts and wasted nowhere else.
The result is that compute scales linearly with context length instead of exponentially, and the implications of that one architectural shift are enormous.
At 12 million tokens, SubQ reduces attention compute by nearly 1,000x compared to standard frontier models and at 1 million tokens, it runs 52x faster than FlashAttention.
And it does all of this while posting frontier level accuracy, scoring 95% on the RULER 128K long-context benchmark versus Claude Opus 4.6's 94.8%, and an 81.8 on SWE-Bench Verified coding tasks, besting Opus 4.6 (80.8) and DeepSeek 4.0 Pro.
The cost comparison is where it gets genuinely insane.
SubQ runs at under $1.50 per million tokens less than 5% of what Claude Opus charges.
On the RULER benchmark, running the test with SubQ cost $8, running the same test with Claude Opus cost $2,600 and that's a 300x cost reduction at equivalent or better accuracy..
Subquadratic launched with $29 million in funding, SubQ is available today for early access via API, and SubQ Code, a coding agent built on the architecture ships alongside it.
The transformer has been the unchallenged foundation of every major AI system since 2017.
SubQ is the first serious evidence that something structurally better might have just arrived.
Introducing SubQ - a major breakthrough in LLM intelligence.
It is the first model built on a fully sub-quadratic sparse-attention architecture (SSA),
And the first frontier model with a 12 million token context window which is:
- 52x faster than FlashAttention at 1MM tokens
- Less than 5% the cost of Opus
Transformer-based LLMs waste compute by processing every possible relationship between words (standard attention).
Only a small fraction actually matter.
@subquadratic finds and focuses only on the ones that do.
That's nearly 1,000x less compute and a new way for LLMs to scale.
Can you believe it?! 👀 We now have more flavors available in cans, including Jarritos Zero!
If you've been wanting to try it, now is your chance.
You can order directly to your door with Amazon Prime! 👇