The Grid’s Beta is LIVE!
We can get your AI API costs down by up to 80% by making suppliers compete for your requests.
Your first 200M tokens are on us, start building
→ https://t.co/e97NlwzIQa
3/ The Grid standardizes inference into graded tiers with guaranteed spec. You pick the tier your workload needs. Suppliers compete to fill it.
You get the output at the best price, instead of brand names and expensive subscriptions.
Uber reportedly torched $3.4B of its AI budget in just four months.
The root cause: an absolute lack of a routing layer between the request and the model.
We are blindly throwing expensive frontier models at EVERY SINGLE TASK.
This is why I am excited about @The_GridAI.
It completely flips the script by turning AI inference into a live spot market.
Instead of hardcoding a single, expensive provider, you select a quality tier eg:
- text-standard
- text-prime
- text-max
The Grid then dynamically routes your request to the cheapest qualifying model 🔥
→ Drop-in migration via an OpenAI-compatible endpoint
→ Zero vendor lock-in for your applications
→ Automated, cost-saving routing based on task difficulty
→ Complete visibility into every token spent
Just for fun, I built a small @Streamlit demo to showcase its capabilities.
The app lays out the entire process:
✦ Header & Request:
↳ a clean UI to plug in an API key and choose your instrument (text-standard, text-prime, or text-max).
✦ Routing & Metadata:
↳ run a prompt and instantly inspect the backend
You can see the latency, prompt/completion tokens, and the exact model returned.
✦ Migration & Traceability:
↳ a live look at the code swap and how the market actually routed the request.
On the top left, you can enter your API key and try it out for yourself!
@The_GridAI just went live, and they are giving new accounts an incredible 200 million free tokens.
Demo app + open-source repo in the 🧵↓
Here's something most inference buyers don't have access to: a limit order.
'Fill my Text Prime order at $0.50 or less.'
If the market clears there, your job runs. If not, it waits.
Let the price come to you instead of the other way around.
Genius idea for AI inference!
A marketplace that routes requests to the cheapest qualifying model at any given point.
This can get you up to 87% cheaper inference!
Today, if you need a model, you pay the vendor's fixed rate card, but that's about to change with this:
For the devs asking how hard it is to switch, it's not.
The whole migration is 3 lines.
Suppliers compete for every request you send. Save up to 80% vs list price.
The migration is almost insulting.
Before:
model: "gpt-4o"
base_url: "https://t.co/423dN3ogwu"
After:
model: "text-prime"
base_url: "https://t.co/8ZUvQNMQdn"
Two lines changed.
Your client keeps running like nothing happened, except now your costs track the actual market instead of whatever a vendor felt like charging you.
🧵 3. The 3 text tiers and the 3 pricing
- Standard is for high-volume work where cost matters most, like classification, batch summarization, tagging, and simple extraction.
- Prime is the everyday production tier. This is where I’d put agents, RAG, drafting, support workflows, and quality-sensitive pipelines.
- Max is for the harder stuff, like long-context work, high-stakes reasoning, and tasks where a wrong answer can create real downstream cost.
The important part is that “cheapest” does not mean “random cheap model.”
Each tier has a quality threshold. The Grid checks models against benchmark floors anchored to Artificial Analysis. If a supplier falls below the required quality level for a tier, it gets removed from the eligible set. So the market competes on price, but only inside the quality bar you picked.
We get asked a lot: how much can I actually save?
Every instrument shows you the savings vs list price, in real time:
Text Standard: save up to 87%.
Text Prime: save up to 79%.
Text Max: save up to 18%.
Every time you call the API, suppliers compete to deliver your request.
@__sishir The way we think about it: set a floor on benchmark, latency, uptime, error rate, and once suppliers clear that, let them compete on price.
We're in Beta! First 200M tokens free for anyone who wants to try→ https://t.co/e97NlwzIQa
1/ If you burn a lot of tokens, this is for you.
The Grid is now live in beta.
Pick the quality and speed your workload needs, change 3 lines of code, and your API costs fall by up to 80%.
How we do it ↓
5/ Built for teams that watch cost per token and are tired of playing model whack-a-mole. Your first 200M tokens are on us. Start building→
https://t.co/e97Nlwzb0C
4/ Prices move with real supply and demand. The grade you bought is the contract suppliers conform to. Measured, not marketed. When a supplier slips, the market routes around them. You get the inference that meets the spec at the best available price.