Today we’re sharing how we built Thalamus: Cerebrium’s highly available distributed router for global AI workloads.
It helps power low-latency, resilient AI applications across fragmented compute for companies like
@tavus, @superwhisper, @useCamb_AI
Read more: https://t.co/QKX7iwNzTI
If voice AI is core to your product, treating the LLM as a permanent black box is leaving UX and margin on the table.
Happy to chat if you’re thinking about making this shift.
Most voice AI teams delay this decision far longer than they should - and eventually hit the same wall.
The LLM ends up being the single biggest driver of both latency and cost.
This typically only starts to make sense at ~15k+ calls per day.
But that’s also when customers start noticing.
Lower latency shows up immediately in:
• Conversation flow
• Turn-taking
• Perceived intelligence
The bigger signal....
The winners won’t just be the models with the best benchmarks - but the platforms that deliver:
• Low latency
• Reliability
• Cost efficiency
• Great developer experience
• Production-ready infra
Infrastructure is the moat.
And that’s exactly what we’re focused on solving at @cerebriumai .
One of the more interesting AI reports I’ve read recently is @openrouter's State of AI - because it’s based on real production usage at scale, not benchmarks or demos.
A few takeaways 🧵👇
You can read the full report here: https://t.co/TSw9FQPr2f
5. Open-source models now represent ~30% of total usage.
Driven by:
• Cost
• Flexibility
• Faster iteration cycles
I’d be surprised if this number doesn’t grow meaningfully in 2026.
6. Early model fit creates lock-in.
When users find a model that fits their workflow early, they stick with it.
@xai just raised a $20B Series E at a $230B valuation - and I’m still bullish.
Back in April 2024, when xAI raised ~$6B at an $18B pre, many questioned the lack of traction and clarity. 18 months later, execution has answered most of that.
Infra matters. xAI built Colossus, one of the world’s largest AI supercomputers, in record time.
Outside of Google, very few labs own their full stack - OpenAI and Anthropic still rely heavily on cloud partners.
Distribution + data are underrated advantages.
xAI already reaches ~600M active users via X, competitive with ChatGPT at ~900M, with a still-nascent GTM - plus proprietary real-time data no one else can access.