“The future of AI is going to be local models running on extraordinary desktop hardware.” - @Jason
This line from the recent @theallinpod hit hard.
For years AI meant sending everything to the cloud, paying per request, and hoping for the best on latency and privacy.
That era is ending.
@Apple Silicon, AI PCs, and high-memory desktops are shifting the game. Local inference brings lower latency, zero marginal cost, real privacy, and actual user control.
At @RunAnywhereAI, we're building exactly for this future: AI apps that run close to the user, understand private context, and work across devices without shipping your data to a third-party cloud.
The next wave won't just be won by the biggest models. It will be won by the models that run where the user is.
What workloads do you think go local first?
#localai #inference #runanywhere #edgeai
Microsoft just canceled internal Claude Code licenses because the bills got out of control.
Months ago, it was the hero tool. Thousands of employees using it. Teams told to build faster, ship faster, prototype faster.
Then the invoice showed up.
Usage exploded. Tokens exploded. Bills exploded. Finance noticed. Licenses started disappearing.
This isn't a Claude problem. It's the next 24 months of every AI rollout.
Everyone wants agents. Everyone wants AI in every workflow. Nobody is asking the actual question:
What happens when every employee uses AI all day, every day?
Cloud AI is cheap when usage is low. Agentic AI breaks that math. The more your deployment succeeds, the less you can predict next month's bill.
You're not paying for a model anymore. You're paying for every token, every request, every loop, every retry, every agent talking to another agent.
And you're handing the most critical part of your stack to a vendor whose pricing, limits, and policies you don't control.
The future is AI everywhere. That doesn't have to mean AI in someone else's cloud.
This is why we're building @RunAnywhere.
Run inference on your own hardware. Keep sensitive data in-house. Kill the unpredictable usage bill. Own the infrastructure instead of renting it back from a hyperscaler.
AI shouldn't grow into a line item that scales faster than your revenue.
@Snorlxz 10. @ShubhamMal72313 from RunAnywhereAI is building natively optimised inference and infrastructure for SLMs, and will be at the Delhi meetup.
just tried this out and it one-shotted* this video: "before the agent does anything"
*i generated the narrative using chatgpt and used that as a prompt. featuring: @e2b@runanywhereai@composio@mem0ai@firecrawl@browser_use@agentmail@covenantlabsai
some thoughts:
- i clearly tried to stick too much into 30 seconds, they talk very fast and lost some content which breaks logic
- character consistency is strong, i uploaded a single screenshot from my prior video as reference
- voice consistency was not automatic. you notice unicorn switch from female to male voice part way through
- the agent gives you an editor with generated scenes broken up but i don't see a way to regenerate a single section in the UI (which would be nice)
- it is definitely a much better experience to have the agent stitch videos together than doing it yourself (i was using canva). was trying @flymy_ai's media agent api for it this weekend which also works well and with other models
Ambrosia is an excellent on-device AI journaling application that runs fully locally for complete privacy, speed, and reliability. It enables seamless offline tracking and AI-powered insights directly on the device.
Integration was handled using the @RunAnywhereAI sdks. Thanks to @_amankishore for building and sharing this
David Friedberg speaking straight FACTS on
@ChrisWillx latest pod
“This whole thing of data centers needs to be stopped. I actually don't think that data centers are going to have much to do with the benefits we're going to realize. So much of AI is going to sit at the edge. It's going to sit in embedded devices. It's going to sit on your desktop computer. It's going to sit on your iPhone. It's going to be ubiquitous. It's going to be everywhere.”
Thx @friedberg
That’s exactly why we’re building
@RunAnywhereai: the on-device AI platform that actually lets models run blazing fast on the hardware people already own.
(And yes that’s me in the photo with the legend himself at the All-In holiday party  Vision officially locked in.)
Edge AI isn’t coming. It’s here.
#runanywhere #modernwisdom #allinpod
Launching Inference Radar: our new weekly newsletter that tracks the top 130+ inference repositories, monitoring every commit, release, code change, and emerging trend across the ecosystem, then distills it all into one clear briefing.