Runpod Flash is GA ⚡️
Flash is a Python SDK that lets you define infrastructure and deploy AI workloads directly from your terminal.
Check it out: https://t.co/RV19HIhxhg
Blog: https://t.co/JDMr3w9MW0
The bottleneck isn't GPUs anymore.
It's the three to five clouds most AI teams stitch together to get a model to production.
So as of today, we'll start calling Runpod the AI Developer Cloud.
Read Zhen's take on this through the link below.
https://t.co/gBaKNhZtNL
Today @MeckaAI is announcing $60M in funding to become the data and deployment layer for physical AI
This raise will allow us to scale our data infrastructure, invest into new verticals, and deploy robots into the real world
"The AI market looks nothing like the narrative."
Our CTO, Brennen Smith, went on TFiR to talk about what we're actually seeing in production.
Some of it will confirm what you've heard. A lot of it won't.
Curious what the data actually shows?
Find the full interview here: https://t.co/3OjyDogxTA
We just launched Multi-Instance GPU (MIG) on Runpod Serverless.
It partitions the RTX 6000 Pro into isolated 24 GB instances, each with dedicated memory and compute.
So if your workload fits in 24 GB, now you can pay for 24 GB.
Read the blog post to get started: https://t.co/p50r1UeeDM
Most teams running vLLM are using default settings.
That's a 2-3x cost penalty with no performance upside.
We benchmarked the configurations that actually matter, vLLM and SGLang, and turned them into a playbook.
Settings, benchmarks, and copy-paste templates, all that stuff.
Get the full playbook here: https://t.co/NueWqxm5NM
introducing https://t.co/45y9MyH2cQ
- inference server based on @huggingface transformers.js
- OpenAI-compatible
- runs on mac, linux & win via cuda, coreml, dml, webgpu, wasm, cpu
- tested llms from @liquidai, @Alibaba_Qwen & @GoogleDeepMind
- embeddings & speech-to-text
- works with @NousResearch Hermes
- built in ts
- open source as MIT
- the first ever project from @runpod labs
https://t.co/SzB0IZ0bdZ
It is crazy that now using the @googlegemma family of models, autoresearch, Codex or Claude Code, @elves_skill, @UnslothAI, and $20 to spend on @runpod, you can now get an industrial quality model that can be deployed on CPU done over a weekend with minimal observation even needed
I actually trained two such models this weekend, one using Gemma 4 and the other using T5 Gemma 2
Not that long ago, these models would’ve been $10k-$50k and would’ve taken several people months
The fact that the price and time have dropped so dramatically opens up all sorts of new use-cases that previously wouldn’t have made sense economically
Wild times
Runpod just shipped Cost Centers. Now you get native spend tracking by team, project, or department, right in the console.
Label any resource (Pods, Serverless, Network Volumes, Instant Clusters) and get per-label spend breakdowns on your monthly invoice. Unlabeled resources auto-group by creator.
If you're running GPU workloads across multiple teams or projects and tired of spreadsheet reconciliation, this is a big one.
Now in public beta for Runpod Teams.
We put a chat interface in the Runpod console.
23 tools across the full REST API.
If you can do it in the dashboard, you can ask for it in chat.
Find it in the console ;)
Runpod is now natively supported in @transformerlab!
Add your API key and get experiment tracking, automatic checkpointing with failure recovery, persistent artifact storage, and interactive sessions (Jupyter, VSCode, vLLM) on your Runpod GPUs.
Get started 👇
🚀 Support for Runpod is live on Transformer Lab for Teams.
Add your Runpod API key and start running workloads on Transformer Lab for Teams using Runpod instances.
What you can do:
⚡ Queue workloads to run automatically or reserve an on-demand instance with Jupyter, VSCode and vLLM on dedicated Runpod GPUs
🧪 Submit training and eval jobs with built-in experiment tracking
🔄 Automate checkpointing and failure recovery. If an instance drops, your job restarts from the last saved checkpoint
💾 Store artifacts persistently, so model weights and eval results are accessible after the Runpod instance terminates
🔗 Supports SLURM and SkyPilot so teams that use Runpod alongside on-prem clusters can manage everything from a unified interface
Get started here: https://t.co/6pwxngVtso