Control plane for agents & engineers to provision compute and run training & inference across NVIDIA, AMD, and other chips — on clouds, Kubernetes, and on-prem.
Infrastructure orchestration is becoming an agent skill.
When agents run experiments, training, and eval autonomously, they now can provision compute, schedule workloads, and track state, all inside the loop.
New post on what this means for platform teams, providers, and the orchestration layer
https://t.co/T2jS9FLCOo
dstack 0.20.22 is out 🚀
Tenstorrent Blackhole support has landed.
Run dev, training, and inference workloads on Blackhole PCIe cards and systems.
Thanks to the @tenstorrent team and @artem_aero for collaborating on this.
https://t.co/fX5qHA7pCS
Most GPU workflows come with too much infra overhead.
Spin up a VM, install dependencies, copy code over, expose ports, watch logs, and remember to shut everything down after.
With JarvisLabs now available as a @dstackai backend, most of that goes away.
You define the machine and the workload in a YAML file. Run dstack apply. That's it.
dstack manages the infrastructure lifecycle. JarvisLabs provides the GPUs.
We wrote a short tutorial covering the full setup, from connecting your account to running a nanochat training job on H100s
Works for training runs, evals, benchmarks, inference services, and GPU dev environments.
Training models or serving inference on AMD GPUs?
We’ve refreshed the AMD accelerator example in the dstack docs, covering on-prem fleets, cloud GPU provisioning, dev environments, training jobs, and production-grade inference.
https://t.co/WffI8cKY7t
@tenstorrent@artem_aero The release also adds @vast_ai specific backend options for more precise GPU offer filtering.
Full release notes:
https://t.co/fX5qHA7pCS
dstack 0.20.22 is out 🚀
Tenstorrent Blackhole support has landed.
Run dev, training, and inference workloads on Blackhole PCIe cards and systems.
Thanks to the @tenstorrent team and @artem_aero for collaborating on this.
https://t.co/fX5qHA7pCS
Training models or serving inference on AMD GPUs?
We’ve refreshed the AMD accelerator example in the dstack docs, covering on-prem fleets, cloud GPU provisioning, dev environments, training jobs, and production-grade inference.
https://t.co/WffI8cKY7t
dstack 0.20.21 is out 🚀
⚡ Kubernetes: multiple clusters via kubeconfig contexts, each exposed as a backend region
⚡ New @jarvislabsai backend: H100/H200 on demand now; RTX PRO 6000 to follow
https://t.co/ORAtgzq7HJ
The community asked us for an example of how to use @radixark Miles with dstack for RL training.
Since Miles uses Ray and dstack can run Ray, using Miles with dstack is quite straightforward.
Here’s a new example of running Miles on a multi-node cluster provisioned and managed by dstack:
https://t.co/TOxsg8aBLr
JarvisLabs is now a @dstackai backend.
That means: if you've been using dstack to orchestrate runs across GPU clouds, you can now point it at our infrastructure with one YAML block.
type: jarvislabs
creds:
api_key: ...
One YAML. One command. GPU running in few seconds.
L4, H100, and CPU VMs, more on the way.
dstack handles the instance lifecycle. We provide the GPU.
If you've been waiting for it, give it a spin.
Needed to test a small training run on @AMD MI300X.
Passed my @digitalocean API key to @dstackai, asked Codex to submit the run, done.
Suspiciously easy.
Read how to deploy PD-disaggregated inference on multi-node @AMD GPU clusters using @lmsysorg SGLang and dstack.
One config for an autoscaling endpoint across cloud or on-prem clusters, with @lightseekorg SMG routing and Mooncake KV transfer over RoCE.
https://t.co/CNrpK5shOd
dstack 0.20.21 is out 🚀
⚡ Kubernetes: multiple clusters via kubeconfig contexts, each exposed as a backend region
⚡ New @jarvislabsai backend: H100/H200 on demand now; RTX PRO 6000 to follow
https://t.co/ORAtgzq7HJ
High-performance LLM inference on @AMD GPU clusters is now much easier to deploy than many people think.
At @dstackai, we’re building an open-source orchestration layer for running inference across GPU vendors, clouds, and on-prem clusters.
It’s exciting to see what becomes possible with projects like @lmsysorg SGLang and SMG.
https://t.co/hCV5YrBEDO
Read how to deploy PD-disaggregated inference on multi-node @AMD GPU clusters using @lmsysorg SGLang and dstack.
One config for an autoscaling endpoint across cloud or on-prem clusters, with @lightseekorg SMG routing and Mooncake KV transfer over RoCE.
https://t.co/CNrpK5shOd