Crusoe is bringing compute resources to DGX Cloud Lepton marketplace to meet AI demand. 🚀
Learn more and join early access! https://t.co/0qDZkEVEEi @NVIDIAAI
DGX Cloud Lepton is a new layer that standardizes AI inference across multiple cloud providers, offering a unified interface and automatic workload routing.
📣 Announcing a unified AI platform connecting developers to thousands of GPUs worldwide: NVIDIA DGX Cloud Lepton (Early Access).
Build, train, and deploy AI apps at scale—faster and easier than ever.
Learn more & join for early access: https://t.co/Ij5MnWrSDF
7/ NVIDIA and Hugging Face offer DGX Cloud Lepton for instant global GPU access.
Train, fine-tune, and deploy models at scale with ease. Fast, flexible, and collaborative.
🚨 NVIDIA launches DGX Cloud Lepton to commoditize inference compute across clouds, threatening neocloud margins.
DGX Cloud Lepton is a new layer abstracting inference compute across multiple neoclouds. It gives users a consistent interface while automatically routing workloads across providers.
→ The goal is to make inference compute a commodity, similar to what Uber did for taxi services. This strips differentiation from neoclouds and creates pricing pressure, reducing their margins.
→ Lepton’s real innovation is turning multi-cloud inference into a seamless, interoperable platform. It raises performance per dollar for users, while keeping NVIDIA’s margins untouched.
@NVIDIAAIDev
Nvidia nears a deal to buy Lepton AI, a GPU reseller, for several hundred million. 💰 This move expands Nvidia's cloud and enterprise software push.
Read more: https://t.co/PeIhJrC9Ai
#Nvidia
We've achieved a >99.5% uptime for large scale GPU clusters, with a great collaboration between @LeptonAI and @digitalocean. This is much better than industry standard SLAs which roams around 98%. It's done via proactive monitoring solutions like our open source GPUD, the cloud native platform, and close collaboration between the engineering teams. Learn more at https://t.co/iFRxwAVwAD, and shoot a message to [email protected] if you need high performance, cloud native, production grade AI infra!
Talk to Llama 3.2-3B 🦙🗣️⚡️
Powered by @LeptonAI (blazing fast LLM inference, ASR, and TTS all in one!) and @Gradio 's ergonomic WebRTC Streaming ⚡️
Building this took me about 30 minutes despite never using Lepton before.
Achieving more than 99.9% uptime and quick turnaround times for collaboration between teams after partnering with #DigitalOcean, @LeptonAI’s CEO, Yangqing Jia, is realizing his goal of growing 10x over the next year. 🚀
Watch to learn how ⤵
https://t.co/t3Qs3ZLWws
We are so proud to announce our extended partnership with FastGPU @fast_gpu via AI OG innovators, the mighty LeptonAI @LeptonAI .
Now you can deploy on-Demand RTX4090’s with Enterprise AI Infrastructure IN SECONDS with Exabits on FastGPU. Just pay for what you use, as you go.
~a thread~
Introducing Samba-CoE v0.3, our latest Composition of Experts (CoE) model that surpasses DBRX by @DbrxMosaicAI and Grok-1 314B by @xAIGrokInu on the OpenLLM Leaderboard @huggingface! 🏆 Samba-CoE-v0.3 is now available on @LeptonAI@jiayq, try now: https://t.co/8IzdQoJpZh. #AI
.@LeptonAI surpasses all other providers in throughput (P50 & P90) for both Llama-2-70B and Mixtral on a small service load for short input long output prompts. A P50 of 130 tks/s is the fastest throughput we've observed among all model offerings by all providers
View this scenario live:
https://t.co/vHsamRqSA9