✨ bitsandbytes now has native ZeroGPU support since the new multi-backend refactor.
This shows how central @PyTorch has become in the AI landscape:
🔗 Extend torch the right way (custom ops, modes, etc.) and your software plugs seamlessly into a thriving ecosystem
Recently finished writing a new blogpost about @PyTorch compilation in ZeroGPU Spaces.
Worth reading if you're interested in learning about :
- PyTorch ahead-of-time compilation
- ZeroGPU internals
https://t.co/Jk3IwBlqEE
Wrote an FA3 attention processor for @Alibaba_Qwen Image using the 🤗 Kernels library. The process is so enjoyable!
Stuff cooking stuff coming 🥠
https://t.co/qVtQkwoB6o
Felt frustrated when using `torch.compile` as it takes forever? 🤬
You SHOULD switch to regional compilation & see if it is just as beneficial as using full compilation.
Let the numbers (Flux.1-Dev) convince you 🫡
🚀 ZeroGPU v2 update
We just switch to Nvidia H200 last week
It means that @huggingface Spaces are now equipped with:
- 🧠 70GB vram
- ⚡ 2.5x more flops
🔓This will hopefully unlock unseen use cases
💰 It also makes Pro plan a seriously cheap CUDA compute option
ComfyUI → @huggingface Spaces → serverless ZeroGPU ✨😌
We wrote a tutorial on how to turn any ComfyUI workflow into an easy to use Gradio app and (optionally) host it for free with ZeroGPU 💥
https://t.co/1Ij5nqA5rS
⚡We've just rolled out a major update on ZeroGPU!
Major improvements:
- 2x faster GPU coldstarts
- More efficient CPU memory usage (meaning more slots for the community)
- ZeroGPU initialization now displays a progress bar
- Greatly improved PyTorch compatibility
GPU-Poor no more: super excited to officially release ZeroGPU in beta today. Congrats @victormustar & team for the release!
In the past few months, the open-source AI community has been thriving. Not only Meta but also Apple, NVIDIA, Bytedance, Snowflake, Databricks, Microsoft, Google, and more have released open models and datasets on Hugging Face, which now hosts over 1M models on the Hub which have been downloaded over a billion times. More than that, many are starting to be better than proprietary APIs.
This movement has been supported not only by big tech but also by a thriving open-source AI community that includes academic labs, startups, and independent hobbyists. For example, more than 35,000 variation models of Llama have been shared on Hugging Face since Meta’s first version a year ago—including more than 7,000 based on Llama-3—ranging from quantized and merged models to specialized models in biology and Mandarin, to name a few. More than 4 million AI builders are now using Hugging Face.
However, the open-source community doesn’t have the same resources available to train and demo these models that big tech have at their disposal, which is why ChatGPT remains the most used AI application today.
@huggingface is fighting this by launching ZeroGPU, a shared infrastructure for indie and academic AI builders to run AI demos on Spaces, giving them the freedom to pursue their work without the financial burden of compute costs. Spaces have been the most popular way to build AI demos, with over 300,000 AI demos created so far on CPU or paid GPU (and a thousand more every day). To foster the continued development of the AI ecosystem, Hugging Face is committing $10M of free GPUs with the launch today of ZeroGPU.
Technically speaking, ZeroGPU leverages Hugging Face's experience in hosting and serving more than 100 Petabytes monthly from the Hugging Face Hub. ZeroGPU allows Spaces to run on multiple GPUs by making Spaces efficiently hold and release GPUs as needed (as opposed to a classical GPU Space that holds exactly one GPU at any time). This architecture is also more energy-efficient since GPUs are shared rather than duplicated. ZeroGPU uses @nvidia A100 GPU devices under the hood.
You can learn more about ZeroGPU here: https://t.co/1mxUxXmElv
More than 1,300 ZeroGPU spaces have been built since we started giving early access to AI builders on May 1, 2024: https://t.co/XvJ2MkcK7R
You can explore some examples from @victormustar: https://t.co/b8SUcRelJf
You can find the article from @kyliebytes: https://t.co/87uN1vnMu8
🤗🤗🤗
@yikesawjeez@Xianbao_QIAN@huggingface You don't need to ask for grants with ZeroGPU, that makes a big difference (I just approved your HF profile on https://t.co/oKuw0pBG61, you should now have access)
@realmrfakename@Xianbao_QIAN@huggingface We’ll update the name, it’s an error and it should display “Zero Nvidia A100” (just try checking the output of nvidia-smi inside at_spaces.GPU!). Have fun with ZeroGPU
@huggingface GPU zero are now running on A100!
https://t.co/zqK9h8sNgT
By adding a simple annotation, your Spaces with grants are able to run
- on multiple GPUs
- on demand GPUs, release as needed
Come join the org and start making awesome demos on many A100!
Model downloads from @huggingface Spaces should be way faster overall
This https://t.co/JiWbn2EuOB Llama-13B Space boots in 30s instead of 15min previously. That's a 30x speedup
It is running on a new Space hardware 🤫 but you can expect decent speed-ups on regular Spaces too