Voice most important modality after text.
Speech-LLMs are the future.
Here is a chronological list of the important Open Source milestones in the Speech-LLM space 📖.
A thread 🧵 ...
We designed Orpheus to be easily fine tuned. On Hugging Face 🤗 there are over 200 finetunes of our model. People have made Orpheus:
- Speak dozens of languages 🌎
- Create personalised voice interfaces 🔊
- Recreate historical/fictional characters📜
Today, we’re launching Orpheus, an open-source TTS model that exceeds the capabilities of both open and closed-source models such as ElevenLabs and OpenAI! (1/6)
Deploying and vibe checking Orpheus TTS, an open-source model for generating speech.
Our implementation supports up to 48 concurrent real-time users per H100 GPU!
People told us they want Orpheus TTS in production.
So we partnered with @baseten as our preferred inference provider!
Baseten runs Orpheus with:
• Low latency (<200 ms TTFB)
• High throughput (up to 48 real-time streams per H100)
• Secure, worldwide infra
Building an AI app with $1/hr AI voice, memory, tool-calling, phone—that's what you use Gabber for
But time to value shouldn't be hours, it should be seconds
Build a sample AI companion in 20 seconds, try it, tune it, then go live with it (last part coming soon)
Thanks for the OrpheusTTS release!
Great models, easy to finetune and you can even livestream using vllm or sglang with a single RTX 3090 with FP8 quantization. INT4 should be even faster.😃