Today, we’re excited to introduce Miso One, the most emotive voice model in the world.
Miso One is an 8-billion-parameter text-to-speech model for highly expressive speech generation. It emotes like a human and responds faster than a human, with just 110 milliseconds of latency.
We’ve open-sourced the model weights, with API access coming soon.
Hear how Miso One sounds in the thread below.
We might have a new top open-source music generator.
> Unified acoustic-token pipeline
> Full-song generation
> Text + lyric control for style, mood, vocals
https://t.co/zoZiWhLArB
> 1. Trillion-parameter MoE architecture, total parameters 1 trillion, activates about 37 billion during inference, inference speed increased by 35 times, energy consumption reduced by 40%
Upgrading your RAM is now unnecessary.
Introducing our new ComfyUI Dynamic VRAM optimization. Running local models is now possible on even the most memory constrained hardware.
Read more here:
https://t.co/H1ZeYLlfzf
🚨 OpenAI charges $0.006/minute. Google charges $0.024. AWS charges $0.024.
Someone just open sourced a tool that does it for $0. And it's faster than all of them.
It's called Insanely Fast Whisper. And that's not hype. That's the benchmark.
150 minutes of audio. 98 seconds to transcribe. On your own machine. No API key. No cloud. No per-minute billing.
Here's what the numbers look like:
→ Whisper Large v3 + Flash Attention 2: 150 min of audio in 98 seconds
→ Distil Whisper + Flash Attention 2: 150 min in 78 seconds
→ Standard Whisper without optimization: 31 minutes for the same job
→ That's a 19x speedup. Same model. Same accuracy. Just faster.
Here's what it does:
→ One command to transcribe any audio file or URL
→ Speaker diarization — knows WHO said WHAT
→ Transcription AND translation to other languages
→ Runs on NVIDIA GPUs and Mac (Apple Silicon)
→ Flash Attention 2 for maximum speed
→ Clean JSON output with timestamps
→ Works with every Whisper model variant
Here's the wildest part:
https://t.co/WfJGCpSz09 charges $100/year. Rev charges $1.50/minute. Descript charges $24/month. Enterprise transcription contracts cost thousands.
Podcasters, journalists, researchers, lawyers, content creators — anyone still paying for transcription is lighting money on fire.
8.8K GitHub stars. 633 forks. MIT License.
100% Open Source.
(Link in the comments)