Today, weโre excited to introduce Miso One, the most emotive voice model in the world.
Miso One is an 8-billion-parameter text-to-speech model for highly expressive speech generation. It emotes like a human and responds faster than a human, with just 110 milliseconds of latency.
Weโve open-sourced the model weights, with API access coming soon.
Hear how Miso One sounds in the thread below.
RT @433: ๐จ ๐๐๐๐๐ ๐๐๐ ๐๐๐ ๐๐๐๐๐๐ ๐๐ฅน
โ Sir Matt Busby Player of the Year
โ United Playersโ Player of the Year
โ FWA Footballer of the Year
โ โฆ