🚀Introducing Emilia-Large: 200K+ Hours of Open-Source Speech Data!
We’re excited to release Emilia-Large, the largest TTS pretraining datasets! With 200K+ hours of multilingual speech data, fully open-source. It is ready to use for #TTS and #SpeechLM.
🚀🚀🚀 A Zero-Shot TTS model MaskGCT (Masked Generative Codec Transformer) is open-sourced in Amphion now. Trained with Emilia. Only needs 5 sec speech to clone
Paper: https://t.co/OdoQ3niCeY
HF: https://t.co/2mCZA9GLzD
Discord: https://t.co/FvmcJ5pm6z
Watch the demo by MaskGCT
@mohamed17381489@realamphion@xutan_tx Hi, since the codec is design for tts task, we only achieve vc by replacing speaker embedding in the codec decoder, we will improve it recently!
Amphion now supports the FACodec, which is the core component of NaturalSpeech3 and the pretrained checkpoints are released.
Paper: https://t.co/bVbwpcTXBo
Checkpoints: https://t.co/CpF3zDArVZ
Demo: https://t.co/uFWnNEb309
Code: https://t.co/wUvqBoCHiD
@xutan_tx@yuancwang