Announcing MOSS-TTS-PNY 1.7B v0.1
Finetune of the MOSS-TTS-Local model with fixed speaker embedding + a custom iSTFTNet3 vocoder that intercepts the MOSS tokenizer's features and outputs 48KHz audio.
Runs 1.8x realtime on a single RTX 5090 w/PyTorch:
(see replies)