Releasing Brontes: A modified Wave U-Net architecture for audio super-resolution. This one is trained to operate on NeuCodec outputs. I'm releasing a general 30M checkpoint on a variety of speech.
See links in replies.
All on MI300X thanks to @HotAisle@AIatAMD
It works! I made it a decoder-only TTS model. Text then mel with the head.
This one is just 52M param, trained from scratch on LJSpeech (20 hours).
The audio quality is shitty because I'm inverting the melspectrogram with Griffin-Lim.
I'm running an experiment. With AR transformers for speech, do we need a tokenizer, or can we get away with predicting mel spectrogram directly?
This unconditional transformer predicts a latent which then goes into a 1D causal conv that predicts the next 4 mel frames.
I'm running an experiment. With AR transformers for speech, do we need a tokenizer, or can we get away with predicting mel spectrogram directly?
This unconditional transformer predicts a latent which then goes into a 1D causal conv that predicts the next 4 mel frames.
It is funny to see the academic TTS research community gradually discover techniques we horsefuckers already were using circa 2021 via indigenous ways of knowing.
Wait...
If Mythos is so good at AI research that they crippled it for the public, since xAI gave Anthropic a metric fuckton of GPUs, do they get to use uncensored Mythos to improve Grok?
A lot of people diss Marco Pierre-White for being a Knorr shill; but I think stock pots/cubes are legit a good product and time saver.
Also, he made accessible, non-pretentious cooking videos. In one he straight up said if you don't want to chop garlic, from a tube is fine.
And as I am otherwise a 39-year-old single, unmarried male, seeking to find my female partner in life to marry, and create a family with, when I imagine myself as your possible father: I donβt even think I would be angry or upset at you or with the situation of my own daughter *being* an @Aella_Girl in the world.
I find you attractive. In fact, your unabashed honesty in talking about, researching, and sharing about your life and your working world is all part of this. It signals high intelligence, openness, and vulnerability to me. All traits I find very attractive.
But as I imagine the nonexistent possible world in which I happened to have (via my wife) birthed a literal @Aella_Girl-you, as you are now, I still know thereβs a part of me that would be perturbed.