an audio model that can switch languages, accents, tone, gender, emotions in a single instance
this is silk mulberry 1.5 our most cost efficient model!
by @rumik_ai research lab
An Indian AI lab just dropped the world's first real-time voice model that allows you to handcraft the voice of the speaker.
You choose the speaker's age, gender, emotional register, timbre, pitch, accent, and code-switching.
The model is also VERY fast and affordable to run.
@lets_dig_deeper@lets_dig_deeper Just experienced Silk Mulberry 1.5 it sounds buttery smooth and genuinely natural!
I've tried dozens of voice agents, but most feel robotic and artificial. This one actually feels alive.
DM'd you my detailed feedback. Would love your thoughts! 🔥
this is something which we've been working on for a while now, especially @null_hawk , we also have insights from pre-training in the report, hopefully we'll put more stuff about the interpretability soon!
launching silk mulberry 1.5
one of the fastest multilingual voice models in the world
it matches the best voice models in quality benchmarks (MOS)
all this at more than 95% lower cost ₹0.40/min (~$0.0046/min)
try now 👇
Someone writing small cheques into American AI labs, doing PR tours about how they “backed the future” probably shouldn’t be the loudest voice lecturing everyone on what India must do after the Fable news.
For those of us actually building models from scratch across modalities, the bottlenecks are not a breaking headline or a geopolitical event. We live them every day, data, compute, talent, inference, distribution, and relentless execution.
You don’t wake up one morning, see a model get pulled down internationally, and suddenly discover the importance of sovereign AI.
At @rumik_ai we’ve always believed in owning our stack and building foundational capabilities ourselves. Soon, we’ll be open-sourcing India’s first expressive TTS model with deep code-switching support across Hindi, Hinglish, and multiple Devanagari languages. Not because it’s fashionable, but because we genuinely believe India can build world-class AI infrastructure and models.
What the ecosystem needs isn’t more hindsight experts chasing engagement after every headline. It needs patient builders, conviction, long-term capital, and VCs who help create enduring AI companies instead of pretending to be the smartest AI researchers on Twitter.
India doesn’t have a talent problem. It has a conviction problem.
i've been observing mixed results for non transformer backbones tbh (might be a skill issue) but another angle to reason about training behavior that i found in this paper was to observe the singular value spectrum of the momentum buffers during training.