pulsatingGenius @neuralwarlock - Twitter Profile

pulsatingGenius

@Neuralwarlock

3 days ago

@ickma2311 This is good stuff

0

2

0

341

Neuralwarlock retweeted

rumik

@rumik_ai

3 days ago

mulberry can: - respond in 162ms, the fastest ttfb on the market - switch between two languages mid-sentence, naturally - speak in different languages with their actual native accents, not a translated voice

7

46

8

9

5K

pulsatingGenius

@Neuralwarlock

3 days ago

@null_hawk Lets fucking fo man

0

2

0

49

Neuralwarlock retweeted

Rohan

@lets_dig_deeper

3 days ago

launching silk mulberry 1.5 one of the fastest multilingual voice models in the world it matches the best voice models in quality benchmarks (MOS) all this at more than 95% lower cost ₹0.40/min (~$0.0046/min) try now 👇

113

1K

151

464

186K

pulsatingGenius

@Neuralwarlock

4 days ago

@wandering_mush haha i should have been more lucid but i wrote this half asleep so some parts came out bit loose than they should have. thanks for the follow lol. i started posting this stuff when like 20 people followed me. i m mostly just learning and posting as i go.

0

1

0

29

pulsatingGenius

@Neuralwarlock

4 days ago

also i m not claiming gemma 4 12b proves encoder free is already the best pure performance choice or that google dropped the encoders only because it performs better. It may very well be a deployment /latency/memory tradeoff. my point is more modest gemma 4 12b shows that you can strip the input side encoder down much more than the usual multimodal pipeline and still get a useful capable model. that is the data point i care about. So I m using it as evidence that this direction works at all not as proof that the tradeoff is already solved everywhere.

2

1

0

37

pulsatingGenius

@Neuralwarlock

4 days ago

@wandering_mush yeah I get u r not arguing against the premise. i just wanted to clarify the gemma bit I wasnt saying the whole gemma 4 family scales toward encoder free as the models get bigger. i was only pointing to gemma 4 12b specifically which is the one I mentioned.

0

1

0

27

pulsatingGenius

@Neuralwarlock

4 days ago

i have no fucking idea why people pretend like ai or llms are the pinnacle of human achievement or some boundary we have reached. why do people act like theres nothing after this? if they are so fucking capable where is the unified theory for all four fundamental forces? why isnt quantum physics solved? why is the three body problem still a nightmare? why are people still dying from random viruses? for fucks sake we still dont even fully know whats under half the ocean. space travel is absurdly expensive. i dont see cars flying over my head. half the millennium Prize problems are still unsolved. we dont understand aging. ofcourse ai is impressive but acting like its the final chapter of science is insane. there are still entire fields of reality we barely understand. we havent even finished understanding intelligence itself yet people are already talking as if ai is the final destination of science.

0

1

0

91

pulsatingGenius

@Neuralwarlock

4 days ago

the voice design stuff is pretty cool.

0

5

0

328

pulsatingGenius

@Neuralwarlock

4 days ago

All of this is just trying to motivate something which should be intrinsic there is nothing u need for research except utter undenying curiosity it should be natural and should come within if u have been able to save your inner child then u r a researcher. It doesnt get more deeper than that

0

3

0

100

pulsatingGenius

@Neuralwarlock

6 days ago

@RBehiel would really love a detailed video on physics of diffusion nd time dependent vector fields

0

3

0

213

pulsatingGenius

@Neuralwarlock

7 days ago

@maharshii Maybe there is no point

1

0

127

Neuralwarlock retweeted

vatsal bharti

@bhartivatsal

7 days ago

Someone writing small cheques into American AI labs, doing PR tours about how they “backed the future” probably shouldn’t be the loudest voice lecturing everyone on what India must do after the Fable news. For those of us actually building models from scratch across modalities, the bottlenecks are not a breaking headline or a geopolitical event. We live them every day, data, compute, talent, inference, distribution, and relentless execution. You don’t wake up one morning, see a model get pulled down internationally, and suddenly discover the importance of sovereign AI. At @rumik_ai we’ve always believed in owning our stack and building foundational capabilities ourselves. Soon, we’ll be open-sourcing India’s first expressive TTS model with deep code-switching support across Hindi, Hinglish, and multiple Devanagari languages. Not because it’s fashionable, but because we genuinely believe India can build world-class AI infrastructure and models. What the ecosystem needs isn’t more hindsight experts chasing engagement after every headline. It needs patient builders, conviction, long-term capital, and VCs who help create enduring AI companies instead of pretending to be the smartest AI researchers on Twitter. India doesn’t have a talent problem. It has a conviction problem.

5

31

4

5

6K

pulsatingGenius

@Neuralwarlock

7 days ago

@bunmaskachaiii I think you mean symbols not language. We can think without words but it’s hard to think without some form of representation.

0

2

0

43

pulsatingGenius

@Neuralwarlock

8 days ago

Genuine question : your combined filter weights DNSMOS, WER and SR or VAD ranks equally. But Table 3 shows the SR/VAD signal is your weakest filter (5.20 avg rank vs 3.40 for combined) and filtering harder on it makes things worse (VAD-50% = 6.20). Why give equal weight to SR? Isn't Silero VAD unreliable on exactly the wild YouTube-type audio that's most of your pool?

Neuralwarlock's tweet photo. Genuine question : your combined filter weights DNSMOS, WER and SR or VAD ranks equally. But Table 3 shows the SR/VAD signal is your weakest filter (5.20 avg rank vs 3.40 for combined) and filtering harder on it makes things worse (VAD-50% = 6.20).
Why give equal weight to SR?
Isn't Silero VAD unreliable on exactly the wild YouTube-type audio that's most of your pool?

Dongmin Park @dongmin_park11

25 days ago

Raon-OpenTTS paper is finally out! We fully open-sourced 615K hours of TTS data and a 1B model competitive with Qwen3-TTS-1B and Voxtral-TTS-4B. Like DCLM and DataComp, our work closed the gap towards SOTA closed-data models in TTS, which will help push the TTS community forward!

dongmin_park11's tweet photo. Raon-OpenTTS paper is finally out! We fully open-sourced 615K hours of TTS data and a 1B model competitive with Qwen3-TTS-1B and Voxtral-TTS-4B. Like DCLM and DataComp, our work closed the gap towards SOTA closed-data models in TTS, which will help push the TTS community forward! https://t.co/zW3aT8CSMC

9

207

27

160

15K

1

17

1

3

3K

pulsatingGenius

@Neuralwarlock

8 days ago

@eigenron Im sure people like hitler , napoleon, alexander the great and countless revolutionaries weren’t just searching for “better explanations” they explicitly wanted to reshape the world and did.

1

3

0

66