NVIDIA's Nemotron 3.5 ASR Streaming Multilingual is now available through FluidAudio optimized for Apple Silicon so apps can run ~40-language real-time ASR entirely on device, no cloud required.
Apps shipping with it today include
@SpokenlyApp, @ALTIC_DEV, @SnaplyAI, and others.
FluidAudio:
https://t.co/VWaAPfLRh5
Original model:
https://t.co/87wRTlO5dm
@NVIDIAAIDev
LeafTok narrates books on-device using FluidAudio for TTS.
Zero network calls during playback. Voices ship with the app.
The trade-off vs cloud TTS: bigger app, fully offline.
Impressive. Very nice.
Recently contributed a patch for Parakeet V2/V3 to FluidAudio that makes it match this speed, on-device, i.e. ~300x speed factor. Can transcribe an hour of audio in ~12 seconds on an iPhone. And tdt-ctc-110m is ~33% faster than that (1 hr in ~9s).
https://t.co/zcRQmvxO1f
Today's a big day for Nemotron models.
Along with Ultra, we also shipped Nemotron Speech 3.5 that now supports 40 Languages and it's insanely Fast and Ultra Low latency!
I collaborated with @Alex_tra_memory, @fluidinference and @ALTIC_DEV to port the model to coreML to make bring the latest Nemotron model to any macbook using FluidVoice!
Give it a try and lmk what you think!
Link below ⬇️
Nemotron ASR Multilingual running on an iPhone 17 Pro in CoreML.
Many thanks to @fluidinference for the CoreML model and to @NVIDIAAI@NVIDIAAIDev for the model itself.
Supertonic3 running on an iPhone 17 Pro using ANE on CoreML. It’s blazing fast with low RAM consumption and background capable. 2 mins worth of audio generated in 3 secs.
Many thanks to @fluidinference for the port.
Audivize now supports NVIDIA Nemotron 3.5 ASR Multilingual via @fluidinference, adding support for 40 language-locales all on-device.
Demo: https://t.co/RwWiEsPBe6
Model: https://t.co/aBzlrWGWkO
@NVIDIAAIDev@NVIDIAAI#NemotronSpeech#VoiceAl
@hamza_q_ haha yeah, i think what happened was these coding agents collected enough training data from their users. the agents don't struggle as much with the more complicated OS environment
gave claude on windows another try, they have def improved the agent to operate much better on windows now. lesss environmental and system troubles, much more ease of operations
just finished chapter 1 of wdndev's llm_interview_note book. i had asked LLM to help quiz me on the concepts. the first chapter is impressive and does well in showing us why transformers were needed. i don't think i even understood half of the content. onto chapter 2 the main piece.
i am already starting to doubt if i am can even become a ML engineer
https://t.co/XLPEPJMamG
this is like the best resource i have ever read about LLMs, the og is chinese but with deepseek you can easily translate or ask it to explain to you.
its fluid clarity
https://t.co/RMxAVsQG4I
alot of metalearning is related to understanding whats the fundamental unit a certain field use to measure or benchmark "progress". like how you can't have an economy without an agreed upon currency
Weight matrices and Upsampling is the deduction of additional information from limited information. i.e 6*6 matrix input into a 8*8 upsample matrix based on the position and values of the individual coordinates