A 3-point gap in aggregate WER can hide a 13-point gap on the audio that actually breaks production.
Heavy noise WER:
- Pulse 18.29%
- Assembly AI 25.61%
- Deepgram Nova 3 31.29%
Aggregate WER averages ten different noise conditions into a single number. The per-condition breakdown shows where a model actually breaks.
@diptanu and sandboxes is like the best founder/product/market fit I've seen.
Diptanu's entire career has lead to this moment, and he's more than ready to take over the world.
> cloned my voice. made it argue with a pro voice about my recent post.
> i don't sound like i think i sound.
> https://t.co/pGtjz3oboR
> turns out i sound like a linkedin influencer.
btw the things that nobody asks about but should
โ paste pdfs, it handles tables
โ pick from 38 voices, american/british/indian/hindi
โ custom pronunciation for names the ai butchers
all at https://t.co/pGtjz3nDzj
> cloned my voice. made it argue with a pro voice about my recent post.
> i don't sound like i think i sound.
> https://t.co/pGtjz3oboR
> turns out i sound like a linkedin influencer.
Wednesday someone on the team asked if lightning v3 was good enough to do a real podcast.
By friday we had https://t.co/jXiaal3FcX. paste a url, two ai voices talk about it, flip the 3d toggle and watch them lip-sync the whole thing in your browser.
It's Free.
iโve started sponsoring young indian builders / devs / security researchers who are doing genuinely cool work. only sponsored @amanvarshney01 and @ni5arga so far, but i want to scale this up over time
there are so many young indian devs doing insanely cool shit with basically zero support. i want to help with that
more of this soon
perks of working at a voice AI company: you end up building the random tools you wish existed. XD
made a thing that drops perfect animated captions into your after effects and premiere pro in one click. even gets Hinglish right (yeah, really).
dropping it soon. @smallest_AI
perks of working at a voice AI company: you end up building the random tools you wish existed. XD
made a thing that drops perfect animated captions into your after effects and premiere pro in one click. even gets Hinglish right (yeah, really).
dropping it soon. @smallest_AI
Pulse Pro is the #1 hosted STT API on the CodeSOTA leaderboard, and #3 overall across all models, hosted and open-source ๐
5.42% mean WER on the HF Open ASR Leaderboard's 8-dataset suite.
A hosted API matching open-source frontier accuracy is rare. Doing it while shipping on-prem and air-gapped deployment is the position that matters for enterprise.
Check out the full leaderboard ๐งต
Pulse Pro is the #1 hosted STT API on the CodeSOTA leaderboard, and #3 overall across all models, hosted and open-source ๐
5.42% mean WER on the HF Open ASR Leaderboard's 8-dataset suite.
A hosted API matching open-source frontier accuracy is rare. Doing it while shipping on-prem and air-gapped deployment is the position that matters for enterprise.
Check out the full leaderboard ๐งต
codex tip that sounds obvious but somehow isn't:
if you want it to do something, just ask it
if it keeps messing up the same thing, ask it to remember what to avoid next time
tell it what good looks like and what context it should use
you'd be surprised how often this works
and when it doesn't, just ask codex what it needs from you to get there