Aditya Bhat

9 days ago

@smallest_AI @harshitajain561 This is so cool!!

0

2

0

47

Aditya16037 retweeted

14 days ago

Pulse hits 3.2% word error rate on Coval's STT benchmark! Coval is the eval platform built for voice AI agents. Their STT test uses diverse speakers, accents, and real-world conditions, not clean read speech. Ahead of: - Deepgram Nova-3 (4.2%) - ⁠AssemblyAI Universal Streaming (4.2%) - ⁠Speechmatics Enhanced (4.2%) Check out the full docs in 🧵

smallest_AI's tweet photo. Pulse hits 3.2% word error rate on Coval's STT benchmark!

Coval is the eval platform built for voice AI agents. Their STT test uses diverse speakers, accents, and real-world conditions, not clean read speech.

Ahead of:
- Deepgram Nova-3 (4.2%)
- ⁠AssemblyAI Universal Streaming (4.2%)
- ⁠Speechmatics Enhanced (4.2%)

Check out the full docs in 🧵

2

14

6

0

2K

Aditya16037 retweeted

15 days ago

A 3-point gap in aggregate WER can hide a 13-point gap on the audio that actually breaks production. Heavy noise WER: - Pulse 18.29% - Assembly AI 25.61% - Deepgram Nova 3 31.29% Aggregate WER averages ten different noise conditions into a single number. The per-condition breakdown shows where a model actually breaks.

smallest_AI's tweet photo. A 3-point gap in aggregate WER can hide a 13-point gap on the audio that actually breaks production.

Heavy noise WER:

- Pulse 18.29%
- Assembly AI 25.61%
- Deepgram Nova 3 31.29%

Aggregate WER averages ten different noise conditions into a single number. The per-condition breakdown shows where a model actually breaks.

1

15

5

0

790

16 days ago

@Devanshpawan1 Yessir!! The launch is a success sir:)

0

1

0

32

Aditya16037 retweeted

16 days ago

Pulse Pro is the #1 hosted STT API on the CodeSOTA leaderboard, and #3 overall across all models, hosted and open-source 👀 5.42% mean WER on the HF Open ASR Leaderboard's 8-dataset suite. A hosted API matching open-source frontier accuracy is rare. Doing it while shipping on-prem and air-gapped deployment is the position that matters for enterprise. Check out the full leaderboard 🧵

smallest_AI's tweet photo. Pulse Pro is the #1 hosted STT API on the CodeSOTA leaderboard, and #3 overall across all models, hosted and open-source 👀

5.42% mean WER on the HF Open ASR Leaderboard's 8-dataset suite.

A hosted API matching open-source frontier accuracy is rare. Doing it while shipping on-prem and air-gapped deployment is the position that matters for enterprise.

Check out the full leaderboard 🧵

4

32

12

2

8K

21 days ago

On WildASR, Pulse hits 9.63% word error rate. Deepgram Nova-3 hits 28.17%. Nearly 3x the WER, on the same audio. WildASR is the benchmark that tests STT on real production conditions, not clean studio recordings. Their dataset covers far-field mics (sound captured from across a room: conference speakerphones, kiosks, drive-thrus), reverberation (echo in real rooms), phone codec compression, clipping, and background noise gaps. These are the conditions voice agents and contact centers deal with every call. Clean datasets predict almost nothing about how a model behaves here. Check out the full benchmarks in our Docs.

Aditya16037's tweet photo. On WildASR, Pulse hits 9.63% word error rate. Deepgram Nova-3 hits 28.17%. Nearly 3x the WER, on the same audio.

WildASR is the benchmark that tests STT on real production conditions, not clean studio recordings. Their dataset covers far-field mics (sound captured from across a room: conference speakerphones, kiosks, drive-thrus), reverberation (echo in real rooms), phone codec compression, clipping, and background noise gaps.

These are the conditions voice agents and contact centers deal with every call. Clean datasets predict almost nothing about how a model behaves here.

Check out the full benchmarks in our Docs.

0

5

2

0

376

Aditya16037 retweeted

Muskan Jain

@Muskanjain0401

25 days ago

every business has a calling button that mostly never works. so i built RingIt. powered by @smallest_AI . → pick your industry → 4 quick questions → voice agent in 3 minutes → tweak it by chatting → see everything on the dashboard

25

209

11

50

13K

Aditya16037 retweeted

Sudarshan Kamath

@kamath_sutra

27 days ago

Can someone explain to him that this isn’t quality time?

63

924

32

134

1M

27 days ago

@smallest_AI @kamath_sutra @SierraPlatform This is what matters!

0

2

0

56

27 days ago

Pulse performing where it really matters! P95 latency of 196ms at scale:)

27 days ago

Pulse by Smallest AI is now #1 on @SierraPlatform’s μBench for P95 latency! Real-time voice systems need speed that holds up consistently at scale. P95 is where the real user experience shows up. Proud to see Pulse leading on the metric that makes conversations actually feel instantaneous. Huge shoutout to Sierra for building and open-sourcing μBench.

smallest_AI's tweet photo. Pulse by Smallest AI is now #1 on @SierraPlatform’s μBench for P95 latency!

Real-time voice systems need speed that holds up consistently at scale. P95 is where the real user experience shows up.

Proud to see Pulse leading on the metric that makes conversations actually feel instantaneous.

Huge shoutout to Sierra for building and open-sourcing μBench.

6

30

10

0

2K

0

2

0

78

29 days ago

Bro is a closer😭😂 Let humans be humans guys:)

Sudarshan Kamath

@kamath_sutra

about 1 month ago

She deserves her moment. So do your customers. Let humans be humans. Try Smallest AI Voice Agents!

85

1K

86

284

2M

0

5

0

108

Aditya16037 retweeted

Krishna

@Krishna__Bansal

30 days ago

@kamath_sutra tried pulse STT with my project ( thanks to @aigrantsindia ) worked really well , it even captured my mumble https://t.co/WJfaqF4oEc

1

3

2

1

269

Aditya16037 retweeted

Abhishek (key/value)

@StalwartCoder

about 1 month ago

sneak peak of worlds fastest speech to text model

7

34

6

7

3K

about 1 month ago

@StalwartCoder Sheeesh

0

1

0

72

about 1 month ago

Exciteddd!

Devanshpawan

@Devanshpawan1

about 1 month ago

Dropping soon 🫳 @itsnotraunaq

0

22

3

2

1K

0

4

0

67

Aditya16037 retweeted

about 1 month ago

4x cost reduction in TTS inference with @tenstorrent! 11 NVIDIA L40S ran 550 simultaneous audio-stream at ~$100K. Now, 27 Tenstorrent P100 chips do the same at ~$27k. First production-grade TTS to match the cost of text tokens without degradation in audio quality. Hear it straight from the team that built it: @AkshatMandloi10 and @ranjith_m_s in the video below.

10

225

41

94

21K

Aditya16037 retweeted