Introducing Local Whisper
An on-device speech-to-text, dictation, and text-to-speech app for macOS, iOS, and Android.
It runs local speech-to-text models, uses MLX on Apple Silicon, includes a Flutter-based mobile app with native keyboards, and cleans up text using on-device or local private LLMs.
It also includes local text-to-speech through Kokoro-82M, so selected text can be read aloud without sending it to a hosted service.
I built Local Whisper because I wanted reliable dictation without relying on hosted speech APIs, subscriptions, accounts, or usage limits.
Here’s what Local Whisper looks like today:
macOS:
• Quick global hotkey dictation: double-tap Right Option from any app and speak.
• Cleaned text is copied directly to your clipboard or pasted at your cursor.
• Text-to-speech: select text in any app and have it read aloud locally with Kokoro-82M.
• Menu bar access, overlay status, transcription history, replacements, shortcuts, and service controls.
• Local audio enhancement: voice activity detection, silence trimming, noise reduction, and normalization.
• CLI for scripting tasks: whisper text aloud, listen through the mic, transcribe audio files, export history, and monitor service health.
• Parakeet-TDT v3 runs locally by default, powered by MLX.
Mobile app + keyboard:
• The mobile app is built using Flutter for both iOS and Android.
• Both platforms feature a recorder app and native keyboard integrations.
• Record within the app, manage model packs and transcription history locally, and transcribe directly into other apps via the keyboard.
• The keyboard includes specialized modes, punctuation shortcuts, and Local Whisper-specific actions.
• iOS transcription runs locally through WhisperKit/Core ML.
• Android transcription uses locally captured audio with sherpa-onnx model packs.
Transcription engines:
• Parakeet-TDT v3 and Qwen3-ASR operate in-process via MLX.
• WhisperKit is supported through a local server configuration.
• Designed for fully local usage: no hosted quotas, no accounts, no transcript uploads, and no usage tracking.
Grammar cleanup (optional):
• Supports Apple Intelligence, Ollama, LM Studio, or transcription-only modes.
• Cleanup uses on-device or locally hosted private LLM models.
• Dictation commands convert spoken phrases like “new line,” “hmm,” and “scratch that” into actionable text edits before cleanup.
Privacy and control:
• Everything runs locally.
• No hosted speech API sits in the loop.
• No account creation, no telemetry, no analytics, no transcript uploads, and no usage limitations.
The goal is simple: fast, local dictation and text-to-speech that feel natural to use daily, without anxiety over usage limits or concerns about sharing private speech data.
Available for macOS, iOS, and Android:
https://t.co/1cL1zY4KFh
@Dimillian Hey @Dimillian
Can we have a straightforward way to share a generated doc from ChatGPT to Codex? Can these “two similar but separate platforms” have some kind of link to each other?
This is even more frustrating in the mobile app..
tbh, the nicest thing would be having a single synced system.
not a bunch of different products, each with its own knowledge, memory, and understanding of you. that just creates duplicated context and fragmented user knowledge everywhere.
just one thing that knows everything, and then u simply choose where it runs and which model powers it.
It doesn’t matter whether you use /goal, a memory system, or any other task context management approach. Model compaction in the middle of a task is the worst things that can happen to quality, especially when it comes to the small details that actually matter.
@GoogleAIStudio I have a feeling that sometime after the release, once the hype goes down, you’re gonna 1. Change its name 2. Merge it into the Gemini app itself :))
@Dimillian My desktop follow-up mode is set to steer but the Codex mobile defaults to queue. Is it possible to change the default to steer or simply sync it with the desktop?
These days, good tests matter more than ever. They’re how humans and agents make sure all the massive amounts of generated code actually work at every step.