@stolinski I found the accuracy drop from streaming transcription vs batch generation was too noticeable to justify building a fully streamed typing UX
Dictation that actually feels instant ⚡️
Runs on-device with Nvidia Parakeet streaming — no lag, no cloud.
Transcribes as you speak. Keeps up with your thoughts.
No uploads. No delays. Just fast, local speech → text.
So Codex + voice pilled that I mapped 3 mouse shortcuts for dictation in Codex:
Normal: GPT-4o Transcribe, unmatched accuracy
Quick adds: local Parakeet, unmatched speed + auto-submit (hits enter when I stop)
Long: auto-formats dictations into a readable prompt/spec
@mattpocockuk What makes it so good?
Last time I looked into it, it seemed like it’s mostly a way to enforce result types and pattern matching/tagged errors.
But that can be done in ~200 LOC, so I must be missing something 🤔
@FracSlap 💯 Nova-3 is the move for this.
Solid speaker diarization, $200 in free credits, and it plugs into most dictation + meeting recording tools that support BYOK.
Quick guide to get the free credits + API key:
https://t.co/KI3DRnFuKt
@FracSlap All value accrues to model providers unless apps create lock-in. Providers will probably ship this feature in their apps in <1yr.
If you want dictation + meeting recording, try Utter: data on disk, live transcripts, speaker labels, mid-meeting Q&A.
https://t.co/h1RfGEPKRF
A simple meeting copilot:
• See who said what (live)
• Take notes in one place
• Ask questions about the conversation instantly
No fluff. Just works.
Free with your API key.
@AlexMultiFamily@FracSlap From experience building something similar: speaker detection accuracy mostly comes down to model choice.
Local models still aren’t good enough yet. For reliable detection you need to use top hosted models like Deepgram Nova 3 or ElevenLabs Scribe.
https://t.co/h1RfGEPKRF
A simple meeting copilot:
• See who said what (live)
• Take notes in one place
• Ask questions about the conversation instantly
No fluff. Just works.
Free with your API key.
@FracSlap 💯 local-first wins. Data on disk > data locked in apps.
Built this into Utter for dictation + meetings: live transcripts, speaker labels, ask questions mid-meeting, save everything to disk.
Use BYOK or local models for free. Your data stays yours.
https://t.co/h1RfGEPKRF
A simple meeting copilot:
• See who said what (live)
• Take notes in one place
• Ask questions about the conversation instantly
No fluff. Just works.
Free with your API key.
@ingoa_dev Thanks for sharing but it’s honestly an unrealistic failure mode, the script is very simple.
The app defaults to you having to manually login when refreshing cookies, if you add your credentials in the app settings it can automate that step.
OpenASO is live 🚀
A free, open-source alternative for App Store Optimization.
Track keywords, research competitors, analyze reviews, translate/respond to users, and export data for AI analysis.
https://t.co/mWGLL9qpT4
🧵 What you can do with it:
@ingoa_dev Thanks
Web login is so Apple ads web cookie can be used to fetch keyword popularity and ASC key is for replying to reviews. Both are optional and all data stays on device
@PKodmad Yeah, ASO can feel like a black hole because it’s so tedious.
I just released a free tool to automate most of it, give it a try.
It can research competitors, keyword rankings, reviews, and screenshots, then help turn that data into ASO recommendations.
https://t.co/qTi6dLoZAA
OpenASO is live 🚀
A free, open-source alternative for App Store Optimization.
Track keywords, research competitors, analyze reviews, translate/respond to users, and export data for AI analysis.
https://t.co/mWGLL9qpT4
🧵 What you can do with it:
@dev_alexandrum Everything is stored locally on your device. If iCloud is enabled, Apple syncs that app data across devices signed into the same Apple ID. There’s no account or login system on our end, and cross-device transcript history comes from iCloud syncing, not our servers.
@interplato@stefanoscalia It comes from public App Store APIs/search endpoints plus publicly accessible metadata/reviews/screenshots.
OpenASO basically handles gathering + structuring all that data for you locally, then MCP/skills analyze it.
The repo probably explains it better than I can in a tweet 😅
Give https://t.co/fVnqq0AYQU a try, this was one of the main workflows I built it around.
You can create prompting modes (prompt + model + shortcut) that rewrite messy dictation into clean specs/prompts before pasting.
Free w/ local models or your own API keys, plus a hosted free trial.
@interplato@stefanoscalia Fair point. I should probably do a side-by-side.
A lot of the underlying ASO data is already public/free to access. OpenASO stays free because the collection happens locally, so there aren’t ongoing server costs.
The MCP + skills part grounds analysis in real App Store data.
@gabriel1 I do the same.
Lately, I’ve started using a post-processing prompt trained on my own dictation patterns to see if the model performs better when it has less ambiguity to resolve.
Because my dictations are saved to disk, I can have Codex analyze recent prompts and auto-generate a better post-processing prompt, personalized to my prompting/speaking style.
My rough voice input gets cleaned before it hits the coding agent. Tiny workflow win that compounds.