Best way to kick off a week with the @LiveKit robotics team: win the @southpkcommons Embodied AI Hackathon the weekend before.
Over 48 hours, we really cooked something special. Distributed low latency inference (VLMs, ACT, MolmoAct 2), teleop, voice agent orchestration, all powered by @LiveKit.
After nearly 18 years I can stop working on Model S and X. We put so much love into these products, but will continue to pour that into the future products. Thanks to everyone who believed in and supported these cars through the years. We strived for the best and will never stop. Saying goodbye to something great and making room for something even greater!
Ship a voice agent on any website with a single script tag.
The widget supports voice, video, screen share, and text chat. Configure branding, capabilities, and per-visitor context from the LiveKit Cloud dashboard. Works on Shopify, Webflow, WordPress, or any custom site.
New Voice AI Model from @resembleai's Research Team: Dramabox! 🎭
A Voice AI model SHOULD give you two things, an oscar-worthy performance and a verifiable signature to prove it's yours.
DramaBox is the first model that does both.
Open Source, available today!
Your outbound phone agent has 1-2 seconds to figure out if it's talking to a person, a voicemail, or an IVR.
We shipped Answering Machine Detection (AMD) in LiveKit Agents to do that for you so your agent knows when to keep talking, leave a message, use the keypad, or hang up.
The killer feature on @livekit's new custom voices is automatic fallback to the same voice clone on another TTS provider if the primary TTS provider request fails.
The fun feature is how easy it is to hear the differences in a clone across different TTS providers.
Incredible work by the Inworld team!
They fused an LLM backbone into their TTS, making it possible to prompt the TTS to express a full range of emotions.
Emotion steering hints like "[speak warmly, engaged, with a reflective tone]" gives the model enough information to synthesize the right expressions into speech.
Introducing Realtime TTS-2, a new generation of voice model built for realtime conversation.
It is the first voice model that hears the conversation, takes natural-language voice direction, holds one voice identity across over 100 languages, and speaks like a person who is paying attention.
The result is voice AI that feels as good as it sounds.
Try it out: https://t.co/80xL7AJveV
Learn More: https://t.co/PLUiAEFizP
Smallest AI is now live on @livekit
Lightning TTS and Pulse STT, both available natively in the LiveKit Agents framework.
→ Lightning TTS: SoTA conversational speech, ~100ms latency, 15 languages with mid-sentence code-switching, voice cloning from 10 seconds of audio
→ Pulse STT: 64ms TTFT, 32 languages, word-level timestamps, PII/PCI redaction
Plug it in, ship a voice agent.
Docs and cookbook in the thread.
We shipped structured data collection for voice agents.
Use data collection mode in Agent Builder, or Tasks and TaskGroups in our Agents SDKs.
Every session ends with a clean JSON payload for your CRM, form, or database.
Great for lead qualification, patient intake, and surveys.
Running a company:
2020: can you survive a pandemic?
2021: still here? we’re going to give all of your competitors $100m series A rounds.
2022: wow, you made it? okay, all engineers cost $600,000/year now.
2023: nice job! okay, SVB failed and we’re going to take away your bank account.
2024: a survivor I see. but can you pivot from ai to crypto to defense tech back to ai-enabled defense tech in a 12 month period to stay relevant?
2025: unfortunately all of your competitors have raised $2b series B rounds. oh and only 500 engineers are relevant and they cost $100m/yr each.
2026: well, well, well. you’re still in business? let’s deploy the thunderclap of godlike LLMs from the heavens so all of your customers can rebuild your app in 2 hours. can you survive?