Even AI agents need @AndrewYNg for advice. So ours called him.
He happened to be recording a course - this course, which is live today โจ
Check it out here: https://t.co/wbnA01Y5Kf
#VoiceWithVB#VoiceAI#BuiltWithVB
New paper: Multi-Faceted Interactivity Alignment in Full-Duplex Speech Models
We use RL to post-train speech models (Moshi and PersonaPlex) to talk more like a human: to know when to respond, when to wait, and when to nod along with โyeahโs and โokayโs when listening.
๐๐ป๐๐ฟ๐ผ๐ฑ๐๐ฐ๐ถ๐ป๐ด ๐๐๐น๐น-๐๐๐ฝ๐น๐ฒ๐ -๐๐ฒ๐ป๐ฐ๐ต-๐๐ฏ ๐
Voice agents aren't just chatbotsโthey need to ACT by using tools ๐จ. To feel natural, they must handle your "ums", mid-sentence corrections, and call APIs with low latency.
๐งต๐
Ke Hu, Ehsan Hosseini-Asl, Chen Chen, Edresson Casanova, Subhankar Ghosh, Piotr \.Zelasko, Zhehuai Chen, Jason Li, Jagadeesh Balam, Boris Ginsburg, "SALM-Duplex: Efficient and Direct Duplex Modeling for Speech-to-Speech Language Model," https://t.co/TpuT1beBoP
Full-duplex speech LLMs could be the next AGI interface, not because voice is โcooler,โ but because human intelligence operates through continuous, parallel, real-time multimodal signals. Text chat is turn-based and discrete; real interaction is overlapping, expressive, and alive
๐ Introducing NVIDIA PersonaPlex ๐
๐ท๏ธ With great control comes great naturalness.
PersonaPlex explores the limits of naturalness and controllability in full-duplex conversational AI, and demonstrates that controllability is key to fully leveraging naturalness.
By prompting both voice and role, we can finetune a full-duplex model on real human conversations together with synthetic data in a single stage. This allows the model to learn natural tone and speaking style from real speech while covering a wide range of roles and behaviors from synthetic data, without getting confused. The result is human-level conversational naturalness with emergent generalization to new personas.
๐ง Demos and details
๐ https://t.co/M46zeTjhTG
๐ค Model weights
๐ https://t.co/CbSP3deztZ
๐ป Code
๐ https://t.co/MxCZevEkJ7
๐ Paper
๐ https://t.co/TmhBOG1aGg
Grateful to all co-authors and collaborators. Looking forward to presenting at #ICASSP2026! ๐
๐ก Don't just read this list, memorize it instantly!
I turned these antonyms into interactive flashcards. Practice "articulate" and others right now:
https://t.co/lNR5fF9u4w
#EnglishLearning#Vocabulary#Flashcards#StudyTips
๐ Introducing NVIDIA PersonaPlex ๐
๐ท๏ธ With great control comes great naturalness.
PersonaPlex explores the limits of naturalness and controllability in full-duplex conversational AI, and demonstrates that controllability is key to fully leveraging naturalness.
By prompting both voice and role, we can finetune a full-duplex model on real human conversations together with synthetic data in a single stage. This allows the model to learn natural tone and speaking style from real speech while covering a wide range of roles and behaviors from synthetic data, without getting confused. The result is human-level conversational naturalness with emergent generalization to new personas.
๐ง Demos and details
๐ https://t.co/M46zeTjhTG
๐ค Model weights
๐ https://t.co/CbSP3deztZ
๐ป Code
๐ https://t.co/MxCZevEkJ7
๐ Paper
๐ https://t.co/TmhBOG1aGg
Grateful to all co-authors and collaborators. Looking forward to presenting at #ICASSP2026! ๐