Happy to share that our research work "DualTurn: Learning Turn-Taking from Dual-Channel Generative Speech Pretraining" has been accepted at Interspeech 2026!
Preprint - https://t.co/02GU5pDMmn
#Interspeech2026#VoiceAI#FullDuplex#TurnTaking
The idea is to bring full-duplex behavior into a modular pipeline, bridging traditional half-duplex ASR-LLM-TTS-VAD systems and genuinely full-duplex conversation, without giving up the LLM capabilities production systems depend on.
@chiragbarjatya Thanks for the motivation, @chiragbarjatya I didn’t get past the second line of your tweet 😅, but the idea of Diwali milestone stuck with me. From 98kg down to 89.xx kg, still a journey ahead, but I’m happy to have hit my Diwali target! 🥂 will enter Diwali with 89 weekly avg.
@monali_dambre I started using 2 bank accounts. Acc1 is where my salary, investment, major payments are done. Acc2 is my day to day spending, every month I transfer some money from Acc1 to Acc2. Acc2 is @TheJupiterApp, which is great for tracking UPI/cards/bank transfers.
We at @Skit_ai are thrilled to announce the release of our latest Multi-Modal LLM models for Speech Understanding on @huggingface, along with a comprehensive GitHub repository containing the code to train and infer these models!
https://t.co/HLtrZdDS4c
https://t.co/3p2b2D6vgZ
Due to the simplicity of training the model, any new perception/generation tasks could be added to the model eg: Multi-speaker transcript, speech environment classification, speech translation, etc.
👏We are proud to announce our team (@shangethr, Kriti A. (ex https://t.co/xX3RBgYTFd), and Swaraj Dalmia), in collaboration with Prof. Eng Siong Chng and Tarun Gupta of #NTUSpeechLab, have their paper "Improving Spoken Language Identification with Map-Mix" accepted at #ICASSP23
We’re excited to release another open-source dataset. This is for phone-number capture from conversational human speech. Read more on this LinkedIn post https://t.co/BDPVAJr6wp