Simon Rouard @simonrouard - Twitter Profile

Pinned Tweet

almost 2 years ago

Very happy to announce that my paper “Audio Conditioning for Music Generation via Discrete Bottleneck Features“ done with @honualx @adiyossLC @jadecopet and Axel Roebel has been accepted at ISMIR24. Paper: https://t.co/2KwG6Bk1jH Sample: https://t.co/Dkom70Eoie Code: soon

2

98

23

29

6K

Simon Rouard @simonrouard

about 1 month ago

Today at 10:30 at @iclr_conf I’ll be presenting CALM (Continuous Audio Language Models), the architecture behind Pocket TTS, @kyutai_labs’s 100M params TTS that runs on CPU. Come chat with me if you want to build an audio LM without tokens Paper: https://t.co/IR2vaZ0wxH

0

36

2

9

1K

Simon Rouard @simonrouard

5 months ago

@kyutai_labs Thank you @nmboffi, as we noticed that Lagrangian self-distillation worked much better than consistency for our TTS task.

0

1

0

1

345

Simon Rouard @simonrouard

5 months ago

Super happy that our work on Continuous Audio Language Models (https://t.co/8KmdWlymUB) led us to build an outstanding 100M TTS with voice cloning ability that runs on any laptop CPU.

kyutai @kyutai_labs

5 months ago

We’re excited to introduce Pocket TTS: a 100M-parameter text-to-speech model with high-quality voice cloning that runs on your laptop—no GPU required. Open-source, lightweight, and incredibly fast. 🧵👇

91

4K

471

4K

235K

0

29

7

1

2K

Who to follow

Alexandre Défossez

@honualx

Leading ambitious research @kyutai_labs. Chief Science Officer @gradiumai.

Emir Demirel

@_emir_demirel_

Music & Speech Tech Researcher

Junghyun (Tony) Koo

@Junghyun_Koo

Research Scientist @SonyAI_global | PhD at Music and Audio Research Group (MARG), @SeoulNatlUni | Previous intern @merl_news, @Sony, and @Supertone_ai

simonrouard retweeted

Gradium

@GradiumAI

6 months ago

Gradium is out of stealth to solve voice. We raised $70M and after only 3 months we’re releasing our transcription and synthesis products to power the next generation of voice AI.

81

1K

157

508

471K

simonrouard retweeted

kyutai @kyutai_labs

8 months ago

1/2 We’re releasing an in-depth tutorial on neural audio codecs, the secret sauce that makes it possible for audio LLMs to not sound like a horror movie:

12

436

55

295

47K

simonrouard retweeted

arXiv Sound @ArxivSound

9 months ago

Rouard Simon, Orsini Manu, Roebel Axel, Zeghidour Neil, D\'efossez Alexandre, "Continuous Audio Language Models," https://t.co/LatjRsbPgt

0

21

5

11

845

simonrouard retweeted

kyutai @kyutai_labs

12 months ago

Kyutai Speech-To-Text is now open-source! It’s streaming, supports batched inference, and runs blazingly fast: perfect for interactive applications. Check out the details here: https://t.co/bQMP56XaKC

33

617

116

421

66K

simonrouard retweeted

kyutai @kyutai_labs

about 1 year ago

Talk to https://t.co/1ZcGtCwvgx 🔊, the most modular voice AI around. Empower any text LLM with voice, instantly, by wrapping it with our new speech-to-text and text-to-speech. Any personality, any voice. Interruptible, smart turn-taking. We’ll open-source everything within the next few weeks.

121

2K

260

2K

285K

simonrouard retweeted

kyutai @kyutai_labs

over 1 year ago

Meet Hibiki, our simultaneous speech-to-speech translation model, currently supporting 🇫🇷➡️🇬🇧. Hibiki produces spoken and text translations of the input speech in real-time, while preserving the speaker’s voice and optimally adapting its pace based on the semantic content of the source speech. Based on objective and human evaluations, Hibiki outperforms previous systems for quality, naturalness and speaker similarity and approaches human interpreters. 🧵

21

473

106

307

167K

simonrouard retweeted

arXiv Sound @ArxivSound

over 1 year ago

``MusicGen-Stem: Multi-stem music generation and edition through autoregressive modeling,'' Simon Rouard, Robin San Roman, Yossi Adi, Axel Roebel, https://t.co/tGwJyDyRIH

1

17

4

7

952

simonrouard retweeted

kyutai @kyutai_labs

over 1 year ago

Meet Helium-1 preview, our 2B multi-lingual LLM, targeting edge and mobile devices, released under a CC-BY license. Start building with it today! https://t.co/X4Dbx2T1cJ

10

375

89

159

58K

Simon Rouard @simonrouard

over 1 year ago

I am presenting our paper MusicGen-Style “Audio Conditioning for Music Generation via Discrete Bottleneck Features” at @ISMIRConf this afternoon. The code as well as the weights of the model are available on https://t.co/tSvrr446v3. You can now play with it!

1

104

10

45

5K

simonrouard retweeted

Nicolas DUFOUR @nico_dufour

over 1 year ago

It start now at poster 227!

0

28

3

2

3K

Simon Rouard @simonrouard

almost 2 years ago

The code and weights of the model will be released soon. Stay tuned!

0

160

Simon Rouard @simonrouard

almost 2 years ago

Very happy to announce that my paper “Audio Conditioning for Music Generation via Discrete Bottleneck Features“ done with @honualx @adiyossLC @jadecopet and Axel Roebel has been accepted at ISMIR24. Paper: https://t.co/2KwG6Bk1jH Sample: https://t.co/Dkom70Eoie Code: soon

2

98

23

29

6K

Simon Rouard @simonrouard

almost 2 years ago

Then we can as well use text and style conditioning to generate music, but we noticed that the model tends to ignore the text prompt. We then introduce a double classifier free guidance. This guidance could be applied to other multi-conditioned generative models.

simonrouard's tweet photo. Then we can as well use text and style conditioning to generate music, but we noticed that the model tends to ignore the text prompt. We then introduce a double classifier free guidance. This guidance could be applied to other multi-conditioned generative models. https://t.co/X67N53aXra

1

0

1

193

simonrouard retweeted

arXiv Sound @ArxivSound

almost 2 years ago

``Audio Conditioning for Music Generation via Discrete Bottleneck Features,'' Simon Rouard, Yossi Adi, Jade Copet, Axel Roebel, Alexandre D\'efossez, https://t.co/Z01vzcESpi

0

21

2

6

1K

simonrouard retweeted

Jean-Marie Lemercier @jm_lemercier

almost 2 years ago

#ICML2024 paper “An Independence-promoting Loss for Music Generation with Language Models” We promote independence between EnCodec codebooks using a kernel trick and improve music generation quality 🎶 Paper 📜 https://t.co/Uyb1sIusze Audio/Code 🔊 https://t.co/MFmdsOIrxo

jm_lemercier's tweet photo. #ICML2024 paper “An Independence-promoting Loss for Music Generation with Language Models”

We promote independence between EnCodec codebooks using a kernel trick and improve music generation quality 🎶

Paper 📜 https://t.co/Uyb1sIusze
Audio/Code 🔊 https://t.co/MFmdsOIrxo https://t.co/ywNux5cAQI

1

40

9

17

8K

Simon Rouard

@simonrouard

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users