Hubert Siuzdak

CEO @Multimodal. Agentic AI for regulated financial workflows. Posting what we learn building agents that handle real loan files in production.

about 2 months ago

@JulienBlanchon thank you! yes

0

1

0

57

Who to follow

Ankur A. Patel

@aapatel09

Ivan Vovk

@iyuvovk

lead ml engineer at @Yandex (agents & reasoning), ex @Huawei, @SamsungResearch, @Skoltech

Shah Nawaz

@Shahnawazgrewal

Assist. Prof in CV @jkulinz | Former @imec_int, @desy, @IITalk. @Uni_Insubria, @TUeindhoven, @TUBerlin, @EIT_Digital. Tweets r strictly personal opinion

about 2 months ago

I'm joining ElevenLabs, together with the founding team of Papla Media, to continue our shared mission of advancing voice interfaces. Excited for what's ahead.

HubertSiuzdak's tweet photo. I'm joining ElevenLabs, together with the founding team of Papla Media, to continue our shared mission of advancing voice interfaces. Excited for what's ahead. https://t.co/OK7p6UYurw

3

29

2

0

691

about 2 months ago

@matiii @ElevenLabs @dabkowski_piotr didn't expect this photo 😅 Mati, Piotr - really appreciate the opportunity. Excited to build together 🇵🇱

0

2

0

85

11 months ago

Cool work and a nice read. But why we rebranded masked language models to diffusion language models 😭

Jinjie Ni

@NiJinjie

11 months ago

Token crisis: solved. ✅ We pre-trained diffusion language models (DLMs) vs. autoregressive (AR) models from scratch — up to 8B params, 480B tokens, 480 epochs. Findings: > DLMs beat AR when tokens are limited, with >3× data potential. > A 1B DLM trained on just 1B tokens hits 56% HellaSwag & 33% MMLU — no tricks, no cherry-picks. > No saturation: more repeats = more gains. 🚨 ”https://t.co/jmUcE1kywW” We also dissected the serious methodological flaws in our parallel work “Diffusion Beats Autoregressive in Data-Constrained Settings” — let’s raise the bar for open review! 🔗 Blog & details: https://t.co/sEQvYUxElj 18 🧵s ahead:

NiJinjie's tweet photo. Token crisis: solved. ✅

We pre-trained diffusion language models (DLMs) vs. autoregressive (AR) models from scratch — up to 8B params, 480B tokens, 480 epochs.

Findings:
> DLMs beat AR when tokens are limited, with >3× data potential.
> A 1B DLM trained on just 1B tokens hits 56% HellaSwag & 33% MMLU — no tricks, no cherry-picks.
> No saturation: more repeats = more gains.

🚨 ”https://t.co/jmUcE1kywW”
We also dissected the serious methodological flaws in our parallel work “Diffusion Beats Autoregressive in Data-Constrained Settings” — let’s raise the bar for open review!

🔗 Blog & details:
https://t.co/sEQvYUxElj

18 🧵s ahead:

43

2K

251

1K

440K

1

6

1

1K

11 months ago

Papla Voicebot is genuinely fun to talk to! Our Papla P1 engine, when running locally, responds in 50-150 ms and it’s also pretty expressive. The main bottleneck right now is Gemini 2.0 Flash latency: the first chunk we pass to TTS typically arrives in ~600 ms. We're exploring a few ideas to make it even snappier, including running a local open-source LLM.

11 months ago

Here's @HubertSiuzdak having a quick chat with our AI voicebot. It listens, responds in real time, and even laughs - just like a real person.

2

4

0

868

1

6

0

1

535

11 months ago

Ok but doing modeling @runwayml is way funnier, feel free to steal

0

154

11 months ago

Or we just say we're doing modeling now

Elon Musk

@elonmusk

11 months ago

This false nomenclature of “researcher” and “engineer”, which is a thinly-masked way of describing a two-tier engineering system, is being deleted from @xAI today. There are only engineers. Researcher is a relic term from academia.

8K

51K

6K

5K

32M

1

0

351

HubertSiuzdak retweeted

12 months ago

Building the future of AI voice… Allergy season meets beta testing and apparently, our voicebot has opinions now. 🤧🤖 Still more emotionally aware than some customer service lines. Big things are coming from Papla Media 💛 #ConversationalAI

0

8

3

0

379

about 1 year ago

We raised $360k in a pre-seed round - grateful to everyone who believed in us early. We've got some great things coming, and can't wait to share them soon!

about 1 year ago

We're happy to share that Papla Media raised a $360k pre-seed round. https://t.co/uRa8yI685c

1

12

0

859

1

11

0

479

HubertSiuzdak retweeted

about 1 year ago

🚀 New TTS Playground Now Live at Papla Media! We just launched a powerful new playground on the Papla Media platform designed to make working with AI voices faster and easier than ever. Now you can: 📝 Type text and hear it instantly in ultra-realistic voices 🎙️ Try multiple voices before deciding which one to use in your app 🔊 Prototype and download audio 🧬 Generate speech from your cloned voices directly via the UI ⚡ Move and experiment fast Whether you’re building tools, testing ideas, or just playing with voice, this space is for you.

0

11

1

4

552

HubertSiuzdak retweeted

about 1 year ago

🚀 Introducing the New Voice Tab We just made it easier (and more fun) to explore, test, and create voices like never before. 🎧 Browse and preview voices across styles, accents, and tones 🧬 Clone your own voice in a few clicks from just 10 seconds of audio. Perfect for creators, developers, brands, and anyone building with voice 👉 Jump in, explore the voices, and try cloning your own: https://t.co/0ZFohrqPwl

1

8

2

4

496

about 1 year ago

@laion_ai yeah, you can steer emotions/styles with different audio prompts!

0

1

0

47

about 1 year ago

is this the state of the art in instant voice cloning? What should we compare it to?

about 1 year ago

🎙️ Voice Cloning with Papla P1 Just Got Real With only 10 seconds of audio, P1 can create a highly realistic voice clone that captures tone, accent, and personality. This means fast, scalable voice personalization that sounds authentic and deeply human. 🔹 Just 10 seconds of voice input 🔹 Natural rhythm and emotional nuance 🔹 Ready for content, games, apps, and more From hyper-personalized experiences to next-gen audio production, Papla P1 voice cloning opens up endless possibilities. 🔊 Let us know what you think in the comments, we’d love to hear your thoughts.

0

13

2

3

942

1

0

3

420

HubertSiuzdak retweeted

about 1 year ago

We're live on the TTS Arena! Check out Papla P1 and see how it compares → https://t.co/6WF60dOVj3

1

10

2

0

562

about 1 year ago

Today we're launching our platform with Papla P1 - our ultra-realistic text-to-speech model, available via API! Create truly conversational voices - and with real-time streaming, it's perfect for live, interactive apps.

about 1 year ago

Introducing Papla P1 and our real-time API for developers! 🚀 Papla P1 is our advanced text-to-speech model, now available through a developer-friendly platform. Easily generate realistic speech, clone voices, and build natural-sounding conversations into your apps.

4

23

3

7

3K

1

9

3

4

2K

over 1 year ago

Excited to be at NeurIPS 2024 in beautiful Vancouver! Come check out our poster on SNAC at the Saturday Audio Workshop. Big thanks to Luca @lucalanze and Florian for the collab. DM me if you’d like to chat about audio generative models, conversational AI, or startups!

HubertSiuzdak's tweet photo. Excited to be at NeurIPS 2024 in beautiful Vancouver! Come check out our poster on SNAC at the Saturday Audio Workshop. Big thanks to Luca @lucalanze and Florian for the collab. DM me if you’d like to chat about audio generative models, conversational AI, or startups! https://t.co/KBRnoXatlM

arXiv Sound @ArxivSound

over 1 year ago

``SNAC: Multi-Scale Neural Audio Codec,'' Hubert Siuzdak, Florian Gr\"otschla, Luca A. Lanzend\"orfer, https://t.co/GQrDlEaHeQ

0

20

5

12

2K

0

6

0

2

429