Today, voice models have no problem generating “angry” or “sad” expressions.
But ask for:
→ bored + fast
→ joy + shy
→ disappointment + confident
…and most systems collapse into stereotypes.
Our latest research blog explores why this happens — and how disentangling emotion from voice at the data layer improves expressive control. Read more below!
@JRBlake 17 of my friends and I are running the @nycmarathon aiming to raise over $54,000 for #TeamJBF. Our story is pretty cool: https://t.co/gb46JL3KxY