Eleven v3 is out of alpha and ready for commercial use.
Since alpha, we've improved stability and accuracy:
- Stability: more reliable model and higher user preference scores
- Accuracy: 68% fewer errors on numbers, symbols, and technical notation
Today we’re introducing Scribe v2: the most accurate transcription model ever released.
While Scribe v2 Realtime is optimized for ultra low latency and agents use cases, Scribe v2 is built for batch transcription, subtitling, and captioning at scale.
Today, we launch the ElevenLabs OSS Engineers Fund - a program that provides sustained support to the open-source projects that help power our work.
Over the next six months, we are contributing $22,000 to projects our engineers rely on.
We just released Scribe v2 Realtime, an industry-leading live speech transcription model 🎉 If you have feedback, my DMs are open
https://t.co/6ASZOC1FPT
@yacineMTB AI + human is still better than just one of them in isolation. A few translators still have work to do. But yeah, technology has forced many of them to find other work.
@omooretweets Cool! I used to do this manually with ChatGPT, Midjourney and Powerpoint, which took a little under 30 minutes for one story. Now it's easier!
Introducing Eleven Music. The highest quality AI music model.
- Complete control over genre, style, and structure
- Multi-lingual, including English, Spanish, German, Japanese and more
- Edit the sound and lyrics of individual sections or the whole song
@theo Have you considered using an acoustic imaging camera device (e.g. FLIR) to pinpoint exactly where the annoying sound comes from? Maybe rent one. You might need a model that is capable of filtering, to focus on the specific frequencies in the annoying sound.
I just published the world's fastest EBU-compliant audio loudness meter as a Python package. The underlying C++ code was developed and generously open-sourced by Nomono, and has been battle-tested in production for a long time. It is just a `pip install loudness` away!
I am starting to like Rust + pyo3 + maturin for creating fast Python packages. And the fact that the builds are ABI3-compatible means they are compatible with a wide range of Python versions, including future versions, like 3.14! This means less maintenance is needed.
audiomentations has surpassed 2k stars! I celebrated it by reimplementing Mp3Compression and Limiter in Rust, leading to some nice performance improvements. Also, Python 3.13 is supported now!
The new Limiter implementation is 30% faster, and you can find it here: https://t.co/pwcOmpWdA4
The main reason why it is faster is that the delay compensation is done directly in Rust instead of relying on np.pad and slicing.
Sometimes, when arranging vocal music, it is useful to do a quick visual sanity check of the distribution of notes with respect to the vocal range of each group. I just made a quick Python CLI tool for doing that.
What makes software development (and engineering) *really* hard instead is:
- Building the right thing (and knowing what this is)
- Coding yourself into a corner (common for juniors - and now also for AI!)
- Architecture
- Tech maturity & risks that come with it
- Testing
- Maintenance
- Migrations
- Real-world edge cases
- Non-functional requirements: eg latency, performance, cost of operations, security
- Compliance
- Tech debt
... and most importantly: people!! (collaboration, conflicts, ownership etc etc)