Introducing Magenta RealTime 2 (MRT2): the live music model you can play as an instrument.
MRT2 offers MIDI and prompt controls, and runs natively on a MacBook with <200ms latency.
Open weights. Open source inference engine. Suite of apps and plugins.
Hear what it can do and try it out for yourself below 🧵
Had one of those "Oh, I'm living in the future" moments yesterday.
Flying on the airplane. No internet.
Playing music with hand gestures controlling a realtime LLM running locally on my laptop
Person next to me thinking I'm crazy... all the while I'm having a great time 😅
🎉Jazzed after Day 1 of @BerkleeCollege's AI Music Summit. Many fellow speakers on #TeamHuman: small models, attribution, iteration, live perf., user-owned data, selective adoption. Stellar lineup of creators, educators, devs, CEOs & lawyers worldwide. Shows by OOD human artists
Have been doing some stem remixing work with Stable Audio 3.
📦 Medium model
🔉 init_audio holding the original audio file
😶🌫️ init_noise_level between 0.4-0.5 seems to be the sweet spot
🪄 Empty promps
We've got a new model coming out next week! We've been having a lot of fun playing with it, and I hope you will too♥️
We'll be celebrating by presenting at the AI Music Summit at Berklee and helping teams at the hackathon afterwards build some wild new musical instruments 🎸
Using Stable Audio 3 to generate variations of an existing loop.
Unconditional generation (no prompt), renoising the latents to 0.5, and just using different seeds seems to generate a nice neighbourhood around the original. Generally keeps the harmonic context and feel.
Experiment: Painting sound effects with @StabilityAI Stable Audio 3
1. Free-form drawing on a spectrogram-like canvas. Time on the x-axis, pitch on the y.
2. Synthesise the drawing to audio. Strokes control bandpass-filtering of a white noise source.
3. Use that audio as input for SA3 audio-to-audio pass combined with a text prompt.
In Claude Desktop with a handful of MCP tools.
The call for the NeurIPS 2026 Creative AI Track is out!
In its fourth year, NeurIPS 2026 Creative AI Track invites research papers and artworks that explore emerging applications, methods, and critiques of artificial intelligence and machine learning in art, design, and creative practice.
Focusing on the theme of Agency, this year’s track asks: how agency emerges, is exercised, is negotiated, and is contested through creative practice with AI. Agency may belong to an artist, a collaborator, a model, an audience, a platform, a community, or even a larger social and technical system, and may be asserted, delegated, shared, resisted, constrained, or redistributed.
Important dates:
June 30: Submission Portal Opens
August 3 (Anywhere on earth): Submission Deadline
September 18: Decision
October 23: Final Camera-Ready Submission
For more information, visit: https://t.co/ju2vjjMKfI
The moment it feels like playing a live instrument, we surpass the early days of neural synthesis -- are we close?
Novack and crew are gods -- they hath finetuned a Stable Audio Open Small into a live music diffusion model for thy pleasure
Can we transform offline audio diffusion into real-time streaming interactive instruments?
Yes!
Presenting Live Music Diffusion Models: a new paradigm for taking your favorite open models into live performance, right on your own laptop! 🎵🎵
🧵
Stable Audio 3, explained in 5 figures.
It’s a family of open-weight models for generating instrumental music and sound effects.
The models are fast, support editing, and are trained on licensed and Creative Commons audio.
👾 https://t.co/e8qhZpVv2w
🏋️♂️https://t.co/aRLGCXGBNr
I’m promoting our new conversational music recommendation dataset, Reddit2Deezer, the largest real-world, grounded CMR dataset (200k–600k conversations). The tracks and albums are mapped to the Deezer API, which enables straightforward access to audio previews and rich metadata.
Say hello to Project LYDIA Phase II!
Developed in partnership with our friends at Roland Future Design Lab @RolandGlobal, we're proud to announce the next step in our journey towards neural hardware.
Article: https://t.co/kmKsghZ3Ld
In the spirit of celebrating non-AI creative work; it’s the anniversary of this music video, in which I held my breath for 4.5 minutes in order to perform the entire song underwater in one shot. No FX, no splicing of takes. It took me 3 months of training to get my breath hold up to a stationary 5 minutes in preparation for the shoot (don’t worry, the drowning is acting, I wasn’t running out of air.)
It’s a slow-paced video. But the focus on a gradual buildup of surreal dread is meant to activate mirror neurons in the watcher, the same ones that turn on when you see someone yawn. I wanted you, over the course of 4 minutes, to slowly drown with me. To experience the feeling that birthed this song in the first place.
(Full thing on YT, it’s called Mabúl)
We're launching the agentic robotics app store today. Let's democratize AI robotics for all!
300+ apps shipped. 10,000 robots in the wild. It used to take weeks from a robotics engineer to build apps, now everyone can do it in hours with ML intern or your favorite neighborhood agent!
My favorite reachy mini app was built by Joel, a 78yo marketing exec who'd never coded in his life. Personally, I built an office receptionist in two hours last week.
More info to start building here: https://t.co/mQ6YM05lgg