Introducing Agora-1, a world model that's learned to simulate multi-agent experiences. It's so fun.
Today we're launching a playable research preview, where you can relive your childhood and enjoy a multiplayer simulation of GoldenEye.
So excited about this new capability!
Starchild-1 is the first ever real-time multimodal world model.
One of the hardest problems in AI-generated storytelling is that production-time is much slower than consumption-time.
What's even worse, there's at least two (audio & video) , sometimes three (+lip-sync) different dimensions you have to sequence back-to-back-to-back if you want to see and hear a video with a character that looks believable.
But even if you get this far - there's still the subjective eval that must pass - Does the character feel believable?
Are they conveying real emotion?
Are they gesticulating for dramatic effect?
Thanks to Odyssey and team - we bring you a multimodal real-time solution where production outpaces consumption on all three of these dimensions.
You've never seen a real-time generated talking head like this.
No tricks
No frame stitching
No multi-model hops
One model streaming audio and video in realtime.
What's even more interesting, the team and I didn't set out to solve Talking Heads. We wanted to solve real-time streaming of a World Model. Which means this model can do a lot more than Talking Heads, it just so happens to do them like this...
Meet our new friend, Starchild-1 ❤️
Starchild-1 is the first ever real-time multimodal world model.
A world model understands and simulates the world. Starchild-1 has learned to generate not just the visuals of the world, but the sounds of it too!
@olivercameron@odysseyml I've never seen Talking Heads this good.
Real-time prompts & outputs for:
- Hands gestures
- Facial Emotions
- Emotional prosody
- Head position
🤩
TOMORROW, 9:30AM PT
We go live with an @odysseyml announcement (or two) I've been looking forward to for some time.
I think these releases will make where world models are headed, and how powerful they can become, much more obvious.
So hyped. GPUs scaling as we speak.
What made solving self-driving so crazy is that learning to drive really meant learning how the world works and interacts from visual observations of roadways.
Imagine all the pretrained knowledge we humans have to know not to bump into things or how to negotiate with other humans. It’s why we can learn to drive with just 30 hours behind the wheel. To make a self-driving car, we had to find a way to give machines a similar base intelligence.
After an insane decade of self-driving research, the problem was cracked. Models successfully learned the narrow world of driving from millions of visual observations, and thus became superhuman drivers. Self-driving cars now regularly demonstrate this emergent, really interesting understanding of the world—things like slowing down for occlusion, anticipating a pedestrian stepping out, or that driving a certain way would change another car’s behavior.
Now I work on general world models, which push the concept of models learning the world much, much further.
A general world model learns from visual observations of the entire world (universe?), not solely roadways. It’s difficult to comprehend just how diverse these observations are. It’s every ocean, glacier, mountain, forest, city, building, simulation, and scenario you can imagine.
Modeling this diversity of pixels, sounds, and actions is dramatically harder, but what I believe will eventually emerge from general world models is a profound understanding of the world—of physics, cause-and-effect, and human behavior—at a depth we can’t really comprehend today.
What knowledge or scientific breakthroughs or sci-fi applications will that intelligence enable? What will it contribute to our understanding of the universe and our place in it? What lives might it even be able to save? We don’t know yet, and we're so early, but I’m on a very rewarding journey with incredible humans at @odysseyml to find out.
Our latest general world model—Odyssey-2 Max—is a humble step on this journey. Our scaling efforts have materially advanced the state of the art in physics-accuracy for world models, and in real-world use it demonstrates a fascinating understanding of the world. I couldn’t be prouder of this team for what they've delivered.
We have much to do, and much we will share soon, but there’s no problem I can think of that’s more meaningful to work on than this.
@olivercameron@Astro_ChrisW It's a special time in human history. Until now, only the past could teach us. Now with World Models, we'll have the fortune to simulate the past, present and future, and learn from them all. A truly noble mission to be part of. Thanks for showing us the way!
@odysseyml "We are living in a computer-programmed reality, and the only clue we have to it is when some variable is changed, and some alteration in reality occurs."
- Philip K Dick
Today at Odyssey, we’re introducing Odyssey-2 Max, our largest and most capable world model yet.
The future beyond LLM's should now be clear.
It looks and feels like the holodeck in Star Trek. The simulation engine at Amaya in DEVS. The world loader in The Matrix. All of these depend on this foundational technology, streaming simulated pixels in real-time.
We look forward to helping the world explore this new and exciting path. We treat the opportunity to do so very seriously and with the highest standards of craftsmanship and support we can muster. We're here to hold it gently, and let the world show us how it should be used.
Come explore and be inspired.
Imagination is all you need. 🌎🤖
It’s time to go beyond language models.
Introducing Odyssey-2 Max, our most powerful world model yet. It materially advances the SOTA in physical accuracy.
This is a big step toward models that simulate and interact with the world in real time.
A new intelligence entirely!