Ryan Po

about 2 months ago

Super cool work from the Odyssey team! Great to see more momentum in this direction (and happy to see the MultiGen framework getting adopted 😉) Multi-player/agent systems that scale to arbitrary numbers of agents might not be solvable with brute force scaling alone. Can’t wait to see where the field goes with this direction!

Odyssey @odysseyml

about 2 months ago

Introducing Agora-1, a multi-agent world model. Multiple participants—human or AI—can now interact inside the same world simulation, all in real-time. Try our playable research preview today, with Agora-1 simulating a multiplayer GoldenEye deathmatch!

118

1K

154

751

910K

1

13

0

3

1K

Po_lhr retweeted

Hansheng Chen @HanshengCh

about 2 months ago

High-fidelity generation is hitting a scaling crisis as DiT compute grows with image resolution and video length. But do we need high-resolution denoising at every step? We introduce Spectral Progressive Diffusion, a plug-and-play framework for efficient image and video generation that directly exploits the spectral autoregression property of diffusion to grow resolution during denoising. [1/7]

22

417

65

347

88K

Po_lhr retweeted

about 2 months ago

New paper: AsymFlow🔥 JiT x0-prediction is not enough for pixel generation. Better keep velocity in a low-rank subspace: - 1.57 FID on ImageNet (best pixel flow model) - Finetunes FLUX.2 klein into pixel space, beats the original on HPSv3/DPG/GenEval (#1 overall on HPSv3) 1/7

HanshengCh's tweet photo. New paper: AsymFlow🔥

JiT x0-prediction is not enough for pixel generation. Better keep velocity in a low-rank subspace:

- 1.57 FID on ImageNet (best pixel flow model)
- Finetunes FLUX.2 klein into pixel space, beats the original on HPSv3/DPG/GenEval (#1 overall on HPSv3)

1/7 https://t.co/FSz46hrJHj

21

285

56

200

56K

3 months ago

Had a lot of fun building this during spring break, pretty surreal to see a multiplayer generative game actually running in the browser (it even works it mobile). Go try it!

3 months ago

Our previous intern released an extremely impressive re-implemented demo of our paper on multiplayer diffusion game engines. https://t.co/UHUEVfkK8h I think this might be the first time you can play a fully-functional multiplayer generative game online with other people. 🤯

natanielruizg's tweet photo. Our previous intern released an extremely impressive re-implemented demo of our paper on multiplayer diffusion game engines.
https://t.co/UHUEVfkK8h

I think this might be the first time you can play a fully-functional multiplayer generative game online with other people. 🤯 https://t.co/ctnpmNnhC8

15

160

29

98

29K

2

14

4

3

2K

3 months ago

@natanielruizg Thanks Nataniel! We're going to be running this demo for a couple more days, so grab your friends and try it out!

1

7

0

592

3 months ago

A couple of weeks ago, we introduced MultiGen, our work on real-time multiplayer world models. After spending way too many hours playing it with friends internally, we knew we had to share it. Today, we're excited to collab with @modal to let you experience it for yourselves. Grab your squad and play the live demo here 👇

3 months ago

We built a real-time multiplayer game generated entirely by a neural network—and now you can actually play it. In collaboration with @modal, we just launched the live demo for MultiGen, our diffusion-based multiplayer game engine. Grab some friends and try it here 👇

10

169

20

74

27K

0

20

4

5

3K

3 months ago

@GordonWetzstein @modal Super excited about releasing this! We've been having so much fun playing Multigen with our friends and now everyone can try it from their browser (and phones)

0

6

0

294

Po_lhr retweeted

3 months ago

High-resolution image and video generation is hitting a wall because attention in DiTs scales quadratically with token count. But does every pixel need to be in full resolution? Introducing Foveated Diffusion: a new approach for efficient diffusion-based generation that allocates compute where it matters most. 1/7🧵

23

1K

117

676

164K

Po_lhr retweeted

Eric Chan

@ericryanchan

4 months ago

Today, we announce our team’s progress in pursuing a different type of foundation model for robotics: the Direct Video Action Model (DVA), which does our best to take robotics and turn it into a generative modeling problem we can scale. Technical blog: https://t.co/GMsxnC5wbJ

13

197

26

108

21K

4 months ago

doomguy finds out he's AI generated

4 months ago

Excited to show some surprising inventions on generative multiplayer games we made at Google with Stanford. We call the work MultiGen. I've always been inspired by early studios like id Software with Doom or Blizzard with Warcraft bringing networked video games to the next level. We are at the point in history where we can make strides like them, but for generative games. It's a strange feeling to be in the age of generative video games while still discovering how exactly to train the models and design the tools that make them useful. All of the tools that have been invented for classic game engines need to be redesigned for generative games. For example level and world design is not entirely possible with existing technology. We introduce editable memory to diffusion game engines that allow for design of new levels via a minimap. But we can easily imagine how this can be expanded with different creation tools. The end goal of this research direction is to allow game designers to be able to guide the generation process of their world, at the granularity that they prefer. Editable memory also allows us to add multiplayer to Generative Doom. We were amazed when we saw GameNGen some years ago, and now you can play it live with friends in real-time, on your couch or even online. Shared representations like our editable memory seem like the future for this type of experience. Models are, in some cases, expensive and approximate encoders but great interpolators and extrapolators. Leveraging their strengths lets you have completely new experiences that can be realized now and not in the distant future. This work was started at my previous team and continued in collaboration with Stanford. Congratulations to all for the discoveries.

32

577

77

298

104K

0

11

2

0

2K

4 months ago

It was a huge pleasure working with Nataniel and team on this project. Starting from his previous project (Unbounded), Nataniel’s vision for generative games is sure to shape the way we view entertainment in the coming years.

4 months ago

Excited to show some surprising inventions on generative multiplayer games we made at Google with Stanford. We call the work MultiGen. I've always been inspired by early studios like id Software with Doom or Blizzard with Warcraft bringing networked video games to the next level. We are at the point in history where we can make strides like them, but for generative games. It's a strange feeling to be in the age of generative video games while still discovering how exactly to train the models and design the tools that make them useful. All of the tools that have been invented for classic game engines need to be redesigned for generative games. For example level and world design is not entirely possible with existing technology. We introduce editable memory to diffusion game engines that allow for design of new levels via a minimap. But we can easily imagine how this can be expanded with different creation tools. The end goal of this research direction is to allow game designers to be able to guide the generation process of their world, at the granularity that they prefer. Editable memory also allows us to add multiplayer to Generative Doom. We were amazed when we saw GameNGen some years ago, and now you can play it live with friends in real-time, on your couch or even online. Shared representations like our editable memory seem like the future for this type of experience. Models are, in some cases, expensive and approximate encoders but great interpolators and extrapolators. Leveraging their strengths lets you have completely new experiences that can be realized now and not in the distant future. This work was started at my previous team and continued in collaboration with Stanford. Congratulations to all for the discoveries.

32

577

77

298

104K

1

19

1

0

1K

4 months ago

🔗 Project page: https://t.co/GroDn9JVOS 📄 Paper: https://t.co/11XpUSXj7G This would not have been possible without our incredible collaborators, @JunhaoZHANG19, Amir Hertz, @GordonWetzstein, @neal_wadhwa, and @natanielruizg

0

8

3

909

4 months ago

🎮 Real-time multiplayer world model 👥 Arbitrary number of players 🧠 Generated entirely by a neural network MultiGen is a real-time multiplayer diffusion game engine that supports an arbitrary number of players through a shared memory-based world model, rather than limiting interaction to just 2 players. While single-player world models can already be entertaining, things really change once multiple people can step into the same generated world together. Here’s a 30-minute timelapse of 4-player gameplay running in real time.

14

151

16

71

40K

4 months ago

One nice consequence of external memory is that it turns level design into a native part of the system. The world is defined explicitly through a top-down map layout, so users can build or modify the environment before inference starts, while the model generates first-person observations that stay aligned with that structure.

1

8

1

4

2K

Po_lhr retweeted

4 months ago

Video world models today have a very limited context length. Mode Seeking meets Mean Seeking (MMM) unlocks long-context, persistent video world models through a unified representation. 1/8 🧵

3

208

26

150

43K

Po_lhr retweeted

7 months ago

today we are releasing new research at Google. we tackle the previously unsolved task of editing motion in an existing video. it's called MotionV2V. with it you can move objects in videos, move the camera, and other unprecedented edits in user-provided video

11

179

43

91

18K

Po_lhr retweeted

Ceyuan Yang

@CeyuanY

10 months ago

Long video generation usually results in context increasing/scaling during chunk/frame-wise rollout. Considering context scaling may require context selection, we thus introduce the idea of MoE into long context modelling and propose Mixture of Contexts. All previous context/memory is considered while the chosen ones are computed in a data-driven manner. You can easily enjoy 7x compute savings.

4

218

31

119

22K