Runjia Li

9 days ago

Gamma-World Generative Multi-Agent World Modeling Beyond Two Players

8

103

14

54

31K

Google DeepMind @GoogleDeepMind

18 days ago

Impressive!!

Jianyuan

@jianyuan_wang

19 days ago

Introducing VGGT-Ω: scaling feed-forward reconstruction across static and dynamic scenes, and studying whether the learned geometric representations transfer beyond reconstruction.

14

839

144

239

769K

0

1

0

73

RunjiaLi retweeted

RuiningLi

@RayLi234

22 days ago

🚀 Introducing Articraft, a coding agent for articulated 3D asset creation. Articraft writes code, executes it, receives validation feedback, and refines the result into simulation-ready 3D assets with parts, joints, and motion. We’re also releasing Articraft-10K: 10,000+ articulated objects across 250 categories, unlocking large-scale interactive scenes for robotics simulation and physical AI. 🔗 Project page: https://t.co/FWutv61yx7 💻 Code: https://t.co/CpCYdBzMlv

23

744

107

785

182K

RunjiaLi retweeted

Stan Szymanowicz

@StanSzymanowicz

22 days ago

We made an interactive client-server viewer for LagerNVS with @JonathonLuiten! You can now interactively explore scenes from just a photo capture - no optimization, no 3D Gaussians, just load your images, run the model on a cloud GPU and stream the renders to your local browser. Check out the video below for some spaces I recently captured in Oxford, London and beyond!

5

175

28

169

16K

RunjiaLi retweeted

25 days ago

We’re reimagining a 50-year-old interface - the mouse pointer - with AI. 🖱️ These experimental demos show how people can intuitively direct Gemini on their screens using motion, speech, and natural shorthand to get things done 🧵

461

9K

1K

3K

2M

RunjiaLi retweeted

Xuanchi Ren

@xuanchi13

about 2 months ago

We scaled up Lyra to generate explorable 3D worlds! 🚀 Introducing Lyra 2.0 — turning a single image into a 3D world you can walk through, look back, and even drop a robot into 🤖 Code and Model available today! 🌐 Website: https://t.co/plBxCoWkNn (1/N)

29

877

123

381

1M

RunjiaLi retweeted

Alexander Pondaven @alexpondaven

2 months ago

Introducing ActionParty: the first video world model that controls up to 7 players simultaneously on the same screen across 46 game environments. We tackle the action binding problem in video diffusion, ensuring each player's action is applied to the right subject. 🧵

6

52

10

17

9K

RunjiaLi retweeted

Wei Yu @GnosisYu

2 months ago

Dropping an exciting new demo of MosaicMem! 👀🔥 A friend brought up a great question: why not combine long-horizon navigation video generation, promptable world events, and scene concatenation? Fair point — so we gave it a shot. 🎬✨ For more technical details, check this thread 🧵👇 https://t.co/qyQYwmHsE6 #WorldModel #GenerativeAI #VideoGeneration #InteractiveAI #Genie3 #EmbodiedAI #GameAI

0

107

21

72

9K

3 months ago

@Snapchat The work was done in a joint collaboration with @WilliMenapace during my internship @Snap. Many thanks to @moayedhajiali, @ashmrz10, Chaoyang Wang Arpit Sahni, @isskoro, Aliaksandr Siarohin, @JakabTomas, @han_junlin, @SergeyTulyakov, @philiptorr

0

4

0

367

3 months ago

🎉EgoEdit @Snapchat has been accepted to CVPR 2026! 🏆👻 We are bringing high-quality, real-time editing to egocentric videos. Our massive 100k video dataset and benchmark are ALREADY PUBLIC! 🔓🚀 🏠 Project Page: https://t.co/cEUZRxdLDf 🤗 Dataset: https://t.co/qCFRTY8cYG

6 months ago

EgoEdit Dataset, Real-Time Streaming Model, and Benchmark for Egocentric Video Editing

4

84

14

37

30K

5

105

8

60

22K

3 months ago

@Snapchat Many thanks to coauthors! And thank @_akhaliq for posting our paper!

0

1

0

220

RunjiaLi retweeted

3 months ago

Mode Seeking meets Mean Seeking for Fast Long Video Generation paper: https://t.co/TFznQW57cC

5

120

18

53

20K

RunjiaLi retweeted

5 months ago

WorldWarp Propagating 3D Geometry with Asynchronous Video Diffusion https://t.co/7CvB39UGvG

6

84

15

65

14K

RunjiaLi retweeted

Junlin Han @han_junlin

8 months ago

Excited to share our new work: “Learning to See Before Seeing”! 🧠➡️👀 We investigate an interesting phenomeno: how do LLMs, trained only on text, learn about the visual world? Project page: https://t.co/9mQt3qnckL

han_junlin's tweet photo. Excited to share our new work: “Learning to See Before Seeing”! 🧠➡️👀 We investigate an interesting phenomeno: how do LLMs, trained only on text, learn about the visual world?
Project page: https://t.co/9mQt3qnckL https://t.co/srjfx64kkd

7

159

26

57

26K

11 months ago

🎉 VMem is officially accepted to ICCV 2025! Excited to chat with everyone in Hawaii about making video generation consistent and interactive with our Surfel-Indexed View Memory 🏝️🎥 Also, huge thanks to my insanely helpful coauthors!

Chuanxia Zheng @ChuanxiaZ

12 months ago

VMem Consistent Interactive Video Scene Generation with Surfel-Indexed View Memory

7

373

56

220

84K

5

60

7

9

15K

RunjiaLi retweeted

Tomas Jakab @JakabTomas

12 months ago

Excited to share VMem: a novel memory mechanism for consistent video scene generation 🎞️✨ VMem evolves its understanding of scene geometry to retrieve the most relevant past frames, enabling long-term consistency 🌐 https://t.co/AHBj6j1ecE 🤗 https://t.co/FbUbJHWW4F 1/ 🧵

4

58

11

27

16K

RunjiaLi retweeted

12 months ago

After two amazing years with @Oxford_VGG, I will be joining @NTUsg as a Nanyang Assistant Professor in Fall 2025! I’ll be leading the Physical Vision Group (https://t.co/byLxP7FE4a) — and we're hiring for next year!🚀 If you're passionate about vision or AI, get in touch!

24

240

29

63

43K

RunjiaLi retweeted