Nick Stracke ✈️ CVPR @rmsnorm - Twitter Profile

Pinned Tweet

about 2 months ago

Video diffusion models learn motion indirectly through pixels. But motion itself is much lower-dimensional. We introduce 64× temporally compressed motion embeddings that directly capture scene dynamics. This enables efficient planning -> 10,000× faster than video models. 🧵👇

10

327

48

271

44K

rmsnorm retweeted

jo.schb ✈️CVPR @jo_schb

1 day ago

⚠️ Standard first stages are not sufficient for safety-critical applications! The most extreme weather events are often the hardest to decode. One latent → many plausible reconstructions Deterministic decoders hide that uncertainty. Meet FREUD 🧵👇

2

11

7

1

755

Nick Stracke ✈️ CVPR

@rmsnorm

4 days ago

Check out our work on how to scale NVS on internet-scale data! We provide fixes to the unsupervised NVS pipeline (RayZer) and also obtain more interpretable pose estimations while simplifying the overall setup.

Stefan Baumann✈️CVPR🔜SF

@StefanABaumann

4 days ago

The internet is full of video. So why can't novel view synthesis just scale on it? Real-world video is simultaneously unposed, messy, and dynamic, breaking self-supervised NVS. We fixed that. RayDer learns static-scene NVS from dynamic internet video, scaling like an LLM. A🧵

7

152

34

100

14K

0

8

1

2

715

Nick Stracke ✈️ CVPR

@rmsnorm

about 1 month ago

💡 Training with differently noised patches increases overall image gen performance, as the model learns a better underlying representation. This holds even for plain Euler sampling, but their sampler increases the gap even more!

jo.schb ✈️CVPR @jo_schb

about 1 month ago

Diffusion models treat every part of an image equally. → Same number of steps. Same compute. But images aren’t uniform. 🤔 Some regions are easy, others are hard. So why force the model to treat them the same? 🧵

11

584

81

471

76K

0

28

0

12

4K

Nick Stracke ✈️ CVPR

@rmsnorm

about 1 month ago

@jonchu dm

0

1

0

162

Nick Stracke ✈️ CVPR

@rmsnorm

about 2 months ago

@nurvai_ai You need periodic regrounding, and that's also what we do for LIBERO. You usually also have a translation error from converting tracks into actions that a robot can actually execute, which you also have to compensate for.

0

2

0

112

Nick Stracke ✈️ CVPR

@rmsnorm

about 2 months ago

Video diffusion models learn motion indirectly through pixels. But motion itself is much lower-dimensional. We introduce 64× temporally compressed motion embeddings that directly capture scene dynamics. This enables efficient planning -> 10,000× faster than video models. 🧵👇

10

327

48

271

44K

rmsnorm retweeted

Simo Ryu

@cloneofsimo

about 2 months ago

Cool stuff

0

36

5

19

8K

rmsnorm retweeted

Nick Stracke ✈️ CVPR

@rmsnorm

about 2 months ago

Stop predicting motion step-by-step. Model the whole motion in a compact representation for efficient planning. 📄 Paper: https://t.co/S51t6kxMqY 💻 Models: https://t.co/MohDxwjpz0 Joint work with @KoljaBauer, @StefanABaumann, @itsbautistam, Josh Susskind, and Björn Ommer.

1

30

6

23

3K

Nick Stracke ✈️ CVPR

@rmsnorm

about 2 months ago

@itsbautistam @KoljaBauer @CVPR Thanks Miguel!

0

1

0

88

rmsnorm retweeted

Miguel Angel Bautista

@itsbautistam

about 2 months ago

Amazing work led by @rmsnorm @KoljaBauer and our collaborators at LMU, to be presented at @CVPR! Personally, I find this question of "what's the right level of abstraction for planning in physical space?" to be very intriguing. Pixels over time are very low SNR (ie. the argument behind JEPA) but motion/trajectories carries a lot on information while being extremely compressible. I believe there's a lot more to uncover from this direction. Very glad to be part of this one!

1

14

3

2

2K

rmsnorm retweeted

Brian Roemmele

@BrianRoemmele

about 2 months ago

A massive step forward for AI video!

0

27

3

6

5K

Nick Stracke ✈️ CVPR

@rmsnorm

about 2 months ago

@Frid45 Thanks! 🇪🇺 Supporting European research, I see 👀

0

4

0

418

rmsnorm retweeted

atharva ☆

@k7agar

about 2 months ago

I have been saying

0

22

1

17

5K

rmsnorm retweeted

Nick Stracke ✈️ CVPR

@rmsnorm

about 2 months ago

@KoljaBauer @StefanABaumann @itsbautistam 1️⃣https://t.co/2wWMnibvRa Also, shoutout to two other recent works that explore how to use point tracks for world modeling. 👇...

1

20

4

1

2K

Nick Stracke ✈️ CVPR

@rmsnorm

about 2 months ago

@KoljaBauer @StefanABaumann @itsbautistam 2️⃣https://t.co/OyDxtn4kzg

Stefan Baumann✈️CVPR🔜SF

@StefanABaumann

about 2 months ago

You don't imagine the future by mentally rendering a movie. You trace how things move -- abstractly, sparsely, step by step. We built a model that does exactly this. It predicts motion, not pixels -- and it's 3,000× faster than video world models. Myriad, accepted at @CVPR 2026

StefanABaumann's tweet photo. You don't imagine the future by mentally rendering a movie. You trace how things move -- abstractly, sparsely, step by step.
We built a model that does exactly this. It predicts motion, not pixels -- and it's 3,000× faster than video world models.
Myriad, accepted at @CVPR 2026 https://t.co/LT2vTN3nam

4

350

54

248

26K

0

15

2

1

1K

Nick Stracke ✈️ CVPR

@rmsnorm

about 2 months ago

@KoljaBauer @StefanABaumann @itsbautistam 1️⃣https://t.co/2wWMnibvRa Also, shoutout to two other recent works that explore how to use point tracks for world modeling. 👇...

Neerja Thakkar

@neerjathakkar

2 months ago

What’s the right representation for a world model? 3D, pixels, or something else? Excited to release our new paper “Forecasting Motion in the Wild” where we propose point tracks as tokens for generating complex non-rigid motion and behavior From @GoogleDeepmind @Berkeley_AI @TTIC_Connect

7

464

74

278

79K

1

20

4

1

2K

rmsnorm retweeted

Kolja Bauer @KoljaBauer

about 2 months ago

Do we really need pixel generation to model motion? 🤔 We show how directly representing motion in a compact space enables efficient, scalable planning. 10,000× faster than video models, enabling planning and reasoning in open-world and robotics settings. Check it out ⬇️

1

29

4

17

4K

Nick Stracke ✈️ CVPR

@rmsnorm

Last Seen Users on Sotwe

Trends for you

Most Popular Users