Daohan Lu @fred_lu_443 - Twitter Profile

Pinned Tweet

4 months ago

Despite impressive visuals, current video world models that only generate a single agent’s perspective aren’t modeling a complete world. The complex behaviors that arise from real or virtual worlds do not happen in a vacuum. They arise from interactions among many agents. Instead of modeling a one-agent island, let’s try modeling a multi-agent planet. This led to our project, Solaris [1/9]

1

77

14

16

6K

fred_lu_443 retweeted

Xichen Pan

@xichen_pan

19 days ago

Modern text-to-image models are increasingly powered by large pretrained LLMs. But there is a curious mismatch: the LLM typically encodes the prompt only once, while the evolving noisy latent states are handled entirely by a newly trained generative backbone. Can pretrained multimodal prior participate in the denoising process? Introducing RepFusion. (1/12) 📄 https://t.co/WbkTtg5M79 🌐 https://t.co/iDHggosNJX

xichen_pan's tweet photo. Modern text-to-image models are increasingly powered by large pretrained LLMs.

But there is a curious mismatch: the LLM typically encodes the prompt only once, while the evolving noisy latent states are handled entirely by a newly trained generative backbone.

Can pretrained multimodal prior participate in the denoising process?

Introducing RepFusion. (1/12)

📄 https://t.co/WbkTtg5M79
🌐 https://t.co/iDHggosNJX

2

130

35

75

26K

fred_lu_443 retweeted

Ying Wang✈️ ICML @yingwww_

4 months ago

What is a good latent space for world modeling and planning? 🤔 Inspired by the perceptual straightening hypothesis in human vision, we introduce temporal straightening to improve representation learning for latent planning. 📑: https://t.co/CCmcEIJGM6

yingwww_'s tweet photo. What is a good latent space for world modeling and planning? 🤔

Inspired by the perceptual straightening hypothesis in human vision, we introduce temporal straightening to improve representation learning for latent planning.

📑: https://t.co/CCmcEIJGM6 https://t.co/SCO4vukZKA

29

789

132

558

242K

fred_lu_443 retweeted

Saining Xie

@sainingxie

4 months ago

i’m joining forces with @ylecun and an incredible group of people to start AMI Labs @amilabs. AMI isn’t a conventional lab. we don’t intend to become one. a lot to say about why this moment matters, but for now we’re heads down building. join us: https://t.co/zXj1IyBYDc

154

3K

163

476

503K

Who to follow

Felix Heide

@_FelixHeide_

Princeton Computational Imaging Lab: https://t.co/n8gRRpdvr4 Head of AI at Torc Robotics: https://t.co/7RonQDi1MJ

Kangle Deng

@kangle_deng

Research Scientist @Roblox | Prev. PhD student @CarnegieMellon

fred_lu_443 retweeted

Peter Tong

@TongPetersb

4 months ago

Train Beyond Language. We bet on the visual world as the critical next step alongside and beyond language modeling. So, we studied building foundation models from scratch with vision. We share our exploration: visual representations, data, world modeling, architecture, and scaling behavior! [1/9]

TongPetersb's tweet photo. Train Beyond Language. We bet on the visual world as the critical next step alongside and beyond language modeling. So, we studied building foundation models from scratch with vision.
We share our exploration: visual representations, data, world modeling, architecture, and scaling behavior! [1/9]

35

1K

220

719

218K

Daohan Lu @fred_lu_443

4 months ago

[10/9] Solaris is our first foray into multi-agent world modeling. While we saw some interesting and surprising findings, it is more so a solid foundation to enable deeper and further experimentation. I’m particularly excited about the complex multi-agent behaviors and phenomena that emerge from learning on open & dynamic multiplayer environments!

0

3

0

145

Daohan Lu @fred_lu_443

4 months ago

Despite impressive visuals, current video world models that only generate a single agent’s perspective aren’t modeling a complete world. The complex behaviors that arise from real or virtual worlds do not happen in a vacuum. They arise from interactions among many agents. Instead of modeling a one-agent island, let’s try modeling a multi-agent planet. This led to our project, Solaris [1/9]

1

77

14

16

6K

Daohan Lu @fred_lu_443

4 months ago

[9/9] For more examples, source code, docs, and paper, check out links below! 🔗https://t.co/CndLctcqYc 📄https://t.co/eu5ANfWUfu This project spanned everything from Java'ing Javascripts (sorry) to docking Dockers and figuring Figmas. I feel truly fortunate to have a group of talented collaborators with diverse collective expertise: @georgysavva, @ojmichel4, Suppakit Waiwitlikhit, Timothy Meehan, Dhairya Mishra, @SrivatsPoddar, @sainingxie

fred_lu_443's tweet photo. [9/9] For more examples, source code, docs, and paper, check out links below!

🔗https://t.co/CndLctcqYc
📄https://t.co/eu5ANfWUfu

This project spanned everything from Java'ing Javascripts (sorry) to docking Dockers and figuring Figmas. I feel truly fortunate to have a group of talented collaborators with diverse collective expertise: @georgysavva, @ojmichel4, Suppakit Waiwitlikhit, Timothy Meehan, Dhairya Mishra, @SrivatsPoddar, @sainingxie

1

5

0

1

231

fred_lu_443 retweeted

Shusheng Yang

@shushengyang

8 months ago

Working on Cambrian-S has been a genuinely meaningful learning experience. ❤️ I am grateful to all my amazing collaborators throughout this long journey, especially @jihanyang13, @_ellisbrown, @PinzhiHuang, Zihao Yang, Yue Yu, @TongPetersb, @ZihanZheng71803, Yifan Xu, Muhan Wang, and @fred_lu_443 (also our amazing director‼️). ☺️Thanks to @sainingxie for continuously encouraging us to explore the unknown, pursue crazy ideas, and play the infinite game! 🥰And thanks to all supervisors @sainingxie, @drfeifei, @ylecun for guiding us through the maze. 🌕Mission never ends. Let’s keep building supersensing for superintelligence. 🧵[n/n]

0

10

1

565

Daohan Lu @fred_lu_443

8 months ago

@ma_nanye Broke but not broken

0

10

fred_lu_443 retweeted

Edwin Huang

@PinzhiHuang

8 months ago

@sainingxie told us to ONLY work on "crazy ideas." Almost a year ago, we started Cambrian-S because "Supersensing" sounded super crazy. This crazy idea kept me awake and caffeinated for months. Today, all that work is live: Cambrian-S is here. So grateful to have built this alongside this incredible team. Please take a look here. Hope you find this idea crazy as well! Website: https://t.co/r4nZiqcMGE Github: https://t.co/dm882JkGzv arXiv: https://t.co/Ya32Zhbvrf

1

29

4

3

3K

Daohan Lu @fred_lu_443

8 months ago

Behind Cambrian-S are the passionate researchers that drive it. This video is a presentation, but more so representation. I shot the short as an ode to the very humans behind, and these unique, surprising spaces and memories that are we. Please enjoy! May the experiment go on--

Saining Xie

@sainingxie

8 months ago

Introducing Cambrian-S it’s a position, a dataset, a benchmark, and a model but above all, it represents our first steps toward exploring spatial supersensing in video. 🧶

30

686

102

373

259K

0

24

7

4

6K

Daohan Lu @fred_lu_443

9 months ago

Nice work! Pixel regression is not the only way to compress images for generation!

Boyang Zheng

@boyangzheng_

9 months ago

Introducing Representation Autoencoders (RAE)! We revisit the latent space of Diffusion Transformers, replacing VAE with RAE: pretrained representation encoders (DINOv2, SigLIP2) paired with trained ViT decoders. (1/n)

6

474

53

275

50K

0

4

0

296

fred_lu_443 retweeted

Yucen Lily Li @yucenlily

12 months ago

In our new ICML paper, we show that popular families of OOD detection procedures, such as feature and logit based methods, are fundamentally misspecified, answering a different question than “is this point from a different distribution?” https://t.co/gcks5PFyPX [1/7]

4

236

49

174

48K

Daohan Lu

@fred_lu_443

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users