Rui Li @leedaray - Twitter Profile

Pinned Tweet

about 1 year ago

Introducing LaRI (https://t.co/4dfmwXKZHw), a📸single-view,🚀single-feed-forward method to model🙈unseen 3D geometry using layered point maps. It ✅seamlessly extends depth estimation ✅unifies object- & scene-level reasoning ✅builds training & eval datasets Details👇

1

143

23

99

9K

leedaray retweeted

Kwang Moo Yi @kwangmoo_yi

about 22 hours ago

Zang et al., "World Tracing: Generative Pixel-Aligned Geometry Beyond the Visible" A Diffusion Transformer that estimates multiple layers of depth to further estimate occluded parts as well.

1

21

6

12

2K

leedaray retweeted

Xingang Pan @XingangP

1 day ago

Natural images often already implicitly contain depth information — hidden in bokeh effects. Can we leverage the rich depth cue widely exist in natural images for depth estimation? We explore this in our recent project BokehDepth (ICML 2026). - Stage 1: A generative model produces calibrated bokeh stacks from the input image. - Stage 2: The bokeh stacks are integrated into a depth prediction model to estimate depth. We believe it highlights bokeh effects as an important and effective complementary cue for monocular depth estimation. 🌐 Project page: https://t.co/vcfTSHFZn6 📄 arXiv: https://t.co/gqGgesXIrO 👨‍💻 Code: https://t.co/KNs46UGXVu

3

60

14

30

6K

leedaray retweeted

Edgar Sucar @SucarEdgar

7 days ago

Come check out V-DPM @CVPR [Poster 25] 11:45 - 13:45 4D video reconstruction in the wild: code and models available 🤖 @EldarIsTyping @Oxford_VGG

2

222

25

148

17K

Who to follow

Mark Boss

@markb_boss

I’m the Co-Head of 3D & Image at Stability AI with research interests in the intersection of machine learning and computer graphics

Ro6er+

@tw_rwerner

The revolution can not be stopped.

Bernhard Jaeger

@bern_jaeger

Co-founder of KE:SAI, a non-profit open science AI research lab. https://t.co/JdxBCIZedy

leedaray retweeted

Xingang Pan @XingangP

5 days ago

Transformers have succeeded in modeling phenomena traditionally associated with computer graphics, such as 3D visual effects (e.g., RayZer) and rendering processes (e.g., RenderFormer). A natural question is whether they can also tackle the challenging task of cloth simulation. We introduce 👕𝗖𝗹𝗼𝘁𝗵𝗧𝗿𝗮𝗻𝘀𝗳𝗼𝗿𝗺𝗲𝗿, a Transformer-based method that reformulates cloth simulation as autoregressive next-state prediction in a learned latent space. It handles diverse scenarios under a single model, with 4-9x lower error than prior SOTAs: • Body-driven garments • Robotic manipulation • General cloth–object collisions We believe it highlights the potential of Transformer-based autoregressive models as a powerful alternative to conventional simulation approaches. This work is mainly led by my student Yu Zhang @yucrazing 🌐 Project page: https://t.co/50cCU6i7GT 📄 arXiv: https://t.co/z6yBNiANpy

2

78

11

41

9K

leedaray retweeted

Jon Barron

@jon_barron

7 months ago

The current paper submission and review process seems unlikely to survive LLMs. One alternative would be to build a new process around talks: "submission" is making and giving a 30 minute live talk, and "review" is three experts watching, evaluating, and asking questions.

35

268

25

39

67K

leedaray retweeted

Prune Truong @prunetruong

8 months ago

🎺Meet VIST3A — Text-to-3D by Stitching a Multi-view Reconstruction Network to a Video Generator. ➡️ Paper: https://t.co/sFqbbUiGOO ➡️ Website: https://t.co/QWMLwXyVcB Collaboration between ETH & Google with Hyojun Go, @DNarnhofer, Goutam Bhat, @fedassa, and Konrad Schindler.

2

88

11

41

17K

leedaray retweeted

Haofei Xu

@haofeixu

8 months ago

🚀Excited to share our recent work on test-time scaling for feed-forward Gaussian splatting: we learn a recurrent model ReSplat that is able to iteratively improve the reconstruction quality in a feed-forward manner! https://t.co/OB38xC7PjC

5

311

49

174

18K

leedaray retweeted

Nikhil Keetha

@Nik__V__

8 months ago

Interesting ICLR submissions 🤩 Depth Anything 3 - My TLDR: Init multi view transformer of VGGT with later layer DINO weights and use teacher model trained on synthetic data only for pseudo labelling real world datasets https://t.co/tPXAqH61B8 Trace Anything - My TLDR: VGGT like model predicting N view geometry and motion as a trajectory field represented using splines and control points https://t.co/OUcLJeMgjN The field is evolving very fast!

Nik__V__'s tweet photo. Interesting ICLR submissions 🤩

Depth Anything 3 - My TLDR: Init multi view transformer of VGGT with later layer DINO weights and use teacher model trained on synthetic data only for pseudo labelling real world datasets

https://t.co/tPXAqH61B8

Trace Anything - My TLDR: VGGT like model predicting N view geometry and motion as a trajectory field represented using splines and control points

https://t.co/OUcLJeMgjN

The field is evolving very fast!

5

385

40

214

25K

leedaray retweeted

Leo@Yuhao @LeoLau_yuhao

9 months ago

Thanks, AK, for sharing our work!

0

26

5

8

14K

Rui Li @leedaray

9 months ago

@MinChonChiSF Made a typo, it’s 2700 accepted papers

0

1

0

731

Rui Li @leedaray

9 months ago

🚀 The #ICCV2025 Award Candidate Papers are out! 🚀 From 2,701 submissions, only 13 were selected, spanning 3D vision, generative models, foundation models, and more. Key highlights at a glance 👇

2

141

18

98

17K

Rui Li @leedaray

9 months ago

(13/13) Spatially-Varying Autofocus TL; DR: A method for per-pixel autofocus that creates freeform depth-of-field and all-in-focus images. 📃Paper: https://t.co/JYYpAOcnPL 🏗️Project: https://t.co/V8SpJHuvRj

0

5

0

4

1K

Rui Li @leedaray

9 months ago

(12/13) Automated Model Evaluation for Object Detection via Prediction Consistency and Reliability TL; DR: A ground-truth-free method (PCR) that evaluates object detectors via prediction consistency and confidence reliability. 📃Paper: https://t.co/3oftIkN1CZ

1

8

1

7

1K

Rui Li

@leedaray

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users