Jiahao Lu

about 2 months ago

🚀 Introducing CoMoVi! From a start image & text prompt, it simultaneously generates realistic human videos and corresponding 3D motion sequences. ✨ No reference videos needed to extract skeletons anymore! 🧠 By co-generating motion and video, CoMoVi directly inherits the massive generalization power of video gen models, making it adaptable to various diverse text prompts! 🌍 This co-generation approach also makes CoMoVi look like a human-centric World Action Model (WAM), simulating not just the visual world, but the physical state of human actions within it. arxiv: https://t.co/bpVyAirgls HF page: https://t.co/W0kdOwwyKF Project page: https://t.co/371GDgJKVo Code: https://t.co/qSJKb5vaqG

1

46

7

27

3K

FFzzf08 retweeted

2 months ago

🎉🎉🎉 MotionCrafter was selected as a CVPR 2026 Highlight🔥 paper !

2

23

3

4

2K

FFzzf08 retweeted

3 months ago

🎥 Demo Video for MotionCrafter (CVPR 2026) How much do video diffusion models know about the 4D world? Watch the demo to find the answer👇 https://t.co/hnt51lrxTh #CVPR2026 #ComputerVision #3DVision

0

22

1

8

3K

Jiahao Lu @FFzzf08

3 months ago

Video diffusion model has strong priors about our physical world. It’s a good way to reconstruct 4D world with it.

Alexandre Morgand @Almorgand

3 months ago

🎥 Demo Video for MotionCrafter (CVPR 2026) How much do video diffusion models know about the 4D world? Watch the demo to find the answer👇 https://t.co/hnt51lrxTh #CVPR2026 #ComputerVision #3DVision

0

22

1

8

3K

0

3

0

217

FFzzf08 retweeted

⚡AI Search⚡

@aisearchio

3 months ago

What a crazy week in AI! 🚀 LTX 2.3 GPT 5.4 FireRed Edit 1.1 Kiwi Edit HY WU Qwen 3.5 small Cuda Agent CubeComposer Helios Spatial T2I Spectrum Utonia & more! Watch the full recap: https://t.co/iH01KoagIH

5

152

13

58

8K

FFzzf08 retweeted

3 months ago

"Track4World: Feedforward World‑centric Dense 3D Tracking of All Pixels" TL;DR: feed‑forward model that predicts pixel‑level 2D and 3D dense flows for holistic world‑centric 3D tracking from monocular video, outperforming prior flow and tracking baselines.

2

87

11

53

5K

FFzzf08 retweeted

Jiahao Lu @FFzzf08

3 months ago

Why Track4World? 1️⃣ Dense world-centric tracking 2️⃣ Supports DA3/Pi3/MoGe 3️⃣ Efficient 3D correlation 4️⃣ 2D-to-3D supervision bypasses 3D GT scarcity! #ComputerVision #3DTracking #SceneFlow #OpticalFlow

0

7

3

1

1K

Jiahao Lu @FFzzf08

3 months ago

Why Track4World? 1️⃣ Dense world-centric tracking 2️⃣ Supports DA3/Pi3/MoGe 3️⃣ Efficient 3D correlation 4️⃣ 2D-to-3D supervision bypasses 3D GT scarcity! #ComputerVision #3DTracking #SceneFlow #OpticalFlow

3 months ago

Excited to share Track4World, feedforward 3D tracking of all pixels in the world-centric coordinate system. Code has been released, and welcome to try it! Homepage: https://t.co/OIiaEl8KJP Code: https://t.co/VgLAkCLPCZ Paper: https://t.co/bCPEVFaQUW

4

266

45

190

18K

0

7

3

1

1K

FFzzf08 retweeted

Chuanxia Zheng @ChuanxiaZ

3 months ago

Excited to share Track4World, feedforward 3D tracking of all pixels in the world-centric coordinate system. Code has been released, and welcome to try it! Homepage: https://t.co/OIiaEl8KJP Code: https://t.co/VgLAkCLPCZ Paper: https://t.co/bCPEVFaQUW

4

266

45

190

18K

FFzzf08 retweeted

4 months ago

How much do video diffusion models know about the 4D world? By introducing a 4D VAE, we jointly estimate geometry and motion from videos using a large-scale pretrained VDM. - paper: https://t.co/pOqK39MO10 - page: https://t.co/nfDqzlvDV5 - code: https://t.co/QMgcfht363

0

96

16

55

11K

FFzzf08 retweeted

AI Bites | YouTube Channel

4 months ago

#CVPR2026 Accepted by CVPR 2026🎉 Final rating: 6 5 5 Thanks to the reviewers and ACs for their recognition!

1

54

4

20

7K

FFzzf08 retweeted

Wildminder

@wildmindai

3 months ago

Track4World. Feedforward world-centric dense 3D tracking; - tracks every pixel in 3D. - 16-frame sequences in 3.4s with 14GB VRAM; - Depth Anything v3 as backbone. https://t.co/5PTAemjBwj

4

204

25

169

19K

FFzzf08 retweeted

AIQUEST

@AiquestAcademy

3 months ago

Track4World: what if you could track every single pixel's 3D movement in a video, accurately and instantly? this new model turns any regular video into a detailed 3D scene, figuring out teh precise 3D path of everything moving in the frame, fast. it's like rebuilding the entire world from a single clip! 🤯 code and demo are available.

1

2

1

142

FFzzf08 retweeted

@ai_bites

5 months ago

CoMoVi, a co-generative framework that couples two video diffusion models (VDMs) to generate 3D human motions and videos synchronously within a single diffusion denoising loop. the generation of 3D human motions and 2D human videos is intrinsically coupled. 3D motions provide the structural prior for plausibility and consistency in videos, while pre-trained video models offer strong generalization capabilities for motions, which necessitate coupling their generation processes. CoMoVi is based on this. Paper Title: CoMoVi: Co-Generation of 3D Human Motions and Realistic Videos Project: https://t.co/Q5fwWcRjkP Link: https://t.co/VGgztlYHPA

0

4

1

0

132

FFzzf08 retweeted

5 months ago

Excited to share our recent work, UniSH, which unifies dynamic 3D scene reconstruction and SMPL estimation within a single framework. (Left-top is input video). Code has been released! https://t.co/T2CrYpTZxn Project page: https://t.co/4OZq3QW9Th Paper: https://t.co/Pnebo5LuDY

4

322

43

197

19K

FFzzf08 retweeted

Ying Shan

@yshan2u

5 months ago

🚀🚀We’re building a new Applied Research Team in Tencent IEG for Game AI, with a research culture similar to ARC Lab. This newly formed team focuses on research-driven Game AI, operating at the intersection of fundamental research and large-scale game environments. Our goal is to develop principled models that can understand, simulate, and act within complex virtual worlds—while remaining grounded enough to eventually shape real games. Our research directions include (but are not limited to): 🎮 Interactive & Dynamic World Modeling — learning, simulating, and reasoning about evolving game worlds 🤖 NPC World-to-Action Modeling — connecting world understanding to decision and action, with strong ties to Embodied AI and agent behavior 🌍 Game Scene Generation — generative modeling of diverse, controllable, and scalable game scenes We are looking for researchers with the following minimum qualifications: ✨ A recent Ph.D. in related fields ✨ 5+ top conference or journal papers ✨ 1000+ GitHub stars 🌟 Evidence of a “make it work” mindset We are also open to strong graduate students for intern positions. Feel free to DM me or contact: [email protected].

6

152

12

56

13K

FFzzf08 retweeted