Shaowei Liu @stevenpg8 - Twitter Profile

Pinned Tweet

about 2 months ago

📢MoRight: Motion Control Done Right "What if your video model actually understood cause and effect?" Existing motion-controlled video models entangle camera and object motion, and treat everything as kinematic displacement. MoRight changes both. 🔥 Motion Causality — MoRight decomposes motion into actions & consequences. Give an action → MoRight predicts consequences (aka motion simulation) . Give a desired outcome → MoRight recovers the driving action (aka motion planning). Not merely displacing pixels. 🎬 Disentangled Control — MoRight separates camera and object motion, allowing users to independently control each of them. No entanglement. Project Page: https://t.co/IVIgJopCCI Paper: https://t.co/t0kQqJfXQE

4

235

34

114

31K

stevenpg8 retweeted

tingwu.wang

@TingwuWang

about 1 month ago

What is missing to bring real-time motion research into AAA games and real-world robotics? We present MotionBricks, a step toward bridging this gap with two key components: - a single generative latent motion backbone covering 350,000+ motion skills, running at 15,000 FPS with 2 ms latency and substantially improved quality and reliability. - a unified smart primitive interface for locomotion, object / scene interaction, with fine-grained control over generated behaviors. Webpage: https://t.co/aJE5skUuWD Code: https://t.co/r56D3TJ8CW Paper: https://t.co/CtOHXnHZMv (ACM TOG / SIGGRAPH 2026)

27

1K

150

928

151K

stevenpg8 retweeted

Xuanchi Ren

@xuanchi13

about 2 months ago

We scaled up Lyra to generate explorable 3D worlds! 🚀 Introducing Lyra 2.0 — turning a single image into a 3D world you can walk through, look back, and even drop a robot into 🤖 Code and Model available today! 🌐 Website: https://t.co/plBxCoWkNn (1/N)

29

874

122

381

1M

Shaowei Liu @stevenpg8

about 2 months ago

@Yahskapar Thanks, Akshay.

0

35

Who to follow

Minghua Liu @ CVPR26

@MinghuaLiu_

PhD student @UCSD_CSE. AI, 3D Vision, Embodied AI | research intern @NVIDIA | past: @Qualcomm @Waymo @Adobe @Tsinghua_Uni

Huazhe Harry Xu

@HarryXu12

Hi, I like reinforcement learning, robots, and video games:) I am an amateur pianist. Assistant Prof at Tsinghua; Postdoc at Stanford; Ph.D. at Berkeley

Huihan Liu

@huihan_liu

CS PhD @UTAustin | 🤖 Robot Learning & Embodied Agent | @berkeley_ai @AIatMeta @MSFTResearch | 🏆 RSS Best Paper Finalist | 🏆 ICRA Outstanding Learning Paper

Shaowei Liu @stevenpg8

about 2 months ago

📢MoRight: Motion Control Done Right "What if your video model actually understood cause and effect?" Existing motion-controlled video models entangle camera and object motion, and treat everything as kinematic displacement. MoRight changes both. 🔥 Motion Causality — MoRight decomposes motion into actions & consequences. Give an action → MoRight predicts consequences (aka motion simulation) . Give a desired outcome → MoRight recovers the driving action (aka motion planning). Not merely displacing pixels. 🎬 Disentangled Control — MoRight separates camera and object motion, allowing users to independently control each of them. No entanglement. Project Page: https://t.co/IVIgJopCCI Paper: https://t.co/t0kQqJfXQE

4

235

34

114

31K

stevenpg8 retweeted

Robots Digest 🤖

@robotsdigest

about 2 months ago

MoRight tackles a core problem in controllable video generation: disentangling camera motion from object motion. Unlike prior methods that rely on dense future-frame tracks, MoRight uses only first-frame reprojected trajectories + camera poses, yet achieves comparable quality and better motion control.

1

26

4

9

1K

stevenpg8 retweeted

Jun Gao

@JunGao33210520

about 2 months ago

When doing motion-conditioned video generation, we think the key is in understanding motion causality (what action will lead to what outcome) and camera-motion decomposition (camera changes shouldn't entangle with object dynamics). MoRight is the first step towards this goal. We support three functionalities: 1. "Simulation": Users provide the action (e.g., moving hands), and the video model generates the consequences (e.g., cups moving and pouring water) 2. "Planning": Users specify the outcome (e.g., balls moving), and the video model generates the action that drives the outcome (e.g., moving legs) 3. "Disentanglement": Users can independently control the camera change and object dynamics, all along with the motion causality mentioned above. Come and check it at https://t.co/FHScanEtse P.S. Shaowei is in the job market; you shouldn't miss him!

0

46

7

22

7K

Shaowei Liu @stevenpg8

about 2 months ago

The project is a joint effort with @xuanchi13 @TianchangS @HuanLing6 @_saurabhg @ShenlongWang @FidlerSanja @JunGao33210520 @nvidia and @UofIllinois, thanks everyone! Special thanks to Spatial Intelligence Lab for hosting me as a intern! Check more results at 👇https://t.co/IVIgJoqasg

stevenpg8's tweet photo. The project is a joint effort with @xuanchi13 @TianchangS @HuanLing6 @_saurabhg @ShenlongWang @FidlerSanja @JunGao33210520 @nvidia and @UofIllinois, thanks everyone! Special thanks to Spatial Intelligence Lab for hosting me as a intern!
Check more results at 👇https://t.co/IVIgJoqasg

0

4

0

339

Shaowei Liu @stevenpg8

about 2 months ago

Our method is straightforward! Dual-stream architecture: Dual-stream architecture: one stream for object motion on the canonical frame, one for camera motion. Object motion transferred across views via cross-view self-attention. Data: We curate a systematic data pipeline to decompose the motion sequences in our training video into active and passive motion, allowing the model to learn to capture the motion causality.

stevenpg8's tweet photo. Our method is straightforward!

Dual-stream architecture: Dual-stream architecture: one stream for object motion on the canonical frame, one for camera motion. Object motion transferred across views via cross-view self-attention.

Data: We curate a systematic data pipeline to decompose the motion sequences in our training video into active and passive motion, allowing the model to learn to capture the motion causality.

1

3

0

2

354

Shaowei Liu @stevenpg8

6 months ago

Visit https://t.co/4Eqc4p1RoR and try the cool demo at https://t.co/yix9oHF1Bv Don’t miss our poster session tmr! Poster: Thur Dec. 3, 11-2pm #4607 Joint work with @david_yao14255, @_saurabhg, @ShenlongWang and @siebelschool Come and unlock more fun demos at our poster tmr👇

0

1

0

140

Shaowei Liu @stevenpg8

6 months ago

Come and visit #NeurIPS2025 work tmr at #4607: VisualSync: Multi-Camera Synchronization via Cross-View Object Motion Website: https://t.co/FprhFRs1Dd Paper: https://t.co/zvuBRbVKqS Code: https://t.co/yix9oHF1Bv Poster: 11-2pm pst #4607 Multi-view (>=30) videos sync in-the-wild

1

15

2

1K

Shaowei Liu @stevenpg8

6 months ago

Our method is generic, robust and scalable: we test on ego-exo settings, sports game recordings, multi-person activities, animals. The synched outputs benefit dynamic reconstruction, novel view synthesis, and multi-view data engines. Check demos and applications on our website

stevenpg8's tweet photo. Our method is generic, robust and scalable: we test on ego-exo settings, sports game recordings, multi-person activities, animals. The synched outputs benefit dynamic reconstruction, novel view synthesis, and multi-view data engines. Check demos and applications on our website https://t.co/Ms2aI9yxiY

1

0

154

Shaowei Liu @stevenpg8

6 months ago

Then our method is straightforward: we need cross-view temporal correspondences and camera pose to compute epipolar lines. We estimate pairwise video offsets by minimizing epipolar violations over matched correspondences. Later we perform global optimization to align all videos.

stevenpg8's tweet photo. Then our method is straightforward: we need cross-view temporal correspondences and camera pose to compute epipolar lines. We estimate pairwise video offsets by minimizing epipolar violations over matched correspondences. Later we perform global optimization to align all videos. https://t.co/GCYvb9KE2d

0

1

0

92

Shaowei Liu @stevenpg8

6 months ago

In a dynamic scene with moving camera, the epipolar geometry still holds: In synced videos (left), the green correspondence from one view always align with the red epipolar lines in the other view. In unsynced videos (right), deviations from the red lines indicate the sync error.

1

0

90

Shaowei Liu @stevenpg8

8 months ago

@LeaMue27 @camenduru @chuan_guo92603 @_JianWang_ Thanks so much Lea, your amazing pioneer work inspires us a lot

0

61

Shaowei Liu @stevenpg8

8 months ago

Glad introduce our #ICCV2025 work Ponimator: Unfolding Interactive Pose for Versatile Human-human Interaction Animation Website: https://t.co/IJp88qa6jb Paper: https://t.co/jgWRFauvAf Code: https://t.co/GfyaLiUXRY Poster: Wed 10.22, 10:30am #1109 @chuan_guo92603 @_JianWang_

6

135

19

66

10K

Shaowei Liu @stevenpg8

8 months ago

Visit https://t.co/PjGSeJagGe and try the cool demo at https://t.co/oE1FWyb5mP Don’t miss our poster session on Wed Oct. 22 10:30 #1109 Joint work @chuan_guo92603, Bing Zhou, @_JianWang_, @Snapchat and @IllinoisCDS Warm hug from the most famous statue at Waikiki beach👇

0

1

0

223

Shaowei Liu @stevenpg8

8 months ago

Our framework enables diverse applications, including creating vivid interactive videos between people from 2 single image like below👇

1

3

0

235

Shaowei Liu

@stevenpg8

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users