We made a @gradio demo for AllTracker!
AllTracker is the current state-of-the-art for general-purpose point tracking. The demo gives a good sense of the accuracy---try your own videos and see for yourself!
🔗 Demo: https://t.co/VniUAh9ZQe
💻 Code: https://t.co/hYgFbQhnMv
Very fun: Our paper for Insect Science is one of the journal's most-viewed articles of 2025! Yay for point tracking + vibrational signal analysis (with @JPolajnar)!
BTW the underlying point tracker here is basically "PIPs+++" -- a simple method tuned well can do a lot!
The Project Aria team is excited to be part of the Third Joint Egocentric Vision (EgoVis) Workshop at #CVPR2026! 👓✨
Don’t miss the latest Project Aria updates as we dive into the future of egocentric perception. Check out the full program of invited speakers below!
📅 June 3, 2026
📍 Colorado Convention Center | Room 704/706
🔗 https://t.co/g1FA6Cx4RU
⚡️@liuziwei7 , @mapo1 , @doughty_hazel
Dynamic 3D from monocular video 🎥
Static pre-scan + video = richer reconstruction ✨
As image-to-3D improves, static pre-scans become increasingly practical! Kudos to @tnarek99 and team! https://t.co/NQZxoOF25E
Excited to present DRoPS! A novel method for dynamic 3D reconstruction from monocular videos, accompanied by a static-prescan of the dynamic object.
🌐 Project: https://t.co/rwIaAINSR4
📄 Paper: https://t.co/drNwUgetmK
🧵 [1/7]
New #NVIDIA Paper
We introduce Motive, a motion-centric, gradient-based data attribution method that traces which training videos help or hurt video generation.
By isolating temporal dynamics from static appearance, Motive identifies which training videos shape motion in video generation.
🔗 https://t.co/TbKXjQMN3H
1/10
Introducing ShapeR, a method for robust conditional 3D shape generation from casually captured sequences.
ShapeR leverages a rectified flow transformer conditioned on per-object multimodal data to turn casual image sequences into full metric scene reconstructions.
Project Page: https://t.co/ffH2zVKd8c
Paper: https://t.co/82yWRrXnA1
Links to code and huggingface below ⬇️
📢 New Paper
PointSt3R: Point Tracking through 3D Grounded Correspondence
Can point tracking be re-formulated as pairwise frame correspondence solely?
We fine-tuning MASt3R with dynamic correspondences and a visibility loss and achieve competitive point tracking results
1/3
Generative Point Tracking with Flow Matching
My latest project with @AdamWHarley@CSProfKGD@DerekRenderling@chrisjpal
Project page: https://t.co/cs4zFEuLYU
Paper: https://t.co/sa9NdFlOgP
Code: https://t.co/F4Ug3JWkRX
Yup! But I wouldn't say "single shot" -- AllTracker, inheriting from CoTracker and PIPs and RAFT, is an iterative approach. In practice, we iterate 4x to get the final answer (and this sums to 11 FPS at 576x1024).
AllTracker. Hi-res, dense point tracking across long video ranges in a single shot. Tracks every pixel. Fast, lightweight (16M params). Sparse tracking is obsolete.
https://t.co/Hbtp5psDJZ
The visualizer in the AllTracker repo is now GPU-based (much faster), with new options to improve clarity.
Very satisfying to see 3D shape "pop out" from the tracking.
https://t.co/hYgFbQhnMv
@cam_sentinel@_akhaliq@Gradio Real-time version of this should be doable very soon. On-device is possible, but depends on the device of course. The model shown here is only 16M parameters (66mb), but needs a GPU.
We made a @gradio demo for AllTracker!
AllTracker is the current state-of-the-art for general-purpose point tracking. The demo gives a good sense of the accuracy---try your own videos and see for yourself!
🔗 Demo: https://t.co/VniUAh9ZQe
💻 Code: https://t.co/hYgFbQhnMv
@georgtrof@Gradio Well, it's great failure case for me to stare at, so thanks for sharing! I think point tracking should eventually be the backbone of pose estimation.
@georgtrof@Gradio I was thinking static ROI yes (e.g., just crop the video with ffmpeg). And yes, blur and self-occlusion make it harder for the model.
@mkocab_@andrew_n_carr@andrew_n_carr I bet you are right that you can classify real/fake from here, but @mkocab_ is right that the visually apparent "cracking" artifact in the background is pretty common overall. This comes from subpixel errors, and maybe subpixel ambiguity in general.
Tricky sample for AllTracker.
It has never seen water at training time, so I'm glad it knows to quickly discard the tracks there...
The bird that travels right-to-left (behind the wings most of the time) could be handled better.