It’s been a while since I posted here, but I’m very excited to share what our team at @nvidia has been building over the past year!
After a year of active development, we’re getting ready to release SIL-Wheel to the world: a one-stop shop platform for data-centric workflows in large-scale video model training.
Built by researchers, for researchers, SIL-Wheel brings together search, curation, annotation, evaluation, and analysis for large video datasets in one centralized framework.
Want a sneak peek before the official release? Come by the NeXD26 Workshop @CVPR tomorrow at 10:30!🚀
🍺 LagerNVS (CVPR 2026) 🍺
LagerNVS is a generalizable, feed-forward, real-time Novel View Synthesis network which
- performs rendering in real time,
- generalizes to in-the-wild data,
- works with and without known source cameras,
- sets a new state-of-the-art among deterministic methods,
- can be paired with a diffusion decoder for generative extrapolation.
LagerNVS shows that 3D biases are useful for Novel View Synthesis but explicit 3D representations are not required to achieve them. We use 3D biases in (1) architecture design and (2) pre-training:
(1) In NVS with explicit 3D representations (3DGS, NeRF) reconstruction is typically difficult and slow, but rendering is much faster and simpler. We mimic this process in the network design: we use a large (1B params) encoder and a small, lightweight decoder (ViT-B). This allows increasing the network capacity while still achieving real-time rendering.
(2) The encoder, initialized from VGGT, was pre-trained with 3D reconstruction objectives, making the initial features 3D aware.
Both substantially improve performance.
Project page: https://t.co/yDVzZkScKN
Code: https://t.co/i7hGcgjAWW
Paper: https://t.co/WQvt4NgL7n
Models: https://t.co/QmaYx9HwkB
Work done with @jianyuan_wang@MinghaoChen23 Christian Rupprecht and Andrea Vedaldi
🥳Excited to present AutoPartGen at #NeurIPS2025 in Mexico City, our new work on part-level 3D generation!
Come chat about the next frontier of 3D generation at my poster!
📅 Wed, December 3, 6:30PM-9:30PM
📍 Hilton Mexico City Reforma (Foyer)
Paper: https://t.co/YNllTMrNxY
Project page: https://t.co/toAXZq7GIi
📢 Lyra: Generative 3D Scene Reconstruction via Video Diffusion Model Self-Distillation
Got only one or a few images and wondering if recovering the 3D environment is a reconstruction or generation problem? Why not do it with a generative reconstruction model!
We show that a camera-conditioned video diffusion model can be transformed into a generative reconstruction model that directly outputs a high-quality 3D Gaussian Splatting representation through self-distillation, without requiring real-world training data.
Check out our results in the video (wait for dynamic scenes in the second half!) :
Project Page: https://t.co/pKtry0BdOL
Code and Models: https://t.co/p4zVBrMKU5
Paper: https://t.co/ZuMM1LCP82
New paper out - accepted at
@ICCVConference
We introduce MoSiC, a self-supervised learning framework that learns temporally consistent representations from video using motion cues.
Key idea: leverage long-range point tracks to enforce dense feature coherence across time.🧵
Introducing VGGT (CVPR'25), a feedforward Transformer that directly infers all key 3D attributes from one, a few, or hundreds of images, in seconds! No expensive optimization needed, yet delivers SOTA results for:
✅ Camera Pose Estimation
✅ Multi-view Depth Estimation
✅ Dense Point Cloud Reconstruction
✅ Point Tracking
Project Page: https://t.co/Qoc1ipqozq
Code & Weights: https://t.co/1GkCpRATkE
Thrilled to see our work accepted to @iclr_conf !! If you're curious about incorporating deformation priors into dynamic reconstruction models, be sure to check out our paper: https://t.co/BBEbKUDJJW
A fantastic paper for @sara_oblak to kick off her academic publishing journey!
We are releasing uCO3D! Built to supercharge 3D GenAI and digital-twin models, this evolution of CO3D features more and higher-quality object videos from 1k categories, 3D Gaussian Splats, and streamlined OSS tools.
💻Data&code: https://t.co/k4tX63qjub
📄Paper: https://t.co/j5YTIEKfXZ
Hi everyone,
I am graduating early next year with a PhD from @Inria in AI. I am looking for an industrial research position and would appreciate your support.
Thank you in advance for any connections, advice, or opportunities you can offer.
My profile: https://t.co/WJSSeDTQ9R
Can networks discover crucial geometric measurements like height/radius from binary task success signals? Check out our #ICML2023 paper on geometric eigen-length learning: https://t.co/7GY912tz1q.
Come by our poster at Exhibit Hall 1 #640 today at 10:30 am!
@gzrer1@paschalidoud_1@Stanford The M parts of the object are not pre-segmented as there are no part labels, they are learned in an unsupervised fashion! The losses responsible for learning meaningful parts are the coverage, overlapping, and control losses (equations 21, 22, 23 of the main paper).
📢Our new #CVPR2023 work, PartNeRF introduces a part-aware generative model for editable 3D shape synthesis, that does not require any 3D supervision nor part-level annotations.
Project: https://t.co/Cia6gJnXqx
This work was done during @ktertikas's internship at @Stanford.
My first computer graphics research project in undergrad was to build an interactive system for artists to create complex shapes from a single image. Now, this whole process is automated with diffusion models!
Check out our NeRDi at #CVPR2023 !
Poster: https://t.co/pqeaFWq3Lv
📢Our #ICCV2023 workshop on AI for 3D Content Creation organized with @geopavlakos, @amlankar95, @KaichunMo and @davrempe from, Paul Guerrero, @SiyuTang3 and Leo Guibas has a fantastic list of speakers!
Workshop Website: https://t.co/hf5VzHPNoy
Paper Submission Deadline: July 17
Students in Greece: The Archimedes Scholarship https://t.co/VOmw1FwoVp is a unique opportunity to do a PhD in Greece on foundational problems in AI. If you are interested in geometric problems and/or neuromorphic methods in vision and learning contact me for more information.