๐ขSceneFactor code is released!
SceneFactor is a factored latent diffusion for controllable, large-scale scene synthesis and editing!
w/
@QTDSMQ, @shubhtuls, @angelaqdai
Check out the code here: https://t.co/FIMiRSTFIs. We present SceneFactor at #CVPR2025 on Fri 13, -10:30 PDT. Don't forget to drop by ๐
๐ขAnimating the Uncaptured ๐ข
We animate 3D humanoid meshes using video diffusion priors given a text prompt.
๐ฅhttps://t.co/EpFW86gaRw
๐https://t.co/suMQs8oQCL
Realistic motion generation for 3D characters - without motion capture! ๐
Great work by @marcbenedi@angelaqdai
๐ขExCap3D: Multilevel Captioning of Objects in 3D Scenes
@chandan__yes generates consistent object and part-level descriptions of objects in 3D scenes, and introduces a new dataset with 190k captions for 34k ScanNet++ objects.
Project: https://t.co/6tWzlYsx5F
w/ @david_roz_
๐ข ScanNet++ v2 Benchmark Release! ๐
Test your state-of-the-art models on:
๐น Novel View Synthesis ๐ธโก๏ธ๐ผ๏ธ
๐น 3D Semantic & Instance Segmentation ๐ค๐๐ถ๏ธ
Shoutout to @chandan__yes and @liuyuehcheng for their incredible work๐
๐Check it out: https://t.co/SKCGM23hA0
Excited to announce ScanNet++ v2!๐
@chandan__yes and @liuyuehcheng have been working tirelessly to bring:
๐น1006 high-fidelity 3D scans
๐น+ DSLR & iPhone captures
๐น+ rich semantics
Elevating 3D scene understanding to the next level!๐
w/ @MattNiessner
https://t.co/QayR1S8KZZ
๐ข๐ข๐๐๐ : ๐๐๐ฎ๐ฌ๐ฌ๐ข๐๐ง ๐๐ฏ๐๐ญ๐๐ซ ๐๐๐๐จ๐ง๐ฌ๐ญ๐ซ๐ฎ๐๐ญ๐ข๐จ๐ง ๐๐ซ๐จ๐ฆ ๐๐จ๐ง๐จ๐๐ฎ๐ฅ๐๐ซ ๐๐ข๐๐๐จ๐ฌ ๐ฏ๐ข๐ ๐๐ฎ๐ฅ๐ญ๐ข-๐ฏ๐ข๐๐ฐ ๐๐ข๐๐๐ฎ๐ฌ๐ข๐จ๐ง๐ข๐ข
We reconstruct animatable Gaussian head avatars from monocular videos captured by commodity devices such as smartphones.
Key idea: distill reconstruction constraints from a multi-view head diffusion model to complete unobserved regions.
https://t.co/prz5HnGoWq
https://t.co/XkWBKScwb2
Great work by @jiapeng_tang@davidedavoli@TobiasKirschst1@liamschoneveld
๐ขDNF: Generating 4D animations with dictionary-based neural fields!
@xinyi092298 presents a new dictionary-based neural field for unconditional 4D generation of deforming shapes -- generating motions with high-quality shape and temporal consistency.
https://t.co/yAZi2k0PjB
๐ขSceneFactor: Generating & editing 3D indoor scenes from text!
@ABokhovkin presents a factored latent diffusion for controllable, large-scale scene synthesis -- decomposed into high-level semantic generation + geometric refinement
w/ @QTDSMQ, @shubhtuls
https://t.co/WGTw70cKIo
๐ข๐ข ๐๐๐ฎ๐ฌ๐ฌ๐ข๐๐ง๐๐ฉ๐๐๐๐ก: Audio-Driven Gaussian Avatars ๐ข๐ข
We synthesize photorealistic and 3D-consistent talking human head avatars driven directly from spoken audio.
More specifically, we introduce an efficient 3DGS-based representation, combined with an audio-conditioned transformer, to generate realistic and high-quality animations. Our method can easily generalize to a variety of in the wild scenarios like songs, foreign languages etc.
We further introduce a new multiview dataset of native English speakers with overall recording time of โผ3.5 hours.
Project: https://t.co/4TpbW3rn11
Video: https://t.co/WidbmXZtEY
Great work by @shivangi2201, @ASevastopolsky, @TobiasKirschst1, @JustusThies, @angelaqdai
How can we generate high-fidelity, complex 3D scenes?
@QTDSMQ's LT3SD decomposes 3D scenes into latent tree representations, with diffusion on the latent trees enabling seamless infinite 3D scene synthesis!
w/ @craigleili, @MattNiessner
https://t.co/wv9bIhkkYi
Excited to present DiffCAD coming to #SIGGRAPH2024!
@DaoyiGao introduces the first probabilistic single-view CAD retrieval & alignment.
We train only on synthetic -> generalize robustly to real images!
Check out the code: https://t.co/hBCoN0Hx3w
w/@david_roz_, @StefanLeuteneg1
Excited to present GenZI at #CVPR2024!
@craigleili introduces GenZI, the first zero-shot approach to creating realistic 3D human-scene interactions by leveraging interaction priors from large VLMs.
Code and data on our website!
https://t.co/hUhMgUoU70
https://t.co/rnn1G5HOuu
AutoInst: Automatic Instance-Based Segmentation of LiDAR 3D Scans
A method for unsupervised instance segmentation of 3D outdoor LiDAR scenes.
Project: https://t.co/m8DJanWH2T
Vid: https://t.co/Z9OyZbskdJ
Paper : https://t.co/rrmvQdjmWV
Check out @chrdiller's CG-HOI :)
We generate realistic 3D human-object interactions, from object geometry and text description.
A key ingredient is explicit modeling of contact, during training and as guidance during inference.
https://t.co/Cl5Jw9oFBO
https://t.co/FVIFqEpjHi
Diffusion models are awesome!
Check out our survey on ๐๐ข๐๐๐ฎ๐ฌ๐ข๐จ๐ง ๐๐จ๐๐๐ฅ๐ฌ ๐๐จ๐ซ ๐๐ข๐ฌ๐ฎ๐๐ฅ ๐๐จ๐ฆ๐ฉ๐ฎ๐ญ๐ข๐ง๐ ! We give an introduction to diffusion models and highlight how they are used by state-of-the-art methods in graphics and vision.
https://t.co/FqaqF7tMPM