MeshFlow brings high-quality mesh with interactive speed. It's time to make mesh generation flow!
Arxiv: https://t.co/LvaAoq1GoH
Web: https://t.co/y6DsntORbz
Code: https://t.co/l1GHR9zjCb
Excited to share MeshFlow — a new approach that can generate meshes with a fraction of seconds, while achieving state-of-the-art generation quality.
Secret sources? Instead of autoregressive models, use equivariant flow-matching!
Excited to share 𝐄𝐫𝐫𝐨𝐫-𝐂𝐨𝐧𝐝𝐢𝐭𝐢𝐨𝐧𝐞𝐝 𝐍𝐞𝐮𝐫𝐚𝐥 𝐒𝐨𝐥𝐯𝐞𝐫𝐬 (𝐄𝐍𝐒), a PDE framework that recurrently corrects its own prediction by reading the PDE residual field rather than minimizing it!
Website: https://t.co/P9TJ3LETFH
There are three reasons why it's called "Functional Attention."
1⃣We make functions, instead of tokens, as first-class citizens for attention module.
2⃣It was inspired by the geometry processing concepts of functional maps.
3⃣It works!
What if attention wasn't about matching tokens, but operating in function space?
Glad to share our #ICML2026 paper:
📄 Functional Attention: From Pairwise Affinities to Functional Correspondences
w/ @Jiefang_Xiao@GaoMaolin @stevenygd Daniel Cremers
📄 https://t.co/rhn9NtwrBm
The #ECCV2026 reviewer discussion period has started! Reviewers should carefully read the authors’ rebuttal, consider the other reviews, and actively participate in the discussion BEFORE finalizing their reviews.
What’s New at #ECCV2026 Malmo 🇸🇪? Please read the important policy updates (especially with regard to ECCV 2024) on our “What’s new?” page.
Notably, this year, we introduced “Contribution Types”, a mechanism for tagging submissions &reviewers to facilitate fair evaluation.
1/2
Combining the benefits of RL and SFT with on-policy distillation, a promising approach for training small models for domain performance and continual learning.
We've released all code and models for FlashDepth! It produces depth maps from a 2k, streaming video in real-time. This was a really fun course project inspired by discussions with @mohsaied and @stevenygd and we look forward to presenting it at #ICCV2025.
GitHub: https://t.co/IHPYJEZFIj
Project page: https://t.co/dGHdXJKLKB
For those at CVPR, Aditya will be presenting this poster tomorrow at 10:30 (Exhibit hall D, Poster #34). Come hear about why neural field derivatives are noisy, and how we resurrect image processing ideas for neural fields!
Real-time video generation is finally real — without sacrificing quality.
Introducing Self-Forcing, a new paradigm for training autoregressive diffusion models.
The key to high quality?
Simulate the inference process during training by unrolling transformers with KV caching.
Most video models struggle to feel like real worlds.
They forget what’s just out of view, slow down as videos get longer, or breaks causality.
We think State Space Models are a natural fit for models with:
🧠 long-term memory across hundreds of frames
⚡ constant-speed generation, even for long rollouts
⏩ fully causal dynamics, fit for real-time interaction
Which multimodal LLMs are ready to take on 3D editing tasks in Blender?
We present BlenderGym — the first benchmark to systematically evaluate them. We also show that the right inference strategy can make all the difference!
Check out our #CVPR2025 Highlight paper to learn more! 👇
Which multimodal LLM should you be using to edit graphics in Blender?
Today, we’re releasing our #CVPR2025 Highlight🌟 work, #BlenderGym 🏋️♀️, the first agentic 3D graphics editing benchmark that will tell you exactly how multimodal LLMs compare in their Blender-editing skills.
What'd we find? 🧵👇
@jon_barron Are we making some implicit assumptions that one 3D model is comparable to only one image or one word? If we can make each 3D model worth thousands of images/words, then this data gap can be perhaps smaller.
How can we use wide-FOV cameras for reconstruction?
We propose self-calibration Gaussian Splatting that jointly optimizes camera parameters, lens distortion, and 3D Gaussian representations to directly reconstruct from a set of wide-angle captures.
page: https://t.co/OQ4C20VsvU