We’re looking for research interns at @NianticSpatial to build real-world scalable 3D scene understanding (publication).
If you’re into 3D foundation models, semantic understanding, feed-forward recon, and/or 3D perception at scale. Please apply / share 👇
🎉PlaceIt3D has been accepted to #ICCV2025.
New task: placing 3D objects in real scenes given text prompts.
We present:
📊 A benchmark and a large-scale dataset
🪄 PlaceWizard, our 3D LLM-based model
Project webpage: https://t.co/3z24v89c0d
Thanks to my amazing co-authors!
🚀 Introducing PlaceIt3D — ICCV 2025, Honolulu 🌺
What feels natural for us — “put the lamp next to the sofa” or “move this character where it can’t be seen from the doorway” — is still one of the hardest challenges in AI.
👉 Full paper and resources: https://t.co/oibeoHLApq
A Scene is Worth a Thousand Features: Feed-Forward Camera Localization from a Collection of Image Features
@axelbarrosotw, Tommaso Cavallari, @viprad, @eric_brachmann
tl;dr: query+posed mapping images->DUSt3R base->query 3D coordinates
https://t.co/nBY72kWXf5
🎉PlaceIt3D has been accepted to #ICCV2025.
New task: placing 3D objects in real scenes given text prompts.
We present:
📊 A benchmark and a large-scale dataset
🪄 PlaceWizard, our 3D LLM-based model
Project webpage: https://t.co/3z24v89c0d
Thanks to my amazing co-authors!
RL is not all you need, nor attention nor Bayesianism nor free energy minimisation, nor an age of first person experience. Such statements are propaganda.
You need thousands of people working hard on data pipelines, scaling infrastructure, HPC, apps with feedback to drive benchmarks and data, tons of research and engineering on generative models, data mixtures, ablations, RL/selftraining, etc etc and we will probably need lots of people working hard to figure out safety, causal world models, awareness, models that create abstractions comparable to infinity and zero and use these to predict the existence of things like black holes and suggest experiments to verify such hypothesis, or come up with novel engineering designs to generate energy more efficiently, robotics, etc etc.
It takes thousands of people and many ideas. In the end some simple ideas might become obvious but such obviousness only happens in retrospect. Yes, there is a bitter lesson but if we had followed it, we’d still be doing linear regression with RL. Let’s not oversimplify, but rather honour the research and engineering of thousands of people.
Also, people keep rewriting history. When our language understanding start up (darkbluelabs) was acquired by Google about 10 years ago, we joined DeepMind, where the AGI documents were all about concepts, RL, episodic memories and made it clear that there was no room for language. To be honest, back then such a position wasn’t so crazy. Now it seems silly, but only because of the benefit of hindsight.
There’s no 1 or 10 heroes in the history of AI. There’s many 1000s of hard working students, profs, engineers, operations and support people, product folks, managers, even hedge funds among others. Let’s honour the whole community and not just ceos or the philosophers of Bayes, RL, deep learning, etc. I look forward to learning from the next generation and seeing what they will achieve. To them: Don’t buy the existing narratives blindly, innovate. Remember that just like mathematics, AI will advance one grave at the time.
🔍Looking for a multi-view depth method that just works?
We're excited to share MVSAnywhere, which we will present at #CVPR2025. MVSAnywhere produces sharp depths, generalizes and is robust to all kind of scenes, and it's scale agnostic.
More info:
https://t.co/esbaLbsRF4
Splats are great, but wouldn’t it be wonderful if we could transform them into the stuff of dreams? Introducing our #CVPR2025 paper ✨Morpheus✨!
Morpheus lets you change the shape and appearance of 3D Gaussian splats with a few words!
https://t.co/16O5v18IRd
(1/6)
We’re excited to unveil CoCreatAR, a groundbreaking system that connects remote developers with real-world locations in real time.
By enabling live collaboration with on-site partners through video, audio, and 3D scans (Gaussian splats), CoCreatAR ensures AR elements fit seamlessly into dynamic environments – solving issues like outdated or incomplete maps. This innovation paves the way for far more accurate and immersive AR experiences. Incredible work by the team who made this a reality.
Learn more here: https://t.co/zLtnH4IenP
Niantic is looking for research interns for next summer in London. Interested in the next generation of reconstruction, mapping and visual relocalization? Apply! (And send me a DM for good measure). Let's explore what the bitter-sweet lesson can do for 3D vision. Link below.
We'll be at poster 105 this afternoon! In the meantime, you can try an interactive DEMO of a lightweight mobile version of HandDGP directly on ANY device with a browser! https://t.co/FUuol80YAp #ECCV2024
👋Would you like more accurate 3D camera-space hand mesh predictions? Check our new #ECCV2024 work: "HandDGP: Camera-Space Hand Mesh Prediction with Differentiable Global Positioning". Project page: https://t.co/7PUrmVAn32 (1/5)
👋Would you like more accurate 3D camera-space hand mesh predictions? Check our new #ECCV2024 work: "HandDGP: Camera-Space Hand Mesh Prediction with Differentiable Global Positioning". Project page: https://t.co/7PUrmVAn32 (1/5)
#MapFreeReloc workshop in progress. The room is relatively small but maybe you can still fit in to hear Jakob Engel tell you about spatial AI. #ECCV2024