Excited to introduce #TruckDrive 🚛 at #CVPR2026: a new long-range driving dataset built specifically for long-range truck autonomy, where safe braking and anticipatory planning demand perception hundreds of meters ahead, far beyond existing robotaxi datasets.
📦 TruckDrive includes:
🔹 475K samples, with 165K densely annotated frames
🔹 Benchmarks for end-to-end driving, tracking, planning, depth estimation, and up to 1,000m for 2D detection and 400m for 3D detection 📏🎯
🛰️ A purpose-built long-range sensor suite:
🔸 7 long-range FMCW LiDARs (range + radial velocity)
🔸 3 high-res short-range LiDARs
🔸 11× 8MP surround cameras for short and long-range📷
🔸 10× 4D FMCW radars 📡
⚠️ Key finding: current state-of-the-art models break down at long range
📉 with 31% to 99% drops on 3D perception tasks beyond 150m. TruckDrive exposes a long-range generalization gap that current architectures and training signals are not closing yet - a benchmark for the next generation of long-range highway autonomy research 🚚
🔗 Project and Data: https://t.co/fNzDCbGQRQ
Fun work together with @torc_robotics led by Filippo Ghilotti, Edoardo Palladin, Samuel Brucker, Adam Sigal, and Mario Bijelic.
Are we done with object detection? What about tiny objects beyond 200 meters? 🔎
Telescope 🔭 addresses long-range perception by explicitly tackling extreme scale imbalance ⚖️ in images. It hinges on a learnable hyperbolic foveation transform from a low-resolution image, magnifying distant regions 🔍 while compressing nearby ones - effectively normalizing object scales with minimal computational overhead. Objects are detected in the transformed (Riemannian) space using a novel bounding box parameterization and are then mapped back to the original image.
Project: https://t.co/mBuQGd7KnB
ScenarioControl 🚗🛣️ - Scenario Generation from a single Dashcam Image 📸 or Text Prompt 💬!! Excited to introduce a new vision-language control mechanism for learned driving scenario generation. Given a single dashcam image or a scene prompt or an image, we generate a full scene layout 🧩, temporally consistent rollouts, including map 🗺️, agents 🚗, and ego video🛣️
ScenarioControl enables direct, fine-grained control over layout and traffic while preserving realism. It operates in a vectorized latent space with a new cross-global control mechanism to fuse vision-language inputs with scene structure while preserving realism. Interfaces seamlessly with generative video models!
Project: https://t.co/3gEvcdk1lE
Super fun project by Lili Gao, @Yanbo_Xu_ , William Koch, Samuele Ruffino, @Luke22R , Behdad Chalaki, Dmitriy Rivkin, Julian Ost, @rogg1111, Mario Bijelic.
Chop the gradients ✂️! We found that truncating decoder gradients in latent video diffusion to a fixed window allows us to finetune on videos with pixel-wise perceptual losses without running out of memory. Pixel losses have been essential for image generation and reconstruction, but until now, they haven't scaled to long-duration, high-resolution video diffusion due to recursive activation accumulation in causal decoders, leading to OOM during training 💥📉.
Project: https://t.co/IMMbKM0s3j
Video diffusion models can do a lot more 🚀 when you can backprop the decoder! Post-process neural rendered scenes, super-resolve videos, harmonize lighting in controlled synthetic driving scenes, and inpaint videos — all in a single step ⚡ with a quick finetune from a standard diffusion model.
⏰ Remember: #ICCP2026 paper deadline is coming up this Friday already, welcoming submissions across various fields around computational imaging!
Details can be found on our website: https://t.co/A8TbcF0olb
WorldFlow3D: Unbounded 3D World Generation 🌍 by Flow Through Hierarchical Distributions, without VAEs !
We reformulate 3D generation as flowing through sequentially finer 3D distributions, cutting training time by more than half ⏱️ compared to existing approaches! Vectorized map layouts provide full scene controllability 🗺️, and a novel flow-field alignment process enables causally coherent, spatially unbounded generation 🌍. This generative method generalizes across both real and synthetic data distributions!
Project: https://t.co/D6v2dPVYxN
Project led by @amogh7joshi and Julian Ost — will be super fun to build on this! 🔥
We're Optimal Intellect, a research lab from the team behind CVXPY. Today we're introducing Moreau: a GPU-native solver that's orders of magnitude faster than the best existing tools.
ICCP 2026 is coming to @Princeton, July 13-15! Paper submissions are open, deadline April 10. Accepted papers published in ICCP Proceedings or IEEE PAMI Special Issue.
Take a sneak peek at already confirmed speakers on our website: https://t.co/oqIZFULKKF!
#ICCP2026
Are we really done with autonomous driving 🚚? Remember the massive winter storm in the US last week❄️!
We’re excited to share a large adverse weather driving dataset which includes small, distant road hazards, pushing perception beyond clear-weather and in-domain assumptions!
https://t.co/B7wL6ORpxb
We collect with different imaging modalities, spanning LiDAR, RGB, gated imaging, stereo, polarization, and depth — collected across diverse weather, lighting, and range conditions, including rare adverse events like heavy rain (~5×/year) and dense fog (~12×/year in North America & Europe) that are typically underrepresented in standard driving benchmarks.
What’s included:
• Seeing Through Fog – labeled adverse weather dataset captured in over 10,000km of driving.
• Gated2Depth / Gated2Gated – gated imaging for dense depth estimation.
• Pixel-Accurate Depth Benchmark – ultra-high-resolution depth ground truth.
• Long-Range Stereo (Gated Stereo) – large-scale sequential dataset with LiDAR and stereo (RGB, RCCB, gated).
• Fogchamber Benchmark – long-range fog/rain depth benchmark.
• Too Tiny To See – lost cargo benchmark with captures on snowy Lapland roads.
• ScatterNeRF – scene reconstruction under atmospheric scattering.
• Polarization Wavefront LiDAR – polarimetric LiDAR data.
Exciting work coming out of a collaboration in the AI-SEE Project, @torc_robotics , @MercedesBenz, and @Princeton .
Physically-grounded video generations at #CES2026 without hallucinations! This week, we demo our end-to-end neural rendering developed at @torc_robotics, which allows us to simulate camera/lidar/radar data for edge cases, such as crash scenarios, road debris, or unforeseen crash sites, without hallucinations!
We show our stack together with our friends from @ForetellixHQ for end-to-end testing of driving models with reconstructed, generated, or conventional mesh assets.
Join us at the Foretellix Booth at LVCC West Hall Booth 4767 from Tuesday to Friday in Las Vegas to see the demo!
Starting the new year without human labeling 🎉!! Multimodal lidar-camera data is a gold mine of dense 3D geometry hiding in plain sight. For supervised pretraining and validation at scale at @torc_robotics, we rely on fully automated pseudo-labeling pipelines. Exploiting geometric priors from temporally accumulated LiDAR maps and an iterative update rule enforces joint geometric–semantic consistency while detecting moving objects via inconsistencies.
We achieve 3D semantic labels and 3D bounding boxes with human-like quality at 200m+ range required for highway driving.
Paper: https://t.co/1Z16u5D2oU
Exciting work with @torc_robotics with Filippo Ghilotti, Samuel Brucker, Nahku Saidy, Matteo Matteucci, Mario Bijelic.
Felix Heide's goal was big: help computers see.
He did not know he was starting on a path to develop a new way of thinking about optics.
“The question for me was always how can we use algorithms to sense and understand the world?” said @_FelixHeide_.
https://t.co/5jOL7XKLIW
Excited to share our #NeurIPS2025 work on learning motion hierarchies! We introduce a general hierarchical graph learning method that learns structured, interpretable motion directly from data, no prior structure or assumptions needed!!!
Project and Paper: https://t.co/aIhw10TfCX
Amazing work led by William Koch, @ChengZh1005, and @MotaLee5 ! See us in San Diego for #NeurIPS2025!
@Meta Here’s the key idea:
❌ Smooth Phase = Brittle. High image quality, but cannot adapt to obstructions not seen during optimization.
✅ Random Phase = Robust. Creates more speckle, but the light inherently diffracts around obstructions, self-healing the image.
Holography with Eyelashes in the Way!! Future holographic glasses struggle with eyelashes that can cast image-destroying shadows. In a collaboration with @Meta, we train a model to generate "eyelash-proof" holograms at 80 FPS to fix this.
Project: https://t.co/kLgb7IQeep
Collaboration with @bkv2chu, @opueyociutad , @_EthanTseng_, @FlorianSchiffe4 , Grace Kuo, @nathanmatsuda, Albert Redo-Sanchez, @douglaslanman , and Oliver Cossairt.
Imagine AR as immersive as Vision Pro and as light as Meta Ray Bans.
That's the promise of holographic displays.
The flaw? Eyelashes can cast image-destroying shadows.
In our #SIGGRAPHAsia2025 paper, we train an AI to generate "eyelash-proof" holograms at 80 FPS to fix this.
Excited to present editable Neural Atlas Graphs at #NeurIPS 2025 (Spotlight)! We introduce a learned atlas representation in which each dynamic object is a 2D planar layer (the atlas). All time-dependent appearance and fine motion are captured directly within this 2D layer using a learned planar flow field and a view-dependent field. Neural Atlas Graphs allow for texture-editable neural representations at high resolution!
Project page: https://t.co/LjcWgJHqxO
Amazing project led by @jaypschneider, with Pratik Bisht, @_ilya_c, Andreas Kolb, Michael Moeller, with the Universität Siegen, @Princeton, the Lamarr Institute, and @torc_robotics.
At the event, @_FelixHeide_, an expert in imaging and computer vision, won the Dean for Research Distinguished Innovation Award, which recognizes a faculty member and their team.
Read the story: https://t.co/ievJwb6xYo