The most important AI research center in Europe isn’t in London, Paris, or Berlin. It’s in a university town of 90,000 people you’ve probably never heard of.
Tübingen, in southwestern Germany, is home to Cyber Valley - an AI consortium set up in 2016 with €165M from the state of Baden-Württemberg, the Max Planck Society, and partners including Bosch, Mercedes-Benz, and Amazon. The region publishes more machine learning and computer vision research to top conferences than anywhere else in Europe. Aleph Alpha came out of here.
With @FastCodeAI Labs now registered in Baden-Württemberg, we’re applying to join the Cyber Valley Startup Network.
Thanks to Paul-David at Cyber Valley for the time.
Work done with Simon Gerstenecker and Andreas Geiger.
We have fully open-sourced the benchmark alongside our custom toolbox.
🔗 Code, Data & Toolbox: https://t.co/QgMiVNIlaL
📄 Read the Paper: https://t.co/dtELevCWRU
Why did the elephant cross the road? To expose how fragile your model is.
There's a relatively quiet but serious problem in autonomous driving research: most models are trained and evaluated on the same scenarios.
Analyzing seven state-of-the-art models, we find that success rates drop by 22.8% on average, highlighting fundamental robustness concerns in current approaches. The benchmark is fully open, and includes a toolbox for researchers to define their own generalization tests.
MDPO: Overcoming the Training-Inference Divide of Masked Diffusion Language Models
"We propose a novel Masked Diffusion Policy Optimization (MDPO) to exploit the Markov property diffusion possesses and explicitly train the model under the same progressive refining schedule used at inference. MDPO matches the performance of the previous state-of-the-art (SOTA) method with 60x fewer gradient updates, while achieving average improvements of 9.6% on MATH500 and 54.2% on Countdown over SOTA when trained within the same number of weight updates. Additionally, we improve the remasking strategy of MDLMs as a plug-in inference replacement to overcome the limitation that the model cannot refine tokens flexibly. This simple yet effective training-free strategy, what we refer to as RCR, consistently improves performance and yields additional gains when combined with MDPO"
One concern that I have as an AI researcher when publishing code is that it can potentially be used in dual-use applications.
To solve this, we propose Civil Software Licenses. They prevent dual-use while being minimal in the restrictions they impose:
https://t.co/gikJHAAMkd
We have released the code for our work, CaRL: Learning Scalable Planning Policies with Simple Rewards.
The repository contains the first public code base for training RL agents with the CARLA leaderboard 2.0 and nuPlan.
https://t.co/oV3Qi9OVZo
We now know RL agents can zero-shot crush driving benchmarks. Can we put them on a car and replace the planning stack? We're hiring a postdoc at NYU to find out!
Email me if interested and please help us get the word out.
#CVPR2025 Paper Picks #3
🚗 SimLingo: Vision-Language-Action for autonomous driving by @KatrinRenz et al. @wayve_ai
Autonomous driving meets language grounding.
SimLingo drives and understands — using only cameras.
No LiDAR. No diffusion. Just vision, language, and action.
📢Excited to present our poster "SimLingo" tomorrow at #CVPR2025. Drop by to talk about vision-language-action models, language-action grounding, or anything else :)
📍Saturday, 10:30 - 12:30 Poster #130
Joint work with Long Chen @ElaheArani@SinavskiOleg@wayve_ai
Only a few days to go until #CVPR2025 kicks off 🤩
This year, we’re excited to share our research paper #SIMLINGO — a foundation model that brings together vision, language, and action to power more generalizable, interpretable embodied agents. 🚗🗣️👀
Come find us at booth 1429 to see it in action and chat with the team!
In the meantime, check out the paper and demo here https://t.co/ygfNxH5fYH
1 year since we launched LINGO-2 @wayve_ai 🧠
With LINGO-2, our AI is trained to both make decisions *and* communicate them. The first closed-loop vision-language-action driving model (VLAM) tested on public roads, LINGO-2 has been game-changing for exploring the connection between language and driving.
Want to catch the latest with LINGO? We'll be presenting our new research paper SIMLINGO at #CVPR2025 https://t.co/QQgyxtHBZB
LINGO-2 blog 👉 https://t.co/4uLitOFBhI
#LINGO2 #EmbodiedAI #Simulation
Just happened to know that DriveLM ranked #9 on the Most Influential ECCV Papers (2024-09 Version). Thorough benchmarking on driving with VLM gains it popularity! https://t.co/pG1JvYzHcM