Computer vision researcher at Frontier AI and Robotics Lab, Amazon
Ex-Postdoc @Oxford_VGG.
PhD at the Max Planck Institute for Informatics.
Opinions are my own.
Introducing V-DPM, for 4D reconstruction of in-the-wild videos. We build on top of VGGT, using Dynamic Point Maps for jointly representing 3D and motion.
Joint work with: @EldarIsTyping , @LaiZihang , and Andrea Vedaldi. @Oxford_VGG.
Check out the demo and code 👇
This afternoon we will be presenting CoWTracker by amazing Zihang Lai who unfortunately could not be at @CVPR in person. Stop by at poster 594 (15:30 - 17:30) to learn about new SOTA in dense point tracking.
Project page: https://t.co/u2SxdFNbuQ
@SucarEdgar@Oxford_VGG
@andreasklinger@nothing Fundamentally, it's yet another Android smartphone. There is no need for the next Apple, what's needed is a truly open platform.
My brother is a senior designer at Figma. He is insanely cracked. I sent him this image and asked him what it would take to build it today. I will never forget his answer… "We can't, we don't know how to do it."
@ahealertweets@tonguei66408226@GoodwinMJ Also Celtic people: "First it were Angles and Saxons, now it is the Vikings and French. We've become minority in our own country!"
papers are kind of like movies: the first one is usually the best, and the sequels tend to get more complicated but not really more exciting. But that totally doesn’t apply to the DepthAnything series. @bingyikang's team somehow keeps making things simpler and more scalable each time.
in this new version, they basically show that a strong representation encoder plus a depth-ray prediction objective is enough (you see the RAE vibes too, right?) to get solid, general spatial perception across a bunch of tasks.
people often say they hate computer vision because it’s messy--too many tasks, too many data types, too many moving parts. but that’s exactly why I love it. I think the biggest AI breakthroughs are going to come quietly from vision and then suddenly leapfrog everything else, changing how AI interacts with the real world and with us.
pretty soon we’ll realize vision is not a big list of tasks--it’s a perspective. a perspective about modeling continuous sensory data, building layered representations of the world, and inching toward human-like intelligence. and tbh we’re watching this happen every day, behind all the hype, as all these different '"tasks" slowly start to merge.
Europe Builds. Others Profit.
3D Gaussian Splatting (3DGS) is the perfect case study. It reflects both Europe’s brilliance and its chronic inability to turn that brilliance into business.
Almost everything that made 3DGS possible was born in Europe. From the early breakthroughs in point-based rasterization in Switzerland to the cumulative research from Austria, Greece, and Germany executed in France, Europe built the foundation. No other continent can match that level of scientific collaboration and intellectual strength.
The LichtFeld Studio bounty later confirmed it: the biggest performance leaps came straight out of European labs. The science was here. The innovation was here. The talent was here.
But the business was not.
When 3DGS exploded, my inbox filled with messages from US-based companies, not from Europe. In the United States, Luma AI and Polycam turned the paper into products within weeks. They did not wait for funding programs or EU consortia. They simply built.
Then came China, which not only caught up in research but quickly outpaced everyone in commercialization. XGRID, DJI, and many others built thriving businesses around what Europe invented. Today, most 3DGS papers come from Chinese institutions rather than European ones.
Meanwhile, the usual giants such as Meta, NVIDIA, Google, Netflix, and Tesla continue to iterate, integrate, and push forward. A thriving ecosystem of startups like World Labs leverages this technology to create new products and markets. The innovation cycle in the United States and China is fast, relentless, and market-driven.
Europe, in contrast, remains bureaucratic and slow. We fund excellence and celebrate publications, but we rarely ship, even though some small startups are trying to change the status quo. Our researchers create the breakthroughs; others create the successful products.
Until Europe finds a way to bridge the gap between laboratories and markets, it will remain the world’s research and development department: brilliant, underpaid, and underleveraged.
Research is Europe’s comfort zone. Execution must become its strength.
Video: One of my dynamic 3D Gaussian implementations based on the paper "Representing Long Volumetric Video with Temporal Gaussian Hierarchy."
We are seeking a full-time Postdoctoral Research Assistant in Computer Vision to join the Visual Geometry Group (University of Oxford) to work on 3D and Spatial AI with Professor Andrea Vedaldi. The post is funded by ERC and is fixed-term for two years with a possible extension.