Heading to Sydney for #RSS2026? ๐ฆ๐บ Join the 2nd iteration of our Neuromorphic Robotics Systems (#NeuRoSys) workshop! ๐ค๐ง โก
Weโre looking for work on #EventCameras, neuromorphic hardware, bio-inspired control, and real-time/real-robot applications.
The human-perceived RGB is image 1 and the Tesla AI photon count reconstruction is image 2.
This is why Tesla FSD can see so well at night or through extreme glare.
We are back. After one year of quiet building.
Introducing GENE-26.5, our first robotic brain that takes a major step toward human-level capability.
For years, robotics has struggled to learn from the worldโs largest and valuable data source: Humans.
Solving it means rethinking the whole stack from the ground up:
- A robotics-native foundation model.
- A 1:1 human-like robotic hand.
- A noninvasive data collection glove for motion, force, and touch.
- A simulator that turns weeks of experiments into minutes.
GENE-26.5 is trained across language, vision, proprioception, tactile, and action. We designed a set of tasks to test how far we can go with this new paradigm.
Fully autonomous, 1x speed, one model, same weights. (Enjoy with sound on)
We are approaching the endgame for robotics.
And this is just a beginning.
๐ข๐ข๐ขintroducing ๐๐จ๐ฐ๐๐ซ ๐ ๐จ๐๐ฆ
A 3D representation that can be ray traced or rasterized in real time, with NO COMPROMISE in quality.
- Project: https://t.co/LkmVQjkIt2
- arXiv: https://t.co/TtMbyKrvrp
Rasterized at 3DGS-class FPS
Ray traced at Radiant Foam speeds
Yes, absolutely: factor graphs are energy-based world models, and very general in terms of structure and computation patterns. See the last chapter of the SLAM Handbook for my view, including general learning within factor graphs. Free PDF at: https://t.co/EdTa9zbNg7
Video models as Physics simulators. ๐๐ฅ
[1/] In our latest work, WinDiNet, we finetuned a pre-trained video model into a differentiable physics engine. 1000x faster than traditional CFD solvers.
Project page: https://t.co/LAx7t00y3e
Abs: https://t.co/OdcgbKeQEG
In my recent blog post, I argue that "vision" is only well-defined as part of perception-action loops, and that the conventional view of computer vision - mapping imagery to intermediate representations (3D, flow, segmentation...) is about to go away.
https://t.co/aFmE9CHHau
Excited to present ACE-SLAM, the first neural SLAM to use Scene Coordinate Regression as an implicit map representation
Efficient (real-time from live stream), compressive (neural maps <1MB) and robust to dynamic scenes
With @marwan_ptr and @AjdDavison
https://t.co/tMsD5hTkB3
A reminder that accurate motion estimation sparse visual SLAM has been in the domain of industry for many years now, and what you might often see in academic papers as the "state of the art" is fairly meaningless. (From @pesarlin.bsky.social)
Over the past year, my lab has been working on fleshing out theory/applications of the Platonic Representation Hypothesis.
Today I want to share two new works on this topic:
Eliciting higher alignment: https://t.co/KY4fjNeCBd
Unpaired rep learning: https://t.co/vJTMoyJj5J
1/9
Genie 3 feels like a watershed moment for world models ๐: we can now generate multi-minute, real-time interactive simulations of any imaginable world. This could be the key missing piece for embodied AGIโฆ and it can also create beautiful beaches with my dog, playable real time