Introducing EgoVerse: an ecosystem for robot learning from egocentric human data.
Built and tested by 4 research labs + 3 industry partners, EgoVerse enables both science and scaling
1300+ hrs, 240 scenes, 2000+ tasks, and growing
Dataset design, findings, and ecosystem 🧵
New essay on robot learning from human data.
I like @karpathy’s idea that LLMs are “ghosts” distilled from human knowledge. In robotics, we are attempting something similar: to summon a sensorimotor ghost.
Our current ritual is teleoperation. It produces data, but strips away the reflexes, priors, and social interactions that make human behavior rich.
My bet: robot learning will scale less with more robots, and more with better models of humans. Right now we lack both the systems and algorithms to model humans well.
If we succeed, the result won’t just be better robots. It may be the first learned theory of how humans act in the physical world. Robots would simply be the first place we deploy it.
Can large-scale sim data enable real-world generalization?🤔
In our new work, we introduce a generalizable domain adaptation setting, where policies must handle real-world situations never presented in the real training data. (1/n)
In evaluation, our method delivers up to 30% higher success rates than the co-training baseline and generalizes to scenarios seen only in simulation, marking a step toward scalable robot learning without large real-world datasets. (4/n)
What if one unified method helps robots learn from human videos across many tasks, many robots?
Meet ImMimic: Cross-Domain Imitation from Human Videos via Mapping and Interpolation (CoRL 2025 Oral Presentation🏆) @ICatGT
Check it here https://t.co/mrBAjewrlg!
Introducing EgoMimic - just wear a pair of Project Aria @meta_aria smart glasses 👓 to scale up your imitation learning datasets!
Check out what our robot can do.
A thread below👇
With large-scale simulation study, we show NOD-TAMP can solve challenging tasks with a handful of demos (4 v.s. 500 demos compared to BC) and achieves strong generalization across diverse shapes, spatial layouts, and task goals. (5/N)
Together, NOD-TAMP flexibly integrates the adaptation of recorded trajectories with traditional motion planning to generalize across drastically different scene layouts. Here we show the full process of skill planning and adaptation for the mug sorting task. (5/N)
NOD-TAMP reasons about the pre- and post-conditions of each skill in NOD space and plans skill sequences to reach different goals. For instance, it can decide whether to pick a mug by the rim or handle to hang it on rack and use tools to manipulate hard-to-reach objects. (4/N)
For skill adaptation, our key insight is to use learned neural object descriptors (NOD) to transform skill trajectories from one task instance to others, thus being able to apply the demoed skills to manipulate unseen object shapes at novel poses. (3/N)
NOD-TAMP is a bi-level planner that reasons about (1) what skills to use given a high-level task goal and (2) how to co-adapt each skill and compose them to form a long-horizon trajectory plan. (2/N)
Can we teach a robot hundreds of tasks with only dozens of demos?
Introducing NOD-TAMP: A framework that chains together manipulation skills from as few as one demo per skill to compositionally generalize across long-horizon tasks with unseen objects and scenes. (1/N)
Since we are entering the "BC is all you need" phase of Robot Learning😜 --- Robomimic (https://t.co/jm2STNoHLu) allows you to play with SOTA algorithms (BC-Transformer, DiffusionPolicy, etc.) on challenging tasks. Also easy to integration with physical robots!
How to represent granular materials for robot manipulation?
Introducing our #CoRL2023 project: Neural Field Dynamics Model for Granular Object Piles Manipulation, a field-based dynamics model for granular object piles manipulation.
🌐 https://t.co/6KPwV32iqO
👇 Thread
If you're at #ICRA2023 come chat with us about our poster on "Generalizable Pose Estimation using Implicit Scene Representations!" Pod 11 at 3pm BST
Read more about our paper: https://t.co/hLuRAnBY53