Play2Perfect learns contact-rich, precise assembly skills like screwing.
The best part is watching the rollouts slowed down: the policy makes tiny corrections and recoveries that are essential for fast, dexterous manipulation.
Check out Tyler’s thread for more details 👇
Let humans be humans, and let robots be robots. Our paper at #ICRA2026, X-Diffusion views human data as noisy counterparts of robot data.
Building on the Ambient Diffusion framework, we selectively integrate human data in the high-noise timesteps of diffusion policy training 👇
🤖 How can we teach dexterous robots to perform precise, contact-rich assembly?
Introducing Play2Perfect: first learn to play with objects, then perfect the policy for tight insertion, multi-part assembly, and screwing.
Sound on! 🔊
🧵👇
Teaching a robot shouldn't require humans to act like robots.
Human demonstrations contain valuable signal for robot manipulation, but they aren’t directly transferable to robots.
X-Diffusion learns from noisy human demonstrations while staying within the robot’s capabilities.
Human videos 👩 are noisy and break standard imitation learning.
X-Diffusion selectively trains on noisy human data to learn skills while maintaining robot feasibility 🦾
Come check out our poster at #ICRA26
Interactive Session 6 − ThI2I Hall C - 3-4:30 p.m.
🎮 Can we learn interactive world models from letting robots “play”?
➡️ Introducing ✨PlayWorld: a framework for training high-fidelity video world models from large-scale autonomous play experience that enables:
→ Accurate dynamics prediction
→ Reliable policy evaluation
→ RL fine-tuning entirely inside the world model
🌐https://t.co/Kpd2DoveXc
🤖 Can a single robot policy manipulate diverse tools without ever seeing them before?
Introducing SimToolReal 🔨 : a generalist dexterous manipulation policy that transfers zero-shot sim→real to unseen tools + unseen tasks
All videos are 1x speed (60 Hz control) 🧵👇
Using OT to define rewards for imitating video demos is popular, but it breaks down when demos are temporally misaligned—a frequent challenge in practice.
We present ORCA at #ICML2025 , which defines rewards by aligning sequences, rather than matching individual frames via OT.
X-Sim, best Paper (Runner-Up) at the EgoAct Workshop @ RSS.
Real2sim2real with pure image based simulation training and reward extracted from human videos
Huge thanks to my co-lead @prithwish_dan, collaborators Angela Chao, Edward Duan & Maximus Pace, and co-advisors @weichiuma@sanjibac!
Thrilled that our X-Sim paper received Best Paper (Runner-Up) at the EgoAct Workshop @RoboticsSciSys — winning a cool pair of @Meta Ray-Bans! 😎
Teleoperation is slow, expensive, and difficult to scale. So how can we train our robots instead?
Introducing X-Sim: a real-to-sim-to-real framework that trains image-based policies 1) learned entirely in simulation 2) using rewards from human videos.
https://t.co/5yt2iTFYF4
Say ahoy to 𝚂𝙰𝙸𝙻𝙾𝚁⛵: a new paradigm of *learning to search* from demonstrations, enabling test-time reasoning about how to recover from mistakes w/o any additional human feedback! 𝚂𝙰𝙸𝙻𝙾𝚁 ⛵ out-performs Diffusion Policies trained via behavioral cloning on 5-10x data!
A core challenge we face when training robots from human videos (e.g., MotionTrack) is that the human basically has to move like a robot for it to work...
@kushalk_ and @prithwish_dan have gone deep on this over the past year—first of many exciting papers that will solve this!
We’re incredibly proud to see our work featured by the Cornell Chronicle!
This work was co-led with the amazing @prithwish_dan and advised by @sanjibac
Read the full story here -
https://t.co/q1BtLXMA0A
We would love to train our robots with human videos. But humans move very differently from robots! How do we bridge this divide?
Check out our work at #ICRA2025 “One-Shot Imitation under Mismatched Execution” on specifying tasks to robots via a prompt human video 🧵
🚀 Come checkout the *four* papers at #CoRL2024 from @PortalCornell ! Each explores a unique angle on how robots can learn effectively with humans:
- Learning by asking (APRICOT)
- Learning by watching (RHyME, Time Your Rewards)
- Learning to collaborate (MOSAIC)
Details: 🧵👇
How can we enable LLMs to actively clarify ambiguous task specifications by gathering information from humans?
Check out APRICOT at #CoRL2024! APRICOT combines LLMs, which propose diverse questions, with Bayesian Active Learning, which selects the most informative one to ask.
Congratulations to graduating senior Sean O'Brien who will be attending University of Buffalo to run cross country and track! Sean is planning on majoring in Engineering! Good luck Sean!
Congratulations to graduating senior Sheridan Dillon who will be attending Fairfield University to play baseball! Sheridan is planning on majoring in Business! Good luck Sheridan!