Excited to release τ0-WM: an open-source unified video-action world model for robotic manipulation.
It's a 5B-parameter robotic foundation model trained on 27.3K hours of real-robot teleoperation, UMI-style demonstrations, and egocentric interaction videos.
Generalist robots don’t fail due to a lack of generality.
They fail due to a lack of proficiency where it matters.
We introduce SOP, enabling generalist policies to improve from real-world experience across distributed robot fleets, without sacrificing generality.
🧵 https://t.co/2xnBpeqNui
We present HIL-SERL, a reinforcement learning framework for training general-purpose vision-based robotic manipulation policies directly in the real-world. It effectively addresses a wide range of challenging manipulation tasks: dynamic manipulation, dual-arm coordination, contact-rich/flexible object manipulation. It achieves a success rate of 100% across all tasks within just 1 to 2.5 hours of training.