How to use simulation data for real-world robot manipulation? We present sim-and-real co-training, a simple recipe for manipulation. We demonstrate that sim data can significantly enhance real-world performance, even with notable differences between the sim and the real. (1/n)
Humanoids need data. Lots and lots of data.
Introducing HumanoidMimicGen: a method that automatically generates 1000s of humanoid loco-manipulation demonstrations from a single teleoperated demonstration.
We are back. After one year of quiet building.
Introducing GENE-26.5, our first robotic brain that takes a major step toward human-level capability.
For years, robotics has struggled to learn from the world’s largest and valuable data source: Humans.
Solving it means rethinking the whole stack from the ground up:
- A robotics-native foundation model.
- A 1:1 human-like robotic hand.
- A noninvasive data collection glove for motion, force, and touch.
- A simulator that turns weeks of experiments into minutes.
GENE-26.5 is trained across language, vision, proprioception, tactile, and action. We designed a set of tasks to test how far we can go with this new paradigm.
Fully autonomous, 1x speed, one model, same weights. (Enjoy with sound on)
We are approaching the endgame for robotics.
And this is just a beginning.
🤖Co-training is everywhere (sim↔real[e.g. GR00T, LBM], human↔robot[e.g. PI, EgoScale], even non-robot data[e.g. PI, LBM).
But why does it work? How can we improve it further?
Taking sim-and-real imitation learning in diffusion/ flow-based models as the test bed, we performed a rigorous mechanistic analysis, drawing on theoretical insights and multi-layered experiments.
😮Key insight: it’s all about representations.
- Alignment → enables transfer
- Discernibility → enables adaptation
⚖️Both are necessary — it's better to have more aligned representations, but the model must be able to discern the domains. We term this as structured representation alignment.
⬇️Let’s take a deep dive into that:
Paper: https://t.co/RWCAxdBC0j
Website: https://t.co/BwgbwCkevA
Discrete or continuous tokens? Or even tokenizer-free? The visual modeling debate rages on, but for now, let me introduce L24SQ, a provably optimal, regularizer-free quantizer with a large codebook (~200k), achieving SoTA reconstruction-compression tradeoff and generative power!
Sim-to-real learning for humanoid robots is a full-stack problem. Today, Amazon FAR is releasing a full-stack solution: Holosoma.
To accelerate research, we are open-sourcing a complete codebase covering multiple simulation backends, training, retargeting, and real-world inference.
Introducing Lux, the most powerful and fastest Computer Use model, built by OpenAGI Foundation @agiopen_org
Lux outperforms Google Gemini CUA, OpenAI Operator and Anthropic Claude on benchmark with 300 real-world tasks.
Try our developer-friendly SDK to build powerful, real-world applications.
🧵
Btw, the code is finally released: https://t.co/2I7lnARYov
Hopefully, you'll find it helpful in your own research. Let us know if you encounter any issues or build on or extend the work!
I'm still excited about real-world RL, and residual RL as a policy-agnostic method for it.
Humanoid motion tracking performance is greatly determined by retargeting quality!
Introducing 𝗢𝗺𝗻𝗶𝗥𝗲𝘁𝗮𝗿𝗴𝗲𝘁🎯, generating high-quality interaction-preserving data from human motions for learning complex humanoid skills with 𝗺𝗶𝗻𝗶𝗺𝗮𝗹 RL:
- 5 rewards,
- 4 DR terms,
- Proprio. ONLY,
- NO history/curriculum.
Ready for agile, human-like 🤖? (Best with 🎧)
🔗 https://t.co/rT2CRb9msm 🎥
1/9
Check out our work on residual RL for finetuning BC policies! We demonstrated real-world RL with a 29-DoF bimanual dexterous robot and improved the pre-trained base policy efficiently.
How can we enable finetuning of humanoid manipulation policies, directly in the real world?
In our new paper, Residual Off-Policy RL for Finetuning BC Policies, we demonstrate real-world RL on a bimanual humanoid with 5-fingered hands (29 DoF) and improve pre-trained policies with ~15-75 minutes of robot interaction.
By learning residual corrections on frozen BC policies using sample-efficient off-policy RL, we achieve significant improvements in sample efficiency, enabling policy finetuning directly on the hardware — to our knowledge, one of the first examples of this on a humanoid with bimanual dexterous hands.
(If you know of other examples, let me know!)
Intelligent humanoids should have the ability to quickly adapt to new tasks by observing humans
Why is such adaptability important?
🌍 Real-world diversity is hard to fully capture in advance
🧠 Adaptability is central to natural intelligence
We present MimicDroid 👇
🌐 https://t.co/J8XpND9j1j
Meet Casper👻, a friendly robot sidekick who shadows your day, decodes your intents on the fly, and lends a hand while you stay in control!
Instead of passively receiving commands, what if a robot actively sense what you need in the background, and step in when confident? (1/n)
(1/n) Many large-scale robot datasets have been released, but their quality varies. How can we curate them to improve policy performance? We present SCIZOR, a self-supervised framework that filters low-quality data, boosting policy performance in both simulation and real world.
Excited to be at Atlanta for #ICRA2025 next week and present our work DexMimicGen on Thursday! Please come check our presentation at 5:00 pm @ 411 and stop by our poster at poster board!
How can we scale up humanoid data acquisition with minimal human effort?
Introducing DexMimicGen, a large-scale automated data generation system that synthesizes trajectories from a few human demonstrations for humanoid robots with dexterous hands. (1/n)
How to use simulation data for real-world robot manipulation? We present sim-and-real co-training, a simple recipe for manipulation. We demonstrate that sim data can significantly enhance real-world performance, even with notable differences between the sim and the real. (1/n)
Competitive Pokémon has all the makings of a great RL research problem:
- Stochasticity? ✅
- Imperfect information? ✅
- Generalization? ✅
- More fun than half-cheetah? ✅✅✅
And today it gets better…
Awesome work from @adamwei_ diving deep into sim-and-real co-training. Our concurrent work (https://t.co/u1nuWWP8z2) shares similar findings on a large range of tasks. We are seeing promising progress on leveraging simulation data on a larger scale!
Learning from both sim+real data could scale robot imitation learning. But what are the scaling laws & principles of sim+real cotraining?
We study this in the first focused analysis of sim+real cotraining spanning 250+ policies & 40k+ evals https://t.co/hMc56Hj2K9 (1/6)
We also find camera alignment to be critical for successful co-training with task-aware digital cousin data.
Training policies on severely misaligned simulation data results in a significant drop in performance compared to policies co-trained with properly aligned digital cousin data. On the Panda arm CounterToSinkPnP task, the cotraining success rate dropped from 67% to 56%, while in the GR-1 humanoid CupPnP task, it declined from 95% to 70%. However, the aligned camera does not need to be strictly identical to the real-world camera. (10/n)