VLA models often forget their pretrained knowledge during action training, hurting generalization.
🤖Our framework unifies action & VLM training to preserve strong pretrained representations & maintain versatility, boosting generalization & robustness.
https://t.co/Bt66OA51uk
Traditional robots are like player pianos—they only play notes they’re given.
Memo is more like a Master Chef who has watched hundreds of cooks in every type of kitchen. Our diverse dataset provides Memo with an “intuition” for grasping entirely new objects. This is a small step towards truly helpful robots that work in any home 🤖👨🍳
Introducing GEN-0, our latest 10B+ foundation model for robots
⏱️ built on Harmonic Reasoning, new architecture that can think & act seamlessly
📈 strong scaling laws: more pretraining & model size = better
🌍 unprecedented corpus of 270,000+ hrs of dexterous data
Read more 👇
Our framework provides a path toward building generalist policies by preserving rich representations.
Work led by me with @akshaygopalk , in collaboration with @XuanlinLi2 , @BoAi0110 , @hiskov and @haosu_twitr .
VLA models often forget their pretrained knowledge during action training, hurting generalization.
🤖Our framework unifies action & VLM training to preserve strong pretrained representations & maintain versatility, boosting generalization & robustness.
https://t.co/Bt66OA51uk
Result: This all translates to robust real-world performance.
Our model demonstrates a more robust understanding of tasks in the real world, successfully completing goal-conditioned actions even in the presence of distracting objects that confuse baseline models.