In my first week at @GeneralistAI, I trained a robot to pour liquids using GEN-1 🤖💧
I wanted to challenge the robot with a non-rigid manipulation task, so liquid felt like the perfect choice. The task involved:
- unscrewing the bottle cap
- pouring liquid into espresso glasses
- rebalancing uneven pours
Best of all, the robot was able to complete the task fully autonomously 3 times in a row (out of 3)! Pour-fect 😉
Excited for the journey ahead and grateful to be building alongside such an incredible team!
Today marks the end of my first full week @GeneralistAI
Last Monday, I was given a challenge: use our GEN-1 model to teach a robot a task of my choosing, using the same no-code platform our customers use.
I picked the ball-and-vase magic trick. It was one of my favorites as a kid, and it felt like the right mix of fun and surprisingly hard.
A few days later, GEN-1 pulled it off. I left Friday having watched the robot nail it 14 times in a row. What’s wild is that even 4 months ago, if you told me you could go from idea to on-robot skill in a couple of days, I probably wouldn’t have believed you.
Really excited to be building with an incredible team. Can’t wait to see what week two brings 🤖
GEN-1 plays the 🐚 shell game, trained on just 1 hr of robot data. It also generalizes to unseen objects, like @BerkayAntmen 's car keys.
Physical AI models should be capable of benchmark tasks like this one. It's interesting for the all the reasons @RhodaAI calls out -- requires visual memory, and the model must track the cups from the very start, at high frame rates.
Interestingly, GEN-1 appears to exhibit a degree of "active perception." It's subtle; the hands can sometimes appear to "follow" the cups, using its own movements to help attend to where it thinks the object should be.
Read more about GEN-1 in our blog post in the comments below ↓
Everyday for the past 2 weeks, we've been sharing something new from GEN-1, our latest milestone in scaling robot learning. This has never been done before.
Going from ideas to skills in days (or faster) is what physical AI models should deliver.
More coming. Stay tuned.
Read more about it in our blog post in the comments below ↓
GEN-1 plays with fidget toy.
Other tasks GEN-1 can do:
https://t.co/j2vm6S45UN
Read more about GEN-1, our latest foundation model for the physical world:
https://t.co/Sg2PoaS7hd
GEN-1 removes thumbtacks and papers from corkboard.
Other tasks GEN-1 can do:
https://t.co/j2vm6S45UN
Read more about GEN-1, our latest foundation model for the physical world:
https://t.co/Sg2PoaS7hd
Orange you glad our robot can stack?
Other tasks GEN-1 can do:
https://t.co/j2vm6S45UN
Read more about GEN-1, our latest foundation model for the physical world:
https://t.co/Sg2PoaS7hd
I love this one, because deploying a robot then “integrating with customer APIs” is not fun.
But a lot simpler when your robot can push the same buttons as a human 👆or use a touch screen.
Also, GEN-1 has memory too — it learns to “remember” which sock it put away, then taps.
As a PhD student, I worked on methods to generalize across motion, objects, lighting, backgrounds, etc. Each is hard, and there’s always a long tail. (E.g. my robot still can’t handle new camera views—hence the “Don’t move! Thx :)” sign.)
But if you have enough data to train from scratch, many of these generalization problems start to disappear. You get them almost “for free.”
What’s powerful is the clarity you get when you start from the goal, not from a favored method: data is a key bottleneck to remove on the path to physical AGI. And working backwards from first principles, there may be ways to solve the data problem that aren’t actually that expensive.
That unlocks the freedom to train from scratch with simple methods, not crutches.
This is GEN-1 putting paper bills into a wallet.
Paper has always been deceptively hard for robots.
Thin, deformable, and unforgiving—it bends, folds, and slips.
Not precise, you miss. Too much force, you crumple.
Easy for humans. But for robots, it’s a full-stack challenge.