Happy to share what I’ve been working on since joining Genesis! GENE-26.5 is a one-of-a-kind, robotics-native multimodal foundation model that learns from diverse, in-the-wild data across modalities and outputs actions enabling a 54-DoF robot system to perform the most dexterous, long-horizon manipulation tasks to date—approaching human-level capability.
This is the result of innovations across the full stack—data collection and processing, robot systems, model architecture, training strategies, and scalable evaluation infrastructure.
Hillclimbing from scratch is for sure suffering but we learned a ton by going through the pain. So proud of our team 💫
Check out our technical paper: https://t.co/QRw2rkcKQV
I'll be at @CVPR (briefly), speaking at the Sense of Space workshop tomorrow @ 9:15 about how robots may be slowing down robotics.
I spent the past year thinking more about the role of human data, simulation, and dexterous manipulation; happy to connect if you're doing the same!
In recent years, academic and industry work in generative modeling has drifted so far apart that they are playing totally different games, and techniques that work in academia may not transfer to industry problems.
The divide isn't just about scale -- the different tasks in academia vs industry lead to different fundamental challenges.
Academic work focuses on class-conditional ImageNet generation. This has a very weak conditioning signal (single categorical label) and the problem is very data-constrained, with all SOTA methods training for hundreds of epochs. The main challenge in this regime is combatting overfitting.
Industry work on image or video generation usually has a much richer conditioning signal (e.g. very long captions, input images, etc) and is almost always underfitting since data can be scaled to absurd degrees. Overfitting (at least for pretraining) isn't a concern; instead we want to fit the complex data distribution *as fast as possible*.
We hope that GPIC is approachable on the academic budgets people are already expending on ImageNet, but will lead to problems more similar to the industry-scale challenges in generative modeling.
If you are serious about robotics (aka Physical AI), hire students from the Robotics, Systems and Control master’s at @ETH . In my opinion, it is THE strongest robotics program in the world:
•control,
•optimization,
•perception,
•locomotion,
•aerial robotics,
•learning,
•systems engineering, robot dynamics
If the student has GPA > 5.75, you can probably skip the technical interview.
TLDR - We proposed CoP, a physics-grounded tactile representation that is richer than binary contact and more transferable than raw touch. Strong s2r RL performance on two challenge tasks: peg-in-hole & ball balancing.
Very exciting work, see Michael's thread for more details!!!
Touch is important for dexterous manipulation, but hard to use in sim-to-real RL.
Binary contact transfers well, but loses rich contact information; raw tactile readings preserve information, but create a large sim-to-real gap.
When @mpan31415 first showed me the paper I remember myself sitting in front of the desk spending a whole afternoon understanding the formulas.
Really elegant physical modeling and sensor calibration method that effectively reduces raw tactile signals into a 6 dim representation. Also interesting insights on model training showing how the representation can be used to infer object properties.
Congrats to Michael and Toru!!
Touch is important for dexterous manipulation, but hard to use in sim-to-real RL.
Binary contact transfers well, but loses rich contact information; raw tactile readings preserve information, but create a large sim-to-real gap.
We have been using simulation for evaluation to accelerate model iteration. We are now exploring how to leverage simulation for learning from experience and enabling a continual self-improvement loop.
Simulation is the key infra of our internal stack!!
Simulation is on the critical path to general-purpose robots.
Reality is too slow to scale robot interactions. Simulation lets us scale at the speed of software: evaluation today, reinforcement learning tomorrow.
Bridging the Sim2Real gap requires owning the full stack: system identification, control, compiler, physics, assets, and rendering.
Proud of the cross-functional collab across our global team on Genesis World 1.0 - and excited to open source it to help accelerate progress across the field.
More details: https://t.co/EFUFMihvEa
The blog post goes into a ton of detail about how we’re using simulation today and our vision for the future, so definitely check it out. Two things I especially want to highlight:
Eval and Alignment
Simulation is one of the best ways to bootstrap generalization. As pretrained policies become stronger, evaluating them on real robots becomes increasingly slow and difficult — especially for multitask policies with enormous visual, behavioral, and semantic diversity. Genesis World dramatically accelerates the pretraining iteration loop by providing faithful simulation results that strongly correlate with real-world performance.
Scaling Autonomous Experience
This is an active research effort at Genesis. Coding agents work so well today because environment creation and outcome measurement are trivial in the digital world. The same playbook should apply to robotics: models exploring and interacting across millions of iterations and thousands of tasks in parallel. Eventually, the knowledge gained from post-training can be distilled back into pretraining, enabling even stronger policies.
Xian always likes to say, “Only at Genesis,” and I usually reply, “In Genesis I trust.” Now I’m starting to see every layer of the stack coming together, and I couldn’t be more excited about the work ahead.
We are back again :) After three weeks of quiet building.
Introducing Genesis World 1.0, our latest simulation platform, the second release in our full-stack suite. Open-sourced.
Robotics is still bottlenecked by the 1× speed of the physical world. Every model, checkpoint, and data recipe eventually needs to be tested on physical hardware, slowly, expensively, and with limited coverage.
One hour in reality can become 100 days in simulation. That is how robotics model iteration moves from a wall-clock bottleneck to a compute problem.
To make this work, simulation has to be both fast and trustworthy.
Over the past year, we rebuilt the entire stack: a GPU-accelerated cross-platform compiler, penetration-free multi-physics contact solvers, unified rigid and deformable physics, and a photo-realistic renderer purpose-built for physical AI applications.
We built Nyx, a high-performance path-traced rendering engine for robotics application.
Genesis World 1.0 achieves near realtime performance with our latest development for penetration-free IPC solver, supporting various types of deformables beyond rigid bodies. It supports contact-rich, dexterous manipulation simulation across different embodiments: unitree, sharpa, wuji, genesis hand and various types of grippers.
Under the hood is Quadrants, our effort in pushing forward cross-platform GPU-accelerated computation. Quadrants started as a fork of Taichi, and we rebuilt most of the critical parts for optimizing simulation workloads, giving 10x faster launch time and up to 4.6x runtime performance compared to the initial Genesis release.
Together, they bring us to an unprecedentedly low sim-to-real gap, enabling zero-shot real-to-sim model evaluation and much faster iteration of GENE.
All available today.
Genesis World 1.0: https://t.co/aknCM3eqws
Quadrants: https://t.co/uXqPNI4cb6
Nyx: https://t.co/R8j0djqGnV
One of the biggest lessons I’ve learned over the past year is how central operations are to robot learning — and how much great operations at scale really matter. Super excited to have @PallyKumar85 joining the team. The energy and expertise he brings will be invaluable as we continue to scale!
Today I'm joining @gs_ai_ as Head of Operations.
Data is what makes a general-purpose robot possible. Manufacturing is what makes it real. My job is the operating system that scales both, in lockstep.
Unitree Unveils: GD01, A Manned Transformable Mecha, from $650,000 👏
The world's first production-ready manned mecha. It can transform. It's a civilian vehicle. It weighs ~500kg with you inside.
Please everyone be sure to use the robot in a Friendly and Safe manner.