@NVIDIA is working on one of the hardest problems in Physical AI so you don’t have to: generalist robotic pick-and-place.
We are excited to introduce GraspGenX at #CVPR2026—a foundation model for robotic grasping that works out of the box for unknown robots, novel objects, and unseen environments.
Unlike Vision-Language-Action (VLA) models or dedicated grasp networks that require expensive, embodiment-specific training, GraspGenX is cross-embodiment and works zero-shot. You simply pass a "robot prompt" alongside an image of the object to generate actions.
🚀 Key Highlights:
1) Scaling: Trained on over 2 Billion 6-DoF grasp rollouts entirely in physics simulation—a dataset size practically impossible to collect via real-world teleoperation.
2) Zero-Shot Transfer: Works out of the box for several common robot grippers widely used across the research community and industry.
3) Built for the Agentic Era: Features native MCP support, client-server architecture, and skills.md, allowing seamless integration into LLM/Agentic robotics workflows.
4) Full Pipeline Integration: Pair it with other open foundation models (like SAM3) and advanced motion solvers like cuRoboV2 for full deployment in entirely unknown environments.
If you are currently executing pick-and-place with a VLA or WAM, you can use GraspGenX to generate sim-verified trajectory data and inject it into your pipeline. No need to waste precious real-world engineering hours on data collection for standard manipulation tasks.
🌐Website: https://t.co/a7acm4Pw7N
💻Code: https://t.co/eYUYxCb7Jp
📄Paper: https://t.co/pDOVp0VJLL
📍CVPR Booth: Poster 619 on Jun 6 1:45 session at ExHall F
This work was led by the incredible @BeiningH (Princeton), in collaboration with a phenomenal team at NVIDIA: @erwincoumans, @yu_wei_chao, @balakumar_, @clembow, and Stan Birchfield
#CVPR2026
Tired of re-training a new model for grasping when u have a new gripper?
Checkout GraspGen-X. Website: https://t.co/jZ0sJ7bJVl
Come by at poster 619 on Jun 6 1:45 session at ExHall F.
This week at #CVPR2026, NVIDIA Research is presenting three papers across physical ai that offer groundbreaking solutions for training at scale across diverse applications:
→ GraspGen-X: the first foundation model for zero-shot grasping, trained on billions of simulated grasps
→ LCDrive: a model that replaces expensive text-based reasoning with compact latent representations
→ NitroGen: a generalized gameplay AI foundation model that harnesses NVIDIA Isaac GR00T to help train embodied agents
Learn more: https://t.co/H748YkAWS9
This week at #CVPR2026, NVIDIA Research is presenting three papers across physical ai that offer groundbreaking solutions for training at scale across diverse applications:
→ GraspGen-X: the first foundation model for zero-shot grasping, trained on billions of simulated grasps
→ LCDrive: a model that replaces expensive text-based reasoning with compact latent representations
→ NitroGen: a generalized gameplay AI foundation model that harnesses NVIDIA Isaac GR00T to help train embodied agents
Learn more: https://t.co/H748YkAWS9