Excited to introduce PolaRiS, a real-to-sim recipe for turning short real-world videos into high fidelity simulation environments for scalable and reliable zeroshot generalist policy evaluation.
https://t.co/nWcR6YuPf4
(1/N 🧵)
Remember action recognition? The days of trying to climb on Kinetics?👻
Announcing VideoNet, a CVPR 2026 Highlight 🎉 which revitalizes action recognition in the VLM era
Explore our data with this fun, interactive demo: https://t.co/W53aBi3QAX
(1/8) 🧵
We’re releasing OmniReset, a framework for training robot policies using large-scale RL and diverse resets for contact-rich, dexterous manipulation.
OmniReset pushes the frontier of robustness and dexterity, without any reward engineering or demonstrations.
Try the policies yourself in our interactive simulator! https://t.co/3hW3nYx2vD
(1/N 🧵)
🧵(6) DROID Eval
CoVer-VLA achieves 14% gains in task progress and 9% in success rate on the challenging red-team PolaRiS benchmark. In the pan cleaning task, π₀.₅ shows incorrect intent, grasping the pan handle. In contrast, CoVer-VLA correctly uses sponge to scrub the pan.
Excited to share CoVer-VLA—a fully self-supervised action verifier for VLA models and the first work of my PhD! 🤖
We developed a lightweight verifier that assesses VLA action quality by aligning actions with text-visual features. Best of all? It requires zero failure data and scales seamlessly to large robotics datasets.
Beyond verification, CoVer learns aligned action representations via contrastive learning—opening doors for more downstream robotics tasks such as data curation and OOD detection! 🚀
Huge thanks to my amazing collaborators and advisors, and a special shout-out to @prodarhan for the help with PolaRis! Truly an incredible platform.
Please check out more details in the post, and try to CoVer your VLA policy!
How can robot policies be trained to best leverage VLMs' CoT reasoning and in-context learning for generalization?
The key is Steerable Policies: vision-language-action models that can be flexibly controlled in many ways!
https://t.co/GvcvmY0JD5
1/9
Meet SceneSmith: An agentic system that generates entire simulation-ready environments from a single text prompt.
VLM agents collaborate to build scenes with dozens of objects per room, articulated furniture, and full physics properties.
We believe environment generation is no longer the bottleneck for scalable robot training and evaluation in simulation.
Website: https://t.co/UZklSkJe9V
👇🧵(1/8)
Building sim environments is hard -- especially ones that are useful for assessing progress on real-world performance. With PolaRiS, you can scan an environment for 2-5 minutes and use 2d gaussian splatting + some tools to create a high fidelity simulation.
@shreyasgite@chris_j_paxton haha we tried to do a very early variant of this here, using our vision policy to bootstrap state-based learning in new sim environments, to improve the vision policy downstream
https://t.co/a59pYNx1Gb
but still some work to do to fill all the gaps!
Evaluating robot policies is hard. Ideally, instead of testing every new policy on a real robot, you could test in simulation; but simulations rarely correlate well with real-world performance. In order to make good, useful simulations, you need to spend a great deal of time and effort.
That’s where PolaRiS comes in: it’s a toolkit that lets you take a short video of a real scene and turn it into a high-fidelity simulation. It provides what you need to build a good evaluation environment, and it “ships” with off-the-shelf environments that already show strong sim-to-real correlation, meaning that they can be used to inform policy performance.
@prodarhan and @KarlPertsch join us to talk about what they have built, why, and how you can use it.
Watch Episode #62 of RoboPapers, with @chris_j_paxton and @DJiafei, now!
A key bottleneck to fast deployment is understanding how you'll fail in a new env.
We need ways to spin up rapid simulations / evaluations of robot policies in new environments if we want to keep high iteration speed.
@prodarhan and @KarlPertsch are building something cool.