We’re officially opening up access to SAI – a new platform for reinforcement learning.
Submit models, compete in structured challenges, and learn from others — anytime, not just during a conference.
Try out SAI now -> 🌐 https://t.co/dqScdtNRLI
Documentation -> 📄 https://t.co/FxYDAchQYX
@sihing_guppy We ran this on a fine-tuned OpenPI policy on LIBERO Spatial. The full perturbation analysis covers 11 language corruption types:
https://t.co/TTWC1cgJM4
https://t.co/3tt9xg2bGZ
What happens when you remove a robot's ability to read its instructions? Almost nothing.
> Full model → 95% success
> Remove language → 94% (▼1%)
> Remove vision → 13% (▼82%)
Near-blind without vision. Near-indifferent to language.
If your evaluation only tests correct instructions, you're not measuring language. You're measuring vision.
A robot receives a language command "pick up the black bowl and place it on the plate" and executes it. We replaced the command with: "My name is Franka."
No task. No object. No action verb. It picked up the bowl and placed it on the plate. The language instruction isn't being read.
The scene is being acted upon. The prompt is decoration.
@sihing_guppy Full analysis with 11 perturbation types across 3 severity levels, including why the flat sensitivity curve is the real finding:
https://t.co/TTWC1cgJM4
We’ll be at @NVIDIAGTC next week in San Francisco.
@sihing_guppy and @MarcAlloul will be there talking about model transparency and evaluation for robotic policies: how to understand what your models are actually doing before you deploy them.
If you’re building or deploying robotic systems, send us a dm or come find us. We’d love to chat.