my sim2real dexterous in-hand manipulation work taught me one annoying thing: there's no shortcut to measuring the sim2real gap. you run rollouts on the real robot.
reset the object, run it, watch, mark pass/fail, reset. dozens per checkpoint. it eats days.🧵
it's not free: reset policy + classifier per scene and binary success only.
but the direction is right. what slows robot learning isn't always training, sometimes we just can't measure fast enough. pointing automation at eval itself is a lever nobody really pulled.