Teaching a robot to navigate should be as simple as recording a video with your phone.
In LoTIS, we study how an RGB video can guide a robot without camera calibration, poses, or robot-specific training.
Page, code, demo:
https://t.co/r82gOIYu2i
#RSS2026#ICRA2026
1/9 🧵
@JasonJZLiu I see, thanks :) I assume the same but I guess it might not always be true that smaller join torque errors always mean smaller EE errors? Either way, nice work!
Teaching a robot to navigate should be as simple as recording a video with your phone.
In LoTIS, we study how an RGB video can guide a robot without camera calibration, poses, or robot-specific training.
Page, code, demo:
https://t.co/r82gOIYu2i
#RSS2026#ICRA2026
1/9 🧵
@giffmana The hard part might be getting the ghost registered in the right place in the view,
Reminds me of some work we did recently on localizing a previous recording directly in the current view, maybe there’s some overlap: https://t.co/9jVXRWjp6A
Teaching a robot to navigate should be as simple as recording a video with your phone.
In LoTIS, we study how an RGB video can guide a robot without camera calibration, poses, or robot-specific training.
Page, code, demo:
https://t.co/r82gOIYu2i
#RSS2026#ICRA2026
1/9 🧵
@TobiasFischer11 Thanks for the insights :) might try to apply this to a work of ours that is related (https://t.co/9jVXRWjp6A). Was it easy to get this to train stably?
Teaching a robot to navigate should be as simple as recording a video with your phone.
In LoTIS, we study how an RGB video can guide a robot without camera calibration, poses, or robot-specific training.
Page, code, demo:
https://t.co/r82gOIYu2i
#RSS2026#ICRA2026
1/9 🧵
@TobiasFischer11 Interesting! Given that, and the findings in tab 4 which suggest that weight sharing gives better perf, do you think training your model with a larger encoder would get even better performance? I.e. same compute as prior methods?
Try the model yourself on your own data at https://t.co/8UAS757bpy and let us know what you think :)
Find us at #RSS2026 in Sydney in July or the MM-SpatialAI workshop @ #ICRA2026 in Vienna next week!
Thanks to my amazing colleagues for making this project possible :)
9/9
Teaching a robot to navigate should be as simple as recording a video with your phone.
In LoTIS, we study how an RGB video can guide a robot without camera calibration, poses, or robot-specific training.
Page, code, demo:
https://t.co/r82gOIYu2i
#RSS2026#ICRA2026
1/9 🧵
Results
The largest gains appear when the setting becomes less like “follow the video as recorded”:
off-traj starts, camera mismatch, and especially backward traversal.
In sim and real-world experiments, LoTIS substantially outperforms prior baselines.
8/9
@maharshii Interesting, this is related to your earlier tweet right? Does this also mean no matter if it is dynamic=True or False, the int/float Inputs will not be compiled into the kernel and you can change them at runtime?