Two papers accepted @ismar 2025! Thanks to all collaborators! 😀
- A Study of Multimodal Pen+Gaze Interaction Techniques for Shape Point Translation in XR
- Facilitating the Exploration of Linearly Aligned Objects in Controller-Free 3D Environment with Gaze and Microgestures
Excited to finally announce that we got a bunch of new upcoming research papers on eye and hand HCI for XR! 👁️🖐️
📝1 paper at #UIST2025
📝3 conference + 3 workshop papers at #ISMAR2025
Stay tuned for details.
Thanks to the amazing students at XI lab + collaborators🙏
Carnegie Mellon’s Robotics Institute runs a course on robot learning...
(For FREE 📌)
16-831 covers the full modern stack… the stuff actually being deployed right now:
Imitation learning. Behavior cloning. Reinforcement learning. Learning from human videos. Sim-to-real transfer. Vision-Language-Action models.
Not theory for its own sake.
Every topic is anchored to a real robotics problem: how do you get a robot to generalize to environments it’s never seen before?
All lecture slides are public.
This is THE Robotics Institute. The place that produced the researchers now leading the frontier labs.
Free. No login.
📌 [https://t.co/3hMerLAU3Z]
Follow for more robotics resources like this!
——
Weekly robotics and AI insights.
Subscribe free: https://t.co/9Nm01QUcw3
I call it UnRealityKit.
It's a UE5 + RealityKit bridge for Apple Vision Pro.
All those things you can't do in Unreal Engine for AVP due to Metal: eye tracking, standard gesture controls, room lighting & reflections, volumes, fully utilizing the M2/M5 chip... all possible now.
Given the strong community adoption and real-world deployment of MolmoAct2 on YAM, we're introducing zero-shot evaluation of MolmoAct2 Bimanual YAM in simulation. Now you can test out our models without a real-world robot and build on them!
Code: https://t.co/ooIbp8BRf9
Simulation built on Maniskill!
Gaussian splats are incredible—until you want them to move. Normally, that means crazy 4D capture setups.
Not anymore. Here is how anyone can rig and animate static splats using Mixamo. Blender playback can get brutal, so big props to @playcanvas—their brilliant Super Splat sequence import saves the day.
🧵 Watch the full tutorial breakdown in the comments!
🔥 #ICRA2026 Best Paper Finalist
The era of "robot VLA = single-arm gripper" is ending.
Introducing Dexora — the first open-source Vision-Language-Action system for dual-arm, dual-hand, 36-DoF dexterous manipulation.
🦾 Dual Arms
🖐️ Dual Hands
🎯 36 DoF Control
🌍 Open Source
Trained on:
• 100K simulated trajectories
• 10K real-world demonstrations
Dexora achieves:
✓ 90%+ success on basic manipulation
✓ Strong dexterous manipulation performance
✓ Cross-embodiment generalization
Our key hypothesis:
Train on the hardest embodiment. Transfer to simpler robots later.
Instead of scaling up gripper policies, we train directly in the most expressive action space and project downward to simpler embodiments.
This may be a practical path toward universal robot controllers.
🎥 Demos: https://t.co/Qkvzl8d5Dl
📄 Paper: https://t.co/InWVHE9k8S
supervision just hit 40,000 GitHub stars!
it now powers over 6.5k open-source computer vision projects, including all my demos like basketball AI
link: https://t.co/xXMRaS4ejS
Robotics has no GitHub. That's genuinely insane in 2026.
Every team rebuilds the same parts from scratch. Same grippers. Same servo protocols. Same 2am debugging loops.
Software solved this 15 years ago. You don't rewrite a web server, you fork one.
Robotics never got that moment. So we're building it at @tnkrdotai.
GitHub for robots: 3D models, build guides, and full assemblies. Forkable, versioned, reusable.
👇 watch what that looks like
Train AI robots without writing a single line of code. 🤖
We just launched LeLab, the official graphical user interface for LeRobot built by @rabault_nicolas. It completely removes the command line from the robot learning workflow, taking you from raw hardware to autonomous movement visually.
If you've ever wanted to get into AI robotics but were held back by complex terminal setups, this is for you.
- Zero-Terminal Setup: Smart calibration with automatic USB port detection.
- Easy Data Collection: Teleoperate your robot and record a dataset.
- One-Click GPU Training: Don't have a massive local GPU? Scale your training instantly with Hugging Face Jobs right inside the app.
Just plug in your SO-ARM101 and start teaching your robot. We put together a complete, step-by-step video guide showing exactly how to get started and train your first policy.
Docs: https://t.co/PrUEIeaXKW
GitHub: https://t.co/SFuOiN8rjN
SAM 3D Body is a CVPR 2026 award candidate paper from @AIatMeta
model recovers a full 3D human body mesh from a single RGB image
you can run it automatically, or guide the reconstruction with masks and 2D keypoints
thx to @NielsRogge for awesome demo idea
Hugging Face just released LeRobot Humanoid
An open-source, low-cost (~$2.5k), 3D-printed humanoid built for robot learning and not just demos.
What’s cool is it’s a full stack release:
• hardware + CAD
• runtime & calibration
• sim environments
• identification tools
• training zoo for locomotion
Designed so anyone can build, break, repair, simulate, and train on a real humanoid.
New tutorial | Football player tracking using Ultralytics Platform ⚽
Learn how to train Ultralytics YOLO26 on a football player detection dataset and build a real-time player tracking pipeline for sports analytics.
Watch here ➡️ https://t.co/zEJ2tVP7um
#Ultralytics#YOLO26 #SportsAnalytics
Adapted my tennis AI pipeline for pickleball 🎾
-Audio based shot detection
-Top down court visualization
-Shot attribution
-Ball + player + pose tracking (RF-DETR + ViTPose)
there's no catch; SAM3 is open source and really good
one of the things it does really well is object tracking, even in crazy complex scenes like basketball
probably my favorite computer vision model ever
I am working on porting SAM models and harness into Apple silicon.
Already seeing 1.25x inference speed increase on mlx with the sam2.1-small model. Quantized versions soon.
Repo: https://t.co/y4RnHhL5C1
Model: https://t.co/mxKoAhlIRm
Articraft for scene reconstruction 🏠
Last year we built LiteReality (https://t.co/GWoIV32GAc): an agentic pipeline for graphics-ready indoor recon, but objects weren't articulated.
once @Mattzh1314 showed how well Articraft 🦾works with images, we combined them for scene recon🔥