VLA-JEPA just dropped in LeRobot ๐ค
What makes this model special is that it does not just learn what action to take from a given observation, it also leverages a JEPA world model to learn action-relevant dynamics.
During training, the VLA leverages V-JEPA2 by conditioning its predictor. This clever trick adds a world modeling objective to the training, which also allows pretraining on human videos.
At inference, the world model is dropped entirely, keeping only a standard VLA architecture: Qwen backbone and action head.
The demo here was only fine-tuned on 13 examples, showing great pretraining capability and running in real time on @NVIDIARobotics DGX Spark!
VLA-JEPA is the first world model to be ported to LeRobot, and I feel like it won't be the last ๐
@Thom_Wolf@ClementDelangue
Cleaning your home with Cable-Driven Parallel Robot and LeRobot.
Meet Stringman, an open-source project by Nathaniel Nifong (@VMises76153). The goal is to completely automate picking up household laundry, toys, and trash.
It operates as an overhead Cable-Driven Parallel Robot (CDPR) using 4 wall-mounted motors and fishing line. Offloading the motor weight from the end-effector keeps the moving gripper assembly lightweight, incredibly cheap, and quiet while maximizing payload capacity. Most of the parts are 3D printed.
The technical stack:
- Model: Multitask Diffusion Transformer (DiT)
- Data: Learned via teleoperation and trained on just 400 local human demonstration examples
- Framework: Built and trained locally using LeRobot on a consumer GPU
This is a fantastic showcase of how accessible, community-driven hardware and open-source AI can rethink home automation constraints without needing a complex humanoid base.
๐ค LeLab now keeps itself up to date! ๐ฆพ
We shipped an in-app popup that alerts you the moment a newer version is on GitHub - plus same upgrades based on users feedback:
- Import any external/Hub model to run & rollout, no retraining
- Per-camera codec & backend control when recording
- Warning when your GPU is present but CUDA isn't being used
- Real camera names on Windows & Linux
- Smoother calibration & teleop (cameras stay off until you turn them on
๐ ๏ธ Update now with:
uv tool install --force git+https://t.co/UjyT3dBkyA
GitHub: https://t.co/SFuOiN8rjN
Docs: https://t.co/PrUEIeaXKW
ETH has semester-long courses.
Stanford? Lecture halls. MIT 826-page textbooks.
๐ Hugging Faceโs robotics? 10 minutesโฆ
>pip install lerobot
Thatโs it.
This command gets you started:
๏ฟผClassical foundations. Imitation learning. Reinforcement learning. Foundation models. Real datasets. Real robots.
The gap between academic and accessible is closing faster than most people realize.
And that matters more than any single course.
๐[https://t.co/QVOjxYFPhd]
โโ-
Weekly robotics and AI insights.
Subscribe free: https://t.co/9Nm01QUcw3
One question follows us to every robotics panel: ๐๐ณ๐ฆ๐ฏโ๐ต ๐บ๐ฐ๐ถ ๐ธ๐ฐ๐ณ๐ณ๐ช๐ฆ๐ฅ ๐ต๐ฉ๐ข๐ต ๐ฐ๐ฑ๐ฆ๐ฏ-๐ด๐ฐ๐ถ๐ณ๐ค๐ช๐ฏ๐จ ๐ณ๐ฐ๐ฃ๐ฐ๐ต๐ช๐ค๐ด ๐ธ๐ช๐ญ๐ญ ๐ซ๐ถ๐ด๐ต ๐ฎ๐ข๐ฌ๐ฆ ๐ช๐ต ๐ฆ๐ข๐ด๐ช๐ฆ๐ณ ๐ง๐ฐ๐ณ ๐ช๐ญ๐ญ-๐ช๐ฏ๐ต๐ฆ๐ฏ๐ต๐ช๐ฐ๐ฏ๐ฆ๐ฅ ๐ข๐ค๐ต๐ฐ๐ณ๐ด ๐ต๐ฐ ๐ฅ๐ฆ๐ฑ๐ญ๐ฐ๐บ ๐ฅ๐ข๐ฏ๐จ๐ฆ๐ณ๐ฐ๐ถ๐ด ๐ด๐บ๐ด๐ต๐ฆ๐ฎ๐ด?
It's a fair question.
We think the answer is the opposite of what most people expect: ๐ง๐ต๐ฒ ๐๐ฎ๐ณ๐ฒ๐๐ ๐ณ๐๐๐๐ฟ๐ฒ ๐ณ๐ผ๐ฟ ๐ฟ๐ผ๐ฏ๐ผ๐๐ถ๐ฐ๐ ๐ถ๐ ๐ฎ๐ป ๐ผ๐ฝ๐ฒ๐ป ๐ผ๐ป๐ฒ. Here's why:
https://t.co/ymwjRH3Xi0
As many of you were interested in the technical details of the model, here is a followup thread to go more into technical details about VLA-JEPA.
1. Architecture
2. Training
3. Recipe for the demo
4. TLDR
๐งตbelow
VLA-JEPA just dropped in LeRobot ๐ค
What makes this model special is that it does not just learn what action to take from a given observation, it also leverages a JEPA world model to learn action-relevant dynamics.
During training, the VLA leverages V-JEPA2 by conditioning its predictor. This clever trick adds a world modeling objective to the training, which also allows pretraining on human videos.
At inference, the world model is dropped entirely, keeping only a standard VLA architecture: Qwen backbone and action head.
The demo here was only fine-tuned on 13 examples, showing great pretraining capability and running in real time on @NVIDIARobotics DGX Spark!
VLA-JEPA is the first world model to be ported to LeRobot, and I feel like it won't be the last ๐
@Thom_Wolf@ClementDelangue
TLDR: VLA-JEPA has a promising architecture leveraging V-JEPA2, enabling pre-training using human videos. Shows greater robustness to perturbations at inference, yet runs quite fast on consumer grade GPUs.
Thanks to everyone involved in the implementation, especially the authors of VLA-JEPA: Jingwen Sun & @zhang_weny92997
Paper: https://t.co/nc3vc8pP3k
Docs: https://t.co/7diSKkvCyG
Any further question? We will answer them in the comments!
For the demo, we collected 13 examples of a screwing task available here: https://t.co/br0pGozegK
Then started the training from the https://t.co/eXjCWgxhQT pretrained weights, 5k steps with a small 16 batch-size.
This task is a good mix of picking + dexterity, but still quite simple enough not to require too many examples.
Hi @LeRobotHF team! ๐ Great to see VLAโJEPA landing in the framework โ huge thanks for the shoutout and for building such an accessible home for robotics research. Excited to keep pushing this direction together! ๐ค๐ค
VLA-JEPA just dropped in LeRobot ๐ค
What makes this model special is that it does not just learn what action to take from a given observation, it also leverages a JEPA world model to learn action-relevant dynamics.
During training, the VLA leverages V-JEPA2 by conditioning its predictor. This clever trick adds a world modeling objective to the training, which also allows pretraining on human videos.
At inference, the world model is dropped entirely, keeping only a standard VLA architecture: Qwen backbone and action head.
The demo here was only fine-tuned on 13 examples, showing great pretraining capability and running in real time on @NVIDIARobotics DGX Spark!
VLA-JEPA is the first world model to be ported to LeRobot, and I feel like it won't be the last ๐
@Thom_Wolf@ClementDelangue
The latest haptic glove from LeRobot's @nepyope looks alive ๐
Homunculus Fยณ drops next month.
Repost if you're hyped for new #lerobot hardware ๐ค
#HF#lerobot#haptics
Kicking off a mini-series of videos with the @LeRobotHF SO-100 robot arm ๐ฆพ๐ โ showing the fun and accessible side of robotics, one experiment at a time! Here, i'm teleoperating the SO-100 to do pick-and-place w/ a conventional leader-follower setup. More to come soon!
Train AI robots without writing a single line of code. ๐ค
We just launched LeLab, the official graphical user interface for LeRobot built by @rabault_nicolas. It completely removes the command line from the robot learning workflow, taking you from raw hardware to autonomous movement visually.
If you've ever wanted to get into AI robotics but were held back by complex terminal setups, this is for you.
- Zero-Terminal Setup: Smart calibration with automatic USB port detection.
- Easy Data Collection: Teleoperate your robot and record a dataset.
- One-Click GPU Training: Don't have a massive local GPU? Scale your training instantly with Hugging Face Jobs right inside the app.
Just plug in your SO-ARM101 and start teaching your robot. We put together a complete, step-by-step video guide showing exactly how to get started and train your first policy.
Docs: https://t.co/PrUEIeaXKW
GitHub: https://t.co/SFuOiN8rjN