The most comprehensive breakdown of Unitree you’ll find anywhere in the world.
Actuators, manufacturing, economics of deployment, hands, suppliers, etc.
Go read!
A shared protocol for physical AI where robots, data and human interaction are aligned, enabling intelligence to carry forward across deployments
@PrismaXai was specify the characteristics of high quality robotics data, including how it should be captured and what makes it useful
You can't copy a human video onto a robot.
The poses drift, the contacts are guessed, the whole clip is physically wrong.
SUGAR treats that mess as a rough draft, not a script.
And that reframe is the actual thesis of the paper.
This is the wall that's been blocking "learn from human video" for real manipulation.
Most methods either avoid objects entirely.
Or they replay the exact recorded clip at test time.
SUGAR throws the reference away after training.
At inference it needs only the robot's own state and the object's pose.
The skill now lives in the policy, not in a clip it copies.
The detail : raw video gives physically impossible start states, like a hand clipping through a box.
So training never resets from those broken frames.
It seeds from a growing pool of sim-validated states instead.
Why it matters: 100 cheap videos per task, no teleop, no per-task reward tuning.
Success climbs as you add clips, 20 to 50 to 100.
On a Unitree G1 it runs zero-shot, recovers from failures, and holds up under shoves.
To be honest:
real-world poses still come from MoCap, the policy is state-based with no vision yet.
The skills also stay coarse for now. The authors flag all three.
Still a clean "video as scalable data, not as a script" story.
Nice work to the SUGAR team.
🪜 What if humanoids could climb ladders and work on them straight out of simulation?
Meet LadderMan: a perceptive system for zero-shot sim-to-real ladder climbing and on-ladder manipulation.
Watch the humanoid climb, stabilize, and manipulate—all in one system. 🤖👇
For as long as we have built machines, we have borrowed from biology, and it has rarely let us down.
A humanoid can wear our shape, or it can inherit how our body moves through the world.
TAGA is a small but sharp example of the second kind.
When you cross a gap or hop across stepping stones, you don't scan the whole ground.
You look right where your next foot has to land.
That selective looking is called active gaze.
It's one of the oldest tricks in animal locomotion.
Most humanoids skip it.
They either stare forward with a depth camera and miss the ground under their feet.
Or they process a full local height map that's costly and doesn't scale.
TAGA learns to aim instead.
A small policy predicts where to look, then crops just that patch of terrain for the controller.
The crop uses differentiable sampling, so where to look is learned end to end through RL.
No gaze labels at all. The robot discovers the strategy on its own.
You can see it in the motion.
During a gap crossing, the gaze slides forward from the support foot to the far edge.
It shifts right before the robot commits.
Same anticipation you'd see in a person.
On a Jetson you can't process everything.
Learning to look only where it counts is what makes large terrain feasible on real hardware.
It also lands the headline number. A 1.2 m gap crossing on a Unitree G1.
That's the largest reported for a perceptive humanoid, about 50% past the previous best.
The goal was never to build a human shell. It's to make the machine carry a little of what makes us move the way we do.
Great work from the MarmotLab (NUS) and Zhejiang University team.
finally got manipulation to start working directly from human motion with RL
most important lesson: mimicking affordance is key. without it no manipulation is observed at all
also, mjx warp saved me 🙏, 9x faster throughput than pure mjx (10k sps -> 90k sps)
There are baby steps in the development of markerless motion capture and then there are quantum leaps!!!
This is the first ever test of our m3 model, enabled by the timesync functionality of the latest Zcam model. m3 requires perfect synchronisation, which is delivered by the pixel level synchronisation of the PixelLink feature on the Z-Cam E2 M4S Mark II.
This gives our algorithms the ultimate synced frame by frame ingest. PixelLink also dramatically increases the quality of our real time solve out the box right now. 💪🏻
The m3 model won’t be in production before the summer but it delivers the most extraordinary results we have ever seen. Even with close up occluded contact.
Visit https://t.co/k28EIZrBT4 for a demo of m3 or to see the data below.
#motioncapture #markerlessmotioncapture #ai #artificialintelligence
Long-horizon humanoid loco-manipulation has typically relied on retargeting human motion capture data to the robot. This requires human demonstrations for every new scenario and often forces the robot into human-like movement patterns.
MotionDisco framework takes a different approach with no human motion data or retargeting. An LLM-guided evolutionary search proposes contact plans, which a kinodynamic optimizer then evaluates.
The optimizer returns feedback that guides the next proposals. The resulting trajectories are used to train an RL policy, which is deployed zero-shot on a real humanoid robot.
GRAIL scales humanoid loco-manipulation without teleoperation or motion capture. It generates robot-compatible demonstrations from 3D assets and video priors, creating over 20K whole-body motion sequences spanning manipulation, sitting, and terrain traversal.
Just in time: the Best Paper Award in Robot Learning #ICRA2026 uses 3D camera pose to improve policy learning.
Pretty straightforward: robots live in a 3D world.
Image credit: @CSProfKGD
.@Inspire_robots' RH56H1 Series Dexterous Hand is an ultra-compact and precise robotic hand for humanoid applications.
It enables safe human-robot collaboration and delivers high‑precision operational performance.
In 2025, INSPIRE-ROBOTS delivered 10,000 units of its dexterous hands, reflecting a strong commitment to real‑world scale.
Learn more:
https://t.co/de3TdpyENC
#humanoid #humanoidrobot #humanoidrobotics #humanoids
We spent last 3 months working with a Unitree G1
Humanoids are not fit for real world deployment YET
The real problems no one talks about:
1️⃣ They overheat after 15m of work & need to rest 45m
2️⃣ They're weak; G1 can't carry a full tray or spin a valve
3️⃣ Battery doesn't last longer than 1-2 hours
4️⃣ Hand dexterity is insufficient (hard to grasp a bottle)
5️⃣ Teleops lag makes precision work unreliable
6️⃣ Hardware + deployment cost don’t justify ROI
Humanoids today can’t replace full-time labor 👷
They can’t be deployed without additional labor
But here's what matters 👇
Hardware is improving very fast 🦿
In 12–24 months the picture will look different
H2 & Walker S2 are much better fit for industrial tasks
They still won’t replace a full-time labor 👷
But can do the most dangerous tasks humans do
And minimize workplace fatality ⚰️
Despite the challenges, I'm bullish on humanoids 🦾
They will do everything humans can do
They will be more flexible than specialized robots
They will be mass produced at low costs
But we are still very early
Build your humanoid in 7 days! 🦾
By combining NVIDIA Isaac Sim with @ROBOTIS DYNAMIXEL-Q (QDD) actuators, you get high backdrivability and precise torque control that seamlessly bridges the Sim2Real gap.
See how we fast-tracked physical deployment:
https://t.co/mnY0ipBQCd