C's Robotics Paper Notes

@RoboReading

Recording paper notes by @chongzitazhang who is interested in robot learning

busy with paperwork recently

Joined March 2024

1 Following

3.3K Followers

512 Posts

C's Robotics Paper Notes @RoboReading

3 months ago

ULTRA: Unified Multimodal Control for Autonomous Humanoid Whole-Body Loco-Manipulation https://t.co/HOxC0WXIL6 1. RL based retargetting 2. multi-modal command student distillation and finetuning so that it can switch between goal reaching vs reference tracking, mocap vs depth

RoboReading's tweet photo. ULTRA: Unified Multimodal Control for Autonomous Humanoid Whole-Body Loco-Manipulation
https://t.co/HOxC0WXIL6

1. RL based retargetting
2. multi-modal command student distillation and finetuning so that it can switch between goal reaching vs reference tracking, mocap vs depth https://t.co/e7OkuKvqke

0

30

9

20

2K

C's Robotics Paper Notes @RoboReading

3 months ago

HoMMI: Learning Whole-Body Mobile Manipulation from Human Demonstrations https://t.co/mYuPNafG1a for cross-embodiement transfer: 1. use look-at points in 3D space instead of direct head states 2. mask out arms

RoboReading's tweet photo. HoMMI: Learning Whole-Body Mobile Manipulation from Human Demonstrations
https://t.co/mYuPNafG1a

for cross-embodiement transfer:
1. use look-at points in 3D space instead of direct head states
2. mask out arms https://t.co/mBZ0xld2rm

0

36

8

16

2K

C's Robotics Paper Notes @RoboReading

3 months ago

Watch Your Step: Learning Semantically-Guided Locomotion in Cluttered Environment https://t.co/gwrC59fcnG this works shows you can train a policy plus using a semantic map to avoid stepping on valuable things. this is why I believe locomotion should all be mapping based.

RoboReading's tweet photo. Watch Your Step: Learning Semantically-Guided Locomotion in Cluttered Environment
https://t.co/gwrC59fcnG

this works shows you can train a policy plus using a semantic map to avoid stepping on valuable things.

this is why I believe locomotion should all be mapping based. https://t.co/2HBwobSbRS

0

29

4

14

2K

C's Robotics Paper Notes @RoboReading

3 months ago

Training is multi stage. They first have a mimic policy and distill that to a base policy. Then they use AIP (AMP for interactions) to make the policy generalize instead of memorizing kinematic references. DF needs mocap, so they also distill this into vision policies.

0

2

0

1

369

C's Robotics Paper Notes @RoboReading

3 months ago

https://t.co/zxbzsfLN0W LESSMIMIC: Long-Horizon Humanoid Interaction with Unified Distance Field Representations use distance field (DF) as a representation for HOI. each link's traj can be defined by DF + grad of DF + vel_norm + vel_tangent. (tbc)

RoboReading's tweet photo. https://t.co/zxbzsfLN0W
LESSMIMIC: Long-Horizon Humanoid Interaction with Unified Distance Field Representations

use distance field (DF) as a representation for HOI. each link's traj can be defined by DF + grad of DF + vel_norm + vel_tangent.
(tbc) https://t.co/qOGzRCI0yA

1

50

6

27

2K

C's Robotics Paper Notes @RoboReading

3 months ago

https://t.co/vFLZ4jbuCt Pro-HOI: Perceptive Root-guided Humanoid-Object Interaction trained with mimic + contact commands, but deployed with planner to replace the reference.

RoboReading's tweet photo. https://t.co/vFLZ4jbuCt
Pro-HOI: Perceptive Root-guided Humanoid-Object Interaction

trained with mimic + contact commands, but deployed with planner to replace the reference. https://t.co/rQzb2JcV4U

0

57

4

35

3K

C's Robotics Paper Notes @RoboReading

3 months ago

https://t.co/jLTggspdpA The grandtour dataset, with tons of odometry methods benchmarked in tons of environments.

RoboReading's tweet photo. https://t.co/jLTggspdpA

The grandtour dataset, with tons of odometry methods benchmarked in tons of environments. https://t.co/De4UWnCgqW

0

29

2

18

2K

C's Robotics Paper Notes @RoboReading

4 months ago

https://t.co/kPMg3oDKjN Learning Humanoid End-Effector Control for Open-Vocabulary Visual Loco-Manipulation For humanoid EE goal reaching, on top of WBC pose tracking, a neural model is trained to further correct small biases near the goal.

RoboReading's tweet photo. https://t.co/kPMg3oDKjN

Learning Humanoid End-Effector Control for Open-Vocabulary Visual Loco-Manipulation

For humanoid EE goal reaching, on top of WBC pose tracking, a neural model is trained to further correct small biases near the goal. https://t.co/sL0f3tFk02

1

52

9

22

3K

C's Robotics Paper Notes @RoboReading

4 months ago

VIGOR: Visual Goal-In-Context Inference for Unified Humanoid Fall Safety https://t.co/AqGBstnSHN visual humanoid fall recovery 1) flat terrain human sparse reference 2) terrain-aware ref adjustment 3) keypoint tracking teacher 4) distilled to visual student without keypoint obs

RoboReading's tweet photo. VIGOR: Visual Goal-In-Context Inference for Unified Humanoid Fall Safety
https://t.co/AqGBstnSHN

visual humanoid fall recovery

1) flat terrain human sparse reference
2) terrain-aware ref adjustment
3) keypoint tracking teacher
4) distilled to visual student without keypoint obs https://t.co/e00k180rRm

0

17

3

7

1K

C's Robotics Paper Notes @RoboReading

4 months ago

https://t.co/Gz7NSpUw2X meshmimic: Geometry-Aware Humanoid Motion Learning through 3D Scene Reconstruction kinda like CRISP + OmniRetarget, but: 1) uses human edge and excludes depth edge in image to extract contacts, which is smart; 2) use polygonal primitives for clean scenes

RoboReading's tweet photo. https://t.co/Gz7NSpUw2X

meshmimic: Geometry-Aware Humanoid Motion Learning through 3D Scene Reconstruction

kinda like CRISP + OmniRetarget, but: 1) uses human edge and excludes depth edge in image to extract contacts, which is smart; 2) use polygonal primitives for clean scenes https://t.co/PRloaYB8pJ

0

78

11

64

5K

C's Robotics Paper Notes @RoboReading

4 months ago

Kids today are not citing wococo :(

1

1

0

0

537

C's Robotics Paper Notes @RoboReading

4 months ago

https://t.co/k4YUDH3vSS Humanoid Agile Object Interaction Control via Dynamics-Aware World Model predict object states from proprioceptive obs, use it to transform object template pointcloud, and embed the transformed pcl as policy inputs.

RoboReading's tweet photo. https://t.co/k4YUDH3vSS

Humanoid Agile Object Interaction Control via Dynamics-Aware World Model

predict object states from proprioceptive obs, use it to transform object template pointcloud, and embed the transformed pcl as policy inputs. https://t.co/77cPYSIMVq

2

31

6

17

2K

C's Robotics Paper Notes @RoboReading

4 months ago

@MagnusAnde16653 in the link?

0

0

0

0

24

C's Robotics Paper Notes @RoboReading

4 months ago

https://t.co/Am7kYha9Q0 reward engineering + state machine engineering + nice progress-based rewards for reference-free humanoid agile climbing learning.

RoboReading's tweet photo. https://t.co/Am7kYha9Q0

reward engineering + state machine engineering + nice progress-based rewards for reference-free humanoid agile climbing learning. https://t.co/OKbdRKTtfP

1

65

11

44

11K

C's Robotics Paper Notes @RoboReading

4 months ago

https://t.co/LUJmdDluuJ Recurrent-Depth VLA Another way of VLM->diffusion or think-act model or whatever it is called; much faster and adaptive computation while reasoning in latent space.

RoboReading's tweet photo. https://t.co/LUJmdDluuJ
Recurrent-Depth VLA

Another way of VLM->diffusion or think-act model or whatever it is called; much faster and adaptive computation while reasoning in latent space. https://t.co/b2GtNPfrTB

0

57

12

34

3K

C's Robotics Paper Notes @RoboReading

4 months ago

https://t.co/lZVNdPNcvu When predicting policy feasibility, can also use reconstruction header to detect OOD.

RoboReading's tweet photo. https://t.co/lZVNdPNcvu

When predicting policy feasibility, can also use reconstruction header to detect OOD. https://t.co/EKqIyDiHzH

0

16

0

15

1K

C's Robotics Paper Notes @RoboReading

4 months ago

https://t.co/IwwhlfewQJ humanoid badminton. motion tracking -> distillation into goal-conditioned policies -> AMP and physics-based finetuning.

RoboReading's tweet photo. https://t.co/IwwhlfewQJ

humanoid badminton.

motion tracking -> distillation into goal-conditioned policies -> AMP and physics-based finetuning. https://t.co/sYEDQS43qU

0

71

12

49

5K

C's Robotics Paper Notes @RoboReading

4 months ago

https://t.co/GnK2HJiSoK

0

4

0

0

741

C's Robotics Paper Notes @RoboReading

4 months ago

HUSKY: Humanoid Skateboarding System via Physics-Aware Whole-Body Control 1) models skateboard dynamics in simulation 2) phase-based design for multiple skills 3) human reference and kinematics-based guidance 4) sys-id sim2real No perception yet

RoboReading's tweet photo. HUSKY: Humanoid Skateboarding System via Physics-Aware Whole-Body Control

1) models skateboard dynamics in simulation
2) phase-based design for multiple skills
3) human reference and kinematics-based guidance
4) sys-id sim2real

No perception yet https://t.co/vZdxFSaO5O

2

36

5

15

2K

C's Robotics Paper Notes @RoboReading

4 months ago

https://t.co/3oMFka4ETh A Systematic Study of Data Modalities and Strategies for Co-training Large Behavior Models for Robot Manipulation Another paper from TRI on how to train your VLA

RoboReading's tweet photo. https://t.co/3oMFka4ETh
A Systematic Study of Data Modalities and Strategies for Co-training Large Behavior Models for Robot Manipulation

Another paper from TRI on how to train your VLA https://t.co/Frfn9ZMw1B

0

15

3

10

2K

Last Seen Users on Sotwe

Trends for you

Most Popular Users