@gavrielstate@RotekSong@xbpeng4 Yes, the reward function needs to be improved in the future. Learning to avoid getting hit is difficult for the character since there are many different reward weights to be balanced. Perhaps we should consider other techniques, e.g., transfer learning and policy fine-tuning.
We've released the code for ASE, along with pre-trained models and awesome gladiator motion data from @reallusion!
https://t.co/0BU3Q6253h
We will also be presenting this work at SIGGRAPH today, 2:15pm in East Building, Ballroom A/B.
Can we train a *single* policy that can control many different robots to walk? The idea behind GenLoco is to learn to control many different quadrupeds, including new ones not seen in training.
https://t.co/XDYODDXykR
Code https://t.co/Q4cEs5OcQ3
Video https://t.co/xyHbeD52Ve