Learning-based Humanoid Control for Human-like Locomotion
Learning-based Humanoid Control for Human-like Locomotion
Goal of this project
Human-like locomotion is a complex control problem, as it requires both robustness and stylized motion. Reinforcement learning (RL) has been widely adopted to address model uncertainties and environmental discrepancies, achieving robust locomotion. However, the black-box nature of neural networks hampers the debugging process, and stylized motions often require tedious reward shaping. In this regard, motion priors can be an effective alternative to mitigate these challenges. Specifically, AMP (Adversarial Motion Priors) provides style rewards based on reference motion data. However, most motion datasets do not include toe joints, leading to unnecessary toe movements or ineffective toe usage during locomotion. Therefore, the goal of this project is to develop a method that enables relatively small but functionally important toe joints to be utilized naturally and effectively, even when they are absent from the motion prior data.
KAPEX
Unitree G1
Algorithm validation (G1)
0) Vanilla policy
1) AMP policy
2) Retargeted motion prior (about 20s)
3) AMP policy : task reward (0.7) + style reward (0.3)
Forward
Backward
Left
Right
Rotate in place
Forward turn
Kapex (humanoid with toe joints)
Forward turn
Forward
Advantages
Relatively short reference motion with generative model provides data efficiency
Style reward mitigates reward finetuning for human-likeness
Disdvantages
Toe movements are not represented in the motion data, which requires additional task rewards to exploit the active toe joints effectively
Multi-prior is limited by mode collapse
Future work
Task rewards for active toe joints
Diffusion policy for multi-motion (running, jumping, ...)
Reference
Peng, Xue Bin, et al. "Amp: Adversarial motion priors for stylized physics-based character control." ACM Transactions on Graphics (ToG) 40.4 (2021): 1-20.
Goodfellow, Ian, et al. "Generative adversarial networks." Communications of the ACM 63.11 (2020): 139-144.
Araujo, Joao Pedro, et al. "Retargeting matters: General motion retargeting for humanoid motion tracking." arXiv preprint arXiv:2510.02252 (2025).