Abstract
Humans master complex motor skills such as walking and running through a sophisticated blend of learning and adaptation. Replicating this level of skill acquisition with traditional Reinforcement Learning (RL) methods in musculoskeletal humanoid systems is challenging due to intricate control dynamics and over-actuation. Inspired by human developmental learning, here we address these challenges, with a double curriculum approach: a three-stage task curriculum (balance, walk, run) and an up to three-stage morphology curriculum (4-year-old, 12-year-old, adult), mimicking physical growth. This combined approach enables the agent to efficiently learn robust gaits that are adaptable to varying velocities and perturbations. Extensive analysis and ablation studies demonstrate that our method outperforms state-of-the-art exploration techniques for musculoskeletal systems. Our approach is agnostic to the underlying RL algorithm and does not require reward tuning, demonstrations, or specific muscular architecture information, marking a notable advancement in the field.
A single policy is capable of performing balance, walking, and running. It also generalizes to perturbations, obstacles, and varying target velocities within an episode, even though these conditions were not encountered during training.
Balance
Walk
Run
Increasing velocity
Increasing and decreasing velocity
Walk with perturbations along x-axis
Run with perturbations along x-axis
Walk with perturbations along z-axis
Run with perturbations along z-axis
Stairs
Gaps