ANYmal-Parkour
Learning Agile Navigation for Quadrupedal Robots
Science Robotics March 2024: link
Pre-print: link
Performing agile navigation with four-legged robots is a challenging task due to the highly dynamic motions, contacts with various parts of the robot, and the limited field of view of the perception sensors. In this paper, we propose a fully-learned approach to train such robots and conquer scenarios that are reminiscent of parkour challenges. The method involves training advanced locomotion skills for several types of obstacles, such as walking, jumping, climbing, and crouching, and then using a high-level policy to select and control those skills across the terrain. Thanks to our hierarchical formulation, the navigation policy is aware of the capabilities of each skill, and it will adapt its behavior depending on the scenario at hand. Additionally, a perception module is trained to reconstruct obstacles from highly occluded and noisy sensory data and endows the pipeline with scene understanding. Compared to previous attempts, our method can plan a path for challenging scenarios without expert demonstration, offline computation, a priori knowledge of the environment, or taking contacts explicitly into account. While these modules are trained from simulated data only, our real world experiments demonstrate successful transfer on hardware, where the robot navigates and crosses consecutive challenging obstacles with speeds of up to two meters per second.
Consists of three learning-based modules trained purely in simulation:
Perception Module
Reconstructs the environment from noisy and highly occluded point cloud measurements.
Locomotion Module
Contains a catalog of advanced locomotion skills that can overcome challenging obstacles.
Navigation Module
Guides the robot in the scene towards the goal, by selecting which skill to activate and providing intermediate commands.
Locomotion Module
The locomotion module is trained with the position-based command formulation described in arxiv.org/abs/2209.12827. Rather than tracking a velocity command, the policy is given a local target position and orientation to reach in a given amount of time. This allows the network to modulate the movement of the robot more freely to overcome challenging obstacles.
We train five separate skills to climb up and down high obstacles, jump over gaps, crawl under obstacles, and walk over rough terrain:
Walk
Crawl
Climb Up
Climb Down
Jump
Navigation Module
The navigation module guides the robot in the scene to reach a target location in a given time using the latent representation of the perception module. It is trained in a hierarchical fashion with the rest of the pipeline: The outer loop consists of the navigation policy, and the inner loop runs the locomotion module. At every time-step, it chooses which low-level skill to activate, and sends intermediate position, heading, and time commands. Thanks to this formulation, it is fully aware of the capabilities and limitations of each skill, and it will adapt its output depending on the scenario at hand.
Perception Module
The perception module is trained to reconstruct the environment and endows the pipeline with scene understanding. Thanks to the formulation, it can cope with highly occluded measurements and noisy state estimation that come from the dynamic motions of the base. Compared to classical approaches such as elevation mapping, it is able to extrapolate beyond the visible areas and can reconstruct the world in 3D.
Measurement
Elevation Map
Ours
Measurement
Elevation Map
Ours