Barkour: Benchmarking Animal-level Agility with Quadruped Robots

Ken Caluwaerts*, Atil Iscen*, Wenhao Yu*, J. Chase Kew*, Tingnan Zhang*, Daniel Freeman¹, Kuang-Huei Lee¹, Lisa Lee¹, Stefano Saliceti¹, Vincent Zhuang¹, Nathan Batchelor, Steven Bohez, Federico Casarini, Jose Enrique Chen, Omar Cortes, Erwin Coumans^, Adil Dostmohamed, Gabriel Dulac-Arnold, Alejandro Escontrela^, Erik Frey, Roland Hafner, Deepali Jain, Bauyrjan Jyenis, Yuheng Kuang, Edward Lee, Linda Luu, Ofir Nachum, Ken Oslund, Jason Powell, Diego Reyes, Francesco Romano, Fereshteh Sadeghi, Ron Sloat, Baruch Tabanpour, Daniel Zheng, Michael Neunert, Raia Hadsell, Nicolas Heess, Francesco Nori, Jeff Seto, Carolina Parada, Vikas Sindhwani, Vincent Vanhoucke, and Jie Tan
(*equal contribution    ¹core contributor    ^work done while at Google)

Google DeepMind

Paper    Google AI Blog    Open-Sourced Assets

Animals have evolved various agile locomotion strategies, such as sprinting, leaping, and jumping. There is a growing interest in developing legged robots that move like their biological counterparts and show various agile skills to navigate complex environments quickly. Despite the interest, the field lacks systematic benchmarks to measure the performance of control policies and hardware in agility. 

We introduce the Barkour benchmark, an obstacle course to quantify agility for legged robots.  Inspired by dog agility competitions, it consists of diverse obstacles and a time based scoring mechanism. This encourages researchers to develop controllers that not only move fast, but do so in a controllable and versatile way. 

To set strong baselines, we present two methods for tackling the benchmark. In the first approach, we train specialist locomotion skills using on-policy reinforcement learning methods and combine them with a high-level navigation controller. In the second approach, we distill the specialist skills into a Transformer-based generalist locomotion policy, named Locomotion-Transformer, that can handle various terrains and adjust the robot's gait based on the perceived environment and robot states. Using a custom-built quadruped robot, we demonstrate that our method can complete the course at half the speed of a dog. 

We hope that our work represents a step towards creating controllers that enable robots to reach animal-level agility.


Open-Sourced Assets

Barkour course MuJoCo model

MuJoCo-XLA training example: https://github.com/google/brax/tree/main/brax/experimental/barkour 

Robot CAD model (OnShape): original model, simplified model for URDF export


The Barkour Benchmark for Robot Agility

Barkour course design composed of four different obstacles: (start and end) pause tables, weave poles, an Aframe, and a broad jump.

Policy Learning

Training and deployment pipelines for the Locomotion Transformer architecture.

We learn individual skills using RL in simulation, and then use simulation rollouts from individual skills to create a dataset and train a causal Transformer-based generalist policy. 

Left: At deployment time, a high-level navigation controller guides the real robot (sim-to-real) through the obstacle course by sending commands to the Locomotion Transformer policy. We compare with an expert-designed policy switching mechanism that selects specialist policies based on the robot’s position.

Right: Measuring robustness of the different policies across a large number of runs on the Barkour benchmark