Learning and Adapting Agility Skills
by Transferring Experience

under review

Overview

Legged robots have enormous potential in their range of capabilities, from navigating unstructured terrains to high-speed running. However, designing robust controllers for highly agile dynamic motions remains a substantial challenge for roboticists. Reinforcement learning (RL) offers a promising data-driven approach for automatically training such controllers. However, exploration in these high-dimensional, underactuated systems remains a significant hurdle for enabling legged robots to learn performant, naturalistic, and versatile agility skills. We propose a framework for training complex robotic skills by transferring experience from existing controllers to jumpstart learning new tasks. To leverage controllers we can acquire in practice, we design this framework to be flexible in terms of their source—that is, the controllers may have been optimized for a different objective under different dynamics, or may require different knowledge of the surroundings—and thus may be highly suboptimal for the target task. We show that our method enables learning complex agile jumping behaviors, navigating to goal locations while walking on hind legs, and adapting to new environments. We also demonstrate that the agile behaviors learned in this way are graceful and safe enough to deploy in the real world.

Bipedal Navigation

In this task, the robot must stand up on its hind legs and navigate to the red square marked on the ground

Jumping

In this task, the robot must clear randomly placed hurdles on the ground. When evaluating in the real world, we set three fixed hurdle placements, i.e, the first and second hurdle would be placed 2.4, 2.8, and 3.6 meters apart, and ran 8 trials per placement. In the real world in order to facilitate the experiments we only mark where the hurdle is via tape on the ground, but we terminate the episode if any of the robot's body is within a centimeter above it. We define success to be the robot clearing both hurdles. Below we show the videos of these trials, separated by failures and successes.

Failures

Successes

Failures (None)

Successes

Failures

Successes