Imitate and Repurpose:
Learning Reusable Robot Movement Skills
From Human and Animal Behaviors

Steven Bohez, Saran Tunyasuvunakool, Philemon Brakel, Fereshteh Sadeghi, Leonard Hasenclever, Yuval Tassa,

Emilio Parisotto, Jan Humplik, Tuomas Haarnoja, Roland Hafner, Markus Wulfmeier, Michael Neunert, Ben Moran, Noah Siegel,

Andrea Huber, Francesco Romano, Nathan Batchelor, Federico Casarini, Josh Merel, Raia Hadsell, Nicolas Heess

arXiv

A. Summary of the proposed approach
and highlight of the results

Abstract

We investigate the use of prior knowledge of human and animal movement to learn reusable locomotion skills for real legged robots. Our approach builds upon previous work on imitating human or dog Motion Capture (MoCap) data to learn a movement skill module. Once learned, this skill module can be reused for complex downstream tasks. Importantly, due to the prior imposed by the MoCap data, our approach does not require extensive reward engineering to produce sensible and natural looking behavior at the time of reuse. This makes it easy to create well-regularized, task-oriented controllers that are suitable for deployment on real robots. We demonstrate how our skill module can be used for imitation, and train controllable walking and ball dribbling policies for both the ANYmal quadruped and OP3 humanoid. These policies are then deployed on hardware via zero-shot simulation-to-reality transfer.

B. MoCap imitation with ANYmal in simulation

The inset shows the original dog MoCap reference clip

C. MoCap imitation with OP3 in simulation

The inset shows the original human MoCap reference clip.

D. MoCap imitation with ANYmal on hardware

The inset shows the original dog MoCap reference clip.

E. Bounded controllable walking with OP3

The robot follows a fixed velocity command until it reaches the end of the workspace, after which it turns on the spot until it faces the center and walks forward again.

F. User-controlled walking with ANYmal

The robot receives forward, lateral and yaw velocity commands from a user via an off-camera handheld joystick.

G. User-controlled walking with OP3

The robot receives forward, lateral and yaw velocity commands from a user via an off-camera handheld joystick.

H. Trajectory following with ANYmal

A tracking controller generates velocity commands which are then executed by the walking controller in order to follow a slalom trajectory. Mid-way through the trial we introduce an obstacle the robot has to walk over in order to test robustness.

I. Ball dribbling with ANYmal in simulation

The robot is required to dribble the ball towards a shifting target location as indicated by the center of the red disc. The agent has learned to use both front and hind legs to control the ball.

J. Ball dribbling with ANYmal on hardware

The robot is required to dribble the ball towards a shifting target location as indicated by the center of the red disc. The controller is able to control the ball quite well despite the limited effort into modelling the contact dynamics of the ball.

K. Ball dribbling with OP3 in simulation

The robot is required to dribble the ball towards a shifting target location as indicated by the center of the red disc. The agent has learned to strafe around the ball in order to be able to kick it in the right direction.

L. High-level exploration with ANYmal in simulation

Sampling latent commands from the prior for the trained skill module results in temporally-extended behavior, with the robot maintaining balance and walking around randomly.

M. Controllable walking with ANYmal in simulation

We use a procedural terrain to improve the robustness of the walking controller to small obstacles and slopes. The target velocities are randomly sampled according to the process described in the text.

N. Controllable walking with ANYmal without skill reuse

.Just optimizing for task reward results in erratic and inefficient behavior that, while effective in solving the task, is not suited for deployment on hardware.

Imitate and Repurpose:Learning Reusable Robot Movement SkillsFrom Human and Animal Behaviors