Human Motion Control of Quadrupedal Robots using Deep Reinforcement Learning

Sunwoo Kim¹, Maks Sorokin², Jehee Lee¹, Sehoon Ha²

¹ Seoul National University ² Georgia Institute of Technology

Robotics: Science and Systems 2022

Primary Contact: sunwoo@mrl.snu.ac.kr

Abstract

A motion-based control interface promises flexible robot operations in dangerous environments by combining user intuitions with the robot's motor capabilities. However, designing a motion interface for non-humanoid robots, such as quadrupeds or hexapods, is not straightforward because different dynamics and control strategies govern their movements. We propose a novel motion control system that allows a human user to operate various motor tasks seamlessly on a quadrupedal robot. We first retarget the captured human motion into the corresponding robot motion with proper semantics using supervised learning and post-processing techniques. Then we apply the motion imitation learning with curriculum learning to develop a control policy that can track the given retargeted reference. We further improve the performance of both motion retargeting and motion imitation by training a set of experts. As we demonstrate, a user can execute various motor tasks using our system, including standing, sitting, tilting, manipulating, walking, and turning, on simulated and real quadrupeds. We also conduct a set of studies to analyze the performance gain induced by each component.

Demo Videos

Pushing tasks (Live mode, x1.5 speed)

In this task, the user aims to pull the tennis ball(left) and push dominoes(right). The user leans backwards and then stretch his right arm to control the robot to push dominoes on a tall box. After finishing the task, the user and the robot express "thank you for watching" by bowing their bodies.

Push the box to the target (Live mode, x1.5 speed)

In this task, the user aims to push the box to the target position(X mark) using the robot. The box is initially located where the robot can't reach. The user commands a robot to approach close enough to reach the box. The user is able to complete a task by combining visual recognition of the robot state and his intuition for completing the task.

Composite tasks (Replay mode)

In a composite task, the robot has to complete three different tasks. First, hit the tennis ball located higher than the robot's standing height. Second, hit the hanging bone high in the air. Last, dodge the tennis ball. The user interactively manipulates the robot using his motion to complete the task.

Anger expression after fail to reach the box (Live mode)

The robot fails to reach the box. The user expresses his anger by hitting the ground.

System Overview

Ablation Studies

Consistency ablation

Ablation study for post-processing of motion retargeting. The white robot indicates reference motion and the black indicates physically simulated. The video shows the effect of consistency correction after the motion retargeting network.

Curriculum learning ablation

Ablation study for curriculum learning. The white robot indicates reference motion and the black indicates physically simulated. The video shows the effect of double curriculum learning introduced in our study.

Domain randomization ablation

Ablation study of domain randomization. We demonstrate it by performing the sitting task in each policy. 

Bibtex

@INPROCEEDINGS{kim2022human, 

    AUTHOR    = {Sunwoo Kim and Maks Sorokin and Jehee Lee and Sehoon Ha}, 

    TITLE     = {Human Motion Control of Quadrupedal Robots using Deep Reinforcement Learning}, 

    BOOKTITLE = {Proceedings of Robotics: Science and Systems}, 

    YEAR      = {2022}, 

    ADDRESS   = {New York, USA}, 

    MONTH     = {June}

}

Supplementary Video