Unsupervised Skill Discovery via Recurrent Skill Training

Zheyuan Jiang*, Jingyue Gao*, Jianyu Chen

Abstract

Being able to discover diverse skills unsupervised without external reward functions is beneficial in reinforcement learning research. Previous unsupervised skill discovery approaches mainly focus on maximizing the information-theoretic rewards for different skills in parallel. Although impressive results have been provided, we found that such parallel training procedure inherently discourages exploration, which leads to poor state coverage and restricts the diversity of learned skills. In this paper, we propose a novel framework to address this issue, which is called Recurrent Skill Training (ReST). Instead of training all the skills in parallel, ReST trains different skills one after another recurrently. We conduct experiments on a number of challenging 2D navigation environments and robotic locomotion environments. Theoretical and empirical results show that our proposed approach outperforms previous parallel training approaches in terms of state coverage and diversity.

Skills Discovered by ReST

ReST can discover many dynamic skills on various MuJoCo tasks.

HalfCheetah

Rolling Forward

Trying to Sit

Rolling Backward

Flipped Running Forward

Running

Flipped Running Backward

Running Backward

Running Backward Faster

Hopper

Shaking Forward

Back Flip

Lying Down & Rubbing Forward

Normal hopping

Hopping Backward

Kung Fu Hopper

Small Step Hopping

Small Step Backward

Walker

Walking Backward

Dashing

Trotting

Ninja Running

Balancing on one foot

Digging Backward

Shaking

Stamping Backward

Humanoid

Humanoid Skill #1

#5

#2

#6

#3

#7

#4

#8