Unsupervised Skill Discovery via Recurrent Skill Training
Zheyuan Jiang*, Jingyue Gao*, Jianyu Chen
Abstract
Being able to discover diverse skills unsupervised without external reward functions is beneficial in reinforcement learning research. Previous unsupervised skill discovery approaches mainly focus on maximizing the information-theoretic rewards for different skills in parallel. Although impressive results have been provided, we found that such parallel training procedure inherently discourages exploration, which leads to poor state coverage and restricts the diversity of learned skills. In this paper, we propose a novel framework to address this issue, which is called Recurrent Skill Training (ReST). Instead of training all the skills in parallel, ReST trains different skills one after another recurrently. We conduct experiments on a number of challenging 2D navigation environments and robotic locomotion environments. Theoretical and empirical results show that our proposed approach outperforms previous parallel training approaches in terms of state coverage and diversity.
Skills Discovered by ReST
ReST can discover many dynamic skills on various MuJoCo tasks.
HalfCheetah
Rolling Forward
Trying to Sit
Rolling Backward
Flipped Running Forward
Running
Flipped Running Backward
Running Backward
Running Backward Faster
Hopper
Shaking Forward
Back Flip
Lying Down & Rubbing Forward
Normal hopping
Hopping Backward
Kung Fu Hopper
Small Step Hopping
Small Step Backward
Walker
Walking Backward
Dashing
Trotting
Ninja Running
Balancing on one foot
Digging Backward
Shaking
Stamping Backward
Humanoid
Humanoid Skill #1
#5
#2
#6
#3
#7
#4
#8