Diversity is All You Need:

Learning Diverse Skills without a Reward Function

Benjamin Eysenbach, Abhishek Gupta, Julian Ibarz, and Sergey Levine

Abstract: Intelligent creatures can explore their environments and learn useful skills without supervision. In this paper, we propose a method for learning useful skills without a reward function. We maximize an information theoretic objective using a maximum entropy policy. On a variety of simulated robotic tasks, we show that this simple exploration objective results in the unsupervised emergence of diverse skills, such as walking and jumping. In a number of reinforcement learning benchmark environments, our method is able to solve the benchmark task despite never receiving the true task reward. In these environments, some of the learned skills correspond to solving the task, and each skill that solves the task does so in a distinct manner. Our results suggest that unsupervised discovery of skills can serve as an effective pretraining mechanism for overcoming challenges of exploration and data efficiency in reinforcement learning.

Paper: arXiv preprint

Code: github repository

Skills Learned with No Reward

In the videos below, we show skills that our agent has learned with no reward. For more videos, please see see the links below:

Harnessing Skills to Maximize a Reward

While our skills are learned without a reward function, they can be used as a task-independent initialization that accelerates learning.

Half Cheetah

Hopper

Harnessing Skills for Imitation

We can use the skills learned to imitation an expert trajectory. In the videos below, we show an expert demonstration on the left, and our agent's imitation on the right. The first three pairs show that half cheetah can successfully imitate an expert standing on its nose, standing upright, and flipping onto its back. The pair in the bottom right shows a failure case, where half cheetah has failed to imitate an expert doing a handstand.

Code

Code to replicate our experiments will be made available soon. Please feel free to contact us for help extending our method or applying it to novel tasks. Please consider citing our paper if you use our code (bibtex below).

Citation

@article{eysenbach2018,

    author  = "Eysenbach, Benjamin and Gupta, Abhishek and Ibarz, Julian and Levine, Sergey",

    title   = "Diversity is All You Need: Learning Diverse Skills without a Reward Function",

    year    = "2018"