Diversity is All You Need:
Learning Diverse Skills without a Reward Function
Benjamin Eysenbach, Abhishek Gupta, Julian Ibarz, and Sergey Levine
Abstract: Intelligent creatures can explore their environments and learn useful skills without supervision. In this paper, we propose a method for learning useful skills without a reward function. We maximize an information theoretic objective using a maximum entropy policy. On a variety of simulated robotic tasks, we show that this simple exploration objective results in the unsupervised emergence of diverse skills, such as walking and jumping. In a number of reinforcement learning benchmark environments, our method is able to solve the benchmark task despite never receiving the true task reward. In these environments, some of the learned skills correspond to solving the task, and each skill that solves the task does so in a distinct manner. Our results suggest that unsupervised discovery of skills can serve as an effective pretraining mechanism for overcoming challenges of exploration and data efficiency in reinforcement learning.
Paper: arXiv preprint
Code: github repository
Harnessing Skills to Maximize a Reward
While our skills are learned without a reward function, they can be used as a task-independent initialization that accelerates learning.
Harnessing Skills for Imitation
We can use the skills learned to imitation an expert trajectory. In the videos below, we show an expert demonstration on the left, and our agent's imitation on the right. The first three pairs show that half cheetah can successfully imitate an expert standing on its nose, standing upright, and flipping onto its back. The pair in the bottom right shows a failure case, where half cheetah has failed to imitate an expert doing a handstand.
Code to replicate our experiments will be made available soon. Please feel free to contact us for help extending our method or applying it to novel tasks. Please consider citing our paper if you use our code (bibtex below).
author = "Eysenbach, Benjamin and Gupta, Abhishek and Ibarz, Julian and Levine, Sergey",
title = "Diversity is All You Need: Learning Diverse Skills without a Reward Function",
year = "2018"