Stefano Ermon

Generative Adversarial Imitation Learning

Abstract

Consider learning a policy from example expert behavior, without interaction with the expert or access to a reward or cost signal. One approach is to recover the expert’s cost function with inverse reinforcement learning, then compute an optimal policy for that cost function. This approach is indirect and can be slow. In this talk, I will discuss a new generative modeling framework for directly extracting a policy from data, drawing an analogy between imitation learning and generative adversarial networks. I will derive a model-free imitation learning algorithm that obtains significant performance gains over existing methods in imitating complex behaviors in large, high-dimensional environments. Our approach can also be used to infer the latent structure of human demonstrations in an unsupervised way. As an example, I will show a driving application where a model learned from demonstrations is able to both produce different driving styles and accurately anticipate human actions using raw visual inputs.

Biography

Stefano Ermon is currently an Assistant Professor in the Department of Computer Science at Stanford University, where he is affiliated with the Artificial Intelligence Laboratory. He completed his PhD in computer science at Cornell in 2015. His research interests include techniques for scalable and accurate inference in graphical models, large-scale combinatorial optimization, and robust decision making under uncertainty, and is motivated by a range of applications, in particular ones in the emerging field of computational sustainability. Stefano's research has won several awards, including three Best Paper Awards, a World Bank Big Data Innovation Challenge, and was selected by Scientific American as one of the 10 World Changing Ideas in 2016. He is a recipient of the Sony Faculty Innovation Award and NSF CAREER Award.