Schedule

Tentative Workshop Schedule

8:45--9:00: Welcome and Introduction

9:00--10:00: Invited Talk: Stefano Ermon

10:00--10:30: Coffee Break

10:30--11:00: Spotlights (papers 1-8)

11:00--11:45: Invited Talk: Sham Kakade, "Prediction with a Short Memory"

11:45--12:15: Spotlights (papers 9-16)

12:15--14:00: Lunch

14:00--14:40: Invited Talk: Yisong Yue, "Inference + Imitation"

14:40--15:30: Poster Session

15:30--16:00: Coffee Break and Posters

16:00--16:55: Summary and Closing

16:55--end: Social mixer: World Cup Final

Invited Speakers

Stefano Ermon, Stanford

Title: Variational Rejection Sampling

Abstract: Learning latent variable models is challenging because evaluating and optimizing the marginal likelihood requires posterior inference. Variational and likelihood-free (adversarial) methods are often used in practice to sidestep this computational difficulty. Variational methods employ a bound on the data log-likelihood based on a simple approximate posterior. While easy to evaluate and optimize, this approximation can be poor when the variational posterior is far from the true one. In the first part of the talk, I will introduce a new approach to combine rejection sampling with variational learning. Our approach allows us to trade off accuracy for computation, interpolating between existing variational learning methods and exact inference. Using a new gradient estimator, we achieve average improvements of 3.71 nats and 0.21 nats over state-of-theart single-sample and multi-sample alternatives on the MNIST dataset. Finally, I will povide a comparison with likelihood-free methods based on adversarial training, showing that although they produce realistic samples, they perform extremely poorly in terms of likelihood metrics.

Biography: Stefano Ermon is an Assistant Professor of Computer Science in the CS Department at Stanford University, where he is affiliated with the Artificial Intelligence Laboratory, and a fellow of the Woods Institute for the Environment. His research is centered on techniques for probabilistic modeling of data, inference, and optimization, and is motivated by a range of applications, in particular ones in the emerging field of computational sustainability. He has won several awards, including four Best Paper Awards (AAAI, UAI and CP), a NSF Career Award, an ONR Young Investigator Award, a Sony Faculty Innovation Award, an AWS Machine Learning Award, a Hellman Faculty Fellowship, and the IJCAI Computers and Thought Award. Stefano earned his Ph.D. in Computer Science at Cornell University in 2015.


Sham Kakade, University of Washington

Title: Prediction with a Short Memory

Abstract: We consider the problem of predicting the next observation given a sequence of past observations, and consider the extent to which accurate prediction requires complex algorithms that explicitly leverage long-range dependencies. Perhaps surprisingly, our positive results show that for a broad class of sequences, there is an algorithm that predicts well on average, and bases its predictions only on the most recent few observation together with a set of simple summary statistics of the past observations. This result applies to both sequences generated by HMMs and those generated by a general class of distributions, where the distribution over the sequences have long-range dependencies.

This is joint work with: Vatsal Sharan, Percy Liang and Greg Valiant.

Biography: Sham Kakade is a Washington Research Foundation Data Science Chair, with a joint appointment in both the Computer Science and Engineering and Statistics departments at the University of Washington. He works on both theoretical and applied questions in machine learning and artificial intelligence, focusing on designing (and implementing) both statistically and computationally efficient algorithms. Amongst his contributions are establishing a number principled approaches and frameworks in reinforcement (including the natural policy gradient and conservative policy iteration) and in the co-development of spectral algorithms for statistical estimation (including provably efficient statistical estimation for mixture models, topic models, hidden markov models, and overlapping communities in social networks). His more recent contributions are on faster algorithms for nonconvex optimization as well as stochastic optimization (and stochastic approximation). He is the recipient of the IBM Goldberg best paper award (in 2007) for contributions to fast nearest neighbor search and the best paper, INFORMS Revenue Management and Pricing Section Prize (2014). He has been program chair for COLT 2011.

Sham completed his Ph.D. at the Gatsby Computational Neuroscience Unit at University College London, advised by Peter Dayan and has done a postdoc at UPenn under the supervision of Michael Kearns. He has been at Microsoft Research, New England, the Department of Statistics, Wharton, UPenn, and the Toyota Technological Institute at Chicago.


Yisong Yue, Caltech

Title: Inference + Imitation

Abstract: Imitation learning pertains learning to mimic desired behavior in a sequential decision making. Inference in complex graphical models often pertains (sequentially) solving some optimization procedure in order compute desired estimates of a distribution of interest. In this talk I will describe two threads of recent research. First I will show how graphical models can be integrated into conventional deep imitation learning settings to train richer policy classes that encode probabilistic semantics. Second, I will show how deep policies can be trained to augment conventional amortized variational inference.

Yisong Yue is an assistant professor in the Computing and Mathematical Sciences Department at the California Institute of Technology. He was previously a research scientist at Disney Research. Before that, he was a postdoctoral researcher in the Machine Learning Department and the iLab at Carnegie Mellon University. He received a Ph.D. from Cornell University and a B.S. from the University of Illinois at Urbana-Champaign.

Yisong's research interests lie primarily in the theory and application of statistical machine learning. He is particularly interested in developing novel methods for interactive machine learning and structured prediction. In the past, his research has been applied to information retrieval, recommender systems, text classification, learning from rich user interfaces, analyzing implicit human feedback, data-driven animation, behavior analysis, sports analytics, policy learning in robotics, and adaptive planning & allocation problems.