Sanjeev Arora

Do GANs actually learn the distribution? Some theory and empirics

Abstract

The Generative Adversarial Nets or GANs framework (Goodfellow et al'14) for learning distributions differs from older ideas such as autoencoders and deep Boltzmann machines in that it scores the generated distribution using a discriminator net, instead of a perplexity-like calculation. It appears to work well in practice, e.g., the generated images look better than older techniques. But how well do these nets learn the target distribution?

Our paper 1 (ICML'17) shows GAN training may not have good generalization properties; e.g., training may appear successful but the trained distribution may be far from target distribution in standard metrics. We show theoretically that this can happen even though the 2-person game between discriminator and generator is in near-equilibrium, where the generator appears to have "won" (with respect to natural training objectives).

Paper2 (arxiv June 26) empirically tests whether this lack of generalization occurs in real-life training. The paper introduces a new quantitative test for diversity of a distribution based upon the famous birthday paradox. This test reveals that distributions learnt by some leading GANs techniques have fairly small support (i.e., suffer from mode collapse), which implies that they are far from the target distribution.

Paper 1: "Equilibrium and Generalization in GANs" by Arora, Ge, Liang, Ma, Zhang. (ICML 2017)

Paper 2: "Do GANs actually learn the distribution? An empirical study." by Arora and Zhang (https://arxiv.org/abs/1706.08224)

Biography

Sanjeev Arora is currently the Charles C. Fitzmorris Professor of Computer Science at Princeton University, and his research interests include computational complexity theory, uses of randomness in computation, probabilistically checkable proofs, computing approximate solutions to NP-hard problems, and geometric embeddings of metric spaces. He received a B.S. in Mathematics with Computer Science from MIT in 1990 and received a Ph.D. in Computer Science from the University of California, Berkeley in 1994 under Umesh Vazirani. Earlier, in 1986, Sanjeev Arora had topped the prestigious IIT JEE but transferred to MIT after 2 years at IIT Kanpur.He was a visiting scholar at the Institute for Advanced Study in 2002-03. His Ph.D. thesis on probabilistically checkable proofs received the ACM Doctoral Dissertation Award in 1995. He was awarded the Gödel Prizefor his work on the PCP theorem in 2001 and again in 2010 for the discovery (concurrently with Joseph S. B. Mitchell) of a polynomial time approximation scheme for the Euclidean travelling salesman problem. In 2008 he was inducted as a Fellow of the Association for Computing Machinery.[6] In 2011 he was awarded the ACM Infosys Foundation Award, given to mid-career researchers in Computer Science. Arora has been awarded the Fulkerson Prize for 2012 for his work on improving the approximation ratio for graph separators and related problems (jointly with Satish Rao and Umesh Vazirani). He is a coauthor (with Boaz Barak) of the book Computational Complexity: A Modern Approach and is a founder, and on the Executive Board, of Princeton's Center for Computational Intractability. He and his coauthors have argued that certain financial products are associated with computational asymmetry which under certain conditions may lead to market instability.

Website: Sanjeev Arora (Princeton)

Google Sites

Report abuse