Adversarial Inversion: Self-supervision with Adversarial Priors

Hsiao-Yu Fish Tung Adam Harley William Seto Katerina Fragkiadaki

Carnegie Mellon University

{htung, aharley, wseto, katef}@cs.cmu.edu

Abstract

We propose adversarial inversion, a weakly supervised neural network model that combines self-supervision with adversarial constraints. Given visual input, our model first generates a set of desirable intermediate latent variables, which we call “imaginations”,e.g., 3D pose and camera viewpoint. Then a differentiable renderer projects these imaginations to reconstruct the input, and discriminator networks constrain the imaginations, using corresponding reference repositories, to reside in the right “domain” e.g., 3D human poses, camera viewpoints, 3D depth maps etc., depending on the task. Our model is trained to minimize reconstruction and adversarial losses. Adversarial inversion can be trained with or without paired supervision of standard supervised models, as it does not require paired annotations. It can instead exploit a large number of unlabelled images. We empirically show adversarial inversion outperforms previous state-of-the-art supervised models on 3D human pose estimation and 3D scene depth estimation from per-frame motion. Further, we show interesting results on biased image editing.

Paper link

coming soon

Experimental Result

1. 3D landmarks prediction

2. Structure-from-motion (SFM)

3. Gender transformation

4. Age transformation

5. Image Inpainting

1. 3D landmarks prediction

Predicted 3D human poses in MPII dataset

2. Structure-from-motion (SFM)

Predicted depth and optical (geometric) flow with and without adversarial priors

3. Biased image super-resolution

Gender transformation

Age transformation

Face transformation

4. Image Inpainting

Lips injection