Boundless: Generative Adversarial Networks for Image Extension

Piotr Teterwak Aaron Sarna Dilip Krishnan Aaron Maschinot David Belanger Ce Liu William T. Freeman

{pteterwak, sarna, dilipkay, amaschinot, dbelanger, celiu, wfreeman}@google.com

Google Research

Presented at ICCV 2019

Image Samples

Extending images from masks of (a) 25%, (b) 50% and (c) 75% of the image width using multiple algorithms. From left to right: DeepFill [1], PConv [2], Photoshop Content Aware Fill, our model with no conditioning, our full model and ground truth.

Abstract

Generic image extension models that work on diverse datasets and preserve both high-level semantics and lowlevel image structures and textures have broad applications in image editing, computational photography and computer graphics. While image inpainting has been extensively studied in the literature, in this paper we found that it is challenging to directly apply the state-of-the-art inpainting methods to image extension as they tend to generate blurry or repetitive pixels with inconsistent semantics. We introduce semantic conditioning to the discriminator of a generative adversarial network (GAN) and achieve impressive qualitative and quantitative results on image extension with coherent semantics and visually pleasing colors and textures in the extended regions. We also show promising results in extreme extensions, including panorama generation. Ours is the first work to tackle the challenging problem of learning image extension, in the general setting of many different image classes.

Applied to Video

supplementary_video (1).mp4

We apply our model to each frame of a video independently. Videos are a natural domain to test our model since consecutive frames are closely related to each other, and the small variations can generate interesting plausible outcomes. This allows us to verify that our model has not memorized a fixed completion for closely related images or collapses with small natural variations in the input. Because there is no temporal model there is a lot of temporal jittering.

Model Architecture

Pretrained Models

References

1. Jiahui Yu, Zhe Lin, Jimei Yang, Xiaohui Shen, Xin Lu, and Thomas S Huang. Generative image inpainting with conextual attention. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 5505–5514, 2018.

2. Guilin Liu, Fitsum A. Reda, Kevin J. Shih, Ting-Chun Wang, Andrew Tao, and Bryan Catanzaro. Image inpainting for irregular holes using partial convolutions. In The European Conference on Computer Vision (ECCV), 2018