Everything is in the Latent Space: Attribute Editing and Attribute Style Manipulation by StyleGAN Latent Space Exploration

Rishubh Parihar Ankit Dhiman Tejan Karmali R. Venkatesh Babu

Video Analytics Lab, Indian Institute of Science, Bengaluru

ACM Multimedia 2022

Arxiv

Code (Coming soon)

movie_full.mp4

Abstract: Unconstrained Image generation with high realism is now possible using recent Generative Adversarial Networks (GANs). However, it is quite challenging to generate images with a given set of attributes. Recent methods use style-based GAN models to perform image editing by leveraging the semantic hierarchy present in the layers of the generator. We present Few-shot Latent-based Attribute Manipulation and Editing (FLAME), a simple yet effective framework to perform highly controlled image editing by latent space manipulation. Specifically, we estimate linear directions in the latent space (of a pre-trained StyleGAN) that controls semantic attributes in the generated image. In contrast to previous methods that either rely on large-scale attribute labeled datasets or attribute classifiers, FLAME uses minimal supervision of a few curated image pairs to estimate disentangled edit directions. FLAME can perform both individual and sequential edits with high precision on a diverse set of images while preserving identity. Further, we propose a novel task of Attribute Style Manipulation to generate diverse styles for attributes such as eyeglass and hair. We first encode a set of synthetic images of the same identity but having different attribute styles in the latent space to estimate an attribute style manifold. Sampling a new latent from this manifold will result in a new attribute style in the generated image. We propose a novel sampling method to sample latent from the manifold, enabling us to generate a diverse set of attribute styles beyond the styles present in the training set. FLAME can generate diverse attribute styles in a disentangled manner. We illustrate the superior performance of FLAME against previous image editing methods by extensive qualitative and quantitative comparisons. FLAME generalizes well on out-of-distribution images from art domain as well as on other datasets such as cars and churches.

Examples of various attribute editing performed on real face images and paintings. Additionally, FLAME can generate various attribute styles such as eyeglasses and hair styles and works for other datasets such as cars.

Overview

I) Synthetic pair creation- We create with negative image In and positive image Ip pairs for attribute aj by blending using a part-wise segmentation mask.

II) Attribute direction is estimated by taking a difference between the latent codes for the Ip and In.

III) Reconstruction of the Ip and In by passing through the pre-trained encoder followed by the StyleGAN generator. Note that, the reconstructed positive images Gs(wp) looks natural although the original positive image Ip is unnatural.

Key Results

Generalization to Out-of-Domain Painting Images

FLAME generalizes well on Out-of-Distriubtion images of face paintings from Metfaces resulting in highly realistic editing of paintings.

Attribute Style Manipulation

FLAME can generate multiple styles for a given attribute such as diverse eye-glass styles and diverse hair-styles.

Citation

If you find our work helpful in your research, please cite our work:

@InProceedings{parihar2022,

title={Everything is in the Latent Space: Image Manipulation by Latent Space Editing},

author={Parihar, Rishubh and Dhiman, Ankit and Karmali, Tejan and Babu, R. Venkatesh},

booktitle={ACM Conference on Multimedia},

year={2022}}

License

This project is licenced under an [MIT License].