Summary

Conclusion

In this work, we explore the latent space robustness of generative models through the lens of adversarial attacks. To the best of our knowledge, this avenue has yet to be explored in literature. We raise the question whether the latent space in GANs is as smooth as previously claimed. To examine this claim, we perform an adversarial attack on StyleGAN. Concretely, given an sampled latent vector, we perform gradient ascent over a well-defined loss function to arrive at adversarial latent codes which result in noisy reconstructions. We experiment with several loss functions to analyze the characteristics of the loss function as a proxy for quality of generated images. Through a noise discriminator, we successfully verify the existence of adversarial holes in the latent space. Our code is available at: https://github.com/anirudh-chakravarthy/VLR-Project.

Future Work

In the future, we aim to explore methods to force the adversarial latent codes to be in-domain. Specifically, we hope to train an adversarial mapper and latent code discriminator (as in Experiment 5) and analyze their effects on the latent space. Furthermore, we also wish to develop a strategy which can sample latent codes from the adversarial holes during training (similar to adversarial training but on the latent space) so as to overcome the existence of these holes. Finally, we also wish to explore strategies to fine-tune an existing generative model such that these adversarial holes can be refined to generate meaningful images.