Improving GANs for Long-Tailed Data through Group Spectral Regularization

Harsh Rangwani, Naman Jaswani, Tejan Karmali, Varun Jampani, R.Venkatesh Babu

Video Analytics Lab, IISc Google Research

European Conference on Computer Vision (ECCV 2022)

Paper

Code

Video

State of the Art (SOTA) Generative Adversarial Networks suffer from mode collapse for tail classes, when they are trained on long-tailed datasets. Addition of our group Spectral Regularizer (gSR) mitigates mode collapse and leads to diversity in image generations.

With our proposed gSR (group Spectral Regularizer) we are able to effectively alleviate mode-collapse and improve performance of SOTA GANs (with LeCam, ADA, and DiffAug ), when trained on long-tailed data (CIFAR-10 LT).

Abstract

Deep long-tailed learning aims to train useful deep networks on practical, real-world imbalanced distributions, wherein most labels of the tail classes are associated with a few samples. There has been a large body of work to train discriminative models for visual recognition on long-tailed distribution. In contrast, we aim to train conditional Generative Adversarial Networks, a class of image generation models on long-tailed distributions. We find that similar to recognition, state-of-the-art methods for image generation also suffer from performance degradation on tail classes. The performance degradation is mainly due to class-specific mode collapse for tail classes, which we observe to be correlated with the spectral explosion of the conditioning parameter matrix. We propose a novel group Spectral Regularizer (gSR) that prevents the spectral explosion alleviating mode collapse, which results in diverse and plausible image generation even for tail classes. We find that gSR effectively combines with existing augmentation and regularization techniques, leading to state-of-the-art image generation performance on long-tailed data. Extensive experiments demonstrate the efficacy of our regularizer on long-tailed datasets with different degrees of imbalance.

Novel Observation

We find that class-specific mode collapse is correlated with the explosion of the spectral norm of the class-specific parameters in generator network. We develop gSR which is an in-expensive regularization method to mitigate the mode collapse.

Overview of Proposed Technique

Steps followed for calculation of group Spectral Regularizer (gSR) term

Extract the class-specific features for each class (eg. cBN in BigGAN, etc.)
Group the parameters for each class into a matrix.
The spectral norm of the matrices is estimated using the inexpensive power iteration.
The spectral norm of the matrix is then added to the loss as a regularization term.
The additional loss term mitigates the issue of the explosion of the spectral norm, which is the reason for the mode-collapse.

Results

Qualitative comparison of the image generation of BigGAN (FID 61.63) and BigGAN + gSR (FID 16.56) on a class-conditioned LSUN dataset. gSR leads to significantly better and diverse generations for even the tail classes (the bottom rows).

gSR.mp4

Qualitative comparison of training of BigGAN and BigGAN + gSR on long-tailed dataset. With gSR the training remains stable and leads to improved FID with iterations, whereas the baseline starts experiencing mode-collapse as iterations increase.

Citation

If you find our work helpful in your research, please cite our work:

@InProceedings{rangwani2022gsr,

title={Improving GANs for Long-Tailed Data through Group Spectral Regularization},

author={Rangwani, Harsh and Jaswani, Naman and Karmali, Tejan and Jampani, Varun and Babu, R. Venkatesh},

booktitle={European Conference on Computer Vision},

year={2022},

}

Licence

This project is licenced under an [MIT License].

Contact

If you have any queries, please get in touch via email: harshr@iisc.ac.in