Prior Work

Image Generation Based Augmentation

Previous works such as [1, 2] make use of CycleGAN to enrich a domain having fewer examples by transforming images from the domain having a higher number of examples. On the other hand [3, 4] attempt to generate new images with the help of GANs, which is similar to our approach.

Image Transformation Based Augmentation

The following works [5, 6, 7] attempt to augment existing images with a variety of transformations. By finding combinations of images, we are able to learn the image manifold better, which leads to better generalization.

Challenges with Prior Methods

In [1, 2] the focus is mainly on domain adaptation. In order to shift to another domain, we need to learn a new Cycle GAN each time. Furthermore, [3, 4] do not perform very well for multi-label classification as they do not learn any relationships between any of the labels. They can only deal with a single class at a time. In addition, there is no disentanglement in the latent space, which makes changing specific parts of the image very tricky. Finally, they do not account for context in the image. All images of cats cannot be considered to be the same.

A group of cats

A group of three cats

An angry cat

Cat on a table

Figure 1: Different contexts in cat images

Page updated

Report abuse