CycleGAN for Facial Expression Recognition
By Michelle Lin and Fatemeh Ghezloo
CycleGAN for Facial Expression Recognition
By Michelle Lin and Fatemeh Ghezloo
Related Work & Approach
Current research in Facial Expression Recognition focuses on recognizing seven basic and universal emotions from human face. The imbalanced distibution among emotion classes leads to low accuracy in classes with fewer samples. To deal with this issue, many methods have been proposed, such as undersampling, synthesizing minorities, creating box around minorities [5] and etc.
Undersampling is a popular method in dealing with class-imbalance problems, which uses only a subset of the majority class but the main deficiency is that many majority class examples are ignored. Another study by Nitesh Chawla et al. [4] suggests that combining undersampling of majority class with oversampling the minority can achieve better performance than only using undersampling.
Another method of generating synthesized data for minority classes is to use a image-to-image transition GAN. Xinyue Zhu et al. use CycleGAN to generate images in a target class using images in neutral class as reference [1]. The generated images were then used to augment and balance the dataset, which increased accuracy for that emotion and also for the dataset as a whole. In this study we replicated in this project.
We used the Facial Expression Recognition (FER2013) Database to evaluate our results. We first trained a CNN on the dataset, then augmented it with generated images from our CycleGAN, retrained, and compared the accuracy. In addition to looking at overall accuracy, we also looked at the per class accuracy to see if the generated images actually improved the accuracy of our target class.