Facial Image Analysis by CNN

with Heterogeneous Learning

Abstract

Recognition of facial attributes such as facial point, gender, and age have been used in marketing strategies and client services on social networks. In general, to recognize these attributes, it requires independent handcraft features and classifiers for each task. Heterogeneous learning is able to train a single classifier to perform multiple tasks. This learning method simultaneously train regression and recognition tasks, thereby reducing both training and testing time. However, differences between training error negatively affect the training process in specific tasks. To address this problem. we propose weighted heterogeneous learning which has weighed error function for a deep convolutional neural network. Our method outperformed the conventional method in terms of facial attribute recognition, especially for regression tasks such as facial point detection, age estimation, and smile ratio estimation.

Weighted Heterogeneous Learning

Conventional heterogeneous learning calculates the training error. Hence, differences between training errors occur because of differences between label ranges for regression tasks and recognition tasks. The proposed method stabilizes the training error by weighting each task and improves the heterogeneous learning performance.

The training error for conventional heterogeneous learning is different for each task, and the training error varies suddenly for the recognition task, as shown in left figure. Additionally, training errors of the proposed method for each task are lower overall than those of conventional heterogeneous learning. The proposed method has a unified training error for each task, and suppresses the dispersion training error variation. To achieve this result, the proposed method is stably trained by weighting the error function.

Conventional Heterogeneous Learning

Weighted Heterogeneous Learning

The left figure shows an example of facial image analysis using conventional heterogeneous learning and the proposed method. The first and third columns show result of examples of conventional heterogeneous learning, and the second and fourth columns show results of the proposed method. and the text on their right is results of subtasks such as gender, age, race, and smile ratio. The green points are facial points detected by conventional heterogeneous learning or proposed method. The red text is inaccurate recognition or estimation. As shown in left figure, we observe that the proposed method is robust to faces with large pose variation, lighting, and severe occlusion. Additionally, our method can run real time demo on CPU machine.

Recognition results

Demo result

Bibtex

@inproceedings{Fukui2017,
author = {Hiroshi Fukui and Takayoshi Yamashita and Yuu Kato and Ryo Matsui and Takanori Ogata and Yuji Yamauchi and Hironobu Fujiyoshi},
booktitle = {Workshop on Facial Informatics on Asian Conference on Computer Vision},
title = {{Multiple Facial Attributes Estimation based on Weighted Heterogeneous Learning}},
year = {2016}
}


@inproceedings{Fukui2017,
author = {Hiroshi Fukui and Takayoshi Yamashita and Yuu Kato and Ryo Matsui and Yuji Yamauchi and Hironobu Fujiyoshi},
booktitle = {International Workshop on Advanced Image Technology},
title = {{Facial Image Analysis by CNN with Heterogeneous Learning}},
year = {2017}
}