Yusuke Hirota Yuta Nakashima Noa Garcia
Osaka University
Accepted at CVPR 2022 (oral)
[paper] [code] [video]
Do image captioning models amplify societal bias? ー Yes, they do.
We propose a metric to quantify societal bias in image captioning models.
We study societal bias amplification in image captioning. Image captioning models have been shown to perpetuate gender and racial biases, however, metrics to measure, quantify, and evaluate the societal bias in captions are not yet standardized. We provide a comprehensive study on the strengths and limitations of each metric, and propose LIC, a metric to study captioning bias amplification. We argue that, for image captioning, it is not enough to focus on the correct prediction of the protected attribute, and the whole context should be taken into account. We conduct extensive evaluation on traditional and state-of-the-art image captioning models, and surprisingly find that, by only focusing on the protected attribute prediction, bias mitigation models are unexpectedly amplifying bias.
LIC measures societal bias amplification of captioning models by analyzing the whole sentence.
The classifier is trained to predict the attributes of the person in the image.
Attribute-revealing words are masked before being fed into the classifier.
If attribute-revealing words are masked, the classifier shouldn't be able to predict human attributes.
To compute bias amplification, compare the accuracies of the 2 classifiers.
All the evaluated captioning models amplify gender (left) and racial (right) bias.
@InProceedings{hirota2022quantifying,
title={Quantifying Societal Bias Amplification in Image Captioning},
author={Hirota, Yusuke and Nakashima, Yuta and Garcia, Noa},
booktitle={CVPR},
year={2022}
}