First International Workshop on Bias Estimation in Face Analytics

In conjunction with ECCV 2018

NEWS: thanks to our sponsors, the workshop will have a monetary prize associated with the Best Result Award

Many publicly available face analytics datasets are responsible for great progress in face recognition. These datasets serve as source of large amounts of training data as well as assessing performance of state-of-the-art competing algorithms. Performance saturation on such datasets has led the community to believe the face recognition and attribute estimation problems to be close to be solved, with various commercial offerings stemming from models trained on such data.

However, such datasets present significant biases in terms of both subjects and image quality, thus creating a significant gap between their distribution and the data coming from the real world. For example, many of the publicly available datasets underrepresent certain ethnic communities and over represent others. Most datasets are heavily skewed in age distribution. Many variations have been observed to impact face recognition including, pose, low-resolution, occlusion, age, expression, decorations and disguise. Systems based on a skewed training dataset are bound to produce skewed results. This mismatch has been evidenced in the significant drop in performance of state of the art models trained on those datasets when applied to images presenting lower resolution, poor illumination, or particular gender and/or ethnicity groups [1, 2]. It has been shown that such biases may have serious impacts on performance in challenging situations where the outcome is critical either for the subject or to a community. Often research evaluations are quite unaware of those issues, while focusing on saturating the performance on skewed datasets.

In order to progress toward fair face recognition and attribute estimation truly in the wild, we propose a new challenge which focuses on a well-balanced dataset across multiple factors: age, gender, ethnicity, pose and resolution.

The goal of this workshop is to:

· Evaluate the current state of face analytics (face recognition, attributes estimation) algorithms to assess their inherent bias on a diverse, as unbiased as possible, dataset

· Facilitate the creation of bias-aware and (as much as possible) bias-free models that can achieve and maintain high performance invariably across multiple groups

To this end, we are creating a balanced test dataset across multiple dimensions (age, gender, ethnicity, resolution, pose) and will run a competition in which groups will be allowed to train models on whichever data they deem appropriate, but will be evaluated on our balanced test set. This dataset is not intended for training, but only for evaluation. A comparative study on the results is expected to shed light and suggest directions that can improve fair face recognition and attributes prediction by suggesting training dataset collection techniques as well as model architectures, algorithms and evaluation protocols.

During the workshop we will present the state of the art achieved on the dataset, together with some baselines which we will run using commercial engines and publicly available state of the art models.

More broadly, the hope is that the workshop will make researchers aware of such bias and stimulate discussion and ideas in the community on how to tackle this very important, but largely neglected issue.

[1] Joy Buolamwini and Timnit Gebru. Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification, in Conference on Fairness, Accountability, and Transparency, 2018

[2] https://www.technologyreview.com/s/601786/are-face-recognition-systems-accurate-depends-on-your-race/

NOTE ON THE COMPETITION DATASET

The BEFA dataset is intended for test purposes only. The competitors should train the network independently of the test dataset, on whichever training data they deem appropriate. The BEFA dataset is expected to contain less than 50K images. As such, running an existing model on it will be very efficient for participants. The test protocol will involve an evaluation of performance of attributes prediction (age, gender, ethnicity) across all intersections of the five subdivisions of it (age, gender, ethnicity, pose, resolution).

DATES (full details on competition coming soon)

Competition Registration: June 30, 2018

to register send an email with subject "BEFA Competition Registration" to ratha@us.ibm.com with your team name and a reference contact details (name, affiliation, email address)

Test Dataset released: ~~July 7, 2018~~ July 20 ,2018

Results submission: ~~August 7, 2018~~ August 20, 2018

Paper submission: ~~August 15, 2018~~ August 23, 2018

Paper review and decision: ~~August 22, 2018~~ August 28, 2018

Camera ready papers due: August 30, 2018

Results announced: ~~August 22, 2018~~ Semptember 10, 2018