Competition Overview


Questions surrounding machine learning fairness and inclusivity have attracted heightened attention in recent years, leading to a rapid emergence of a full area of research within the field of machine learning.

To provide additional empirical grounding and a venue for head-to-head comparison of new methods, the InclusiveImages competition encourages researchers to develop modeling techniques that reduce the biases that may be encoded in large data sets. In particular, this competition is focused on the challenge of geographic skew encountered when the geographic distribution of training images does not fully represent levels of diversity encountered at test or inference time.

How the Competition Works

Concretely, in this competition researchers will train on Open Images [2], a large, multilabel, publicly-available image classification dataset that has been found to exhibit a geographical skew, and evaluate on InclusiveImages, an image classification dataset collected with explicit inclusion goals, designed as a stress-test of a model's ability to generalize to images from geographical areas under-represented in the training data.

In addition to the Open Images training set, competitors will have access to a large, open-source data set of textual information that may be useful in helping to provide additional information and context to aid a model's ability to generalize to other geographical distributions. Competitors will be instructed to assume a geographic shift between training and evaluation data, but will not have all the details of what the shift is, mimicking the real-world situation in which a model may be deployed in an environment that is markedly different than it was trained, as is often the case when localities differ from global distributions.

How to Address Location Representation

Competitors should assume that locations that are over-represented at training may not have the same level of representation at test time, and that their models will explicitly be stress-tested for performance on images from some locations that are under-represented during training. Competitors will be able to validate their submissions on a validation set which has this quality, and then will be tested on a final evaluation set which is exhibits this quality in a different way.