Cross-Genre Gender Prediction in Italian

Shared Task at EVALITA 2018

GxG (Gender X-Genre) is a task on author profiling (in terms of gender) on Italian texts, with a specific focus on cross-genre performance, organised as part of EVALITA 2018

State-of-the-art gender prediction on Twitter for English, the most common platform and language used for this task, is approximately 80-85% (Rangel et al., 2015; Alvarez-Carmona et al., 2015; Rangel et al., 2017; Basile et al., 2017), as obtained at the yearly PAN evaluation campaigns. In the context of the 2016 PAN evaluation campaign, a cross-genre setting was introduced for gender prediction on English, Spanish, and Dutch, and best scores were recorded at an average accuracy of less than 60% (Rangel et al., 2016). This was achieved by training models on tweets, and testing them on datasets from a different source still in the social media domain, namely blogs. To further explore the cross-genre issue, Medvedeva et al. (2017) ran additional experiments using PAN data from previous years with the model that had achieved best results at the cross-genre PAN 2016 challenge (Busger op Vollenbroek et al., 2016). The picture they obtain is mixed in terms of accuracy of cross-genre performance, eventually showing that models are not yet general enough to capture gender accurately across different datasets.

This is evidence that we have not yet found the actual dataset-independent features that do indeed capture the way females and males might write differently. (And might let us wonder if this is a valid assumption at all.)

This task is aimed and designed to address this issue. Indeed, if we can make gender prediction stable across very different genres, then we are more likely to have captured deeper gender-specific traits rather than dataset characteristics. As a by product, this task will yield a variety of models for gender prediction in Italian, also shedding light on which genres favour or discourage in a way gender expression, by looking at whether they are easier or harder to model.