Data & Evaluation

Data description

The DIMEMEX dataset consists of around 3,000 memes, compiled from public Facebook groups rooted in Mexico and manually annotated on the presence of hate speech, inappropriate content, and harmful content

While hate speech, inappropriate content, and harmful content comprise the classes considered for Task 1. Detection of Hate Speech, Inappropriate, and Harmless Memes labeling process; classism, sexism, racism, and others were considered as sub phenomenons derived from the hate speech class and are the subclasses that belong to Task 2. Finer-grained detection of Hate Speech in Memes.

Sample memes from each category

Warning: This samples may be offensive to some readers, these do not represent the perspectives of the authors.

Category: Harmless

Category: Inappropriate

Category: Hate Speech
Subcategory: Classism

Category: Hate Speech
Subcategory: Racism

Category: Hate Speech
Subcategory: Sexism

Category: Hate Speech
Subcategory: Others

Evaluation details

Both subtasks will rely on the DIMEMEX dataset. Thus, participants would be able to join either or both tasks.

Submissions will be evaluated on the test partition considering mainly the macro f1 and ponderated macro f1 measures. 

Both task challenges will be run on the CodaLab platform. Also, baseline performances will be released for both tasks considering the use of either or both available modalities (text, image, text-image).