Data & Evaluation
Data description
The DIMEMEX dataset consists of around 3,000 memes, compiled from public Facebook groups rooted in Mexico and manually annotated on the presence of hate speech, inappropriate content, and harmful content.
While hate speech, inappropriate content, and harmful content comprise the classes considered for Task 1. Detection of Hate Speech, Inappropriate, and Harmless Memes labeling process; classism, sexism, racism, and others were considered as sub phenomenons derived from the hate speech class and are the subclasses that belong to Task 2. Finer-grained detection of Hate Speech in Memes.
Sample memes from each category
Warning: This samples may be offensive to some readers, these do not represent the perspectives of the authors.
Category: Harmless
Category: Inappropriate
Category: Hate Speech
Subcategory: Classism
Category: Hate Speech
Subcategory: Racism
Category: Hate Speech
Subcategory: Sexism
Category: Hate Speech
Subcategory: Others
Evaluation details
Both subtasks will rely on the DIMEMEX dataset. Thus, participants would be able to join either or both tasks.
Submissions will be evaluated on the test partition considering mainly the macro f1 and ponderated macro f1 measures.
Both task challenges will be run on the CodaLab platform. Also, baseline performances will be released for both tasks considering the use of either or both available modalities (text, image, text-image).