Home

This is the website for OffensEval, a series of shared tasks on offensive language identification organized at the International Workshop on Semantic Evaluation (SemEval). OffensEval models offensive content using a hierarchical annotation described in Zampieri et al., 2019 focusing on type and target of offensive content.

You can find information about OffensEval previous editions:

OffensEval 2020: Multilingual Offensive Language Identification in Social Media (SemEval-2020 Task 12) [webpage] [report]
OffensEval 2019: Identifying and Categorizing Offensive Language in Social Media (SemEval-2019 Task 6) [webpage] [report]

You can also find information about the English datasets used at OffensEval:

OLID: Offensive Language Identification Dataset used in OffensEval 2019 [webpage] [paper]
SOLID: Semi-Supervised Offensive Language Identification Dataset used in OffensEval 2020 [webpage] [paper]

Finally, on this page, you will find links to datasets in other languages. OffensEval 2020 featured Arabic, Danish, Greek, and Turkish datasets.

Please also explore our more recent publications:

TBO: Target-Based Offensive Language Identification [dataset] [paper]
OffensEval 2023: Offensive language identification in the age of Large Language Models [paper]

Page updated

Google Sites

Report abuse