The shared task will feature two subtasks:
Violent event identification. Determine whether a given tweet is associated with a violent incident or not (binary classification).
Violent event category recognition. Recognize the crime category (see above) to which a given tweet belongs (multi-class classification.).
Both subtasks will rely on the DA-VINCIS corpus, and participants can approach either or both tasks. The challenge will be run in the CodaLab2 platform. Baseline performances will be released for both tasks.
The DA-VINCIS Corpus has been compiled from Twitter by retrieving tweets associated with violent incidents. Following a sound methodology, 5000 tweets filtered from a set of 12,000 were manually annotated using a cloud based platform. Each tweet in the dataset has been labeled by at least 3 annotators. The following were considered for the labeling process:
Accident: Eventual event or action that results in involuntary damage to people or things.
Homicide: Deprivation of life.
None of the above: Selected when there is no crime reported in the tweet. Please note that tweets under this category were also retrieved using keywords associated with violent events.
Theft: Seizure or willful destruction of someone else's property without the right and without the consent of the person who can legally dispose of them.
Kidnapping: Deprivation of liberty.
The whole categories will be used for track 2, while only two categories (none-of-the-above vs. the rest) will be considered for track 1.
DATA ALREADY AVAILABLE