Task & Data

Task objective

The objective of this task is to build models able to determine if a news item obtained from Twitter describes a violent incident or not by analyzing the textual information.

Description of subtasks

The shared task will feature two subtasks:

Violent event identification. Determine whether a given tweet is associated with a violent incident or not (binary classification).
Violent event category recognition. Recognize the crime category (see above) to which a given tweet belongs (multi-class classification.).

Both subtasks will rely on the DA-VINCIS corpus, and participants can approach either or both tasks. The challenge will be run in the CodaLab2 platform. Baseline performances will be released for both tasks.

Data description

The DA-VINCIS Corpus has been compiled from Twitter by retrieving tweets associated with violent incidents. Following a sound methodology, 5000 tweets filtered from a set of 12,000 were manually annotated using a cloud based platform. Each tweet in the dataset has been labeled by at least 3 annotators. The following were considered for the labeling process:

Accident: Eventual event or action that results in involuntary damage to people or things.
Homicide: Deprivation of life.
None of the above: Selected when there is no crime reported in the tweet. Please note that tweets under this category were also retrieved using keywords associated with violent events.
Theft: Seizure or willful destruction of someone else's property without the right and without the consent of the person who can legally dispose of them.
Kidnapping: Deprivation of liberty.

The whole categories will be used for track 2, while only two categories (none-of-the-above vs. the rest) will be considered for track 1.

DATA ALREADY AVAILABLE

Dataset

Page updated

Google Sites

Report abuse