CogALex-V Shared Task on the Corpus-Based Identification of Semantic Relations

Discovering whether words are semantically related – and, if so, which kind of relation holds between them – is an important task in Natural Language Processing (NLP) with a wide range of applications, such as Automatic Thesaurus Creation, Ontology Learning, Paraphrase Generation, etc.  Semantic relations also play a central role in human lexical retrieval and may thus shed light on the organization of the mental lexicon.  Corpus-based approaches to semantic relation identification promise an efficient and scalable solution to the NLP task.  At the same time, they may provide a cognitively plausible model for human acquisition of semantic relations.

As part of the 5th CogALex workshop at COLING 2016, we propose a shared task on the corpus-based identification of semantic relations in the form of a “friendly competition”.  Its aim is not to find the team with the best system, but to test different distributional models and other corpus-based approaches on a hard semantic task, and thus gain a better understanding of their respective strengths and weaknesses.  For this reason, both training and test data will be made available.  Participants are expected to submit a short paper (4 pages) describing their approach and evaluation results (using the official scoring scripts), together with the output produced by their system on the test data.

The task is split into two subtasks, which should both be tackled by participating systems.

The input file is the same as for subtask 1. Participant systems are expected to return a TAB-separated file like the example below (but without header row), where each word pair is annotated with one relation label from the list above (and a gold standard file with correct answers is provided in the same format).  Word pairs must appear exactly in the same order as in the input file.  This subtask is evaluated in terms of precision, recall and F1-score for each of the four semantic relations.  The overall score is the weighted average of the four F1-scores.

Rules for the Shared Task

We encourage the participants to make use of the metadata provided with EVALution for the performance analysis of the system.  However, these data should not be used for training the systems.  The metadata contain useful information such as frequency, possible and most frequent POS, possible and most common capitalization types & inflections, semantic domain, etc. More details can be found in the README.txt file in the data package.

Submission procedure & formatting instructions

Dataset and Evaluation

We provide a dataset extracted from EVALution 1.0 (Santus et al. 2015), which was developed from WordNet (Fellbaum 1998) and ConceptNet (Liu and Singh 2004), and which was further filtered by native speakers in a CrowdFlower task.  Our dataset is split into a training set (released on the 8th of September 2016) and test set (released on the 26th of September 2016). Official evaluation scripts (in terms of precision, recall and F1-score) will be released together with the test data.

EVALution is a heterogeneus and non-balanced dataset which replicates real-world difficulties. Words in the dataset are not POS-tagged and fall into different frequency ranges. Moreover, words may be used in different senses (even with different POS) within the data set, so that they might hold different relations depending on the respective sense.  In the CrowdFlower task, relations were judged according to the paraphrases (e.g. “X is a kind of Y”) shown in the subtask description

Please cite the task description paper

Santus, Enrico; Gladkova, Anna; Evert, Stefan; Lenci, Alessandro (2016). The CogALex-V shared task on the corpus-based identification of semantic relations. In Proceedings of the 5th Workshop on Cognitive Aspects of the Lexicon (CogALex-V), pages 69–79, Osaka, Japan. [PDF]

if you evaluate a system on this dataset.

Important Dates

Organizers

References