CogALex 2020 - Shared Task

Monolingual and Multilingual Identification

of Paradigmatic Semantic Relations

Discovering whether words are semantically related – and, if so, which kind of relation holds between them – is an important task in Natural Language Processing (NLP) with a wide range of applications, such as Automatic Thesaurus Creation, Ontology Learning, Paraphrase Generation etc. (Santus, 2016). Semantic relations also play a central role in human lexical retrieval and may have an important role in the organization of the mental lexicon (Murphy, 2003; McRae et al., 2012).

Following the success of the shared task on the corpus-based identification of semantic relations at CogALex-V (Santus et al., 2016), we propose a second edition of the task and we also add a multilingual component. Our special focus will be on paradigmatic semantic relations, such as synonymy, antonymy and hypernymy, which are notoriously difficult to be distinguished by the classical word embedding models used in NLP (Schulte im Walde, 2020).

Participating teams are expected to submit a short paper (4 pages) describing their approach and the evaluation results (using the official scoring scripts), together with the output produced by their system on the test data.

Task 1: Monolingual Identification of Semantic Relations

We provide training and validation data for the following languages: English, German and Mandarin Chinese. The data format is the same for all languages: a word pair and the semantic relation holding between the words. The semantic relations in the data are exemplified in the table(s) below:

Word1 avoid

Word2 dodge

Relation Label SYN (synonym)

Description Word1 can be used with the same meaning of Word2

Word1 outstanding

Word2 mediocre

Relation Label ANT (antonym)

Description Word1 can be used as the opposite of Word2

Word1 caffeine

Word2 stimulant

Relation Label HYP (hypernym)

Description Word1 is a more specific word, i.e., a kind_of Word2.

Word1 clown

Word2 starry

Relation Label RANDOM

Description Word1 and Word2 are unrelated

Given the TAB-separated input file with word pairs, participating systems must add a third column specifying which relation holds between the two words. The output should be another TAB-separated file without the header row and with each word pairs followed by the label assigned by the system (gold standard files with the correct answers are provided in the same format). The word pairs must appear exactly in the same order as in the input file.

Each participating team can join the competition for one or more languages. Once the final test sets are released, the performance of the participating systems will be evaluated separately for each language. For each language track, the final score will be assessed as the weighted average of the F1-scores for the three semantic relations SYN, ANT and HYP (the score for the RANDOM relation will not be considered).

Task 2: Multilingual Identification of Semantic Relations

Given the relatively small size of the datasets, participants are also encouraged to propose new methods for transferring knowledge from one language to another. Examples can be approaches based on crosslingual word embeddings (Ruder et al., 2019), on multilingual training (Glavas and Vulic, 2018) and on meta learning (Yu et al., 2020).

The participants will have to use the released training and validation data for English, German and Mandarin Chinese to test the capacity of transfer learning of their systems. Finally, we will release a test set on a surprise language, and we will ask the participants to send their results for the new language. The data format and the evaluation metric will be the same as in Task 1.

Extra Rules

  1. Participating systems are allowed to make use of existing knowledge bases and semantic resources, with the exception of WordNet and ConceptNet.

  2. There are no restrictions on the corpus data used.

  3. Only a single final system for each participating team can be reported as the official score in the submitted paper. Participants are encouraged to carry out additional post-hoc experiments on the test data, which may also be included in the paper.

  4. Participants must submit a complete paper with evaluation scores computed by the official scoring scripts, as well as the output of their final system on the test data. We will check that the system output is consistent with the scores given in the paper.

Data Release

The development data released to the participants include:

  1. Training and validation datasets for English, German and Mandarin Chinese;

  2. An evaluation script to assess the performance of a system, given an output file in the format specified above;

  3. A script to run a baseline, consisting in a Support Vector Machine using as features the concatenation of the embeddings of each word pair.

For more details go here :

Please submit the prediction outputs via mail ( You are requested to upload a zipped folder containing the output files and a readme file with a short system description.

System Result

1) English

System Owner Overall Precision Overall Recall Overall F1

Text2TCS Lennart Wachowiak 0.602 0.455 0.517

Transformer-based Saurav Karmakar 0.563 0.355 0.428

HSemID Jean-Pierre Colson 0.400 0.276 0.320

2) German

System Owner Overall Precision Overall Recall Overall F1

Text2TCS Lennart Wachowiak 0.592 0.435 0.500

HSemID Jean-Pierre Colson 0.395 0.258 0.312

3) Chinese

System Owner Overall Precision Overall Recall Overall F1

Text2TCS Lennart Wachowiak 0.904 0.860 0.881

HSemID Jean-Pierre Colson 0.501 0.331 0.377

4) Italian

Note: every instance with a typo "mainconia" is excluded for evaluation.

System Owner Overall Precision Overall Recall Overall F1

Text2TCS Lennart Wachowiak 0.557 0.429 0.477

HSemID Jean-Pierre Colson 0.365 0.296 0.325

Important Dates

  • August 1, 2020: Task Description, Release of Development Data

  • September 1, 2020: Release of Test Data

  • September 24, 2020: Deadline for Results Submission

  • October 1, 2020: Announcement of the Winners

  • October 15, 2020: Deadline for Shared Task Papers Submission

Shared Task Organizers

  • Emmanuele Chersoni (The Hong Kong Polytechnic University)

  • Luca Iacoponi (Amazon)

  • Rong Xiang (The Hong Kong Polytechnic University)

  • Enrico Santus (Bayer)

Main Contact


Glavas and Vulic, 2018. Discriminating between lexico-semantic relations with the specialization tensor model. Proceedings of NAACL.

McRae et al., 2012. Semantic and associative relations: Examining a tenuous dichotomy. The Adolescent Brain: Learning, Reasoning and Decision-Making, edited by Reyna et al., pp. 39-66, APA.

Murphy, 2003. Semantic relations and the lexicon: antonymy, synonymy and other paradigms. Cambridge University Press.

Schulte im Walde, 2020. Distinguishing between paradigmatic semantic relations across word classes: human ratings and distributional similarity. Journal of Language Modelling, vol. 8, n. 1, pp. 53-101.

Ruder et al., 2019. A survey of crosslingual word embedding models. Journal of Artificial Intelligence Research, vol. 65, pp. 569-631.

Santus, 2016. Making sense: From word distribution to meaning. PhD Thesis, The Hong Kong Polytechnic University.

Santus et al., 2016. The CogALex-V shared task on the corpus-based identification of semantic relations. Proceedings of the COLING Workshop on the Cognitive Aspects of the Lexicon.

Yu et al., 2020. Hypernymy detection for low-resource languages via meta learning. Proceedings of ACL.