Complex Word Identification (CWI) Shared Task 2018

Text simplification systems aim to facilitate reading comprehension to different target readerships such as foreign language learners, native speakers with low literacy levels or various kinds of reading impairments. Identifying which the words are considered difficult for a given target population is an important step for building better performing lexical simplification systems. This step is known as complex word identification (CWI).

Following the success of the first CWI shared task at SemEval 2016, we are organizing the Second CWI Shared Task co-located with the BEA Workshop 2018 at NAACL-HLT in New Orleans, USA.

The Second CWI Shared Task features a multilingual dataset and participants can choose to participate in one or more of the following tracks:

  • English monolingual CWI
  • German monolingual CWI
  • Spanish monolingual CWI
  • Multilingual CWI shared task with French test set (English, Spanish, and German datasets can be used for training)

For more information please check the Call for Participation.

The CWI shared task 2018 is finished. We would like to thank all teams for participating. The ranks of the binary classification and probabilistic classification are available. The datasets including the gold standard are available here.

The shared task report and the system description papers will be published in the BEA Workshop 2018 proceedings.

Organizers: Sanja Štajner (University of Mannheim), Chris Biemann (University of Hamburg), Shervin Malmasi (Harvard Medical School), Gustavo Paetzold (University of Sheffield), Lucia Specia (University of Sheffield), Anaïs Tack (Université Catholique de Louvain and KU Leuven), Seid Muhie Yimam (University of Hamburg), Marcos Zampieri (University of Wolverhampton)

Contact: Sanja Štajner - sanja(at)informatik(dot)uni-mannheim(dot)de