We invite you to participate in the 2021 shared task on Text Normalization for Swiss German which will be held at SwissText 2021.
Written Swiss German is not standardized and varies across authors and their dialects and its use is almost exclusively constrained to communication on social media or via text messaging. Many corpora will therefore contain many distinct surface forms for the same word which can make their analysis challenging. It is therefore desirable to be able to normalize them to a single common surface form.
We collected Swiss German utterances from social media and two annotators mapped every token to a corresponding form in Standard German (see examples below). The task is to build models that can perform such a mapping automatically. This is different from translation since the resulting normalized utterance will in general not be grammatically correct Standard German as word order is preserved.
A similar effort has previously been undertaken for text messages by the SMS4Science project. There is also a related shared task on lexical normalization of other languages running for the WNUT2021 workshop.
Please fill out the registration form to participate. You will receive the data only after registration.
We provide code for checking and scoring your submissions in this repository.
March 01.
May 01.
May 04.
May 05.
Mai 15.
June 01.
June 10.
June 14.-16.
Shared Task Start
Release Test Data
Experimental Results Due
Publication Evaluation Results
System Descriptions Due
Notification of Acceptance
Camera-ready System Description Due
SwissText 2021 Conference
Contact: vode@zhaw.ch
Organization:
Pius von Däniken, ZHAW InIT
Manuela Hürlimann, ZHAW InIT
Mark Cieliebak, ZHAW InIT