The 2017 shared task on Native Language Identification (NLI) will take place at the BEA-12 workshop.
NLI is the task of identifying the native language
(L1) of a writer based solely on a sample of their writing. The task is
typically framed as a classification problem where the set of L1s is
known a priori. Most work has focused on identifying the native language
of writers learning English as a second language. Two previous shared tasks on NLI have been organized in which the task was to identify the native language of non-native speakers of English-based on essays and spoken responses they provided during a standardized assessment of academic English proficiency. The first shared task was based on the essays only and was also held with the BEA workshop in 2013. It was a total success with 29 teams competing, making it one of the largest shared tasks that year. Three years later, Computational Paralinguistics Challenge at Interspeech 2016 hosted a sub-challenge on identifying the native language based solely on the spoken responses.
This year's shared task combines the inputs from the two previous tasks.
There will be three tracks: NLI on the essay only, NLI on the spoken
response only (based on a transcription of the response, not the audio), and NLI using both responses from a test taker. We feel
this will make for a more challenging shared task while building on the
methods and results from the previous two shared tasks. The training
and development data for the shared task will be available in March
2017. There will be two tracks, one open and one closed. In the closed track, you can only use the labeled data we provide to train your system. In the open track, you can use any data you want. We do allow and encourage submissions to both tracks. Shared Task Report and System PapersLinks to the shared task report and all of the system description papers will be posted here once they are online. Cite the Shared Task Report:
@InProceedings{nli2017, author = {Malmasi, Shervin and Evanini, Keelan and Cahill, Aoife and Tetreault, Joel and Pugh, Robert and Hamill, Christopher and Napolitano, Diane and Qian, Yao}, title = {{A Report on the 2017 Native Language Identification Shared Task}}, booktitle = {Proceedings of the 12th Workshop on Building Educational Applications Using NLP}, month = {September}, year = {2017}, address = {Copenhagen, Denmark}, publisher = {Association for Computational Linguistics} } Key Dates
ContactYou can reach the organizers via nlisharedtask@gmail.com The organizing committee for the shared task is:
|