The first edition of a shared task on Native Language Identification (NLI) will take place at the BEA-8 workshop. The shared task will be organized by Joel Tetreault, Aoife Cahill, and Daniel Blanchard. NLI is the task of identifying the native language (L1) of a writer based solely on a sample of their writing. The task is typically framed as a classification problem where the set of L1s is known a priori. Most work has focused on identifying the native language of writers learning English as a second language. To date this topic has motivated several ACL and EMNLP papers, as well as a master’s thesis.
Native Language Identification (NLI) can be useful for a number of applications. In educational settings, NLI can be used to provide more targeted feedback to language learners about their errors. It is well known that learners of different languages make different errors depending on their L1s. A writing tutor system which can detect the native language of the learner will be able to tailor the feedback about the error and contrast it with common properties of the learner’s language. In addition, native language is often used as a feature that goes into authorship profiling, which is frequently used in forensic linguistics.