Scope and Objective

Eurasia is the largest continental area comprising all of Europe and Asia. It is also home to seven families of more than 2,500 languages. Despite the rich diversity of these languages, various language communities in the Eurasia continent are under-represented, low-resource, endangered and systematically oppressed politically. As a result, many of these languages such as Kurdish, Gilaki, Santali, Kashmiri, Laz, and Abkhaz are low-resource and many are endangered with very few studies carried out on them, such as Shabaki, Talysh, Domari, Korbet and Bawm. One interesting characteristic of these languages is the influence of the communal languages on their lexicon through borrowed words or words of the same cognates. Furthermore, such an influence can be observed to some extent in the syntax of the languages, despite their typological differences as they belong to different language families. Relying on lingua franca, many of these linguistic communities are facing standardization issues, particularly in the written forms. As a result, scripts of other languages are used by the speakers of an under-represented language in many cases.

This workshop focuses on the development of language technology resources and tools for the indigenous, endangered and lesser-resource languages in the Eurasia continent. 

In a media-centric world where language technology allows people to break cultural and language barriers, it is important that speakers of endangered and indigenous languages can be empowered to use these technologies to share their knowledge and culture with the world. With the aim of bridging this gap, the goal of this workshop is to increase visibility and promote research for lesser-resourced and underrepresented language communities in Europe and Asia. Through collaboration between NLP researchers, language experts and linguists working for endangered languages in these communities, we aim to create language technology resources that will help to preserve and revive these languages for future generations. Furthermore, the workshop aims to promote the emergence of new methods that benefit linguists, for instance for automation of analysis and validation processes, field linguists, the facilitation of data collection and analysis processes, and computational linguists by developing new techniques necessary for linguistic analysis, development of supervised or weakly supervised methods for the analysis of poorly written or undocumented languages.

The main objective of the workshop is to create basic resources and develop tools for Eurasiatic languages, including but not limited to the following topics:

Identify, Describe, and Share your LRs!