Arabic is a challenging language for the field of computational linguistics. This is due to many factors including its complex and rich morphology, its high degree of ambiguity as well as the presence of a number of dialects that vary quite widely. Arabic is also a language with important geopolitical connections. It is spoken by over 400 million people in countries with varying degrees of prosperity and stability. It is the primary language of the latest world refugee problem affecting the Middle East and Europe. The opportunities that are made possible by working on this language and its dialects cannot be underestimated in their consequence on the Arab World, the Mediterranean Region, and the rest of the World.
There has been a lot of progress in the last 20 years in the area of Arabic Natural Language Processing (NLP). Many Arabic NLP (or Arabic NLP-related) workshops and conferences have taken place, both in the Arab World and in association with international conferences. Examples include the following:
The First, Second, Third, Fourth and Fifth Arabic Natural Language Processing Workshop at EMNLP 2014, ACL 2015, EACL 2017, ACL 2019, and COLING 2020 respectively.
The First, Second, Third, and Fourth Workshops on Arabic Corpora and Processing Tools at LREC 2014, LREC 2016, LREC 2018, and LREC 2020 respectively.
The conference on Arabic Language Resources and Tools (MEDAR-2009, NEMLAR-2004).
The workshop on Computational Approaches to Semitic Languages (LREC 2010, EACL 2009, ACL 2007, ACL 2005, ACL 2002, ACL 1998).
The workshop on Computational Approaches to Arabic Script-based Languages (MTSummit XII 2009, LSA 2007, COLING 2004).
The International Symposium on Computer and Arabic Language (ISCAL 2009, ISCAL 2007)
This workshop follows in the footsteps of these efforts to provide a forum for researchers to share and discuss their ongoing work. This workshop is timely given the continued rise in research projects focusing on Arabic NLP.
We invite submissions on topics of natural language processing that include, but are not limited to, the following:
Basic core technologies: morphological analysis, disambiguation, tokenization, POS tagging, named entity detection, chunking, parsing, semantic role labeling, Arabic dialect modeling, etc.
Applications: machine translation, speech recognition, speech synthesis, optical character recognition, pedagogy, assistive technologies, social media analytics, sentiment analysis, summarizations, dialogue systems, etc.
Resources: lexicons, dictionaries, annotated and unannotated corpora, etc.
Submissions may include work in progress, as well as finished work that has not been previously published. Submissions must have a clear focus on specific issues pertaining to the Arabic language whether it is standard Arabic, dialectal, or mixed. Papers on other languages sharing problems faced by Arabic NLP researchers such as Semitic languages or languages using Arabic script are welcome. Additionally, papers on efforts using Arabic resources but targeting other languages are also welcome. Descriptions of commercial systems are welcome, but authors should be willing to discuss the details of their work.
We are delighted to share that we received sponsorship from Google Research.
The funds will be used to cover registration and travel fees for a selected number of students