The Sixth
Arabic Natural Language Processing Workshop

(WANLP 2021)

Location: EACL 2021 (Virtual)

Workshop Proceedings
are Publicly Accessible
on the ACL Anthology


April 19, 2021


wanlp2021.arabic-nlp.net

Opening Remarks

WANLP-2021-Opening.pdf

Closing Remarks

WANLP-2021-Closing.pdf

Workshop Description

Arabic is a challenging language for the field of computational linguistics. This is due to many factors including its complex and rich morphology, its high degree of ambiguity as well as the presence of a number of dialects that vary quite widely. Arabic is also a language with important geopolitical connections. It is spoken by over 400 million people in countries with varying degrees of prosperity and stability. It is the primary language of the latest world refugee problem affecting the Middle East and Europe. The opportunities that are made possible by working on this language and its dialects cannot be underestimated in their consequence on the Arab World, the Mediterranean Region, and the rest of the World.

There has been a lot of progress in the last 20 years in the area of Arabic Natural Language Processing (NLP). Many Arabic NLP (or Arabic NLP-related) workshops and conferences have taken place, both in the Arab World and in association with international conferences. Examples include the following:

    • The First, Second, Third, Fourth and Fifth Arabic Natural Language Processing Workshop at EMNLP 2014, ACL 2015, EACL 2017, ACL 2019, and COLING 2020 respectively.

    • The First, Second, Third, and Fourth Workshops on Arabic Corpora and Processing Tools at LREC 2014, LREC 2016, LREC 2018, and LREC 2020 respectively.

    • The conference on Arabic Language Resources and Tools (MEDAR-2009, NEMLAR-2004).

    • The workshop on Computational Approaches to Semitic Languages (LREC 2010, EACL 2009, ACL 2007, ACL 2005, ACL 2002, ACL 1998).

    • The workshop on Computational Approaches to Arabic Script-based Languages (MTSummit XII 2009, LSA 2007, COLING 2004).

    • The International Symposium on Computer and Arabic Language (ISCAL 2009, ISCAL 2007)

This workshop follows in the footsteps of these efforts to provide a forum for researchers to share and discuss their ongoing work. This workshop is timely given the continued rise in research projects focusing on Arabic NLP.

We invite submissions on topics of natural language processing that include, but are not limited to, the following:

    • Basic core technologies: morphological analysis, disambiguation, tokenization, POS tagging, named entity detection, chunking, parsing, semantic role labeling, Arabic dialect modeling, etc.

    • Applications: machine translation, speech recognition, speech synthesis, optical character recognition, pedagogy, assistive technologies, social media analytics, sentiment analysis, summarizations, dialogue systems, etc.

    • Resources: lexicons, dictionaries, annotated and unannotated corpora, etc.

Submissions may include work in progress as well as finished work hat has not been previously published. Submissions must have a clear focus on specific issues pertaining to the Arabic language whether it is standard Arabic, dialectal, or mixed. Papers on other languages sharing problems faced by Arabic NLP researchers such as Semitic languages or languages using Arabic script are welcome. Additionally, papers on efforts using Arabic resources but targeting other languages are also welcome. Descriptions of commercial systems are welcome, but authors should be willing to discuss the details of their work.


Workshop Program

Best Paper Award

ArCOV19-Rumors: Arabic COVID-19 Twitter Dataset for Misinformation Detection

by Fatima Haouari, Maram Hasanain, Reem Suwaileh and Tamer Elsayed


Presentation Instructions

  • Main Workshop Oral Presentations (Session 1a, 1b, 2a, 2b): 12 min talks + 3 min Q&A. Powerpoint or PDF slides are ok. You will be asked to share your screen. Please be present for the whole session and on time. We will keep the talks on schedule, to allow conference attendees to go from workshop to another.

  • Main Workshop Posters (Poster Session I): You present a 2-min poster boaster; and also have a poster (landscape mode) in Gathertown (information was sent by email). Please be on time.

  • Shared Task Posters (Poster Session II): You will have a poster (landscape mode) in Gathertown (information was sent by email). Please be on time.

Program Summary

Sponsors

We are delighted to share that we received sponsorship from Google, The Alan Turing Institute and DSTL.
The funds were used to cover EACL fees and ACL Membership for nine students.

Invited Speaker

Speaker: Prof. Hend Al-Khalifa

Title: Where Cognitive NLP meets Arabic NLP

Short Biography: Hend S. Al-Khalifa is a full Professor with the Information Technology Department, King Saud University and the head of iWAN research group. She has contributed over 150 research papers in workshops, international conferences and journals and has been a principal investigator and co-investigator on over 10 research grants. She also served as a program committee member of many national and international NLP conferences including NAACL-HLT, WANLP, ACLing, and OSACT. Moreover, she served as the Co-Chair of the Workshop on Open-Source Arabic Corpora and Processing Tools since 2014. Prof. Al-Khalifa's research interests include Arabic NLP, Semantic Web, HCI and computers for people with special needs.


AlKhalifa-WANLP 2021 - Keynote.pdf

Accepted Papers

Main Workshop - Long Papers

  1. QADI: Arabic Dialect Identification in the Wild. Authors: Ahmed Abdelali, Hamdy Mubarak, Younes Samih, Sabit Hassan and Kareem Darwish

  2. DiaLex: A Benchmark for Evaluating Multidialectal Arabic Word Embeddings. Authors: Muhammad Abdul-Mageed, Shady Elbassuoni, Jad Doughman, AbdelRahim Elmadany, El Moatez Billah Nagoudi, Yorgo Zoughby, Ahmad Shaher, Iskander Gaba, Ahmed Helal and Mohammed El-Razzaz

  3. Benchmarking Transformer-based Language Models for Arabic Sentiment and Sarcasm Detection. Authors: Ibrahim Abu Farha and Walid Magdy

  4. What does BERT learn from Arabic machine reading comprehension datasets? Authors: Eman Albilali, Nora Altwairesh and Manar Hosny

  5. Kawarith: an Arabic Twitter Corpus for Crisis Events. Authors: Alaa Alharbi and Mark Lee

  6. Arabic Compact Language Modelling for Resource Limited Devices. Authors: Zaid Alyafeai and Irfan Ahmad

  7. Arabic Emoji Sentiment Lexicon (Arab-ESL): A Comparison between Arabic and European Emoji Sentiment Lexicons. Authors: Shatha Ali A. Hakami, Robert Hendley and Phillip Smith

  8. ArCOV19-Rumors: Arabic COVID-19 Twitter Dataset for Misinformation Detection. Authors: Fatima Haouari, Maram Hasanain, Reem Suwaileh and Tamer Elsayed

  9. ArCOV-19: The First Arabic COVID-19 Twitter Dataset with Propagation Networks. Authors: Fatima Haouari, Maram Hasanain, Reem Suwaileh and Tamer Elsayed

  10. The Interplay of Variant, Size, and Task Type in Arabic Pre-trained Language Models. Authors: Go Inoue, Bashar Alhafni, Nurpeiis Baimukan, Houda Bouamor and Nizar Habash

  11. Automatic Difficulty Classification of Arabic Sentences. Nouran Khallaf and Serge Sharoff

  12. Dynamic Ensembles in Named Entity Recognition for Historical Arabic Texts. Authors: Muhammad Majadly and Tomer Sagi

  13. Arabic Offensive Language on Twitter: Analysis and Experiments. Authors: Hamdy Mubarak, Ammar Rashed, Kareem Darwish, Younes Samih and Ahmed Abdelali

  14. Adult Content Detection on Arabic Twitter: Analysis and Experiments. Authors: Hamdy Mubarak, Sabit Hassan and Ahmed Abdelali

  15. UL2C: Mapping User Locations to Countries on Arabic Twitter. Authors: Hamdy Mubarak and Sabit Hassan

  16. Let-Mi: An Arabic Levantine Twitter Dataset for Misogynistic Language. Authors: Hala Mulki and Bilal Ghanem

  17. Empathetic BERT2BERT Conversational Model: Learning Arabic Language Generation with Little Data. Authors: Tarek Naous, Wissam Antoun, Reem Mahmoud and Hazem Hajj

  18. ALUE: Arabic Language Understanding Evaluation. Authors: Haitham Seelawi, Ibraheem Tuffaha, Mahmoud Gzawi, Wael Farhan, Bashar Talafha, Riham Badawi, Zyad Sober, Oday Al-Dweik, Abed Alhakim Freihat and Hussein AL-NATSHEH


Main Workshop - Short/Demo Papers

  1. Quranic Verses Semantic Relatedness Using AraBERT. Authors: Abdullah Alsaleh, Eric Atwell and Abdulrahman Altahhan

  2. AraELECTRA: Pre-Training Text Discriminators for Arabic Language Understanding. Authors: Wissam Antoun, Fady Baly and Hazem Hajj

  3. AraGPT2: Pre-Trained Transformer for Arabic Language Generation. Authors: Wissam Antoun, Fady Baly and Hazem Hajj

  4. QuranTree.jl: A Julia package for Quranic Arabic Corpus. Authors: Al-Ahmadgaid Asaad

  5. Automatic Romanization of Arabic Bibliographic Records. Authors: Fadhl Eryani and Nizar Habash

  6. SERAG: Semantic Entity Retrieval from Arabic knowledge Graphs. Authors: Saher Esmeir

  7. Introducing A large Tunisian Arabizi Dialectal Dataset for Sentiment Analysis. Authors: Chayma Fourati, Hatem Haddad, Abir Messaoudi, Moez BenHajhmida, Aymen Ben Elhaj Mabrouk and Malek Naski

  8. AraFacts: The First Large Arabic Dataset of Naturally Occurring Claims. Authors: Zien Sheikh Ali, Watheq Mansour, Tamer Elsayed and Abdulaziz Al‑Ali

  9. Improving Cross-Lingual Transfer for Event Argument Extraction with Language-Universal Sentence Structures. Authors: Minh Van Nguyen and Thien Huu Nguyen

NADI Shared Task


  1. NADI 2021: The Second Nuanced Arabic Dialect Identification Shared Task. Authors: Muhammad Abdul-Mageed, Chiyu Zhang, AbdelRahim Elmadany, Houda Bouamor and Nizar Habash

  2. Adapting MARBERT for Improved Arabic Dialect Identification: Submission to the NADI 2021 Shared Task. Authors: Badr AlKhamissi, Mohamed Gabr, Muhammad ElNokrashy and Khaled Essam

  3. Country-level Arabic Dialect Identification Using Small Datasets with Integrated Machine Learning Techniques and Deep Learning Models. Authors: Maha J. Althobaiti

  4. BERT-based Multi-Task Model for Country and Province Level MSA and Dialectal Arabic Identification. Authors: Abdellah El Mekki, Abdelkader El Mahdaouy, Kabil Essefar, Nabil El Mamoun, Ismail Berrada and Ahmed Khoumsi

  5. Country-level Arabic dialect identification using RNNs with and without linguistic features. Authors: Elsayed Issa, Mohammed AlShakhori, Reda Al-Bahrani and Gus Hahn-Powell

  6. Arabic Dialect Identification based on a Weighted Concatenation of TF-IDF Features. Authors: Mohamed Lichouri, Mourad Abbas, Khaled Lounnas, Besma Benaziz and Aicha Zitouni

  7. Machine Learning-Based Approach for Arabic Dialect Identification. Authors: Hamada Nayel, Ahmed Hassan, Mahmoud Sobhi and Ahmed El-Sawy

  8. Dialect Identification in Nuanced Arabic Tweets Using Farasa Segmentation and AraBERT. Authors: Anshul Wadhawan


ArSarcasm Shared Task


  1. Overview of the WANLP 2021 Shared Task on Sarcasm and Sentiment Detection in Arabic. Authors: Ibrahim Abu Farha, Wajdi Zaghouani and Walid Magdy

  2. WANLP 2021 Shared-Task: Towards Irony and Sentiment detection in Arabic tweets using Multi-headed-LSTM-CNN-GRU and MaRBERT. Authors: Reem Abdel-Salam

  3. Sarcasm and Sentiment Detection In Arabic Tweets Using BERT-based Models and Data Augmentation. Authors: Abeer Abuzayed and Hend Al-Khalifa

  4. Multi-task Learning Using a Combination of Contextualised and Static Word Embeddings for Arabic Sarcasm Detection and Sentiment Analysis. Authors: Abdullah I. Alharbi and Mark Lee

  5. ArSarcasm Shared Task: An Ensemble BERT Model for Sarcasm Detection in Arabic Tweets. Authors: Laila Bashmal and Daliyah AlZeer

  6. Sarcasm and Sentiment Detection in Arabic: investigating the interest of character-level features. Authors: Dhaou Ghoul and Gaël Lejeune

  7. Deep Multi-Task Model for Sarcasm Detection and Sentiment Analysis in Arabic Language. Authors: Abdelkader El Mahdaouy, Abdellah El Mekki, Kabil Essefar, Nabil El Mamoun, Ismail Berrada and Ahmed Khoumsi

  8. A contextual word embedding for Arabic sarcasm detection with random forests. Authors: Hazem Elgabry, Shimaa Attia, Ahmed Abdel-Rahman, Ahmed Abdel-Ate and Sandra Girgis

  9. SarcasmDet at Sarcasm Detection Task 2021 in Arabic using AraBERT Pretrained Model. Authors: Dalya Faraj and Malak Abdullah

  10. Sarcasm and sentiment detection in Arabic language A hybrid approach combining embeddings and rule-based features. Authors:Kamel Gaanoun and Imade Benelallam

  11. Combining Context-Free and Contextualized Word Representations for Arabic Sarcasm Detection and Sentiment Identification. Authors:Amey Hengle, Atharva Kshirsagar, Shaily Desai and Manisha Marathe

  12. Leveraging Offensive Language for Sarcasm and Sentiment Detection in Arabic. Authors: Fatemah Husain and Ozlem Uzuner

  13. The IDC System for Sentiment Classification and Sarcasm Detection in Arabic. Authors: Abraham Israeli, Yotam Nahum, Shai Fine and Kfir Bar

  14. Preprocessing Solutions for Detection of Sarcasm and Sentiment for Arabic. Authors: Mohamed Lichouri, Mourad Abbas, benaziz besma, Aicha Zitouni and Khaled Lounnas

  15. iCompass at Shared Task on Sarcasm and Sentiment Detection in Arabic. Authors: Malek Naski, Abir Messaoudi, Hatem Haddad, Moez BenHajhmida, Chayma Fourati and Mohamed Aymen Ben Elhaj Mabrouk

  16. Machine Learning-Based Model for Sentiment and Sarcasm Detection. Authors: Hamada Nayel, Eslam Amer, Aya Allam and Hanya Abdallah

  17. DeepBlueAI at WANLP-EACL2021 task 2: A Deep Ensemble-based Method for Sarcasm and Sentiment Detection in Arabic. Authors: Bingyan Song, Chunguang Pan, shengguang wang and Zhipeng Luo

  18. AraBERT and Farasa Segmentation Based Approach For Sarcasm and Sentiment Detection in Arabic Tweets. Authors: Anshul Wadhawan


Workshop Organizers

General Chair:

      • Nizar Habash, New York University Abu Dhabi, UAE. Email: nizar.habash AT nyu.edu

Program Chairs:

      • Houda Bouamor, Carnegie Mellon University in Qatar. Email: hbouamor AT qatar.cmu.edu

      • Hazem Hajj, American University of Beirut, Lebanon. Email: hh63 AT aub.edu.lb

      • Walid Magdy, University of Edinburgh, Scotland. Email: wmagdy AT inf.ed.ac.uk

      • Wajdi Zaghouani, Hamad Bin Khalifa University, Qatar. Email: wzaghouani AT hbku.edu.qa

Publication Chair:

      • Fethi Bougares, University of Le Mans, France. Email: fethi.bougares AT univ-lemans.fr

      • Nadi Tomeh, LIPN, Université Paris 13, Sorbonne Paris Cité. Email: tomeh AT lipn.fr

Publicity Chair:

      • Ibrahim Abu Farha, University of Edinburgh, Scotland. Email: i.abufarha AT ed.ac.uk

      • Samia Touileb, University of Oslo, Norway. Email: samiat AT ifi.uio.no

Ex-General Chairs / Advisors:

      • Wassim El-Hajj, American University of Beirut, Lebanon. Email: we07 AT aub.edu.lb

      • Imed Zitouni, Google, USA. Email: imed.zitouni AT gmail.com

Advisory Committee:

      • Muhammad Abdul-Mageed, UBC, Canada. Email: muhammad.mageed@ubc.ca

      • Ahmed Ali, Qatar Computing Research Institute, Qatar. Email: amali@qf.org.qa

      • Hend Alkhalifa, King Saud University, Saudi Arabia. Email: hend.alkhalifa AT gmail.com

      • Houda Bouamor, Carnegie Mellon University in Qatar. Email: hbouamor AT qatar.cmu.edu

      • Fethi Bougares, Le Mans University, France. Email: Fethi.bougares AT gmail.com

      • Khalid Choukri, ELDA, European Language Resource Association, France. Email: choukri AT elda.org

      • Kareem Darwish, Qatar Computing Research Institute, Qatar. Email: kdarwish AT hbku.edu.qa

      • Mona Diab, George Washington University, USA. Email: mtdiab AT gmail.com

      • Mahmoud El-Haj, Lancaster University, UK. Email: m.el-haj AT lancaster.ac.uk

      • Samhaa El-Beltagy, Nile University, Egypt. Email: samhaaelbeltagy AT gmail.com

      • Wassim El-Hajj, American University of Beirut, Lebanon. Email: we07 AT aub.edu.lb

      • Nizar Habash, New York University Abu Dhabi, UAE. Email: nizar.habash AT nyu.edu

      • Lamia Hadrich Belguith, University of Sfax, Tunisia. Email: lamia.belguith AT gmail.com

      • Hazem Hajj, American University of Beirut, Lebanon. Email: hh63 AT aub.edu.lb

      • Walid Magdy, University of Edinburgh, Scotland. Email: wmagdy AT inf.ed.ac.uk

      • Khaled Shaalan, The British University in Dubai, UAE. Email: khaled.shaalan AT buid.ac.ae

      • Kamel Smaili, University of Lorraine, France. Email: kamel.smaili AT loria.fr

      • Nadi Tomeh, University Paris 13, France. Email: tomeh AT lipn.fr

      • Wajdi Zaghouani, Hamad Bin Khalifa University, Qatar. Email: wajdiz AT gmail.com

      • Imed Zitouni, Google, USA. Email: imed.zitouni AT gmail.com

Program Committee Members

The following is the list of PC members, all of whom participated in the review process of the WANLP 2020. A large percentage of them confirmed their willingness to review papers for WANLP 2021.

  1. Mourad Abbas, CRSTDLA, Algeria

  2. Ahmed Abdelali, Qatar Computing Research Institute, HBKU, Qatar

  3. Muhammad Abdul-Mageed, UBC Canada

  4. Ibrahim Abu Farha, University of Edinburgh, Scotland

  5. Bayan AbuShawar, Al Ain University, UAE

  6. Haithem Afli, Cork Institute of Technology, Ireland

  7. Hend Al-Khalifa, King Saud University, KSA

  8. Hussein AL-NATSHEH, Mawdoo3 Limited, Jordan

  9. Almoataz B. Al-Said, Cairo University, Egypt

  10. Nora Al-Twairesh, King Saud University, KSA

  11. Bashar Alhafni, New York University Abu Dhabi, UAE

  12. Ahmed Ali, Qatar Computing Research Institute, HBKU, Qatar

  13. Wissam Antoun, American University of Beirut, Lebanon

  14. Gilbert Badaro, American University of Beirut, Lebanon

  15. Riadh Belkebir, New York University Abu Dhabi, UAE

  16. Abdelmajid Ben-Hamadou, Sfax University, Tunisia

  17. Houda Bouamor, Carnegie Mellon University in Qatar

  18. Fethi Bougares, Le Mans University, France

  19. Karim Bouzoubaa, Mohammad V University, Morocco

  20. Violetta Cavalli-Sforza, Al Akhawayn University, Morocco

  21. Aloulou Chafik, Univeristé de Sfax, Tunisia

  22. Khalid Choukri, ELDA, European Language Resource Association, France

  23. Shammur Absar Chowdhury, Qatar Computing Research Institute, HBKU, Qatar

  24. Kareem Darwish, Qatar Computing Research Institute, HBKU, Qatar

  25. Mona Diab, George Washington University, USA

  26. Samhaa R. El-Beltagy, Nile University, Egypt

  27. Wassim El-Hajj, American University of Beirut, Lebanon

  28. Shady Elbassuoni, American University of Beirut, Lebanon

  29. Obeida ElJundi, American University of Beirut, Lebanon

  30. AbdelRahim Elmadany, UBC, Canada

  31. Tamer Elsayed, Qatar University, Qatar

  32. Sahar Ghannay, LIMSI, France

  33. Nada Ghneim, Higher Institute for Applied Sciences and Technology, Syria

  34. Nizar Habash, New York University Abu Dhabi, UAE

  35. Bassam Haddad, University of Petra, Jordan

  36. Lamia Hadrich-Belguith, University of Sfax, Tunisia

  37. Hazem Hajj, American University of Beirut, Lebanon

  38. Salima Harrat, École Normale Supérieure (Bouzaréah), Algeria

  39. Maram Hasanain, Qatar University, Qatar

  40. Go Inoue, New York University Abu Dhabi, UAE

  41. Mustafa Jarrar, Bir Zeit University, Palestine

  42. Ganesh Jawahar, The University of British Columbia, Canada

  43. Salam Khalifa, New York University Abu Dhabi, UAE

  44. Walid Magdy, University of Edinburgh, Scotland

  45. Azzeddine Mazroui, University Mohamed I, Morocco

  46. Seif Mechti, University of Sfax, Tunisia

  47. salima medhaffar, Le Mans University, France

  48. Karima Meftouh, Badji Mokhtar University, Algeria

  49. Hamdy Mubarak, Qatar Computing Research Institute, HBKU, Qatar

  50. El Moatez Billah Nagoudi, The University of British Columbia, Canada

  51. Preslav Nakov, Qatar Computing Research Institute, HBKU, Qatar

  52. Alexis Nasr, University of Marseille, France

  53. Hamada Nayel, Benha University, Egypt

  54. Younes Samih, Heinrich Heine Universität Düsseldorf, Germany

  55. Khaled Shaalan, The British University in Dubai, UAE

  56. Khaled Shaban, Qatar University, Qatar

  57. Kamel Smaili, University of Lorraine, France

  58. Peter Sullivan, The University of British Columbia, Canada

  59. Reem Suwaileh, Qatar University, Qatar

  60. Nadi Tomeh, University Paris 13, France

  61. Samia Touileb, University of Oslo, Norway

  62. Omar Trigui, University of Sousse, Tunisia

  63. Wajdi Zaghouani, Hamad Bin Khalifa University, Qatar

  64. Nasser Zalmout, Amazon Inc., USA

  65. Taha Zerrouki, University of Bouira, Algeria

  66. Chiyu Zhang, UBC Canada

  67. Imed Zitouni, Google, USA

Important Dates

  • Jan 18, 2020: Workshop Paper Due Date (Extended to Feb 1, 2021 Feb 5, 2021)

  • Feb 18: Notification of Acceptance (Pushed to Feb 22, 2021)

  • Mar 1: Camera-ready papers due (strict!)

  • April 19: Workshop Date (one day)

    All deadlines are 11.59 pm UTC -12h (“anywhere on Earth”).


Paper Submission Instructions

Types of Papers: We invite research papers (long and short), demo papers, and shared task description papers.

Paper Length: Long research papers may consist of up to 8 pages of content, plus unlimited references; final versions of long papers will be given one additional page of content (up to 9 pages) so that reviewers’ comments can be taken into account. Short research papers, demo papers and shared task description papers may consist of up to 4 pages of content, plus unlimited references; final versions of short papers will be given one additional page of content (up to 5 pages) so that reviewers’ comments can be taken into account.

Submission Format: Please follow the EACL official Latex template

Submission Website: https://www.softconf.com/eacl2021/WANLP2021/

Blind Reviewing Policy: The workshop follows a blind reviewing policy for the main workshop tracks (Long, Short, and Demo). The authors should omit their names and affiliations from the paper and avoid self-references that reveal their identity. Papers that do not conform to these requirements will be rejected without review.

Multiple Submission Policy: Papers that have been or will be submitted to other meetings or publications must indicate this at submission time. Authors must inform organizers immediately once a paper is to be withdrawn from the workshop for any reason. Attempting to publish the same paper or with a major overlap (50%) may lead to rejection of the paper even after an acceptance notification have gone out.

Anonymity and Supplementary Material: As the reviewing will be blind, papers must not include authors' names and affiliations. Furthermore, self-references that reveal the author's identity, e.g., "We previously showed (Smith, 1991) ..." must be avoided. Instead, use citations such as "Smith previously showed (Smith, 1991) ..." Papers that do not conform to these requirements will be rejected without review.

Papers should not refer, for further detail, to documents that are not available to the reviewers. For example, do not omit or redact important citation information to preserve anonymity. Instead, use third person or named reference to this work, as described above (“Smith showed” rather than “we showed”).

Papers may be accompanied by a resource (software and/or data) described in the paper. Papers that are submitted with accompanying software/data may receive additional credit toward the overall evaluation, and the potential impact of the software and data will be taken into account when making the acceptance/rejection decisions.

Anonymity Period: We follow the guidelines of EACL 2021 for Anonymity Period with the period being 1 month before the submission deadline up to the date of acceptance, rejection or withdrawal (January 1, 2021 to February 22, 2021).

Shared Tasks

Shared Task 1: NADI 2021

The Nuanced Arabic Dialect Identification (NADI) targets province-level dialects, and as such is the first to focus on naturally-occurring fine-grained dialects at the sub-country level. The first NADI shared task (NADI 2020) was held at WANLP 2020. The NADI 2021 shared task will continue to focus on fine-grained dialects with new datasets and efforts to distinguish both MSA and dialects according to their geographical origin. The data covers a total of 100 provinces from all 21 Arab countries and come from the Twitter domain. Evaluation and task set up follow the NADI 2020 shared task. The subtasks involved include:

Subtask 1:

  • Subtask 1.1: Country-level MSA identification: A total of 21,000 tweets, covering 21 Arab countries.

  • Subtask 1.2: Country-level DA identification: A total of 21,000 tweets, covering 21 Arab countries.

Subtask 2: Similar to Subtask 1 but focusing on the Province level

Participants will be provided with an additional 10M unlabeled tweets that can be used in developing their systems for either or both of the tasks. The evaluation metrics will include precision/recall/f1-score/accuracy. Macro Averaged F-score will be the official metric.

Shared task page: https://sites.google.com/view/second-nadi-shared-task

Organizers:

Muhammad Abdul-Mageed, Chiyu Zhang (The University of British Columbia, Canada), Nizar Habash (New York University Abu Dhabi) , and Houda Bouamor (Carnegie Mellon University, Qatar).

Shared Task 2: Sarcasm and Sentiment Detection In Arabic

Sarcasm detection is the process of identifying whether a piece of text is sarcastic or not. Sarcasm detection received attention in other languages, but in Arabic still lags behind. The shared task on Sarcasm and Sentiment Detection in Arabic will focus on analyzing tweets and identifying their sentiment and whether a tweet is sarcastic or not.

There are two subtasks in this shared task:

  • Subtask 1 (Sarcasm Detection): Identifying whether a tweet is sarcastic or not, this is a binary classification task.

  • Subtask 2 (Sentiment Analysis): Identifying the sentiment of a tweet and assigning one of three labels (Positive, Negative, Neutral), multilabel classification task.

Shared task page: https://sites.google.com/view/ar-sarcasm-sentiment-detection/

Organizers:

Ibrahim Abu Farha (The University of Edinburgh, UK), Wajdi Zaghouani (Hamad Bin Khalifa University, Qatar) and Walid Magdy (The University of Edinburgh, UK)