The Sixth
Arabic Natural Language Processing Workshop
(WANLP 2021)
Location: EACL 2021 (Virtual)
Workshop Proceedings
are Publicly Accessible
on the ACL Anthology
April 19, 2021
wanlp2021.arabic-nlp.net
Opening Remarks
Closing Remarks
Workshop Description
Arabic is a challenging language for the field of computational linguistics. This is due to many factors including its complex and rich morphology, its high degree of ambiguity as well as the presence of a number of dialects that vary quite widely. Arabic is also a language with important geopolitical connections. It is spoken by over 400 million people in countries with varying degrees of prosperity and stability. It is the primary language of the latest world refugee problem affecting the Middle East and Europe. The opportunities that are made possible by working on this language and its dialects cannot be underestimated in their consequence on the Arab World, the Mediterranean Region, and the rest of the World.
There has been a lot of progress in the last 20 years in the area of Arabic Natural Language Processing (NLP). Many Arabic NLP (or Arabic NLP-related) workshops and conferences have taken place, both in the Arab World and in association with international conferences. Examples include the following:
The First, Second, Third, Fourth and Fifth Arabic Natural Language Processing Workshop at EMNLP 2014, ACL 2015, EACL 2017, ACL 2019, and COLING 2020 respectively.
The First, Second, Third, and Fourth Workshops on Arabic Corpora and Processing Tools at LREC 2014, LREC 2016, LREC 2018, and LREC 2020 respectively.
The conference on Arabic Language Resources and Tools (MEDAR-2009, NEMLAR-2004).
The workshop on Computational Approaches to Semitic Languages (LREC 2010, EACL 2009, ACL 2007, ACL 2005, ACL 2002, ACL 1998).
The workshop on Computational Approaches to Arabic Script-based Languages (MTSummit XII 2009, LSA 2007, COLING 2004).
The International Symposium on Computer and Arabic Language (ISCAL 2009, ISCAL 2007)
This workshop follows in the footsteps of these efforts to provide a forum for researchers to share and discuss their ongoing work. This workshop is timely given the continued rise in research projects focusing on Arabic NLP.
We invite submissions on topics of natural language processing that include, but are not limited to, the following:
Basic core technologies: morphological analysis, disambiguation, tokenization, POS tagging, named entity detection, chunking, parsing, semantic role labeling, Arabic dialect modeling, etc.
Applications: machine translation, speech recognition, speech synthesis, optical character recognition, pedagogy, assistive technologies, social media analytics, sentiment analysis, summarizations, dialogue systems, etc.
Resources: lexicons, dictionaries, annotated and unannotated corpora, etc.
Submissions may include work in progress as well as finished work hat has not been previously published. Submissions must have a clear focus on specific issues pertaining to the Arabic language whether it is standard Arabic, dialectal, or mixed. Papers on other languages sharing problems faced by Arabic NLP researchers such as Semitic languages or languages using Arabic script are welcome. Additionally, papers on efforts using Arabic resources but targeting other languages are also welcome. Descriptions of commercial systems are welcome, but authors should be willing to discuss the details of their work.
Workshop Program
Best Paper Award
ArCOV19-Rumors: Arabic COVID-19 Twitter Dataset for Misinformation Detection
by Fatima Haouari, Maram Hasanain, Reem Suwaileh and Tamer Elsayed
Presentation Instructions
Main Workshop Oral Presentations (Session 1a, 1b, 2a, 2b): 12 min talks + 3 min Q&A. Powerpoint or PDF slides are ok. You will be asked to share your screen. Please be present for the whole session and on time. We will keep the talks on schedule, to allow conference attendees to go from workshop to another.
Main Workshop Posters (Poster Session I): You present a 2-min poster boaster; and also have a poster (landscape mode) in Gathertown (information was sent by email). Please be on time.
Shared Task Posters (Poster Session II): You will have a poster (landscape mode) in Gathertown (information was sent by email). Please be on time.
Program Summary
Sponsors
We are delighted to share that we received sponsorship from Google, The Alan Turing Institute and DSTL.
The funds were used to cover EACL fees and ACL Membership for nine students.
Invited Speaker
Speaker: Prof. Hend Al-Khalifa
Title: Where Cognitive NLP meets Arabic NLP
Short Biography: Hend S. Al-Khalifa is a full Professor with the Information Technology Department, King Saud University and the head of iWAN research group. She has contributed over 150 research papers in workshops, international conferences and journals and has been a principal investigator and co-investigator on over 10 research grants. She also served as a program committee member of many national and international NLP conferences including NAACL-HLT, WANLP, ACLing, and OSACT. Moreover, she served as the Co-Chair of the Workshop on Open-Source Arabic Corpora and Processing Tools since 2014. Prof. Al-Khalifa's research interests include Arabic NLP, Semantic Web, HCI and computers for people with special needs.
Accepted Papers
Main Workshop - Long Papers
QADI: Arabic Dialect Identification in the Wild. Authors: Ahmed Abdelali, Hamdy Mubarak, Younes Samih, Sabit Hassan and Kareem Darwish
DiaLex: A Benchmark for Evaluating Multidialectal Arabic Word Embeddings. Authors: Muhammad Abdul-Mageed, Shady Elbassuoni, Jad Doughman, AbdelRahim Elmadany, El Moatez Billah Nagoudi, Yorgo Zoughby, Ahmad Shaher, Iskander Gaba, Ahmed Helal and Mohammed El-Razzaz
Benchmarking Transformer-based Language Models for Arabic Sentiment and Sarcasm Detection. Authors: Ibrahim Abu Farha and Walid Magdy
What does BERT learn from Arabic machine reading comprehension datasets? Authors: Eman Albilali, Nora Altwairesh and Manar Hosny
Kawarith: an Arabic Twitter Corpus for Crisis Events. Authors: Alaa Alharbi and Mark Lee
Arabic Compact Language Modelling for Resource Limited Devices. Authors: Zaid Alyafeai and Irfan Ahmad
Arabic Emoji Sentiment Lexicon (Arab-ESL): A Comparison between Arabic and European Emoji Sentiment Lexicons. Authors: Shatha Ali A. Hakami, Robert Hendley and Phillip Smith
ArCOV19-Rumors: Arabic COVID-19 Twitter Dataset for Misinformation Detection. Authors: Fatima Haouari, Maram Hasanain, Reem Suwaileh and Tamer Elsayed
ArCOV-19: The First Arabic COVID-19 Twitter Dataset with Propagation Networks. Authors: Fatima Haouari, Maram Hasanain, Reem Suwaileh and Tamer Elsayed
The Interplay of Variant, Size, and Task Type in Arabic Pre-trained Language Models. Authors: Go Inoue, Bashar Alhafni, Nurpeiis Baimukan, Houda Bouamor and Nizar Habash
Automatic Difficulty Classification of Arabic Sentences. Nouran Khallaf and Serge Sharoff
Dynamic Ensembles in Named Entity Recognition for Historical Arabic Texts. Authors: Muhammad Majadly and Tomer Sagi
Arabic Offensive Language on Twitter: Analysis and Experiments. Authors: Hamdy Mubarak, Ammar Rashed, Kareem Darwish, Younes Samih and Ahmed Abdelali
Adult Content Detection on Arabic Twitter: Analysis and Experiments. Authors: Hamdy Mubarak, Sabit Hassan and Ahmed Abdelali
UL2C: Mapping User Locations to Countries on Arabic Twitter. Authors: Hamdy Mubarak and Sabit Hassan
Let-Mi: An Arabic Levantine Twitter Dataset for Misogynistic Language. Authors: Hala Mulki and Bilal Ghanem
Empathetic BERT2BERT Conversational Model: Learning Arabic Language Generation with Little Data. Authors: Tarek Naous, Wissam Antoun, Reem Mahmoud and Hazem Hajj
ALUE: Arabic Language Understanding Evaluation. Authors: Haitham Seelawi, Ibraheem Tuffaha, Mahmoud Gzawi, Wael Farhan, Bashar Talafha, Riham Badawi, Zyad Sober, Oday Al-Dweik, Abed Alhakim Freihat and Hussein AL-NATSHEH
Main Workshop - Short/Demo Papers
Quranic Verses Semantic Relatedness Using AraBERT. Authors: Abdullah Alsaleh, Eric Atwell and Abdulrahman Altahhan
AraELECTRA: Pre-Training Text Discriminators for Arabic Language Understanding. Authors: Wissam Antoun, Fady Baly and Hazem Hajj
AraGPT2: Pre-Trained Transformer for Arabic Language Generation. Authors: Wissam Antoun, Fady Baly and Hazem Hajj
QuranTree.jl: A Julia package for Quranic Arabic Corpus. Authors: Al-Ahmadgaid Asaad
Automatic Romanization of Arabic Bibliographic Records. Authors: Fadhl Eryani and Nizar Habash
SERAG: Semantic Entity Retrieval from Arabic knowledge Graphs. Authors: Saher Esmeir
Introducing A large Tunisian Arabizi Dialectal Dataset for Sentiment Analysis. Authors: Chayma Fourati, Hatem Haddad, Abir Messaoudi, Moez BenHajhmida, Aymen Ben Elhaj Mabrouk and Malek Naski
AraFacts: The First Large Arabic Dataset of Naturally Occurring Claims. Authors: Zien Sheikh Ali, Watheq Mansour, Tamer Elsayed and Abdulaziz Al‑Ali
Improving Cross-Lingual Transfer for Event Argument Extraction with Language-Universal Sentence Structures. Authors: Minh Van Nguyen and Thien Huu Nguyen
NADI Shared Task
NADI 2021: The Second Nuanced Arabic Dialect Identification Shared Task. Authors: Muhammad Abdul-Mageed, Chiyu Zhang, AbdelRahim Elmadany, Houda Bouamor and Nizar Habash
Adapting MARBERT for Improved Arabic Dialect Identification: Submission to the NADI 2021 Shared Task. Authors: Badr AlKhamissi, Mohamed Gabr, Muhammad ElNokrashy and Khaled Essam
Country-level Arabic Dialect Identification Using Small Datasets with Integrated Machine Learning Techniques and Deep Learning Models. Authors: Maha J. Althobaiti
BERT-based Multi-Task Model for Country and Province Level MSA and Dialectal Arabic Identification. Authors: Abdellah El Mekki, Abdelkader El Mahdaouy, Kabil Essefar, Nabil El Mamoun, Ismail Berrada and Ahmed Khoumsi
Country-level Arabic dialect identification using RNNs with and without linguistic features. Authors: Elsayed Issa, Mohammed AlShakhori, Reda Al-Bahrani and Gus Hahn-Powell
Arabic Dialect Identification based on a Weighted Concatenation of TF-IDF Features. Authors: Mohamed Lichouri, Mourad Abbas, Khaled Lounnas, Besma Benaziz and Aicha Zitouni
Machine Learning-Based Approach for Arabic Dialect Identification. Authors: Hamada Nayel, Ahmed Hassan, Mahmoud Sobhi and Ahmed El-Sawy
Dialect Identification in Nuanced Arabic Tweets Using Farasa Segmentation and AraBERT. Authors: Anshul Wadhawan
ArSarcasm Shared Task
Overview of the WANLP 2021 Shared Task on Sarcasm and Sentiment Detection in Arabic. Authors: Ibrahim Abu Farha, Wajdi Zaghouani and Walid Magdy
WANLP 2021 Shared-Task: Towards Irony and Sentiment detection in Arabic tweets using Multi-headed-LSTM-CNN-GRU and MaRBERT. Authors: Reem Abdel-Salam
Sarcasm and Sentiment Detection In Arabic Tweets Using BERT-based Models and Data Augmentation. Authors: Abeer Abuzayed and Hend Al-Khalifa
Multi-task Learning Using a Combination of Contextualised and Static Word Embeddings for Arabic Sarcasm Detection and Sentiment Analysis. Authors: Abdullah I. Alharbi and Mark Lee
ArSarcasm Shared Task: An Ensemble BERT Model for Sarcasm Detection in Arabic Tweets. Authors: Laila Bashmal and Daliyah AlZeer
Sarcasm and Sentiment Detection in Arabic: investigating the interest of character-level features. Authors: Dhaou Ghoul and Gaël Lejeune
Deep Multi-Task Model for Sarcasm Detection and Sentiment Analysis in Arabic Language. Authors: Abdelkader El Mahdaouy, Abdellah El Mekki, Kabil Essefar, Nabil El Mamoun, Ismail Berrada and Ahmed Khoumsi
A contextual word embedding for Arabic sarcasm detection with random forests. Authors: Hazem Elgabry, Shimaa Attia, Ahmed Abdel-Rahman, Ahmed Abdel-Ate and Sandra Girgis
SarcasmDet at Sarcasm Detection Task 2021 in Arabic using AraBERT Pretrained Model. Authors: Dalya Faraj and Malak Abdullah
Sarcasm and sentiment detection in Arabic language A hybrid approach combining embeddings and rule-based features. Authors:Kamel Gaanoun and Imade Benelallam
Combining Context-Free and Contextualized Word Representations for Arabic Sarcasm Detection and Sentiment Identification. Authors:Amey Hengle, Atharva Kshirsagar, Shaily Desai and Manisha Marathe
Leveraging Offensive Language for Sarcasm and Sentiment Detection in Arabic. Authors: Fatemah Husain and Ozlem Uzuner
The IDC System for Sentiment Classification and Sarcasm Detection in Arabic. Authors: Abraham Israeli, Yotam Nahum, Shai Fine and Kfir Bar
Preprocessing Solutions for Detection of Sarcasm and Sentiment for Arabic. Authors: Mohamed Lichouri, Mourad Abbas, benaziz besma, Aicha Zitouni and Khaled Lounnas
iCompass at Shared Task on Sarcasm and Sentiment Detection in Arabic. Authors: Malek Naski, Abir Messaoudi, Hatem Haddad, Moez BenHajhmida, Chayma Fourati and Mohamed Aymen Ben Elhaj Mabrouk
Machine Learning-Based Model for Sentiment and Sarcasm Detection. Authors: Hamada Nayel, Eslam Amer, Aya Allam and Hanya Abdallah
DeepBlueAI at WANLP-EACL2021 task 2: A Deep Ensemble-based Method for Sarcasm and Sentiment Detection in Arabic. Authors: Bingyan Song, Chunguang Pan, shengguang wang and Zhipeng Luo
AraBERT and Farasa Segmentation Based Approach For Sarcasm and Sentiment Detection in Arabic Tweets. Authors: Anshul Wadhawan
Workshop Organizers
General Chair:
Nizar Habash, New York University Abu Dhabi, UAE. Email: nizar.habash AT nyu.edu
Program Chairs:
Houda Bouamor, Carnegie Mellon University in Qatar. Email: hbouamor AT qatar.cmu.edu
Hazem Hajj, American University of Beirut, Lebanon. Email: hh63 AT aub.edu.lb
Walid Magdy, University of Edinburgh, Scotland. Email: wmagdy AT inf.ed.ac.uk
Wajdi Zaghouani, Hamad Bin Khalifa University, Qatar. Email: wzaghouani AT hbku.edu.qa
Publication Chair:
Fethi Bougares, University of Le Mans, France. Email: fethi.bougares AT univ-lemans.fr
Nadi Tomeh, LIPN, Université Paris 13, Sorbonne Paris Cité. Email: tomeh AT lipn.fr
Publicity Chair:
Ibrahim Abu Farha, University of Edinburgh, Scotland. Email: i.abufarha AT ed.ac.uk
Samia Touileb, University of Oslo, Norway. Email: samiat AT ifi.uio.no
Ex-General Chairs / Advisors:
Wassim El-Hajj, American University of Beirut, Lebanon. Email: we07 AT aub.edu.lb
Imed Zitouni, Google, USA. Email: imed.zitouni AT gmail.com
Advisory Committee:
Muhammad Abdul-Mageed, UBC, Canada. Email: muhammad.mageed@ubc.ca
Ahmed Ali, Qatar Computing Research Institute, Qatar. Email: amali@qf.org.qa
Hend Alkhalifa, King Saud University, Saudi Arabia. Email: hend.alkhalifa AT gmail.com
Houda Bouamor, Carnegie Mellon University in Qatar. Email: hbouamor AT qatar.cmu.edu
Fethi Bougares, Le Mans University, France. Email: Fethi.bougares AT gmail.com
Khalid Choukri, ELDA, European Language Resource Association, France. Email: choukri AT elda.org
Kareem Darwish, Qatar Computing Research Institute, Qatar. Email: kdarwish AT hbku.edu.qa
Mona Diab, George Washington University, USA. Email: mtdiab AT gmail.com
Mahmoud El-Haj, Lancaster University, UK. Email: m.el-haj AT lancaster.ac.uk
Samhaa El-Beltagy, Nile University, Egypt. Email: samhaaelbeltagy AT gmail.com
Wassim El-Hajj, American University of Beirut, Lebanon. Email: we07 AT aub.edu.lb
Nizar Habash, New York University Abu Dhabi, UAE. Email: nizar.habash AT nyu.edu
Lamia Hadrich Belguith, University of Sfax, Tunisia. Email: lamia.belguith AT gmail.com
Hazem Hajj, American University of Beirut, Lebanon. Email: hh63 AT aub.edu.lb
Walid Magdy, University of Edinburgh, Scotland. Email: wmagdy AT inf.ed.ac.uk
Khaled Shaalan, The British University in Dubai, UAE. Email: khaled.shaalan AT buid.ac.ae
Kamel Smaili, University of Lorraine, France. Email: kamel.smaili AT loria.fr
Nadi Tomeh, University Paris 13, France. Email: tomeh AT lipn.fr
Wajdi Zaghouani, Hamad Bin Khalifa University, Qatar. Email: wajdiz AT gmail.com
Imed Zitouni, Google, USA. Email: imed.zitouni AT gmail.com
Program Committee Members
The following is the list of PC members, all of whom participated in the review process of the WANLP 2020. A large percentage of them confirmed their willingness to review papers for WANLP 2021.
Mourad Abbas, CRSTDLA, Algeria
Ahmed Abdelali, Qatar Computing Research Institute, HBKU, Qatar
Muhammad Abdul-Mageed, UBC Canada
Ibrahim Abu Farha, University of Edinburgh, Scotland
Bayan AbuShawar, Al Ain University, UAE
Haithem Afli, Cork Institute of Technology, Ireland
Hend Al-Khalifa, King Saud University, KSA
Hussein AL-NATSHEH, Mawdoo3 Limited, Jordan
Almoataz B. Al-Said, Cairo University, Egypt
Nora Al-Twairesh, King Saud University, KSA
Bashar Alhafni, New York University Abu Dhabi, UAE
Ahmed Ali, Qatar Computing Research Institute, HBKU, Qatar
Wissam Antoun, American University of Beirut, Lebanon
Gilbert Badaro, American University of Beirut, Lebanon
Riadh Belkebir, New York University Abu Dhabi, UAE
Abdelmajid Ben-Hamadou, Sfax University, Tunisia
Houda Bouamor, Carnegie Mellon University in Qatar
Fethi Bougares, Le Mans University, France
Karim Bouzoubaa, Mohammad V University, Morocco
Violetta Cavalli-Sforza, Al Akhawayn University, Morocco
Aloulou Chafik, Univeristé de Sfax, Tunisia
Khalid Choukri, ELDA, European Language Resource Association, France
Shammur Absar Chowdhury, Qatar Computing Research Institute, HBKU, Qatar
Kareem Darwish, Qatar Computing Research Institute, HBKU, Qatar
Mona Diab, George Washington University, USA
Samhaa R. El-Beltagy, Nile University, Egypt
Wassim El-Hajj, American University of Beirut, Lebanon
Shady Elbassuoni, American University of Beirut, Lebanon
Obeida ElJundi, American University of Beirut, Lebanon
AbdelRahim Elmadany, UBC, Canada
Tamer Elsayed, Qatar University, Qatar
Sahar Ghannay, LIMSI, France
Nada Ghneim, Higher Institute for Applied Sciences and Technology, Syria
Nizar Habash, New York University Abu Dhabi, UAE
Bassam Haddad, University of Petra, Jordan
Lamia Hadrich-Belguith, University of Sfax, Tunisia
Hazem Hajj, American University of Beirut, Lebanon
Salima Harrat, École Normale Supérieure (Bouzaréah), Algeria
Maram Hasanain, Qatar University, Qatar
Go Inoue, New York University Abu Dhabi, UAE
Mustafa Jarrar, Bir Zeit University, Palestine
Ganesh Jawahar, The University of British Columbia, Canada
Salam Khalifa, New York University Abu Dhabi, UAE
Walid Magdy, University of Edinburgh, Scotland
Azzeddine Mazroui, University Mohamed I, Morocco
Seif Mechti, University of Sfax, Tunisia
salima medhaffar, Le Mans University, France
Karima Meftouh, Badji Mokhtar University, Algeria
Hamdy Mubarak, Qatar Computing Research Institute, HBKU, Qatar
El Moatez Billah Nagoudi, The University of British Columbia, Canada
Preslav Nakov, Qatar Computing Research Institute, HBKU, Qatar
Alexis Nasr, University of Marseille, France
Hamada Nayel, Benha University, Egypt
Younes Samih, Heinrich Heine Universität Düsseldorf, Germany
Khaled Shaalan, The British University in Dubai, UAE
Khaled Shaban, Qatar University, Qatar
Kamel Smaili, University of Lorraine, France
Peter Sullivan, The University of British Columbia, Canada
Reem Suwaileh, Qatar University, Qatar
Nadi Tomeh, University Paris 13, France
Samia Touileb, University of Oslo, Norway
Omar Trigui, University of Sousse, Tunisia
Wajdi Zaghouani, Hamad Bin Khalifa University, Qatar
Nasser Zalmout, Amazon Inc., USA
Taha Zerrouki, University of Bouira, Algeria
Chiyu Zhang, UBC Canada
Imed Zitouni, Google, USA
Important Dates
Jan 18,2020:Workshop Paper Due Date (Extended toFeb 1, 2021Feb 18:Notification of Acceptance (Pushed to Feb 22, 2021)Mar 1: Camera-ready papers due (strict!)
April 19: Workshop Date (one day)
All deadlines are 11.59 pm UTC -12h (“anywhere on Earth”).
Paper Submission Instructions
Types of Papers: We invite research papers (long and short), demo papers, and shared task description papers.
Paper Length: Long research papers may consist of up to 8 pages of content, plus unlimited references; final versions of long papers will be given one additional page of content (up to 9 pages) so that reviewers’ comments can be taken into account. Short research papers, demo papers and shared task description papers may consist of up to 4 pages of content, plus unlimited references; final versions of short papers will be given one additional page of content (up to 5 pages) so that reviewers’ comments can be taken into account.
Submission Format: Please follow the EACL official Latex template
Submission Website: https://www.softconf.com/eacl2021/WANLP2021/
Blind Reviewing Policy: The workshop follows a blind reviewing policy for the main workshop tracks (Long, Short, and Demo). The authors should omit their names and affiliations from the paper and avoid self-references that reveal their identity. Papers that do not conform to these requirements will be rejected without review.
Multiple Submission Policy: Papers that have been or will be submitted to other meetings or publications must indicate this at submission time. Authors must inform organizers immediately once a paper is to be withdrawn from the workshop for any reason. Attempting to publish the same paper or with a major overlap (50%) may lead to rejection of the paper even after an acceptance notification have gone out.
Anonymity and Supplementary Material: As the reviewing will be blind, papers must not include authors' names and affiliations. Furthermore, self-references that reveal the author's identity, e.g., "We previously showed (Smith, 1991) ..." must be avoided. Instead, use citations such as "Smith previously showed (Smith, 1991) ..." Papers that do not conform to these requirements will be rejected without review.
Papers should not refer, for further detail, to documents that are not available to the reviewers. For example, do not omit or redact important citation information to preserve anonymity. Instead, use third person or named reference to this work, as described above (“Smith showed” rather than “we showed”).
Papers may be accompanied by a resource (software and/or data) described in the paper. Papers that are submitted with accompanying software/data may receive additional credit toward the overall evaluation, and the potential impact of the software and data will be taken into account when making the acceptance/rejection decisions.
Anonymity Period: We follow the guidelines of EACL 2021 for Anonymity Period with the period being 1 month before the submission deadline up to the date of acceptance, rejection or withdrawal (January 1, 2021 to February 22, 2021).
Shared Tasks
Shared Task 1: NADI 2021
The Nuanced Arabic Dialect Identification (NADI) targets province-level dialects, and as such is the first to focus on naturally-occurring fine-grained dialects at the sub-country level. The first NADI shared task (NADI 2020) was held at WANLP 2020. The NADI 2021 shared task will continue to focus on fine-grained dialects with new datasets and efforts to distinguish both MSA and dialects according to their geographical origin. The data covers a total of 100 provinces from all 21 Arab countries and come from the Twitter domain. Evaluation and task set up follow the NADI 2020 shared task. The subtasks involved include:
Subtask 1:
Subtask 1.1: Country-level MSA identification: A total of 21,000 tweets, covering 21 Arab countries.
Subtask 1.2: Country-level DA identification: A total of 21,000 tweets, covering 21 Arab countries.
Subtask 2: Similar to Subtask 1 but focusing on the Province level
Participants will be provided with an additional 10M unlabeled tweets that can be used in developing their systems for either or both of the tasks. The evaluation metrics will include precision/recall/f1-score/accuracy. Macro Averaged F-score will be the official metric.
Shared task page: https://sites.google.com/view/second-nadi-shared-task
Organizers:
Muhammad Abdul-Mageed, Chiyu Zhang (The University of British Columbia, Canada), Nizar Habash (New York University Abu Dhabi) , and Houda Bouamor (Carnegie Mellon University, Qatar).
Shared Task 2: Sarcasm and Sentiment Detection In Arabic
Sarcasm detection is the process of identifying whether a piece of text is sarcastic or not. Sarcasm detection received attention in other languages, but in Arabic still lags behind. The shared task on Sarcasm and Sentiment Detection in Arabic will focus on analyzing tweets and identifying their sentiment and whether a tweet is sarcastic or not.
There are two subtasks in this shared task:
Subtask 1 (Sarcasm Detection): Identifying whether a tweet is sarcastic or not, this is a binary classification task.
Subtask 2 (Sentiment Analysis): Identifying the sentiment of a tweet and assigning one of three labels (Positive, Negative, Neutral), multilabel classification task.
Shared task page: https://sites.google.com/view/ar-sarcasm-sentiment-detection/
Organizers:
Ibrahim Abu Farha (The University of Edinburgh, UK), Wajdi Zaghouani (Hamad Bin Khalifa University, Qatar) and Walid Magdy (The University of Edinburgh, UK)