In recent years, social media network mediums have revolutionized the way people communicate, particularly in the Arab World. The adoption of social media platforms, such as Facebook and Twitter, is not only high in Arabic speaking countries, such platforms have been credited with playing a major role in the Arab Spring. Since social media can provide a peering glass into the lives of users, doing the same for Arab users requires: a) the effective processing and analysis of Arabic text, including Modern Standard Arabic (MSA) and different Arabic dialects; and b) the understanding of general phenomena that characterize social media in the Arab World. For the former, appropriate Natural Language Processing (NLP) methods and tools that are tailored for Arabic social media text are required to perform different tasks such as text normalization and segmentation, extraction of named entities, ascertaining sentiment, identifying dialects, etc. The latter requires the proper understanding of specificities of Arabic social media such as the prevalence of specific genres (e.g. religious content), the relative use of formal vs informal language, the commonality of bilinguality and code switching, etc.
Qatar National Research Fund (QNRF) has been funding several research projects in this field such as the ARAP project on Author Profiling for Cyber-Security. In this context, we encourage in this workshop researchers to contribute their recent research works that: showcase recent advances in Arabic NLP as they relate to social media; and highlight specific phenomena that characterize Arabic social media. The workshop aims to break barriers to using technologies to address research questions related to social sciences research in Arabic speaking world. The Arabic speaking region expands from the Middle East to North Africa covering a wide area with distinct geographies, ethnicities, governments and more. These features add unique and interesting complexities to social and sociological studies of the region.
Workshop Topics and Themes included but are not limited to the following:
- Challenges of Arabic social media processing
- Mining Arabic social data
- Opinion mining and social media analytics of the Arabic language
- Credibility of online Arabic content
- Arabic social media and health behaviors
- Annotated corpora and resources for the Arabic language in social media
- Methods and Algorithms
- Dialectal Arabic processing in social media
- Sentiment analysis
- Arabic dialect modeling
- Detecting of hate speech on Arabic social media
- Market analysis and surveys using Arabic social media
- Demographics of Arabic users: gender, age, country, religion, etc.
- Code switching between MSA and dialects
- Basic core technologies for Arabic in Social Media such as morphological analysis, disambiguation, tokenization, POS tagging, named entity detection, chunking, parsing, semantic role labeling.
Workshop Format / Abstract Submission
Participants would be asked to submit abstracts that would address one of the topics above from the perspective of: use cases; tools; resources; and preliminary experimental results. Abstracts would be reviewed by the workshop organizers and authors of selected abstracts would be offered a time slot for a short presentation (10 minutes each) to present their ideas. The submitted abstract will be published online in this website.
To Submit your 1-2 pages long abstract, please send it by email before the deadline, Sept 22 2019, to : email@example.com
The submitted Abstracts should follow the Socinfo format in Word or Latex Format.
In the second part of the workshop, we will organize breakout discussions based on the themes captured in the abstracts. This would allow problem owners and solution providers to have deeper discussions that may include collaborations, authorship of joint white-papers, etc.
Deadline for abstract submission : Sep 22 2019
Acceptance Notification: Sept 30th 2019
Workshop Day : Nov 18th 2019
Workshop Schedule (Johara Room, Hilton Doha)
8:30-8:45 Opening Remarks
8:45-9:45 Social Media Analysis -- 8 minutes each presentation + 2 minutes for questions
- GCC Verified and Unverified Twitter Users: An Investigation of Variance by Gender, Language and Content., Hamda K. Al Boinin.
- Twitter Analysis of the role of women in MENA Region. Hanan Dorri
- Pre and Posthumous Twitter: The Case of Jamal Khashoggi. Marc Owen Jones and George Mikros
- Collecting and Cleaning Social Media Data in Arabic. Amna Al-Ansari
- An Analysis of the Emojis used by Twitter Users in Qatar. Souad Aqeel
- Sentiment Preservation in Translated Arabic User Generated Content. Pintu Lohar, Muhannad Albayk Jaam, Haithem Afli, Grace Tang, and Andy Way
10:00-11:00 Corpus and Applications -- 8 minutes each presentation + 2 minutes for questions
- Propaganda Accounts in the Arab World. Kareem Darwish and Preslav Nakov
- An Arabic Twitter Corpus for Irony Detection. Ines Abbes and Wajdi Zaghouani
- Taxonomy of Arabic Offensive Speech. Hamdy Mubarak, Younis Samih, Ahmed Abdelali, Kareem Darwish
- Detecting Deception in Seven Truth Seven Lies translated into Arabic. Francisco Rangel, Paolo Rosso, Anis Charfi,and Wajdi Zaghouani
- ARAP-Tweet 2.0: A Fine-Grained Multi-Dialectal Arabic Corpus Annotated with Age, Gender, and Dialect Information. Anis Charfi, Wajdi Zaghouani, Syed Hassan Mehdi, and Esraa Mohamed
- Using machine learning algorithms for suicidal profile detection in social networks. Atika Mbarek, Salma Jamoussi, and Abdelmajid Ben Hamadou
11:15-12:20 Breakout discussions (3-4 groups)
12:20-13:00 Group presentation from each breakout discussion and closing Remarks