Note: most papers are linked; those that are not shoudl be easily googlable on https://scholar.google.com .
Papers with a present emoji (π ) can be chosen for presentation. Please email Owen if you want to present one of these papers (owen.rambow@stonybrook.edu).
The following list is work in progress.
The basic text is:
Nizar Y. Habash, Introduction to Arabic Natural Language Processing (Morgan & Claypool Synthesis Lectures on Human Language Technologies)
2010, 187 pages, (https://doi.org/10.2200/S00277ED1V01Y201008HLT010)
It is available electronically from the Stony Brook Library here.
Wolfdietrich Fischer, 2006. Grammatik des klassischen Arabisch.
π Hamdah Alghamdi and Eleni Petraki , 2018. Arabizi in Saudi Arabia: A Deviant Form of Language or Simply a Form of Expression? Soc. Sci. 2018, 7(9), 155.
π Aula Khatteb Abu-Liel , Zohar Eviatar & Bracha Nir (2019). Writing between languages: the case of Arabizi, Writing Systems Research, 11:2. (Search in library for free download.)
Habash, Nizar; Eryani, Fadhl; Khalifa, Salam; Rambow, Owen; Abdulrahim, Dana; Erdmann, Alexan- der; Faraj, Reem; Zaghouani, Wajdi; Bouamor, Houda; Zalmout, Nasser; Hassan, Sara; Al-Shargi, Faisal; Alkhereyf, Sakhar; Abdulkareem, Basma; Eskander, Ramy; Salameh, Mohammad; Saddiki, Hind, 2018. Unified guidelines and resources for Arabic dialect orthography. In Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC-2018).
π Vergyri, Dimitra and Kirchhoff, Katrin, 2004. Automatic Diacritization of {A}rabic for Acoustic Modeling in Speech Recognition. Proceedings of the Coling Workshop on Computational Approaches to {A}rabic Script-based Languages.
π Rani Nelken and Stuart M. Shieber. Arabic diacritization using weighted finite-state transducers. In Proceedings of the 2005 ACL Workshop on Computational Approaches to Semitic Languages, pages 79-86, Ann Arbor, Michigan, June 2005.
Habash, Nizar and Rambow, Owen, 2007. Arabic Diacritization through Full Morphological Tagging In Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics (NAACL HLT 2007); Companion Volume, Short Papers.
π Mubarak, Hamdy and Abdelali, Ahmed and Darwish, Kareem and Eldesouki, Mohamed and Samih, Younes and Sajjad, Hassan, 2019. A System for Diacritizing Four Varieties of {A}rabic. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP): System Demonstrations.
π Mubarak, Hamdy and Abdelali, Ahmed and Sajjad, Hassan and Samih, Younes and Darwish, Kareem, 2019. Highly Effective {A}rabic Diacritization using Sequence to Sequence Modeling. Proceedings of the 2019 Conference of the North {A}merican Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers).
Al-Badrashiny, Mohamed; Eskander, Ramy; Habash, Nizar; and Rambow, Owen, 2014. Automatic Transliteration of Romanized Dialectal Arabic. In Proceedings of the Eighteenth Conference on Computational Natural Language Learning.
Eskander, Ramy; Al-Badrashiny, Mohamed; Habash, Nizar; and Rambow, Owen, 2014. Foreign Words and the Automatic Processing of Arabic Social Media Text Written in Roman Script. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP).
π Ori Terner and Kfir Bar and Nachum Dershowitz, 2020. Transliteration of Judeo-Arabic Texts into Arabic Script Using Recurrent Neural Networks. Arxiv2004.11405.
π Shazal, Ali and Usman, Aiza and Habash, Nizar, 2020. A Unified Model for Arabizi Detection and Transliteration using Sequence-to-Sequence Models. Proceedings of the Fifth Arabic Natural Language Processing Workshop at ACL.
π Masmoudi, Abir; Khmekhem, Mariem Ellouze; Khrouf, Mourad; Belguith, Lamia Hadrich, 2020. Transliteration of Arabizi into Arabic Script for Tunisian Dialect. ACM Transactions on Asian and Low-Resource Language Information Processing.
Eskander, Ramy; Habash, Nizar; Rambow, Owen; and Tomeh, Nadi, 2013. Processing Spontaneous Orthography. In Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies.
π McCarthy, John J., "A prosodic theory of nonconcatenative morphology" (1981). Linguistic Inquiry. 26.
Habash, Nizar and Rambow, Owen, 2006. Morphological Analysis for Arabic Dialects. In Proceedings of the Joint Conference of the International Committee on Computational Linguistics and the Association for Computational Linguistics (ACL-Colingβ06). Sydney, Australia.
Mohamed Altantawy, Nizar Habash, Owen Rambow and Ibrahim Saleh, 2010. Morphological Analysis and Generation of Arabic Nouns: A Morphemic Functional Approach. In Proceedings of LREC 2010.
Habash, N. and Rambow, O. (2005). Arabic Tokenization, Part-of-Speech Tagging and Morphological Disambiguation in One Fell Swoop. In Proceedings of the 43rd Annual Meeting of the Association for Computational Lin- guistics (ACLβ05), pages 573β580, Ann Arbor, Michigan.
Pasha, Arfath; Al-Badrashiny, Mohamed; Kholy, Ahmed; Eskander, Ramy; Diab, Mona; Habash, Nizar; and Rambow, Owen, 2014. Madamira: A fast, comprehensive tool for morphological analysis and disambiguation of Arabic. In Proceedings of the 9th International Conference on Language Resources and Evaluation (LREC).
Nasser Zalmout and Nizar Habash. 2017. Donβt throw those morphological analyzers away just yet: Neural morphological disambiguation for Arabic. In Pro- ceedings of the 2017 Conference on Empirical Meth- ods in Natural Language Processing. pages 715β 724. Yash presents on 3/29.
?Zalmout, Nasser and Habash, Nizar, 2020. Joint Diacritization, Lemmatization, Normalization, and Fine-Grained Morphological Tagging. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL2020). Yash presents on 3/29.
π Zalmout, Nasser and Erdmann, Alexander and Habash, Nizar, 2018. Noise-Robust Morphological Disambiguation for Dialectal Arabic. Proceedings of the 2018 Conference of the North {A}merican Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers).
π Khalifa, Salam and Zalmout, Nasser and Habash, Nizar, 2020. Morphological Analysis and Disambiguation for Gulf Arabic: The Interplay between Resources and Methods. Proceedings of the 12th Language Resources and Evaluation Conference (LREC2020).
π Alkuhlani, Sarah and Habash, Nizar, 2012. Identifying Broken Plurals, Irregular Gender, and Rationality in Arabic Text. Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics.
Habash, Nizar; Roth, Ryan; Rambow, Owen; Eskander, Ramy; and Tomeh, Nadi, 2013. Morphological Analysis and Disambiguation for Dialectal Arabic. In Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies.
Eskander, Ramy; Habash, Nizar; Rambow, Owen; and Pasha, Arfath, 2016. Creating Resources for Dialectal Arabic from a Single Annotation: A Case Study on Egyptian and Levantine. In Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, Osaka, Japan.
Resource: Stanford Arabic parser
Chiang, David; Diab, Mona; Habash, Nizar; Rambow, Owen; and Sharif, Safiullah, 2006. Parsing Arabic Dialects. In Proceedings of the 11th Conference of the European Chapter of the Association for Computational Linguistics. Trento, Italy.
Habash, Nizar; Gabbard, Ryan; Rambow, Owen; Kulick, Seth; and Marcus, Mitch, 2007. Determining Case in Arabic: Learning Complex Linguistic Behavior Requires Complex Linguistic Features. In Proceedings of the Conference on Empirical Methods in Natural Language Processing and the Conference on Computational Natural Language Learning (EMNLP-CoNLL 2007).
π Spence Green and Christopher D. Manning. 2010. Better Arabic Parsing: Baselines, Evaluations, and Analysis. In COLING.
Marton, Yuval; Habash, Nizar; and Rambow, Owen, 2013. Dependency parsing of Modern Standard Arabic with lexical and inflectional features Computational Linguistics 39 (1), 161-194.
π Anas Shahrour, Salam Khalifa, and Nizar Habash. 2015. Improving Arabic diacritization through syntactic analysis. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, EMNLP 2015, Lisbon, Portugal, September 17-21, 2015, pages 1309β1315.
π Anas Shahrour, Salam Khalifa, Dima Taji, and Nizar Habash. 2016. CamelParser: A system for Arabic syntactic analysis and morphological disambiguation. In Proceedings of the International Conference on Computational Linguistics (COLING), pages 228β232.
π Dima Taji, Nizar Habash, and Daniel Zeman. 2017. Universal dependencies for Arabic. In Proceedings of the Workshop for Arabic Natural Language Processing (WANLP), Valencia, Spain.
π Kankanampati, Yash and Le Roux, Joseph and Tomeh, Nadi and Taji, Dima and Habash, Nizar, 2020. Multitask Easy-First Dependency Parsing: Exploiting Complementarities of Different Dependency Representations. Proceedings of the 28th International Conference on Computational Linguistics.
Cairene and Sanaani: Watson JCE (2002, 2007) Phonology and morphology of Arabic (the phonology of the worldβs languages). USA: Oxford University Press.
Algerian (Tlemcen): William MarΓ§ais, Le dialecte arabe parlΓ© Γ Tlemcen, 1902 https://books.google.com/books?id=PB4UAAAAYAAJ&printsec=frontcover&source=gbs_ge_summary_r&cad=0#v=onepage&q&f=false
Bouamor, Houda; Habash, Nizar; Salameh, Mohammad; Zaghouani, Wajdi; Rambow Owen; Abdulrahim, Dana; Obeid, Ossama; Khalifa, Salam; Eryani, Fadhl, Erdmann, Alexander; Oflazer, Kemal, 2018. The MADAR Arabic dialect corpus and lexicon. In Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC-2018).
Alshargi, Faisal; Dibas, Shahd; Alkhereyf, Sakhar; Faraj, Reem; Abdulkareem, Basmah; Yagi, Sane; Kacha, Ouafaa; Habash, Nizar; Rambow, Owen, 2019. Morphologically Annotated Corpora for Seven Arabic Di- alects: Taizi, Sanaani, Najdi, Jordanian, Syrian, Iraqi and Moroccan. In Proceedings of the Fourth Arabic Natural Language Processing Workshop at ACL2019, pp. 137-147.
Habash, Nizar; Rambow, Owen; Diab, Mona; and Kanjawi-Faraj, Reem, 2008. Guidelines for Annotation of Arabic Dialectness. In Proceedings of the LREC Workshop on HLT & NLP within the Arabic world. Marrakesh, Morocco.
Maamouri, Mohamed; Bies, Ann; Buckwalter, Tim; Diab, Mona; Habash, Nizar; Rambow, Owen; Tabessi, Dalila. Developing and Using a Pilot Dialectal Arabic Treebank. In Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC2006). Genoa, Italy.
π Ahmed Hamdi, Rahma Boujelbane, Nizar Habash, Alexis Nasr. The Effects of Factorizing Root and Pattern Mapping in Bidirectional Tunisian - Standard Arabic Machine Translation. MT Summit 2013, Sep 2013, France.
π Janet C. E. Watson, 1999: The Directionality of Emphasis Spread in Arabic. Linguistic Inquiry.
Ahyad, H., & Becker, M. (2020). Vowel unpredictability in Hijazi Arabic monosyllabic verbs. Glossa: A Journal of General Linguistics, 5(1), 32.
π Samih, Younes and Attia, Mohammed and Eldesouki, Mohamed and Abdelali, Ahmed and Mubarak, Hamdy and Kallmeyer, Laura and Darwish, Kareem, 2017. A Neural Architecture for Dialectal {A}rabic Segmentation. Proceedings of the Third {A}rabic Natural Language Processing Workshop at ACL.
π Biadsy, Fadi and Hirschberg, Julia and Habash, Nizar, 2009. Spoken {A}rabic Dialect Identification Using Phonotactic Modeling. Proceedings of the {EACL} 2009 Workshop on Computational Approaches to {S}emitic Languages.
Farber, Benjamin; Freitag, Dayne; Habash, Nizar; and Rambow, Owen, 2008. Improving NER in Arabic Using a Morphological Tagger. In Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC2008). Marrakesh, Morocco.
Eskander, Ramy and Rambow, Owen, 2015. SLSA: A Sentiment Lexicon for Standard Arabic. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing (EMNLP2015). http:// aclweb.org/anthology/D15-1304