AfricaNLP 2023 Workshop
African NLP in the Era of Large Language Models.
(Collocated with ICLR 2023, 5th May 2023 )
About the Workshop
Over 1 billion people live in Africa, and its residents speak more than 2,000 languages. But those languages are among the least represented in NLP research, and work on African languages is often sidelined at major venues. In 2022, the wave of large language models built through collaborative networks and large investments in compute has come to the shores of African languages. This year has seen the release of large multilingual models such as BLOOM and NLLB-200 for machine translation. While those models have been publicly open-sourced, their impact on the community of African NLP researchers is yet to be assessed and deserves to be a matter of wider discussion. This has inspired the theme for the 2023 workshop: African NLP in the Era of Large Language Models.
The workshop has several aims
to invite a variety of speakers from industry, research networks and academia to get their perspectives on the development of large language models and how African languages have and have not been represented in this work
to provide a venue to discuss the benefits and potential harms of these language models on the speakers of African languages and African researchers.
to enable positive interaction between academic, industry, and independent researchers around this theme and encourage collaboration and engagement for the benefit of the African continent
to foster further relationships between the African linguistics and NLP communities. It is clear that linguistic input about African languages is key in the evaluation and development of African models
to showcase work being done by the African NLP community and provide a platform to share this expertise with a global audience interested in NLP techniques for low-resource languages
to promote multidisciplinarity within the African NLP community with the goal of creating a holistic participatory NLP community that will produce NLP research and technologies that value fairness, ethics, decolonial theory, and data sovereignty
to provide a platform for the groups involved with the various projects to meet, interact, share and forge closer collaboration
to provide a platform for junior researchers to present papers, solutions, and begin interacting with the wider NLP community
to present an opportunity for more experienced researchers to further publicize their work and inspire younger researchers through keynotes and invited talks
This workshop follows the previously successful editions in 2020, 2021, and 2022. It will be hybrid and co-located with ICLR2023. No paper will be automatically desk-rejected :).
Submission Deadline: 5th February , 2023 (AoE time)
Acceptance Notifications: 3rd March , 2023 (AoE time)
Camera-ready: 15th April, 2023 (AoE)
Workshop date: 5th May, 2023 in Kigali, Rwanda & Virtual
Perez Ogayo is a master's student at Carnegie Mellon University's Technologies Institute (LTI). Prior to her studies at Carnegie Mellon, she received her BSc in Computer Science from African Leadership University-Rwanda. Perez's research pursuits lie in the realm of multilingual and low natural language processing (NLP), where she focuses on machine translation, speech synthesis and recognition, and NLP for endangered languages. Additionally, she is interested in the efficient deployment of NLP models on smaller devices, as she recognizes the importance of accessibility and sustainability in the field. Alongside her studies at Carnegie Mellon, Perez currently serves as a researcher at Masakhane, where she works on the Luo, Swahili, and Suba languages.
Paul Azunre holds a Ph.D. in Computer Science from MIT and has served as a principal investigator on several DARPA research programs. He founded Algorine Inc., a research lab dedicated to advancing AI/ML and identifying scenarios where they can have a significant social impact. Paul also co-founded Ghana NLP, an open-source initiative focused on using NLP and transfer learning with Ghanaian and other low-resource languages. He also serves as Director of Research at Dun & Bradstreet, a company helping businesses manage supply chain risk and other business analytics challenges. He is the author of the recently published book, "Transfer Learning for NLP" by Manning Publications.
Elizabeth Salesky is a Ph.D. student at Johns Hopkins University, advised by Philipp Koehn and Matt Post. Her research primarily focuses on language representations for machine translation and multilinguality, including how to create models that are more data-efficient and robust to variation across languages and data sources.
Dr. Seid Muhie Yimam
Dr. Seid Muhie Yimam is currently a technical lead at HCDS and a research associate at Language Technology Group, under the supervision of Prof. Chris Biemann. At HCDS, he will mostly work on leading and consulting research on digital humanities that involve big data processing of textual content. He will continue teaching NLP and Data science courses in the house while supervising students on interdisciplinary AI and data science research topics. He is currently participating in the development of a research data and knowledge management project, an intersectional project with knowledge management, AI, and library science. The project is envisioned to ingest metadata from research reports and projects automatically from diverse sources to present the outcomes using appealing visualization components.
He has been working as a postdoctoral researcher at Language Technology Group, UHH, since January 2020. He received his Ph.D. degree from the Universität Hamburg, with a specialization in the integration of adaptive machine learning models into annotation tools and NLP applications. From January 2020-March 2022, he has been working on multiple research topics including social media NLP (hate speech detection, fake news identification, and sentiment analysis) and low-resource language NLP research, mostly for the Ethiopian language of Amharic that include named entity recognition, semantic models, hate speech detection, and sentiment analysis. He has been teaching NLP courses and supervising Master’s projects and thesis in the group.
Laurent Besacier is a principal scientist and Natural Language Processing (NLP) research team lead at Naver Labs Europe. Before that, he became a professor at the University Grenoble Alpes (UGA) in 2009 where he led the GETALP group (natural language and speech processing). Laurent is still affiliated with UGA.
His main research expertise and interests lie in the field of natural language processing, automatic speech recognition, machine translation, under-resourced languages, machine-assisted language documentation and the evaluation of NLP systems.
Asmelash Teka Hadgu
Asmelash Teka Hadgu is the Co-founder and CTO of Lesan and a fellow at the Distributed AI Research Institute (DAIR). At Lesan, he has built state-of-the-art machine translation systems to and from Amharic, Tigrinya, and English. Prior to Lesan, Asmelash did his Ph.D. at the Leibniz University Hannover where his research focused on applied machine learning for applications in scholarly communication, crisis communication, and natural language processing in low resource settings. Currently, as part of the Lesan-DAIR partnership, he is working on language technologies for Ge’ez based languages such as Tigrinya and Amharic.
Audace Niyonkuru is Chief executive officer of Digital Umuganda , an AI and Open data company focusing on democratising access to information in African languages by the creation of open & publicly available datasets to spur AI research and innovation on the continent .He is also a member of United Nations Internet governance forum multi stakeholder advisory group.
Call for Sponsors
The AfricaNLP Workshop has been an essential gathering for the AfricaNLP community (Masakhane) for many years. It serves as a platform for African scholars, practitioners, and students to showcase their work and for junior students to launch their research careers. Thanks to the generosity of sponsors like you in the past, we have been able to provide over 115 scholarships to African researchers to attend our workshop. Read about their experience.
By supporting us, your company will receive valuable promotion through branding opportunities and access to our recruitment database. Additionally, you will have the opportunity to lead a session for our attendees, the specific benefits and sponsorship tiers available at the sponsorship guide.
David Ifeoluwa Adelani
Research Fellow, UCL
Bonaventure F. P. Dossou
Ph.D. Student, Mila & McGill
Ph.D. Student, UPorto
Atnafu Lambebo Tonja
Ph. D. Student, IPN
Research Scientist, Meta AI
Postdoc, RIKEN Center for AIP
PhD. student, DeustoTech
Ph.D. Student, Insight Centre, University of Galway
Tajuddeen Rabiu Gwadabe
Project Manager, Masakhane Research Foundation
PhD student, University of Amsterdam
Everlyn Asiko Chimoto
Ph.D. Student, University of Cape Town, AIMS
Contacts & Slack Workplace
You're invited to join the Masakhane community slack (channel #africanlp-iclr2023-support) . Meet other participants, find collaborators, mentors and advice there. Organizers will be available on slack to answer questions regarding submissions, format, topics, etc. If you have any doubt whether you can contribute to this workshop (e.g. if you have never written a paper, if you are new to NLP, if you do not have any collaborators, if you do not know LaTeX, etc.), please join the slack and contact us there as well.
To contact the workshop organizers please send an email to: africanlp-ICLR2023@googlegroups.com