SUMEval-2: The 2nd Workshop on Scaling Up Multilingual & Multi-Cultural Evaluation
COLING 2025
COLING 2025
Massively Multilingual Language Models (MMLMs) like mBERT, XLMR and XY-LENT support around 100 languages of the world. Additionally, generative models like GPT-4 and BLOOM are getting attention from the NLP community and the public. However, most existing multilingual NLP benchmarks reflect a handful of cultures and languages. The languages present in evaluation benchmarks are usually high-resource and largely belong to the Indo-European language family. By extension, the cultures represented in evaluation benchmarks are also largely reflective of Western society. This makes current evaluation unreliable and does not provide a full picture of the performance of MMLMs across the linguistic and cultural landscape. Although efforts are being made to create benchmarks that cover a larger variety of tasks, cultures, languages, and language families, it is unlikely that we will be able to build benchmarks covering all languages and cultures. Due to this, there is recent interest in alternate strategies for evaluating MMLMs, including performance prediction and Machine Translation of test data.
Workshop Date: January 20th, 2025
Location: Abu Dhabi, in person
Find the full (tentative) program here.
In addition to regular papers submitted to the workshop, we will also accept papers submitted elsewhere and papers with COLING and ARR reviews. Papers submitted elsewhere will not be included in the proceedings, but participants will have a chance to present them during the workshop. Please get in touch if you have any questions.
Archival submission through Softconf.
Submitted manuscripts must be 8 pages long with unlimited pages for references and appendices. We follow ARR submission guidelines. For more information about templates, guidelines, and instructions, see the ARR CFP guidelines. We encourage authors to include a broader impact and ethical concerns statement, following ARR Ethics Policy from the main conference.
All submissions will be double-blind peer-reviewed (with author names and affiliations removed) by the program committee and judged by their relevance to the workshop themes.
Please note that at least one of the authors of each accepted paper must register for the workshop and present the paper in-person.
The deadline for Archival Submissions has been extended to November 08, 2024! Submit through Softconf.
We are pleased to announce that, thanks to Microsoft's generous sponsorship, there will be cash awards for two best papers (one in archival and one in non-archival track).
Archival Submission: October 18th, 2024 November 08, 2024
Non-archival Submission: November 25th, 2024
Notification of Acceptance: December 3rd, 2024
Camera-Ready Papers Due: December 10th, 2024
Workshop Date: January 20th, 2025
Location: Abu Dhabi, in person
All deadlines are 11:59 PM (Anywhere on Earth)
This workshop is an extension of the SumEval 2022 workshop, with a wide scope focusing on multicultural evaluation in addition to multilingual evaluation. Topics of interest include but are not limited to:
Under-represented cultures and languages
Studies on scaling up multilingual and multicultural evaluation
Human evaluation of multilingual and multicultural aspects of models
Automated evaluation metrics for multilingual and multicultural evaluation
Studies on fairness and other aspects of evaluation
Studies on cultural representation in multilingual models
Data sets, benchmarks, or libraries for evaluating multi-lingual models
Probing and analysis of multilingual models across cultures and languages
Comparison of evaluation strategies for languages, cultures, and domains
Submission types:
We would appreciate seeing various types of works on this (but not only) topic, like:
dataset and other resource papers;
novel models and techniques;
empirical studies;
position papers to propose promising new tasks or directions that the field should pursue;
literature review of a subfield;
approaches to interdisciplinary collaboration;
user study designs, user surveys.
to be announced
Hellina Hailu Nigatu, UC Berkeley
Monojit Choudhury, MBZUAI
Oana Ignat, Santa Clara University
Sebastian Ruder, Cohere
Sunayana Sitaram, Microsoft
Vishrav Chaudhary, Microsoft
Agrima Seth, University of Michigan
Angana Borah, University of Michigan
Hailay Kidu Teklehaymanot, L3S Research Center
Joan Nwatu, University of Michigan
Muhammed Farid Adilazuarda, MBZUAI
Sougata Saha, MBZUAI
Tolúlọpẹ́ Ògúnrẹ̀mí, Stanford University
For questions and sponsorship, feel free to contact us at sumeval2025@gmail.com
Sponsor