We invite contributions on, but not limited to, the following topics:
Data collection, annotation, and curation for under-resourced languages (crowdsourcing, participatory methods, gamification, unsupervised or weakly supervised methods)
Learning with limited supervision (zero- or few-shot, PEFT, RAG with linguistic resources)
Multilingual alignment, representation learning, and language embeddings, including rare languages
Speech, multimodal, and cross-modal technologies for under-resourced languages (speech recognition, synthesis, speech-to-text, speech translation, multimodal resources)
Basic text processing (normalization, orthography, transliteration, tokenization/segmentation, morphological and syntactic processing) in and for low-resource settings.
Low-resource machine translation (pivoting, alignment, synthetic data)
Evaluation frameworks, benchmarks, and metrics designed or adapted for underrepresented languages
Adaptation, domain adaptation, and robustness to domain shift in low-resource contexts
Responsible approaches, ethical issues, community engagement, data sovereignty, and language revitalization
Deployment, tools, and practical systems for underserved languages (e.g., mobile apps, dictionary or translation apps, linguistic tools)
Case studies of success and negative results (with lessons learned)
Interoperability, standardization, and metadata practices for datasets in low-resource scenarios
Special Themes
Language modeling for intra-language variation, dialects, accents, and regional variants of less-resourced languages
Many less-resourced languages display rich internal diversity, including dialects, accents, and regional or social varieties. This special theme focuses on developing language models and speech technologies that capture and respect intra-language variation rather than reduce it to a single “standard.” We welcome work on dialect identification and adaptation, accent-robust speech systems, normalization vs. diversity-preserving modeling, and cross-dialect transfer in low-data scenarios. Approaches combining linguistic insights, community participation, and ethical awareness are especially encouraged. The aim is to build technologies that reflect and sustain the true linguistic richness of under-resourced languages.
Ultra-Low-Resource Language Adaptation
This special theme focuses on methods that enable effective language and speech technology development under extreme data scarcity. We invite research on transfer learning, cross-lingual adaptation, multilingual pretraining, and self-supervised or few-shot approaches tailored to ultra-low-resource settings. Work on evaluation, data augmentation (including synthetic data), and leveraging typological or linguistic knowledge is also welcome. The goal is to advance techniques that extend modern language technologies to the most underrepresented languages, ensuring inclusivity in the digital age.
Community-Led Project Showcase
To help ground research in community needs, we invite brief (5–10 min) presentations by language community members, NGOs, or practitioners describing real-world challenges or resource needs. Position papers or research posters are appropriate formats for this category.