MADASR 2.0 : Multi-Lingual Multi-Dialect ASR in 8 Indian Languages

🔗 Join the MADASR 2.0 Google Group to connect with participants and stay updated!

Recent advances in automatic speech recognition (ASR) have been driven by self-supervised learning (SSL) models such as wav2vec2, and large-scale multilingual systems like Whisper and Massively Multilingual ASR. Despite this progress, ASR for low-resource languages remains a major challenge, especially in linguistically diverse regions like India. With over 100 spoken languages and 22 constitutionally recognized ones, many Indian languages still lack adequate annotated data to train reliable ASR models.

Interestingly, many Indian languages share commonalities in script, grammar, and phonetics, making them well-suited for transfer learning and domain adaptation from high-resource counterparts. Multilingualism is also a norm in India, where speakers often code-switch between dialects and languages—using native dialects in informal contexts and standardized forms in formal communication. This underscores the need for scalable, multilingual dialect-aware ASR systems that reflect real-world usage. While multilingual ASR is gaining global attention, relatively little work has addressed the rich intra-language dialectal diversity within Indian low-resource languages.

Challenge:

After the grand success of the MADASR1.0 challenge, which focused on mono-lingual ASR methods in 2 Indian languages (Bengali and Bhojpuri), in MADASR2.0, the focus is on building multi-lingual ASR models. Below are the details (in brief) for the MADASR2.0 challenge.

This challenge focuses on building robust and scalable multilingual ASR pipelines using dialect-rich data from the RESPIN initiative. Participants will work with speech data spanning 33 dialects across 8 Indian languages—Bengali, Bhojpuri, Chhattisgarhi, Kannada, Magahi, Maithili, Marathi, and Telugu. For each language, two subsets are available: a smaller set of approximately 30 hours and a larger set of around 150 hours, both balanced across dialects and domains. Depending on the track, participants will use either the smaller subset (Tracks 1 & 3) or the full dataset (Tracks 2 & 4). The challenge encourages the exploration of innovative acoustic modelling techniques, including pretraining, finetuning, and other approaches tailored to dialect-diverse, low-resource scenarios.

Track 1: Train multilingual ASR from scratch using only the 30-hour per language RESPIN subset (low-resource baseline).
Track 2: Train ASR using only the full 150-hour per language RESPIN dataset (in-corpus high-resource).
Track 3: Train ASR using the 30-hour per language RESPIN subset along with any publicly available corpora or pretrained models.
Track 4: Train ASR using the full 150-hour per language RESPIN dataset along with any publicly available corpora or pretrained models.

If you are interested in participating in this challenge, please fill out the Registration Form.

Challenge Timeline (Tentative) :

Registration opens - April 10, 2025 ✅
Dataset (train+dev) sharing - April 19, 2025 ✅
Baselines release - April 22, 2025 ✅
Dataset (test) sharing- May 25, 2025
Challenge Submission opens - May 31, 2025
Final challenge submission - June 21, 2025
Challenge results declaration - June 22, 2025
Challenge paper submission deadline - June 25, 2025

Note: Anywhere on Earth (AoE) time is used for deadlines

Page updated

Google Sites

Report abuse