To ensure the maximum participation from research communities across the world and bring out various novel approaches, we have kept 4 different tracks that are shown below in the table. This challenge aims to develop effective, scalable pipelines for multilingual ASR using dialect-rich data from the RESPIN initiative. The RESPIN dataset includes 33 dialects across 8 Indian languages: Bengali, Bhojpuri, Chhattisgarhi, Kannada, Magahi, Maithili, Marathi, and Telugu. Each language has five 30-hour subsets, balanced across domains and dialects, with a larger 150-hour set available. Depending on the track, participants can use a 30-hour subset (Tracks 1 & 3) or the full 150-hour dataset (Tracks 2 & 4).The challenge encourages exploration of trade-offs between acoustic and textual data in low-resource ASR, including the impact of pretrained models and external corpora. All 4 tracks are summarized in the table below:
Track 1: Train multilingual ASR from scratch using only the 30-hour per language RESPIN subset (~240 hours data for 8 all languages in total) (low-resource baseline).
Track 2: Train ASR using only the full 150-hour per language RESPIN dataset (~1200 hours data for 8 all languages in total) (in-corpus high-resource).
Track 3: Train ASR using the 30-hour per language RESPIN subset along with any publicly available corpora or pretrained models. (~240 hours data for 8 all languages + any external resources)
Track 4: Train ASR using the full 150-hour per language RESPIN dataset along with any publicly available corpora or pretrained models. (~1200 hours data for 8 all languages + any external resources)