Qn 1) How can I access the dataset?
Ans) The audio files of the corpus are present at https://ee.iisc.ac.in/madasr23dataset/ and the transcripts and processed data are available at the challenge GitHub repository - https://github.com/bloodraven66/RESPIN_ASRU_Challenge_2023/
Qn 2) Will there be separate evaluations for Bengali and Bhojpuri?
Ans) Yes, the test set evaluation will be separate. Participants may build monolingual or multilingual speech recognizers but there will not be a single leaderboard for each language.
Qn 3) I see that in the train and dev data, there is speaker and dialect information about each utterance. Will this also be provided for test data?
Ans) Yes. Language, dialect, speaker, and sentence IDs will be provided for test-set utterances.
Qn 4) Is using pretrained models (such as wav2vec) allowed in Track 1, or only in Track 3 & 4?
Ans) No. Using pretrained models involves leveraging external acoustic features, it is allowed only in tracks 3 and 4.
Qn 5) How will the evaluation be performed?
Ans) Jiwer (https://pypi.org/project/jiwer/) will be used to compute test set level WER/CER for Bengali and Bhojpuri separately.
Qn 6) Can the additional text provided in the github repository be used in track 1 and 3?
Ans) Yes, it can be used for all 4 tracks
Qn 7) Regarding the hypothesis files, is it fine for them to contain [extra symbols] in the text? Will these symbols be ignored during evaluation, or should we preprocess the files before submission and remove any [extra symbols]?
Ans) We will not preprocess any text from the submitted hypothesis
Qn 8) If we utilize a multilingual (only acoustic model) trained only on the challenge data, would our submission still be considered under track 1?
Ans) Yes, you may use the challenge data to build multilingual models for tracks 1 and 2
Qn 9) Since the submission deadline for ASRU normal papers has been extended, we would like to inquire whether the deadline for the challenge submission will also be adjusted accordingly.
Ans) The deadline has been extended to July 10
Qn 10) Is there any requirement regarding the paper submission?
Ans) Your submission will be treated as a regular paper and will go through the normal review process.
Note: If you need any clarification on models/datasets allowed for specific tracks, contact the organizers.