ENGLISH asr challenge

Speech Lab, IIT Madras announces Automatic Speech Recognition (ASR) Challenge for Indian English. The Indian English challenge is the second challenge in the series of ASR challenges planned. The details of the first challenge can be found here. These challenges are a part of the National Language Translation Mission funded by MeitY. They aims towards helping and encouraging the advancement of ASR in Indian Languages. We plan to have a series of challenges with increasing difficulty in different Indian languages, and release appropriate data with each challenge. In the first few challenges, we will release everything including source codes etc, so that start-ups/Universities/Research-Labs without previous experience in ASR can also participate and get familiar with it.

Challenge overview

Recent advancements in Speech technology have shown that ASR systems can work on par with humans. To build a good ASR system requires large amounts of training data and high-end computational resources.

However, when it comes to Indian languages, not everyone, especially academic institutions and startups, have access to these resources. As a part of this challenge, we will be releasing speech data in Indian English. Everyone who participates in this challenge will then be free to use this data for research purposes.

Data Set and Baseline recipes

The data set comprises of Indian English read speech and lecture speech data along with the corresponding transcriptions. It covers genres like politics, sports, entertainment, etc. The read speech was collected by Speech Lab IITM and has text data crawled from newspapers. The volunteers were asked to read them. The lecture speech data was obtained from the Computer Science and Electrical lectures of NPTEL. The read speech corpus is named IITM whereas the lecture speech corpus is referred to as NPTEL. The following data sets will be released for this challenge:

Train set - 280 hours --- IITM (80 hours) + NPTEL (200 hours)
Development set IITM - 6 hours --- IITM
Development set NPTEL - 5 hours --- NPTEL
Evaluation set IITM- 6 hours --- IITM
Evaluation set NPTEL - 5 hours --- NPTEL

Lexicon, results and recipes to replicate the baseline experiments will also be made available.

How to Participate

Enroll yourself by registering on this link: Register Now!
Registering on the above link provides access to the user license and download the training and test data for English challenge

Challenge

The participants are expected to submit their results on the evaluation data.
The evaluation data will be made available only when the submission portal is opened, i.e., ~~28th January 2021~~ 3rd February 2021.
The links to download evaluation sets will be mailed to all the registered participants.
The challenge will have two streams:
- Closed English-ASR Challenge: Only the training data distributed as part of the challenge can be used to train the models (both acoustic and language models). Please do not use dev set data.
- Open English-ASR Challenge: You can use any external/additional data to train the acoustic and language models.
The participants can choose to submit their results to both streams or any one among them.

Submit results: Use submission portal submit your results.

The submission portal will open on ~~28th of January 2021~~ 3rd of February 2021 and closes at midnight on ~~1st of February 2021~~ 7th of February 2021(midnight anywhere in the world, i.e., 12pm UTC on ~~1st of February 2021~~ 7th of February 2021)
Submissions should include the ASR output produced by the system and a brief description of the system
The format of the decode files to be submitted will be shared soon.
Participating teams can submit a maximum of 10 submissions per team
Results will be displayed on a leader board throughout the period that the submission site is open

Baseline results, source codes and recipes

Baseline results and scripts can be found here

Important Dates

Release of training data (280 hours), development data (IITM - 6 hours and NPTEL - 5 hours), lexicon and baseline system: December 15, 2020
Evaluation data (IITM - 6 hours and NPTEL - 5 hours) release and opening of submission site: ~~January 28, 2021~~ February 3rd, 2021
Closing of submission site: ~~February 1, 2021~~ February 7, 2021(midnight anywhere in the world, i.e., 12pm UTC on ~~February 1, 2021~~ February 7, 2021)
Announcement of results: ~~February 3, 2021~~ February 8, 2021

About Speech Lab IITM

Speech lab IIT Madras is headed by Prof. S. Umesh and is part of the Dept. of Electrical Engg. Our focus is on building state of the art speech recognition systems, especially in Indian languages. Our research interests are in low-resource modelling, multilingual speech recognition and speaker normalisation.

Page updated

Google Sites

Report abuse