Challenge Rules

Challenge Rules:

The data released as a part of this challenge can be used freely for academic purposes but permission for any commercial use of the data should be sought by writing to contact@gramvaani.org

If registered participants feel that they cannot submit a system, they will have to submit a withdrawal clause that states that they will use the data for research purposes only.

The systems submitted are expected to beat the baseline system in terms of WER/CER, however, innovative systems that come close to the baseline may be considered.

Only the audio for the blind test set (5 hours) will be released. Participants are expected to run their systems on the blind test set and submit the ASR hypotheses for evaluation.

Participants will need to share their final ASR model or an API of their model, along with the paper to be able to reproduce the hypotheses against the blind test set.

There are 3 types of challenges

1) Closed Challenge - Participants can use only the Gram Vaani 100 hours Train dataset and Gram Vaani 5 hours Development dataset for training models(Both acoustic and language models).

Cannot use any pretrained models or data (Can use 100 hr data only)

2) Self Supervised Closed Challenge - Participants can use only the Gram Vaani 1000 hours, Gram Vaani 100 hours Train dataset and Gram Vaani 5 hours Development dataset for training models(Both acoustic and language models).

Cannot use any pretrained models. But you can use the 1000 hr data we are sharing to develop model. But none other than that. ( Can use 100 hr for training and 1000 hr for pre train model dvpt)

3) Open Challenge - Participants can use any external/additional dataset for training models (Both acoustic and language models).

Open to all external data, model etc