ASR Strong Baseline
We provide the following baseline, consisting of a pre-trained version of the Wav2Vec 2.0 model and strongly recommend that participants use this model for transfer learning:
https://huggingface.co/Edresson/wav2vec2-large-xlsr-coraa-portuguese
Step by step to train and test the model:
Details of our baseline can be found in the following pre-print: https://arxiv.org/abs/2110.15731
CORAA: a large corpus of spontaneous and prepared speech manually validated for speech recognition in Brazilian Portuguese
CORAA: a large corpus of spontaneous and prepared speech manually validated for speech recognition in Brazilian Portuguese
Arnaldo Candido Junior, Edresson Casanova, Anderson Soares, Frederico Santos de Oliveira, Lucas Oliveira, Ricardo Corso Fernandes Junior, Daniel Peixoto Pinto da Silva, Fernando Gorgulho Fayet, Bruno Baldissera Carlotto, Lucas Rafael Stefanel Gris, Sandra Maria AluĂsio