This website is created for the purpose of researchers and students alike, to reciprocate the results we got while improving the performance of an ASR system for non-native speech(accented speech). The dataset we used is the open source Mozilla Common Voice Dataset. The following section contains the train - dev split we used for our system.
The train set consists of ~31000 utterances and consists of accents from US, England, Australia, Canada, Scotland, Ireland and Wales.
See it here.
The dev set consists of ~1150 utterances and consists of US, England, Australia and Canada. See it here.
The test set consists of ~1120 utterances and consists of US, England, Australia and Canada. See it here.
The test Indian set consists of 1200 utterances and is used as one of the unseen accents. See it here.
The test NewZealand set consists of ~530 utterances and is also used as one of the unseen accents. See it here.
One of the insights we got was the PCA plot of the dev set and the unrelated accent sets for the utterance level accent bottleneck features (embeddings). It can be seen below.
Out of all the utterances, some of them seem very interesting. You can listen to them to find out more.