Among the several datasets that enable state-of-the-art speech recognition systems (MNIST, LibriSpeech, SpeechCommands, …) there are many that provide sufficient trained or untrained data for major languages (English, France, German, …). However, data remain scarce for languages such as Ewè, spoken by above 10 million speakers in the West African region. The Yodi Project aims at providing the state-of-the-art datasets necessary for Machine learning development in Africa (West Africa).