[Saudi Dialects Speech Corpus]

This is a database for varieties of human speech in different Saudi dialects. At the current stage, it includes (Najdi “Riyadh”, Hijazi “Jeddah”, and Jizani). It will expand on other dialects.

These data are: i) audio recording of a range of different speech styles, including both scripted and unscripted speech, ii) recording speech in interaction (e.g. in dialogue pairs) as well as in monologue; and iii) recording speech with a sociolinguistically stratified sample of speakers (e.g. grouped by age and gender).

There are different sustainable goals that are strategically planned for the project including:

Building a Saudi database for Saudi dialects.
Supporting a sustainable research objectives.
Preserving the linguistic diversity and associated intangible cultural heritage embodied within the spoken Arabic dialects of Saudi Arabia.
Providing a well-studied data for research analysis.
Adapting the data for ML, NLP, and IA projects.

The current stage: the project is currently receiving different offers of where to deposit the data.

Next project

There is always a next! Loading....

Page updated

Google Sites

Report abuse