My research focuses on developing effective speech technology for languages with limited digital resources. This includes transfer learning approaches, multilingual models, and innovative data augmentation techniques to overcome the “data scarcity” problem.
Most speech technology is developed for major world languages with abundant digital resources. However, thousands of languages have limited or no technological support. My work addresses four challenges:
Limited training data: How to build effective speech systems with minimal labeled data
Cross-lingual transfer: Leveraging knowledge from resource-rich languages
Evaluation methodologies: Creating appropriate metrics for low-resource scenarios
Cultural preservation: Supporting language vitality through technology
This project encompasses several initiatives:
Text-to-Speech synthesis for languages with limited data
Speech recognition systems that leverage cross-lingual knowledge transfer
Phonological feature mapping for related languages
Methodologies for evaluating speech technology in low-resource contexts