Inclusive and Fair Speech Technologies

This special session will take place on Tuesday, September 20 at 10 am (KST time) in room R116-118.
See the
conference website for up-to-date information.

Short Summary

Speech technologies have become increasingly used and now power a very large range of applications. Automatic speech recognition systems have indeed dramatically improved over the past decade thanks to the advances brought by deep learning and the effort on large-scale data collection. The speech technology community's relentless focus on minimum word error rate has thus resulted in a productivity tool that works well for some categories of the population, namely for those of us whose speech patterns match its training data: typically, college-educated first-language speakers of a standardized dialect, with little or no speech disability.

For some groups of people, however, speech technology works less well, maybe because their speech patterns differ significantly from the standard dialect (e.g., because of regional accent), because of intra-group heterogeneity (e.g., speakers of regional African American dialects; second-language learners; and other demographic aspects such as age, gender, or race ), or because the speech pattern of each individual in the group exhibits a large variability (e.g., people with severe disabilities).

The goal of this special session is (1) to discuss these biases and propose methods for making speech technologies more useful to heterogeneous populations and (2) to increase academic and industry collaborations to reach these goals.

Such methods include:

  • analysis of performance biases among different social/linguistic groups in speech technology,

  • new methods to mitigate these differences,

  • new approaches for data collection, curation and coding,

  • new algorithmic training criteria,

  • new methods for envisioning speech technology task descriptions and design criteria.

Moreover, the special session aims to foster cross-disciplinary collaboration between fairness and personalization research, which has the potential to both improve customer experiences and algorithm fairness. The special session will bring experts from both fields to advance the cross-disciplinary study between fairness and personalization, e.g., fairness-aware personalization.

The session promotes collaboration between academia and industry to identify the key challenges and opportunities of fairness research and shed light on future research directions.


Organizers

in alphabetical order:

  • Prof. Laurent Besacier, Naver Labs Europe, France, Principal Scientist,

  • Dr. Keith Burghardt, USC Information Sciences Institute, USA, Computer Scientist,

  • Dr. Alice Coucke, Sonos Inc., France, Head of Machine Learning Research,

  • Prof. Mark Allan Hasegawa-Johnson, University of Illinois, USA, Professor of Electrical and Computer Engineering,

  • Dr. Peng Liu, Amazon Alexa, USA, Senior Machine Learning Scientist,

  • Anirudh Mani, Amazon Alexa, USA, Applied Scientist,

  • Prof. Mahadeva Prasanna, IIT Dharwad, India, Professor, Dept of Electrical Engineering,

  • Prof. Priyankoo Sarmah, IIT Guwahati, India, Professor, Dept of Humanities and Social Sciences,

  • Dr. Odette Scharenborg, Delft University of Technology, the Netherlands, Associate professor,

  • Dr. Tao Zhang, Amazon Alexa, USA, Senior Manager.