Automatic identification of gender from speech

Author(s): Sarah Ita Levitan, Taniya Mishra and Srinivas Bangalore


Identifying the gender of a speaker from speech has a variety of applications ranging from speech analytics to personalizing human-machine interactions. While gender identification in previous work has explored the use of the statistical properties of the speaker's pitch features, in this paper, we explore the impact of using spectral features in conjunction with pitch features on identifying gender. We present a novel approach that leverages pitch feature trajectories in the interest of identifying the speaker's gender with as little speech as possible. We also investigate the cross-lingual robustness of a model trained on English speakers to identify the gender of German speakers. Finally, we present a model for gender detection in German that outperforms the state-of-the-art results on a benchmark data set.