Landmark-based pronunciation error identification on L2 Mandarin Chinese

Author(s): Xuesong Yang, Xiang Kong, Mark Hasegawa-Johnson and Yanlu Xie

Abstract

This paper explores a novel approach of identifying pronunciation errors for the second language (L2) learners based on the landmark theory of human speech perception. Earlier works on the selection method of distinctive features and the likelihood-based "goodness of pronunciation’’ (GOP) measurement have gained progress in several L2 languages, e.g. Dutch and English. However, the improvement of performance is limited due to error-prone automatic speech recognition (ASR) systems and less distinguishable features. Landmark theory that exploits quantal nonlinear relationships of articulatory-acoustics provides a basis of selecting distinctive feature positions that are suitable for identifying pronunciation errors. By leveraging this English acoustic landmark theory, we propose to select Mandarin Chinese salient phonetic landmarks for top-16 frequently mispronounced phonemes by Japanese (L1) learners, and extract corresponding features including mel-frequency cepstral coefficients (MFCC) and formants. Both tasks of cross validation and evaluation are performed for individual phoneme using support vector machine with linear kernel (LinearSVM). Experiments illustrate that our landmark-based approaches achieve higher kappa and f1 score significantly than GOP-based methods that calculate duration normalized confidence score for each phoneme.

Get PDF

Comments