On the automatic comparison and cloning of native and non-native speech prosody.

Author(s): Daniel Hirst


It is notoriously difficult to evaluate prosody objectively, since there is little consensus as to what constitutes a correct prosody for a given utterance. This presentation describes an automatic procedure which consists in comparing a non-native speaker’s production with 10 instances of the same utterance, taken from the OMProDat database, and read by native speakers . The pitch and relative syllable durations of the native and non-native versions are normalised and compared and the version from the native speaker which is most closely correlated with that of the non-native speaker is chosen as a model. The normalised pitch and syllable durations of the native speaker’s recording can then be cloned and transferred to the L2 utterance. The original and re-synthesised versions of the learner’s utterance can then be used to provide both visual and auditory feedback to the language learner.