GenProso

GenProso is a module for the automatic prediction of prosodic parameters (duration and F0) in text-to-speech (TTS) applications. It is currently used with TexAFon, the text processing tool of GLiCom, to get the necessary linguistic information from the input text. It has been conceived as a research tool to explore the use of parametric prosodic models for the prediction of prosody in TTS.

Written in Python, linguistic/phonetic knowledge in GenProso is implemented in the form of:

- Python procedures, responsible of the prediction process;
- F0 and duration models, stored in tabular format as text files, which are usually obtained from the analysis of annotated corpora, but can be manually edited by the user for specific research purposes; they can be representative of the speech of one or several speakers, or a particular speech type, depending on the reference corpus used to build them.

GenProso needs as input (provided by TexAFon)

- the phonetic transcription of the input utterance
- its segmentation in prosodic units (syllables, stress groups and intonation groups) of the input text.
- F0 and duration models that will be used for prediction.