Multilingual Modelling of Pitch Contours

Garrido (1996) presented a proposal of F0 modelling framework, strongly inspired in the IPO approach. This proposal was later partly redefined in Garrido (2001, 2010).

In this model, F0 contours are considered to be the result of the combination of two different kinds of patterns:

    • Local, predicting the evolution of F0 at Stress Group (SG) level; SGs are formed by one stressed syllable and all the following unstressed syllables before the next stressed one (or the end of the IG).

    • Global, representing the evolution of the contours at Intonation Group (IG) level.

To define these patterns, the model proposes a methodology for the representation and analysis of F0 contours which includes the following steps:

    • Stylisation of the F0 contours;

    • Annotation of the stylised contours;

    • Definition of the local and global patterns (modelling) from the annotated contours.

F0 stylisation

F0 contours are represented in this model as series of relevant inflection points, obtained after a stylisation procedure.

F0 annotation

Each inflection point of the stylised contoour receives during the annotation process one of the following labels, representing its relative F0 height within its container IG.

    • P+ (Extra high peak)

    • P (Peak)

    • V (Valley)

    • V- (Extra low valley)

F0 modelling

Local patterns are defined as recurrent series of labels anchored at specific places of the syllables that make up SGs. The position of each point is defined with respect to the nucleus of its container syllable. Three different positions are considered:

    • I ('initial', close to the beginning of the syllable nucleus),

    • M ('middle', close to the centre of the nucleus),

    • F ('final', close to the end of the nucleus).

Global patterns are represented in the model as 'reference lines' showing the evolution of the P and V F0 levels along the IG. For each IG, then, two reference lines are considered, one for the P and one for the V level. These patterns are speaker-dependent, and model the F0 height and range of each speaker.