MelAn

MelAn is a tool for the automatic stylisation, annotation and modelling of F0 contours. It is made of a set of Praat and R scripts that perform the tasks of F0 stylisation, labelling and modelling. They can be run on Windows, Mac OS X or Linux. It has been conceived for the automatic processing and analysis of large corpora.

The tool applies automatically the framework and methodology for the analysis of F0 contours proposed in Garrido (1996, 2001). This procedure is intended to obtain a symbolic representation of F0 contours which captures their perceptually relevant features, in the sense that it should be possible to build a 'synthetic' contour from the symbolic representation almost identical to the original contour from a perceptual point of view:

'Y cada vez la tendremos más', female speaker (original contour)

'Y cada vez la tendremos más', female speaker ('synthetic' contour)


A more detailed description of this tool can be found in Garrido (2010).

MelAn is available for public download from here.

Input

The tool expects as input:

    • A 'wav' file containing the sound

    • A Praat 'TextGrid' file containing a set of tiers with:

        • the orthographic representation of the input utterance

        • its corresponding phonetic transcription (in IPA or SAMPA), aligned with the speech wave

        • the segmentation in prosodic units (syllables, stress groups, intonation groups and breath groups

Prosodic unit segmentation can be added automatically to a TextGrid containing an orthographic representation of the input utterance and its corresponding phonetic transcription using SegProso.

Stylisation

The stylisation phase detects the relevant F0 inflection points from the original F0 contours to obtain a stylised contour perceptually close to the original one. The stylisation procedure included with Praat is used for this task, tuned properly to approach the stylised contour most similar to the original one from a perceptual point of view. The script also converts the obtained stylised contour to a point tier and adds it to the input Textgrid.

Annotation

The output of this phase is a chain of P+, P, V and V- labels associated with specific inflection points of the contour (not all inflection points receive a label at the end of the process), which are added to the input TextGrid in a new point tier.

Modelling

Following the descriptive framework proposed in Garrido (1996, 2001), two types of patterns, local and global, are extracted during this phase. The output of this process is:

    • A 'local' file containing the ordered list of local patterns making up its contour

    • A 'global' file containing a set of values describing the individual P and V regression lines, the F0 range and the F0 reset for each IG of the utterance

Local patterns are also stored in an interval tier at the output TextGrid.