Forced Alignment with DARLA

  1. DARLA (Dartmouth Linguistic Automation) is available at http://darla.dartmouth.edu/. It can be used to generate a phonemic transcription of a recording and/or to extract vowel formant measurements from a recordings.

  2. For very large datasets, the service "Completely Automated Vowel Extraction" may be used; For most projects, some manual correction of the data will be useful, which is provided through the service "Semi-Automated Alignment and Extraction"

  3. The second option, the semi-automated service, requires an audio file as well as either

(a) a Praat textgrid with sentence or breath group boundaries

(b) a textfile (ending .txt) with a transcription or the text being read or

(c) a Praat textgrid with existing phonemic transcriptions.

If you have an audio file and haven't worked on the file so far, option (b) is most likely the best choice. If the recording consists of read speech, simply save the text as a textfile. Note that any repetition or hesitation needs to be removed from the recording or shown in the textfile or it will lead to erroneous results. If the recording consists of spontaneous speech, you first need to write down what the speaker said or use a speech recognition service (several commercial services are available; the Munich Phonetics lab also provides one for academic use).

  1. We further focus on option (b), "Audio transcriptions in a plain textfile". This service requires the sound file (.wav or .mp3 format) as input. The next three options (filtering of stop words, unstressed vowels and high bandwidth vowels) are only important if you want to directly use the aggregate results for vowel quality measurements (which is not recommended, as vowel formants should be normalised, e.g. with NORM). Finally, you need to provide an email address to receive a link in order to download the results once they are ready.