This homework is intended to provide practice in:

1) Identifying manner of articulation from a speech waveform [see class handout];

2) Examining cues to stop consonant voicing in a speech waveform.

3) Make measurements of the timepoints and durations of segmental events and characteristics in a waveform.

You will need to download and use the cross-platform sound editing software Audacity. (Note if you are running Catalina OS and would like to make your own recordings in Audacity--not required for this assignment—see this note.)

You will need to download these 10 sound files; save them locally on your computer:

manner1, manner2, manner3, manner4, manner5, manner6, voicing1, voicing2, voicing3, and voicing4.

Part 1

After launching Audacity, open the six sound files manner1 - manner6 (choose ‘Open,’ under the ‘File menu). Each of these files is a speech waveform. In order to see them all at once, you will have to re-size the windows of each file and arrange them on your screen. The tracks can be made bigger on the screen by selecting the magnifying tool in the top left corner of each window and clicking on the waveform, or by selecting ‘Zoom In’ under the ‘View’ menu. Zoom in using ⌘1(Mac) or CTRL1(Win); you can zoom out again if you want using ⌘3(Mac) or CTRL3(Win)].

These six files are the phrases (not in this order): a chat, a nap, at, a bat, a sat, and a pat.

Speech waveforms record the sound pressure fluctuations that hit a microphone, much like those that hit your eardrum. A speech waveform includes information as to whether a speech sound is voiced or voiceless.

As a reminder a voiced segment will have a regular (periodic) repeating pattern that looks like close, evenly spaced, spikes that are present for each vibration of the vocal folds.

A speech waveform also includes information about manner of articulation.

As a reminder, vowels are usually the loudest or highest amplitude segments and are voiced. Nasals, liquids and glides are also voiced (periodic) but are of a lesser amplitude. Nasals also have a simpler waveform shape than vowels; more like a smoother sawtooth shape. Fricatives have a turbulent or irregular pattern of vibration. Stop closures are either basically silent for voiceless stops or may have very low amplitude voicing for voiced stops. Most syllable initial voiceless stops will also be aspirated in English. Think about how aspiration might be indicated in a waveform based on what we’ve learned in class.


Turn in the answers to Questions 1-11 (with associated subparts) below as a file (word or pdf) to the class assignment dropbox. You have the link to upload this in an email from Professor Byrd. Please make sure the filename starts with your last name, e.g. SmithSally_MeasuringWaveforms.docx.


1a-f• Your first task is to identify without listening to them which phrase each speech waveform (manner1 – manner6) is. It may be helpful to phonetically transcribe the phrases above first to identify the sequence of segment types that is expected.

a. manner1 is the phrase _____________

b. manner2 is the phrase _____________

c. manner3 is the phrase _____________

d. manner4 is the phrase _____________

e. manner5 is the phrase _____________

f. manner6 is the phrase _____________

After you have done this, listen to each one (by clicking on the ‘Play’ button) and see if you were right. Correct and understand your mistakes.

Record, Play and Edit - the basics of Audacity

Selecting Audio - the basics

If Selection tool is not selected (default setting), choose | from Tools Toolbar.

The easiest way to select a region of audio is to click the left mouse button anywhere inside of an audio track, then drag (in either direction) until the other edge of your selection is made, then release the mouse. Alteratively you can click in the audio and then shift-click in a second location to extend the selection to that location.

Your second task involves segmenting the waveforms to answer the following questions.

Please make and record your measurements as accurately as possible. [Hint: To measure accurately, you will want to zoom in using ⌘1(Mac) or CTRL1(Win); you can zoom out again if you want using ⌘3(Mac) or CTRL3(Win)].

For these questions you will have to refer to the times in the panel in the bottom of your screen. IMPORTANT: Use the drop down arrow to select "+milliseconds" for this time indicator. Single-clicking in the waveform displays the timepoint at where the cursor was clicked. To select a portion of the waveform, click where you want the selection to begin and hold down the mouse and drag horizontally in the waveform to the location you want the selection to end (or shift-click at the desired end of selection). The readout in the panel below the waveform will display the start and end times of your selection.

2a,b• What are the beginning and end timepoints for the nasal consonant in "a nap"?

3• At what time point do the lips close in "a bat"?

4• At what time point do the lips close in "a pat"?

5• At what time point do the lips open in "a pat"?

6• At what time point does the (second) vowel start in "a pat"?

Part II.

If you haven’t already, download and open the files voicing1 – voicing4 and open them in Audacity.

voicing1, voicing2, voicing3, and voicing4

The waveforms are of the words: pad, pat, bat, spat.

IMPORTANT: Use the drop down arrow to select "+milliseconds" for this time indicator. (As before you can resize the windows to see them all at once and Zoom In (⌘1/CTRL1) or Out (⌘3/CTRL3) as necessary.) You will again be selecting portions of the waveform (see above). To determine the duration of your selection, you can subtract the start and end times (report to the nearest millisecond) or choose "Start and Length of Selection" for display, where "Length" will be your durational measurement.


Identify which waveform is which word by listening (press Play button). Answer the following questions. (To measure accurately, you may want to zoom in using ⌘1/CTRL1; you can zoom out again if you want using ⌘3/CTRL3):

7• Which is longer: the [æ] in pad or the [æ] in pat?

8• Which is longer: the [d] in pad or the [t] in pat?

In fact, these durational patterns are typical of English and many other languages; in fact whether a final stop is perceived by English-speaking listeners as voiced or voiceless can be determined completely by the duration of the preceding vowel.


9• Is the [p] in "pat" (and "pad) differentiated from the [b] in "bat" by:

i) the presence versus absence of voicing during the closure?;

ii) Voice Onset Time (the time between the release of the closure and the onset of voicing for the following vowel?; or

iii) by both i and ii?


10a-d• What is the Voice Onset Time measurement for the bilabial stop in each of the four words? This is a measurement of duration, reported in milliseconds. Note: Start your cursor at the very start of the stop release burst and drag it to the beginning of the first cycle of vowel voicing, and record the interval duration ( subtract start and end times of your selection to obtain the duration or simply display and record "Length").

10a. VOT for [p] in "pad"

10b. VOT for [p] in "pat"

10c. VOT for [b] in "bat"

10d. VOT for [p] in "spat"


11• Select (drag over) the first stop closure plus the vowel in "spat" (i.e. leave out the initial fricative). Play the selection by clicking the Play button.

a. What word do you hear?

b. Why??