23. Speech
23. Speech
Humans employ many modes of signaling in communication including visual, acoustic, tactile and olfactory. This is not different from what our ancestor mammals, vertebrates or even invertebrates had been doing for millions of years. The level of complexity that evolved in our speech, however, seems to be unique in nature.
Most of the sounds that humans use in speech are produced by the vocal folds. Non-vocal sounds include clicks produced with the lips or tongue. Vibration of the vocal folds produces complex sound waves with elaborate harmonic structure. The vocal folds can be stretched or relaxed, producing voice over a wide range of fundamental frequencies.
Figure 1. The spectrogram of the human voice reveals its rich harmonic content. More details.
At any given fundamental frequency, the voice can be changed in a myriad of ways that influence its meaning during speech or singing. After being produced at the vocal folds, sound is modified through cavity resonances in the pharynx, mouth and nasal cavity and through constriction or interruption of the flow at the soft palate, tongue, teeth or lips. A large number of muscles can move body parts in ways that alter the quality of voice and in this way become part of the vocal system. The vocal system could be the most complex and versatile motor system in our body.
Phonetics is the branch of linguistics that studies the sounds of human speech and equivalent aspects of sign languages. It is divided into articulatory, acoustic and auditory phonetics. Articulatory phonetics is the subfield that studies how the anatomy of the vocal apparatus is used to articulate speech. It addresses the biomechanics of positioning structures, the physics of airflow and the acoustics of the articulation.
The variety of vocal sounds produced for communication needed to be cataloged using a common notation system that could be applied to any language and dialect. The International Phonetic Association created the International Phonetic Alphabet (IPA) in the late 19th century as a standardized representation of the sounds of spoken language.
IPA symbols are composed of one or more elements of two basic types: letters and diacritics. The sound of the English letter ⟨t⟩, for example, may be transcribed in IPA with a single letter, [t], or with a letter plus diacritics, [t̺ʰ], depending on how precise one wishes to be. As of 2005 the system contains 107 letters and 52 modifyiers. The letters chosen for the IPA come mostly from Latin and Greek, but many other letters and symbols were also included.
The IPA provides one letter for each distinctive sound (speech segment). This means that:
It does not normally use combinations of letters to represent single sounds, the way English does with ⟨sh⟩, ⟨th⟩ and ⟨ng⟩, or single letters to represent multiple sounds the way ⟨x⟩ represents /ks/ or /ɡz/ in English.
There are no letters that have context-dependent sound values, as do "hard" and "soft" ⟨c⟩ or ⟨g⟩ in several European languages.
It does not have separate letters for two sounds if no known language makes a distinction between them.
Figure 2. The official chart of the IPA as of 2015. An official chart is released after each revision of the IPA, summarizing its structure. More details.
The IPA offers over 160 symbols for transcribing speech but only a subset of these is used to transcribe any one language. It is possible to transcribe speech with various levels of precision. A precise phonetic transcription in which sounds are described in a great deal of detail is known as a narrow transcription. A coarser transcription is called a broad transcription. The English word little, for example, may be transcribed broadly using the IPA as [ˈlɪtəl], and this broad (imprecise) transcription is a more or less accurate description of many pronunciations. A narrower transcription may focus on individual or dialectical details: [ˈɫɪɾɫ] in General American, [ˈlɪʔo] in Cockney, or [ˈɫɪːɫ] in Southern US English.
The sounds produced for vocal communication form two major groups: vowels and consonants.
Vowels are produced by the passage of air through the larynx and the vocal tract mostly having the vocal tract open and allowing the air to escape without generating turbulent noise.
Consonants are produced with restriction or interruption of the airflow from the lungs.
Articulatory phonetics is the branch of linguistics that studies how the vocal apparatus is moved to produce speech sounds. The International Phonetic Alphabet is used to represent all the sounds produced in all languages. Vocal sounds are produced as vowels, with unrestricted air flow, or consonants, with blocked or constricted airflow.
Vocal folds, spectrogram, harmonic series, cavity resonance, phonetics, articulatory phonetics, international phonetic alphabet, international phonetic association, vowel, consonant.
Figure 1 by Dvortygirl, Mysid - FFT'd in baudline; original sound by DvortygirlThis file was derived from:En-us-it's all Greek to me.ogg, CC BY-SA 3.0, https://commons.wikimedia.org/w/index.php?curid=2524720
Figure 2 by International Phonetic Association. http://www.internationalphoneticassociation.org/content/ipa-chart, available under a Creative Commons Attribution-Sharealike 3.0 Unported License. Copyright 2015.