So you can understand how computers digitally represent sound, you first need to know a little about what sound is!
Sound is all about vibration. To make a sound, something has to vibrate — whether that’s the string of a guitar, the larynx (voice box) of a person, or the loudspeakers of your radio. Sound waves consist of vibrating particles, which knock into other particles, causing those particles to vibrate and knock into more particles, and so on and so forth; this is how sound waves travel away from their source. We hear sounds because the vibrations in the air cause our ear drums to vibrate, and these vibrations are converted into nerve signals that are sent to our brains. Similarly, microphones detect vibrations in the air and convert them into electrical signals.
When you imagine a sound wave, you might think of something that looks a bit like a water wave:
However, this isn’t really an accurate representation of a sound wave. The particles that vibrate as part of a sound wave move back and forth along the axis along which the sound wave is travelling. This movement creates areas where particles are more bunched up (areas of high pressure, or compressions) and areas where particles are more spread out (areas of low pressure, or rarefactions). This type of wave is called a longitudinal wave.
It’s not easy to visually represent a longitudinal wave; there are two types of graphs to do it:
A graph like the one below, where the x-axis indicates space, with the top of the wave representing areas of compression, and the bottom of the wave representing areas of rarefaction
A graph where the x-axis indicates time, so the graph represents a sound wave passing a particular point in space; this is the more commonly used type of graph
From a graph showing how a sound wave changes over time, you can extract two important features that affect what a sound sounds like to us: the amplitude and the frequency. The amplitude is the height of the wave on the graph from the middle to its highest point. The amplitude determines a sound’s volume — sound waves of higher amplitude are louder. For example, the sound wave created by an theatre actor projecting their lines to the audience has a higher amplitude than the sound wave produced by someone in the audience whispering to their friend.
The time that passes between two wave peaks is the time period of the sound wave. To calculate the frequency, you divide 1 by this time period. The frequency of the sound wave tells you about the sound’s pitch: a sound with a higher frequency has a higher pitch. For example, the graph of the shrill sound of a whistle will look a lot more bunched up than the graph of the deep sound of a double bass. The sound represented by the green line in the graph below would have a higher pitch than the sound represented by the purple line.
Of course, most sound waves are not pure like the ones shown above, because most sounds are made up of combinations of lots of waves. Let’s look at the sound waves corresponding to whistling, speech, and music.
Whistling
As you can see, whistling produces a smooth, simple sound wave.
Speech
By contrast, speech is a combination of multiple sound waves of different frequencies and amplitudes that results in a more complicated wave.
Music
Music is similar to speech, and most instruments don’t produce just one pure note.
With free software like Audacity and a microphone connected to your computer (maybe as part of a webcam), you can record different sounds and look at the shape of the sound wave.
Give it a try — record sounds and noises produced by items in your home.
Can you tell anything about the sounds by looking at the waves on screen?