The Web Audio API Introduction

Live coding video: introduction to Web audio

Welcome to the WebAudio API lesson! I personnally love this API, playing with it is a lot of fun as you will discover! I hope you will like it as much as I do!

The audio and video elements are used for playing streamed content, but we do not have a real control on the audio.

They come with a powerful API as we saw during the previous course and the previous lessons of this course: we can build a custom user interface, make our own play, stop, pause buttons.

We can control the video from JavaScript, listen to events and manage playlists, etc.

However, we have no real control on the audio signal: fancy visualizations are impossible to do.

The ones that dance with the music, and sound effects such as reverberation, delay, make an equalizer, control the stereo, put the signal on the left or on the right is impossible.

Furthermore, playing multiple sounds in sync is nearly impossible due to the streamed nature of the signal.

For video games, we will need to play very fast different sounds, and you can not wait for the stream to arrive before starting playing it.

Web Audio is the solution to all these needs and with Web Audio you will be able to get the output signal from the audio and video elements, and process it with multiple effects.

You will be able to work with samples loaded in memory.

This will enable perfect syncing, accurate loops, you will be able to mix sounds, etc.

You can also generate music programmatically for creating synthetic sounds or virtual instruments.

This part will not be covered by this course even if I give links to interesting libraries and demos that do that.

Let’s have a look at some applications.

The first thing I wanted to show you is just want we can do with the standard audio element.

So this is the standard audio element [music] that just plays a guitar riff that is coming from a server, but we can get control on the audio stream and do such things like that [music].

As you can see I control the stereo balancing here, and we have a real time waveform and volume meters visualization.

Another thing we can do is that we can load samples in memory.

This is an application I wrote for playing multitracks songs.

So we are loading MP3s and decoding them in memory so that we can click anywhere on the song, I can make loops like this [music].

As you can see, we can isolate the tracks, we can mix them in real time.

Another application that works with samples in memory is this small example you will learn how to write it in the course: we loaded two different short sounds [sounds] in memory and we can play them repeatedly [sounds] or we can add effects like changing the pitch, changing the volume with some random values and play them with random intervals [sounds].

We can see that the application to video games is straightforward.

Another thing you can do is use synthetics sounds, we will not cover the techniques, but you can use some libraries.

This is a library that works with synthetic sounds, you do not have to load a file for having these sounds [sounds].

This is a library for making 8 bits sounds like the very first computers and video games in the 80's used to produce.

You can also make very complex application, like a vocoder [sounds], or a synthesizer music instrument [sounds].

Ok you have got the idea.

This is all the interesting things you can do, and you can also learn how to debug such applications.

I will make a video especially for that, but using FireFox, you can activate, in the setting of the dev tools, the Web Audio debug tab.

So I clicked here on Web Audio, and this added a new tab here Web Audio, and if I reload the page, I can see the graph corresponding to the route of the signal.

Here we have got a source -this is called the audio graph- so we've got the source, and we've got a destination.

The source is the original sound.

In that case it is a mediaElementAudioSource node that corresponds to the audio element here.

The signal goes to an other node that is provided by the Web Audio API and implemented natively in your browser, it is a StereoPanner for separating the sound between left and right.

So then it goes to an analyser here that will draw the blue waveform and finally to the destination, and the destination is the speakers.

I also routed the signal to another part of the graph just for displaying two different analysers corresponding to the left and right channels.

This is for the volume meters here [music].

And if you click on a node, you can see that some node have parameters.

On the stereoPanner, that enables me to balance the sound to the left or to the right, you can see if I change that and click again, I can debug the different properties of each node.

You will learn how to build this graph, how to assemble the different nodes, what are the most useful nodes for adding effects, controlling the volume, controlling the stereo, making a equalizer, creating fancy visualizations, and so on.

Welcome to the Web Audio world and during a few lessons, you will learn step by step how to do such an application.

Shortcomings of the standard APIs that we have discussed so far...

In Module 2 of the HTML5 Coding Essentials course, you learned how to add an audio or video player to an HTML document, using the <audio> and <video> elements.

For example:

... render like this in your document:

Under the hood, this HTML code:

initiates a network request to stream the content,
deals with decoding/streaming/buffering the incoming data,
renders audio controls,
updates the progress indicator, time, etc.

You also learned that it's possible to write a custom player: to make your own controls and use the JavaScript API of the <audio> and <video> elements; to call play() and pause(); to read/write properties such as currentTime; to listen to events (ended, error, timeupdate, etc.); and to manage a playlist, etc.

However, there are many things we still cannot do, including:

Play multiple sounds or music in perfect sync,
Play non-streamed sounds (this is a requirement for games: sounds must be loaded in memory),
Output directly to the speakers; adding special effects, equalizer, stereo balancing, reverb, etc.
Any fancy visualizations that dance with the music (e.g. waveforms and frequencies).

The Web Audio API fulfills such missing parts, and much more.

In this course, we do not cover the whole Web Audio API specification. Instead, we focus on the parts of the API that can be useful for writing enhanced multimedia players (that work with streamed audio or video), and on parts that are useful for games (i.e. parts that work with small sound samples loaded in memory). There is the API that specializes in music synthesis and scheduling notes, that we will not study in this course.

Here's a screenshot from one example we will study: an audio player with animated waveform and volume meters that 'dance' with the music:

Web Audio concepts

The audio context

The canvas used a graphic context for drawing shapes and handling properties such as colors and line widths.

The Web Audio API takes a similar approach, using an AudioContext for all its operations.

Using this context, the first thing we do when using this API is to build an "audio routing graph" made of "audio nodes" which are linked together (most of the time in the course, we are going to call it the "audio graph"). Some node types are for "audio sources", another built-in node is for the speakers, and many other types exist, that correspond to audio effects (delay, reverb, filter, stereo panner, etc.), audio analysis (useful for creating fancy visualizations of the real time signal). Others, which are specialized for music synthesis, are not studied in this course.

The AudioContext also exposes various properties, such as sampleRate, currentTime (in seconds, from the start of AudioContext creation), destination, and the methods for creating each of the various audio nodes.

The easiest way to understand this principle is to look at the pen below—

HTML code:

<!DOCTYPE html>

<html>

<head>

<title>WebAudio example of biquad filter node</title>

</head>

<body>

<br>

</body>

</html>

JS code:

// This line is a trick to initialize the AudioContext

// that will work on all recent browsers

var ctx = window.AudioContext || window.webkitAudioContext;

var audioContext;

var gainExample, gainSlider, gainNode;

window.onload = function() {

// get the AudioContext

audioContext = new ctx();

// the audio element

gainExample = document.querySelector('#gainExample');

gainSlider = document.querySelector('#gainSlider');

buildAudioGraph();

// input listener on the gain slider

gainSlider.oninput = function(evt){

gainNode.gain.value = evt.target.value;

};

function buildAudioGraph() {

// create source and gain node

var gainMediaElementSource = audioContext.createMediaElementSource(gainExample);

gainNode = audioContext.createGain();

// connect nodes together

gainMediaElementSource.connect(gainNode);

gainNode.connect(audioContext.destination);

}

This example is detailed in the next lesson. For the moment, all you need to know is that it routes the signal from an <audio> element using a special node that bridges the "streamed audio" world to the Web Audio World, called a MediaElementSourceNode, then this node is connected to a GainNode which enables volume control. This node is then connected to the speakers.

Use the Audio Chrome extension to see the WebAudio graph in devtools

FireFox has a very good WebAudio debugger built in its devtools. If you use Google Chrome, get the "Audion" extension. You can install it from the Chrome Web Store.

Once installed, open a Web page that contains some WebAudio code (this one for example), open the Developer Tools, and locate the “Web Audio” (Editor) option. Once enabled, return to Developer Tools and open the Web Audio tab. Then, reload the target webpage so that all Web audio activity can be monitored by the tool. You can click on the WebAudio graph nodes to see their properties' values.

Note that the JSBin and CodePen examples should be opened in standalone mode (not in editor mode).

Audio nodes are linked via their inputs and outputs, forming a chain that starts with one or more sources, goes through one or more nodes, then ends up at a destination (although you don't have to provide a destination if you just want to visualize some audio data, for example).

The AudioDestination node above corresponds to the speakers. In this example, the signal goes from left to right: from the MediaElementSourceNode (we will see in the code that it's the audio stream from an <audio> element), to a Gain node (and by adjusting the gain property we can set the volume of the sound that outputs from this node), then to the speakers.

Typical code to build an audio routing graph (the one used in the above example)

HTML code extract:

<audio src="https://mainline.i3s.unice.fr/mooc/drums.mp3"

id="gainExample"

controls loop

crossorigin="anonymous">

</audio>

<br>

JavaScript source code:

// This line is a trick to initialize the AudioContext

// that will work on all recent browsers

var ctx = window.AudioContext || window.webkitAudioContext;

var audioContext;

var gainExemple, gainSlider, gainNode;

window.onload = function() {

// get the AudioContext

audioContext = new ctx();

// the audio element

player = document.querySelector('#gainExample');

player.onplay = () => {

audioContext.resume();

}

gainSlider = document.querySelector('#gainSlider');

buildAudioGraph();

// input listener on the gain slider

gainSlider.oninput = function(evt){

gainNode.gain.value = evt.target.value;

};

function buildAudioGraph() {

// create source and gain node

var gainMediaElementSource = audioContext.createMediaElementSource(player);

gainNode = audioContext.createGain();

// connect nodes together

gainMediaElementSource.connect(gainNode);

gainNode.connect(audioContext.destination);

}

Explanations:

Here we applied a commonly used technique:

As soon as the page is loaded: initialize the audio context (line 11). Here we use a trick so that the code works on all browsers: Chrome, FF, Opera, Safari, Edge. The trick at line 3 is required for Safari, as it still needs the WebKit prefixed version of the AudioContext constructor.
Then we build a graph (line 20).
The build graph function first builds the nodes, then connects them to build the audio graph. Notice the use of audioContext.destination for the speakers (line 35). This is a built-in node. Also, the MediaElementSource node "gainexample" which is the HTML's audio element.

Example of bigger graphs

Web Audio nodes are implemented natively in the browser. The Web Audio framework has been designed to handle a very large number of nodes. It's common to encounter applications with several dozens of nodes: some, such as this Vocoder application, use hundreds of nodes (the picture below has been taken while the WebAudio debugger was still included in FireFox, you should get similar results with the Audion extension).

Page updated

Google Sites

Report abuse