Working with cues

Live Coding Video: accessing the content of a track

Hi! We will continue the last example from the previous video, and this time when we will click on the "force load track 0" or "force load track 2" buttons.

You remember the track number 0 (the English subtitles) was not loaded, readyState=0 here says the track is not loaded.

Track 2 also was not loaded: it contains the English chapters of the video.

This time, I will explain how we can read the content of the file.

So if I click on "force load track 0", I see here the content of the WebVTT file.

I didn't read it as pure text, I used the track API for accessing individually each cue, each one of these elements here is a cue, and I access the id, the start time, the end time and the content, that we call the text content of each cue.

If I click on "force load track 2", I see the chapters definitions here, so chapter 1 of the video goes from 0 to 26 seconds, and it corresponds to the introduction part of the video.

How did we do that? We just completed the readContent function that previously just showed the statuses of the different tracks.

So remember that when we clicked on a button, we forced the text track corresponding to the HTML track to be loaded in memory, and then we can read it.

A TextTrack object has different properties and the most important one is called cues.

The cues is the list of every cue inside the VTT file, and each cue corresponds to a time segment, has an id and a text content.

If you do track.cues, you've got the list of the cues and you can iterate on them.

So for each cue, we are going to get its id: cue.id here.

It corresponds to the id of the cue number i.

In my example, I have got an index in the loop, I get the current cue, I get the id of this cue.

I can also get the start time, the end time and the text.

So cue.text corresponds exactly at this sentence highlighted here.

This is the only thing I wanted to show you, because the next time we are going to do something really interesting with this content here, we are going to display on the side of the video a clickable transcript.

And when we will click on it, the video will jump to the corresponding position.

This is exactly what the edX video player does, the one you are watching at right now.

A TextTrack object has different properties and methods

kind: equivalent to the kind attribute of HTML track elements. Its value is either "subtitles", "caption", "descriptions", "chapters", or "metadata". We will see examples of chapters, descriptions and metadata tracks in subsequent lessons.
label: the label of the track, equivalent of the label attribute of HTML track elements.
language: the language of the text track, equivalent to the srclang attribute of HTML track elements (be careful: it's not the same spelling!)
mode: explained earlier. Can have values equal to: "disabled"|"hidden"|"showing". Can force a track to be loaded (by setting the mode to "hidden" or "showing").
cues: get a list of cues as a TextTrackCueList object. This is the complete content of the WebVTT file!
activeCues: used in event listeners while the video is playing. Corresponds to the cues located in the current time segment. The start and end times of cues can overlap. In reality this may rarely happen, but this property exists in case it does, returning a TextTrackCueList object that contains all active tracks at a given time.
addCue(cue): add a cue to the list of cues.
removeCue(cue): remove a cue from the list of cues.
getCueById(id): returns the cue with a given id.

A TextTrackCueList is a collection of cues, each of which has different properties and methods

id : the cue id as written in the line that starts cues in the WebVTT file.
startTime and endTime: define the time segment for the cue, in seconds, as a floating point value. It is not the formatted String we have in the WebVTT file (see screenshot below),
text: the cue content.
getCueAsHTML(): a method that returns an HTML version of the cue content, not as plain text.
Others such as align, line, position, size, snapToLines, etc., that correspond to the position of the cue, as specified in the WebVTT file. See the W3Cx HTML5 Coding Essentials course about cue positioning.

Example that displays the content of a track

HTML code:

<!DOCTYPE html>

<head>

<title>Using HTML views of tracks</title>

</head>

<body>

<track label="English subtitles" kind="subtitles" srclang="en"

src="https://mainline.i3s.unice.fr/mooc/elephants-dream-subtitles-en.vtt" >

<track label="Deutsch subtitles" kind="subtitles" srclang="de"

src="https://mainline.i3s.unice.fr/mooc/elephants-dream-subtitles-de.vtt" default>

<track label="English chapters" kind="chapters" srclang="en"

src="https://mainline.i3s.unice.fr/mooc/elephants-dream-chapters-en.vtt">

</video>

<h3>HTML track descriptions</h3>

<button id="buttonLoadFirstTrack" onclick="forceLoadTrack(0);" disabled>Force load track 0</button>

<button id="buttonLoadThirdTrack" onclick="forceLoadTrack(2);" disabled>Force load track 2</button><p>

</div>

</body>

</html>

CSS code:

#trackStatusesDiv {

border:1px solid;

height:auto;

padding: 20px;

}

JS code:

let video, htmlTracks;

let trackStatusesDiv;

let buttonLoadFirstTrack, buttonLoadThirdTrack;

window.onload = () => {

// called when the page has been loaded

video = document.querySelector("#myVideo");

trackStatusesDiv = document.querySelector("#trackStatusesDiv");

buttonLoadFirstTrack = document.querySelector("#buttonLoadFirstTrack");

buttonLoadFirstTrack.disabled=false; buttonLoadThirdTrack = document.querySelector("#buttonLoadThirdTrack");

buttonLoadThirdTrack.disabled=false;

// Get the tracks as HTML elements

htmlTracks = document.querySelectorAll("track");

// displauy their status in a div under the video

displayTrackStatuses(htmlTracks);

};

function displayTrackStatuses(htmlTracks) {

trackStatusesDiv.innerHTML = "";

// display track info

for(let i = 0; i < htmlTracks.length; i++) {

let currentHtmlTrack = htmlTracks[i];

let currentTextTrack = currentHtmlTrack.track;

let label = "<li>label = " + currentHtmlTrack.label + "</li>";

let kind = "<li>kind = " + currentHtmlTrack.kind + "</li>";

let lang = "<li>lang = " + currentHtmlTrack.srclang + "</li>";

let readyState = "<li>readyState = " + currentHtmlTrack.readyState + "</li>";

let mode = "<li>mode = " + currentTextTrack.mode + "</li>";

trackStatusesDiv.innerHTML += "<li><b>Track:" + i + ":</b></li>" + "<ul>" + label + kind + lang + readyState + mode + "</ul>";

}

function readContent(track) {

// track is a TexTrack object!

console.log("reading content of loaded track...");

//displayTrackStatuses(htmlTracks);

trackStatusesDiv.innerHTML = "";

// Get the cue list for this track

let cues = track.cues;

// iterate on the cue list

for(let i=0; i < cues.length; i++) {

// current cue

let cue = cues[i];

let id = cue.id + "<br>";

let timeSegment = cue.startTime + " => " + cue.endTime + "<br>";

let text = cue.text + "<P>";

trackStatusesDiv.innerHTML += id + timeSegment + text;

}

function getTrack(htmlTrack, callback) {

let textTrack = htmlTrack.track;

if(htmlTrack.readyState === 2) {

console.log("text track already loaded");

textTrack.mode = "hidden";

callback(textTrack);

} else {

// will force the track to be loaded

console.log("Forcing the text track to be loaded");

textTrack.mode = "hidden";

htmlTrack.addEventListener('load', function(e) {

callback(textTrack);

});

}

function forceLoadTrack(n) {

getTrack(htmlTracks[n], readContent);

}

In the JavaScript code, we only changed the content of the readContent(track) function from the example from the previous lesson:

function readContent(track) {

console.log("reading content of loaded track...");

//displayTrackStatuses(htmlTracks);

// instead of displaying the track statuses, we display

// in the same div, the track content//

// first, empty the div

trackStatusesDiv.innerHTML = "";

// get the list of cues for that track

var cues = track.cues;

// iterate on them

for(var i=0; i < cues.length; i++) {

// current cue

var cue = cues[i];

var id = cue.id + "<br>";

var timeSegment = cue.startTime + " => " + cue.endTime + "<br>";

var text = cue.text + "<P>"

trackStatusesDiv.innerHTML += id + timeSegment + text;

}

As you can see, the code is simple: you first get the cues for the given TextTrack (it must be loaded; this is the case since we took care of it earlier), then iterate on the list of cues, and use the id, startTime, endTime and text properties of each cue.

This technique will be used in one of the next lessons, and we will show you how to make a clickable transcript on the side of the video - something quite similar to what the edX video player does.

Page updated

Google Sites

Report abuse