HTML5′s Track Element: Add Text to a Media File Without Software


With the heavy online consumption of audios and videos, the internet has now gone beyond the plain text. Believe it or not, videos show up in almost 40% of search results. For a user, multimedia can be just an interactive way to get entertained, but for a developer, it’s a golden opportunity to deliver a bunch of information to users.
Before the introduction of HTML5’s element, it was an extremely arduous task for a developer to add timed text tracks to media files, without using scripting or additional software. In this blog post, I’m going to show you how you can add timed text tracks to your media files. Also, you’ll understand how the element can be beneficial for the SEO of a web page. At the end, I’ve also included demo links of the element, so you could see it in action.

Let’s begin!

Understanding the Element

The element allows you to add explicit external timed text tracks to your media files in a simple, standardized way. Text content, which is displayed near the bottom of the audio/video area in the HTML5 video player, may include captions, subtitles, metadata, descriptions or chapters.

The is an empty element, means it must not have the end tag. It must be a child of a

<video width="854" height="480" controls>
<source src="video.mp4" type="video/mp4">
<track src=""

Browser Support forLuckily, theelement supports almost all versions of modern browsers:
– Chrome
– Safari 6+
– Firefox 31+
– Opera 15+
– Internet Explorer 10+

Attributes of the track Element

The track element accepts five attributes described below:

The src attribute:

This “must-have” attribute defines the URL of the timed text file. As you can’t useelement from a file located on local system, you need to put your audio or video file on a web server. This means, the value of src attribute should be an absolute or relative URL.

For instance:

<track src="video_">


The srclang attribute:

This attribute specifies the language of the text track data. srclang must be present if the value of kind attribute is set to “subtitles”. The value of this attribute must be a valid BCP 47 language tag. For example, the value en represents English and hi is used for Hindi. Look into the IANA Language Subtag Registry that contains nearly 8000 language subtags.

<track src="" kind="subtitles" srclang=" es ">

The example given above defines the language of the timed text file is Spanish.


The kind attribute:

This enumerated attribute specifies the type of text content we want to add to our media file. It contains one of several values described below:

– subtitles:
subtitles are generally transcriptions or translation of the dialogue in the audio or video file. They are used when a user is not able to understand the language of a multimedia file, so he could understand what’s inside audio or video by reading dialogues in his favorite language. You must define the language of the source, which is done by adding a suitable BCP 47 language tag to the srclang attribute:

<track src="" kind="subtitles" srclang="en">

– captions:
captions are brief descriptions, which are meant to be used when sound is not clear, available or audible. Below is a simple example:

<track src="" kind="captions">

– descriptions:
descriptions, as the name indicates, are used to describe the video content. These are appropriate when the video component is unavailable or user is blind. Timed text tracks specified as descriptions are synthesized as a separate audio track. Below is the example:

<track src="" kind="descriptions">

– chapters:
Chapter titles are meant to be used for navigating the video. Timed tracks marked as chapters are usually displayed as a potentially nested list in the user agent’s interface area.

<track src="" kind="chapters">

– metadata:
This is indented to be used for scripting or non-visual information. It is not normally displayed in the video player. Timed tracks that are marked as metadata are intended to use from a script like JavaScript. Here’s an example:

<track src="" kind="metadata">


The label attribute:

The label attribute is used to define a user-readable title for the text track. It’s used by user agents while listing caption, subtitle, and audio description tracks in their interface. If theelement contained inside the label attribute, the value of the label attribute must not be the empty string, or else the user-agent will automatically assign a value like “untitled”.

<track src="" kind="subtitles" srclang="en" label="English_subtitles">

As you can see in the above example, “English_subtitles” is the value of the label attribute.


The default attribute:

This default is a Boolean attribute, using which you can specify a track as the default track. This attribute instruct the user agent to enable the default track, if the user’s preferences do not indicate that any other track would be more suitable. Needless to say, you can add default to only oneelement in the parent node. In the example given below, the subtitle in the English language is specified as the default track.

<track kind="subtitles" src="" srclang="hi">
<track kind="subtitles" src="" srclang="en" default>
<track kind="subtitles" src="" srclang="es">


SEO Advantages of theElement

Some of the key benefits of adding media tracks to your media files are:

Deep Linking:

When a user search for a phrase that is mentioned within a video, search engines return search results pointing to that specific part of the video. Deep linking makes your videos faster and easier to discover, which consequently generates more traffic to your site.
Improves Your Online Presence:
As search engines crawl text content, placing a transcript on your video web page amplifies your search presence in a dramatic way.

User Experience & Accessibility:

Subtitles and captions in multimedia items don’t only improve the online experience, but also make your content accessible to people with disabilities.

Thumbnail in Search Results:

A video page is displayed as a rich snippet with a nice video thumbnail in search engine result pages, which results in increased click through rate and organic traffic.

Increased Engagement:

Where transcripts increase user engagement, on the other hand, captions increase a video’s completion rate by 40 %. Both user experience and engagement are crucial, as Google rewards videos that have longer “watch time”.
See theElement in Action
To see the demo of theelement, visit following links:
Track Example at HTML5 Demos
IE’s Video Caption Demo Page