Alternate Audio Tracks

One of the features of Unified Origin is the ability to create all possible combinations of audio, video and subtitles on request. Similar to subtitles in multiple languages, you may also have alternate audio tracks:

Multiple languages (English, Spanish, French).

Multiple codecs (AAC, DTS, Dolby Digital, FLAC).

Multiple bitrates (Adaptive Bitrate).

Note

Adaptive Bitrate for audio is in general only useful for audio only presentations (e.g. radio, podcast, event - etc). This works for both VOD and Live streams.

Adding audio in multiple languages 

Let's create a presentation with video available in four bitrates and audio content available in three languages (English, Italian and German).

tears-of-steel-avc1-1500k.ismv	AVC encoded video track at 1500 kbits/second
tears-of-steel-avc1-1000k.ismv	AVC encoded video track at 1000 kbits/second
tears-of-steel-avc1-750k.ismv	AVC encoded video track at 750 kbits/second
tears-of-steel-avc1-400k.ismv	AVC encoded video track at 400 kbits/second
tears-of-steel-aac-128k.isma	AAC encoded audio track at 128 kbits/second, English
tears-of-steel-aac-128k-it.isma	Italian audio dummy track for example purposes (not part of the VODPack)
tears-of-steel-aac-128k-de.isma	German audio dummy track for example purposes (not part of the VODPack)

Important

It is important that the metadata information stored in the audio and video files about the track is accurate. E.g. for the audio tracks it is vital that the 'language' is correctly signaled. Preferably the metadata in the tracks is correct, but it is also possible to override this using Selecting specific tracks from an input file.

Alternate audio tracks are simply added to the list of inputs on the command line when creating the server manifest file.

#!/bin/bash

mp4split -o presentation.ism \
  --hds.client_manifest_version=2 \
  --hls.client_manifest_version=4 \
  tears-of-steel-avc1-1500k.ismv \
  tears-of-steel-avc1-1000k.ismv \
  tears-of-steel-avc1-750k.ismv \
  tears-of-steel-avc1-400k.ismv \
  tears-of-steel-aac-128k.isma \
  tears-of-steel-aac-128k-it.isma \
  tears-of-steel-aac-128k-de.isma

Adding audio using multiple codecs 

Let's create a presentation with video available in four bitrates and audio content available in the formats AAC, DTS and Dolby Digital. Explicit variant sets are specified for HLS as recommended in our best practices Should Fix: Configure --variant_set when offering stereo and multi-channel audio (affects HLS only).

tears-of-steel-avc1-1500k.ismv	AVC encoded video track at 1500 kbits/second
tears-of-steel-avc1-1000k.ismv	AVC encoded video track at 1000 kbits/second
tears-of-steel-avc1-750k.ismv	AVC encoded video track at 750 kbits/second
tears-of-steel-avc1-400k.ismv	AVC encoded video track at 400 kbits/second
tears-of-steel-aac-128k.isma	AAC encoded audio track at 128 kbits/second, English
tears-of-steel-ac3.isma	The audio track in Dolby Digital
tears-of-steel-dts-384k.isma	The audio track in DTS

#!/bin/bash

mp4split -o presentation.ism \
  --hds.client_manifest_version=2 \
  --hls.client_manifest_version=4 \
  --variant_set='type!="audio"||Channels<=2' \
  --variant_set='type!="audio"||Channels>2' \
  tears-of-steel-avc1-1500k.ismv \
  tears-of-steel-avc1-1000k.ismv \
  tears-of-steel-avc1-750k.ismv \
  tears-of-steel-avc1-400k.ismv \
  tears-of-steel-aac-128k.isma \
  tears-of-steel-ac3.isma \
  tears-of-steel-dts-384k.isma

Using FLAC

With FLAC it is required to wrap the FLAC encoded audio into an MP4 file (which in turn can be converted to for instance CMAF). Using FFmpeg this looks as follows:

#!/bin/bash

ffmpeg -i input.flac -vn -c:a copy -strict -2 output.mp4 -y

The resulting MP4 can be used 'as is' by Unified Origin and made available through a URL to a MPEG-DASH player (see for instace MSE-Toolbox):

https://example.com/output.mp4/.mpd

Or the FLAC audio MP4 can be added to the server manifest similar as demonstrated above:

#!/bin/bash

mp4split -o presentation.ism \
  output.mp4

In the case of HLS, fragmented MP4 (fMP4) should be used for the delivery format, not TS (as stated in the Apple HLS Authoring Specification, 2.1):

#!/bin/bash

mp4split -o output.cmfa \
  output.mp4

mp4split -o presentation.ism \
  --hls.fmp4 \
  output.cmfa

The URL to use then becomes the following:

https://example.com/presentation.ism/.m3u8

FLAC audio sample can be for instance found in the 2L HiRes test bench.

Adding director's commentary 

Let's create a presentation with video available in four bitrates and audio content available in three languages (English, Italian and German). Add to that an additional audio track with the Director's commentary (available in English only).

tears-of-steel-avc1-1500k.ismv	AVC encoded video track at 1500 kbits/second
tears-of-steel-avc1-1000k.ismv	AVC encoded video track at 1000 kbits/second
tears-of-steel-avc1-750k.ismv	AVC encoded video track at 750 kbits/second
tears-of-steel-avc1-400k.ismv	AVC encoded video track at 400 kbits/second
tears-of-steel-aac-128k.isma	AAC encoded audio track at 128 kbits/second, English
tears-of-steel-aac-128k-it.isma	Italian audio dummy track for example purposes (not part of the VODPack)
tears-of-steel-aac-128k-de.isma	German audio dummy track for example purposes (not part of the VODPack)
tears-of-steel-aac-128k-commentary.isma	Commentary audio dummy track for example purposes (not part of the VODPack)

#!/bin/bash
mp4split -o presentation.ism \
  --hds.client_manifest_version=2 \
  --hls.client_manifest_version=4 \
  tears-of-steel-avc1-1500k.ismv \
  tears-of-steel-avc1-1000k.ismv \
  tears-of-steel-avc1-750k.ismv \
  tears-of-steel-avc1-400k.ismv \
  tears-of-steel-aac-128k.isma \
  tears-of-steel-aac-128k-it.isma \
  tears-of-steel-aac-128k-de.isma \
  tears-of-steel-aac-128k-commentary.isma --track_role=commentary

Alternate audio for MPEG-DASH 

Representations are arranged into Adaptation Sets. To allow for seamless switching between Representations in a Adaptation Set, the Representations are grouped in the same Adaptation Set, if, and only if, they have identical values for the the following properties:

the language as described by the @lang attribute.

the Role element.

the @codecs attribute.

the @audioSamplingRate attribute.

Alternate audio for HTTP Live Streaming (HLS)

Important

Alternate audio for HLS requires at least version 4 of the protocol. Make sure to set this using the --hls.client_manifest_version option.

When using alternate audio, for instance different languages as English and German, it is mandatory to have the language tracks in the same bitrate.

This is required in HLS v4 to create correct groups of audio tracks, which in turn will allow the player to select the language selection option the UI.

If you have more 'tracks', say two audio bitrates in two languages you will need four audio tracks. The manifest will present two groups and the player will select the better quality while maintaining the language selection option in the UI.

HLS Alternate Audio for older devices

In some situations your client may be bound to an older client manifest version without support for alternate audio tracks. Let's assume you have packaged a video asset with English, Spanish and German audio. The method to request alternate audio tracks in HLS is by adding the following parameters to your request URL:

URL to the media presentation	Description
http://localhost/video/video.ism/video.m3u8?tracks=audio_eng,video_eng	Select audio in English and video in English.
http://localhost/video/video.ism/video.m3u8?tracks=audio_spa,video_spa	Selects audio in Spanish and video in Spanish.
http://localhost/video/video.ism/video.m3u8?tracks=audio_ger,video_ger	Selects audio in German and video in German.
http://localhost/video/video.ism/video.m3u8?tracks=audio_spa,video_eng	Selects audio in Spanish and video in English.

Alternate audio for HTTP Dynamic Streaming (HDS)

Important

Alternate audio for HDS requires at least version 2 of the protocol. Make sure to set this using the --hds.client_manifest_version option.

Please note that HDS does not support identical audio tracks in different bitrates.

Alternate audio for HTTP Smooth Streaming (HSS)

Important

It's not possible to mix mono and stereo audio with HTTP Smooth Streaming. Audio tracks should either be all mono or all stereo.

Please note that HSS does not support identical audio tracks in different bitrates.

Alternate Audio Tracks

Adding audio in multiple languages

Adding audio using multiple codecs

Using FLAC

Adding director's commentary

Alternate audio for MPEG-DASH

Alternate audio for HTTP Live Streaming (HLS)

HLS Alternate Audio for older devices

Alternate audio for HTTP Dynamic Streaming (HDS)

Alternate audio for HTTP Smooth Streaming (HSS)

Adding audio in multiple languages 

Adding audio using multiple codecs 

Adding director's commentary 

Alternate audio for MPEG-DASH 