Alternate Audio Tracks

One of the features of Unified Origin is the ability to create all possible combinations of audio, video and subtitles on request. Similar to subtitles in multiple languages, you may also have alternate audio tracks:

  • Multiple languages (English, Spanish, French).

  • Multiple codecs (AAC, DTS, Dolby Digital, FLAC).

  • Multiple bitrates (Adaptive Bitrate).

Note

Adaptive Bitrate for audio is in general only useful for audio only presentations (e.g. radio, podcast, event - etc). This works for both VOD and Live streams.

Adding audio in multiple languages

Let's create a presentation with video available in four bitrates and audio content available in three languages (English, Italian and German).

tears-of-steel-avc1-1500k.ismv

AVC encoded video track at 1500 kbits/second

tears-of-steel-avc1-1000k.ismv

AVC encoded video track at 1000 kbits/second

tears-of-steel-avc1-750k.ismv

AVC encoded video track at 750 kbits/second

tears-of-steel-avc1-400k.ismv

AVC encoded video track at 400 kbits/second

tears-of-steel-aac-128k.isma

AAC encoded audio track at 128 kbits/second, English

tears-of-steel-aac-128k-it.isma

Italian audio dummy track for example purposes (not part of the VODPack)

tears-of-steel-aac-128k-de.isma

German audio dummy track for example purposes (not part of the VODPack)

Important

It is important that the metadata information stored in the audio and video files about the track is accurate. E.g. for the audio tracks it is vital that the 'language' is correctly signaled. Preferably the metadata in the tracks is correct, but it is also possible to override this using Selecting specific tracks from an input file.

Alternate audio tracks are simply added to the list of inputs on the command line when creating the server manifest file.

#!/bin/bash

mp4split -o presentation.ism \
  --hds.client_manifest_version=2 \
  --hls.client_manifest_version=4 \
  tears-of-steel-avc1-1500k.ismv \
  tears-of-steel-avc1-1000k.ismv \
  tears-of-steel-avc1-750k.ismv \
  tears-of-steel-avc1-400k.ismv \
  tears-of-steel-aac-128k.isma \
  tears-of-steel-aac-128k-it.isma \
  tears-of-steel-aac-128k-de.isma

Adding audio using multiple codecs

Let's create a presentation with video available in four bitrates and audio content available in the formats AAC, DTS and Dolby Digital.

tears-of-steel-avc1-1500k.ismv

AVC encoded video track at 1500 kbits/second

tears-of-steel-avc1-1000k.ismv

AVC encoded video track at 1000 kbits/second

tears-of-steel-avc1-750k.ismv

AVC encoded video track at 750 kbits/second

tears-of-steel-avc1-400k.ismv

AVC encoded video track at 400 kbits/second

tears-of-steel-aac-128k.isma

AAC encoded audio track at 128 kbits/second, English

tears-of-steel-ac3.isma

The audio track in Dolby Digital

tears-of-steel-dts-384k.isma

The audio track in DTS

#!/bin/bash

mp4split -o presentation.ism \
  --hds.client_manifest_version=2 \
  --hls.client_manifest_version=4 \
  tears-of-steel-avc1-1500k.ismv \
  tears-of-steel-avc1-1000k.ismv \
  tears-of-steel-avc1-750k.ismv \
  tears-of-steel-avc1-400k.ismv \
  tears-of-steel-aac-128k.isma \
  tears-of-steel-ac3.isma \
  tears-of-steel-dts-384k.isma

Using FLAC

With FLAC it is required to wrap the FLAC encoded audio into an MP4 file (which in turn can be converted to for instance CMAF). Using FFmpeg this looks as follows:

#!/bin/bash

ffmpeg -i input.flac -vn -c:a copy -strict -2 output.mp4 -y

The resulting MP4 can be used 'as is' by Unified Origin and made available through a URL to a MPEG-DASH player (see for instace MSE-Toolbox):

https://example.com/output.mp4/.mpd

Or the FLAC audio MP4 can be added to the server manifest similar as demonstrated above:

#!/bin/bash

mp4split -o presentation.ism \
  output.mp4

In the case of HLS, fragmented MP4 (fMP4) should be used for the delivery format, not TS (as stated in the Apple HLS Authoring Specification, 2.1):

#!/bin/bash

mp4split -o output.cmfa \
  output.mp4

mp4split -o presentation.ism \
  --hls.fmp4 \
  output.cmfa

The URL to use then becomes the following:

https://example.com/presentation.ism/.m3u8

FLAC audio sample can be for instance found in the 2L HiRes test bench.

Adding director's commentary

Let's create a presentation with video available in four bitrates and audio content available in three languages (English, Italian and German). Add to that an additional audio track with the Director's commentary (available in English only).

tears-of-steel-avc1-1500k.ismv

AVC encoded video track at 1500 kbits/second

tears-of-steel-avc1-1000k.ismv

AVC encoded video track at 1000 kbits/second

tears-of-steel-avc1-750k.ismv

AVC encoded video track at 750 kbits/second

tears-of-steel-avc1-400k.ismv

AVC encoded video track at 400 kbits/second

tears-of-steel-aac-128k.isma

AAC encoded audio track at 128 kbits/second, English

tears-of-steel-aac-128k-it.isma

Italian audio dummy track for example purposes (not part of the VODPack)

tears-of-steel-aac-128k-de.isma

German audio dummy track for example purposes (not part of the VODPack)

tears-of-steel-aac-128k-commentary.isma

Commentary audio dummy track for example purposes (not part of the VODPack)

#!/bin/bash
mp4split -o presentation.ism \
  --hds.client_manifest_version=2 \
  --hls.client_manifest_version=4 \
  tears-of-steel-avc1-1500k.ismv \
  tears-of-steel-avc1-1000k.ismv \
  tears-of-steel-avc1-750k.ismv \
  tears-of-steel-avc1-400k.ismv \
  tears-of-steel-aac-128k.isma \
  tears-of-steel-aac-128k-it.isma \
  tears-of-steel-aac-128k-de.isma \
  tears-of-steel-aac-128k-commentary.isma --track_role=commentary

Alternate audio for MPEG-DASH

Representations are arranged into Adaptation Sets. To allow for seamless switching between Representations in a Adaptation Set, the Representations are grouped in the same Adaptation Set, if, and only if, they have identical values for the the following properties:

  • the language as described by the @lang attribute.

  • the Role element.

  • the @codecs attribute.

  • the @audioSamplingRate attribute.

Alternate audio for HTTP Live Streaming (HLS)

Important

Alternate audio for HLS requires at least version 4 of the protocol. Make sure to set this using the --hls.client_manifest_version option.

When using alternate audio, for instance different languages as English and German, it is mandatory to have the language tracks in the same bitrate.

This is required in HLS v4 to create correct groups of audio tracks, which in turn will allow the player to select the language selection option the UI.

If you have more 'tracks', say two audio bitrates in two languages you will need four audio tracks. The manifest will present two groups and the player will select the better quality while maintaining the language selection option in the UI.

HLS Alternate Audio for older devices

In some situations your client may be bound to an older client manifest version without support for alternate audio tracks. Let's assume you have packaged a video asset with English, Spanish and German audio. The method to request alternate audio tracks in HLS is by adding the following parameters to your request URL:

URL to the media presentation

Description

http://localhost/video/video.ism/video.m3u8?tracks=audio_eng,video_eng

Select audio in English and video in English.

http://localhost/video/video.ism/video.m3u8?tracks=audio_spa,video_spa

Selects audio in Spanish and video in Spanish.

http://localhost/video/video.ism/video.m3u8?tracks=audio_ger,video_ger

Selects audio in German and video in German.

http://localhost/video/video.ism/video.m3u8?tracks=audio_spa,video_eng

Selects audio in Spanish and video in English.

Alternate audio for HTTP Dynamic Streaming (HDS)

Important

Alternate audio for HDS requires at least version 2 of the protocol. Make sure to set this using the --hds.client_manifest_version option.

Please note that HDS does not support identical audio tracks in different bitrates.

Alternate audio for HTTP Smooth Streaming (HSS)

Important

It's not possible to mix mono and stereo audio with HTTP Smooth Streaming. Audio tracks should either be all mono or all stereo.

Please note that HSS does not support identical audio tracks in different bitrates.