Adaptive Bitrate (ABR) Streaming

Adaptive bitrate (ABR) streaming uses a source video format that is encoded at multiple bitrates. The most commonly used video codecs are H.264/AVC and H.265/HEVC. The most commonly used audio codec is AAC. See the Factsheet for an overview of supported codecs and formats.

Since a presention consists of many different assets (audio, video, subtitles) we use a manifest file that describes the information of all the assets.

Creation of a server manifest file

The server manifest is used by the webserver module. It holds information about the available streams, DRM options, etc. The webserver module uses the information in the server manifest file for creating the different client manifests / playlists and applying any selected encryption.

When you are using an encoder (please see the Factsheet for an overview of supported encoders) for producing VOD smooth streaming presentations, you always have to re-create the server manifest file. USP stores additional information in the server manifest file which may not be available in the server manifest file generated by the encoder.

It is not necessary to create the playlists (.ismc, .m3u8, .mpd, .f4m). All these files are created on-the-fly by the webserver module.

The server manifest file comes in two flavours. A server manifest filename ending with the extension .ism is used for VOD presentations. For LIVE presentation an extension of .isml is used (note the additional character l for live).

Using fragmented-MP4 as source

Fragmented-MP4 video is an ISO base media file format. It is similar to a progressive MP4, but already prepared for ABR (Adaptive Bitrate) playout. This is the preferred format.

There are many variations, e.g. MPEG DASH, Microsoft's Smooth Streaming (Protected Interoperable File Format (PIFF)), UltraViolet's Common File Format, and Adobe's F4F.

In its simplest form the command for creating a VOD server manifest file is:

#!/bin/bash

mp4split -o tears-of-steel.ism \
  tears-of-steel.ismv

You can also specify multiple input filenames. Say you have the audio track in tears-of-steel-64k.isma and two video files, tears-of-steel-400k.ismv and tears-of-steel-800k.ismv:

#!/bin/bash

mp4split -o tears-of-steel.ism \
   tears-of-steel-64k.isma \
   tears-of-steel-400k.ismv \
   tears-of-steel-800k.ismv

See Input track selection and editing for more advanced options.

The video files do not necessarily have to be stored locally. The webserver is also able to fetch the media data from any HTTP server (e.g. S3, Azure). Just specify the input files using fully qualified URLs:

#!/bin/bash

mp4split -o tears-of-steel.ism \
  http://storage.server/tears-of-steel/tears-of-steel-64k.isma \
  http://storage.server/tears-of-steel/tears-of-steel-400k.ismv \
  http://storage.server/tears-of-steel/tears-of-steel-800k.ismv

See Remote Storage for more information.

Using progressive-MP4 as source

You can also use progressive-MP4 files as input. Progressive MP4 files have a single index and require more work and memory by the Origin to process. For very large files it is recommended to use fragmented-MP4 instead.

When all the tracks (bitrates) are contained in one mp4 file, no server manifest is needed. When the tracks (bitrates) are individual mp4's a server manifest must be created to indicate which files belong together:

#!/bin/bash

mp4split -o tears-of-steel.ism \
  tears-of-steel-64k.mp4 \
  tears-of-steel-400k.mp4 \
  tears-of-steel-800k.mp4

Using HTTP Live Streaming (HLS) as source

New in version 1.7.2.

The Origin ingests HTTP Live Streaming presentations. The .m3u8 playlist is used for creating the timeline and determining the URLs of the media segments.

Create a server manifest file with the URLs of the media playlists as input:

#!/bin/bash

mp4split -o hls.ism \
  http://demo.unified-streaming.com/video/tears-of-steel/tears-of-steel.ism/tears-of-steel-audio_eng=127997.m3u8 \
  http://demo.unified-streaming.com/video/tears-of-steel/tears-of-steel.ism/tears-of-steel-video_eng=2997000.m3u8

MP4Split fetches the playlists and extracts all the information necessary to create the server manifest file.

Requirements

General requirements from HTTP Live Streaming section 3 Media segments, specifically:

  • Transport Stream segments MUST contain a single MPEG-2 program.
  • There MUST be a Program Association Table (PAT) and a Program Map Table (PMT) at the start of each segment.

Additionally,

  • A media segment that contains video MUST start with an Instantaneous Decoder Refresh (IDR) coded picture and enough information to completely initialize a video decoder.
  • Media segments MUST be decodable without information from other segments, the EXT-X-INDEPENDENT-SEGMENTS setting must be used. E.g. the PES payload may not spill into the next segment.
  • The duration value of the EXTINF tag MUST be accurate.

Ingesting protocol version 1

  • Playlists specify media segments containing multiplexed audio and video.
  • For streams containing both audio and video, the PTS of the first audio Access Unit (AU) MUST be greater than or equal to the PTS of the first video AU.

Ingesting protocol version 4

  • Separate playlists for audio and video.
  • Use IFRAME playlists to ingest smaller duration media segments.

Technical details

HTTP Smooth Streaming (HSS) playout requires that fragments start with an IDR frame. This means we have to construct the timeline in the client manifest file based on the timestamps of the keyframes.

If the M3U8 playlist lists MPEG TS segments that always start with an keyframe (the EXT-X-INDEPENDENT-SEGMENTS setting must be used) then we can use that as video playlist.

If this is not the case (that is: the keyframes are positioned arbitrarily in the MPEG TS) then all segments need to be scanned to determine their timestamp.

For performance reasons, this cannot be done on-the-fly.

In these cases, the exact IDR boundaries can be signaled in a separate keyframe playlist (using the EXT-X-I-FRAME-STREAM-INF tag and EXT-X-BYTERANGE).

If the ingested HLS does not have a separate keyframe playlist, one can be easily generated from the original video M3U8 playlist.

For example (using Unified Packager):

#!/bin/bash

mp4split --package_hls -o iframe-playlist.m3u8 \
  --create_iframe_playlist \
  original.m3u8

mp4split -o playout.ism \
  iframe-playlist.m3u8

Above can also be used to fix inaccurate EXTINF values typically found in older HLS content (protocol versions < 4).

Creating the server manifest

For creating the (server) manifest file we access the master playlist.

The stream information (bitrate and codec information, audio attributes like samplerate and video attributes like width and height) is copied from the master playlist.

Creating the client manifest

For creating the (client) manifest file we access the following files:

  1. The master playlist.
  2. The (keyframe) media playlist.
  3. The first media segment listed in the media playlist.

The media playlist has a complete list of all the media segments and timing information necessary to create the timeline. If EXT-X-I-FRAME-STREAM-INF is available we use them, otherwise we fall back to the EXT-X-STREAM-INF playlists.

Summary

HLS ingest formats supported:

  • M3U8 master playlist protocol version 1 where segments always start with an IDR frame.
  • M3U8 master playlist protocol version 4 where segments always start with an IDR frame (possibly signaled with EXT-X-INDEPENDENT-SEGMENTS).
  • M3U8 master playlist protocol version 4 that include EXT-X-I-FRAME-STREAM-INF for the video streams.

Audio segments do not have boundary limitations. We assume that we can segment these on any Access Unit (AU). The duration of an audio segment is selected to be an exact multiple of the timescale/samplerate.

Options for VOD and LIVE streaming

The webserver module supports the following options. If an option is preceded by iss., hls., hds. or mpd. then it is specific to that playout format.

--[iss|hls|hds|mpd].disable

Disables playout of a specific format. The default is to allow playout to all formats (HTTP Smooth Streaming, HTTP Live Streaming, HTTP Dynamic Streaming and MPEG-DASH).

--[iss|hls|hds|mpd].minimum_fragment_length

The default duration for media fragments is determined by the keyframe interval (GOP size) used in the encoded video.

For HTTP Smooth Streaming and MPEG-DASH using a closed GOP of two seconds works best. The GOP size is set by the encoder during the encoding process.

HLS is optimized for media segments with a duration of around eight seconds. If your media segments are encoded in two seconds, then this option concatenates multiple GOPs into a single media segment. Setting minimum_fragment_length=8 instructs the Origin to concatenate four GOPs into a single media fragment. An example:

#!/bin/bash

mp4split -o video.ism \
  --hls.client_manifest_version=3
  --hls.minimum_fragment_length=8 \
  video.ism

--[iss|hls|hds|mpd].base_path

Add the given path/URL to every URL used by the generated playlists and manifests. Defaults to empty, making all the requests relative to the manifest file.

MPEG-DASH

The following options are specific to MPEG-DASH streaming:

--mpd.min_buffer_time

The value of the @minBufferTime attribute in the MPD element (in seconds).

Note

The value of the minimum buffer time does not provide any instructions to the client on how long to buffer the media.

The value however describes how much buffer a client should have under ideal network conditions. As such, MBT is not describing the burstiness or jitter in the network, it is describing the burstiness or jitter in the content encoding. It is a property of the content.

DASH-IF Interoperability Points section 3.2.8 Bandwidth and Minimum Buffer Time.

--mpd.inline_drm

Specify the content protection in the MPD.

This has the advantage that a player can quickly issue a license request, without having to load (or wait) on the initialization segment to become available.

--mpd.profile

The default is to select the profile that fits the presentation best, which is the approach that we strongly recommend.

This option will force the origin to signal a certain profile, regardless of the question whether the content matches the chosen profile or not. We therefore advise against using it. With the DVB Profile as the only exception, the origin won't actually make the content compliant.

The valid URNs for this option are the following:

Profile / Interoperability Point URN Section
urn:mpeg:dash:profile:mp2t-main:2011 ISO/IEC 23009-1 section 8.4
urn:mpeg:dash:profile:isoff-live:2011 ISO/IEC 23009-1 section 8.6
urn:com:dashif:dash264 DASH-IF DASH-AVC/264 section 6.3
urn:hbbtv:dash:profile:isoff-live:2012 HBB Profile
urn:dvb:dash:profile:dvb-dash:2014 DVB Profile (DVB Document A168 July 2014)

DVB DASH (urn:dvb:dash:profile:dvb-dash:2014)

When this profile is selected a presentation with the requirements imposed by the DVB Profile (DVB Document A168 July 2014) is written.

  • A SegmentTemplate is always used for describing the timeline.
  • The content is offered using Inband Storage for SPS/PPS. That is, the codec signaled in the sample entry is avc3 and each IDR frame starts with one or more SPS/PPS NAL units.

HLS - HTTP Live Streaming

The following options are specific to HTTP Live Streaming:

--hls.client_manifest_version

Output protocol version of the client manifest (.m3u8) file. Defaults to 1 specifying protocol version 1. For a detailed overview see here

Version 3 adds floating point durations which may be more accurate.

Version 4 adds media groups (instead of multiplexed streams) and enables WebVTT support. It also adds EXT-X-I-FRAMES-ONLY variants to the master playlist.

--hls.no_audio_only

Do not include the audio only track in the variant m3u8 playlist file. (Defaults to false.)

--hls.no_discontinuities

Output MPEG-TS streams with no discontinuities. (Defaults to true.)

--hls.optimized

Output MPEG-TS streams with optimized PES placement. (Defaults to false.)

Optimized PES placement reduces the overhead introduced by the MPEG-TS container, especially for low bitrate video streams. The additional complexity introduced in the MPEG-TS stream is fully compatible with HLS, but for achieving maximum compatibility with older legacy devices the default is false.

--hls.pass_sei

Wrap SEI messages into ID3 Tags.

--hls.subtitles_subformat

The subformat for tracks of type SUBTITLES. The default is to use WebVTT output.

When set to "SMPTETT" an additional SUBFORMAT attribute is added to the master playlist, signaling that the subtitles are formatted in SMPTE-TT XML. SMPTE-TT allows for more formatting capabilities. It also supports bitmaps. Note that you need a custom player supporting this format, it is not part of the HLS draft.

--hls.no_elementary

By default we output elementary streams for "aac", "ac3" and "ec3" adio. If this option is specified then the elementary streams are muxed into an MPEG-TS stream instead.

HDS - HTTP Dynamic Streaming

The following options are specific to HTTP Dynamic Streaming:

--hds.client_manifest_version

Output version of the client manifest file (.f4m). (Defaults to 1). Version 2 adds support for Late Binding Audio.

When the client manifest version is set to 2 and the dvr window length is set to 60 seconds or higher a <dvrInfo> tag is written in the manifest.

--hds.inline_drm

Inline the DRM meta data blob in the manifest file (instead of a URL).

--hds.no_onfi

The onFI message is added to any stream that has video or is an audio only stream (manifest version 1).

The messages are not added in an alternate audio stream (manifest version 2) since the OMSF framework stalls on alternate audio tracks that has SCRIPT tags. Note that this is okay, since the onFI messages are present in the main presentation and for Radio/Audio only presentations you use manifest version 1.

The timestamps in an onFI message are aligned to seconds and is written once per second. The system-date sd (dd-mm-yyyy) element is only present when UTC timing is present. The system-time st (hh:mm:ss.sss) element is always present.

Setting this flag skips inserting any onFI message at all.

HSS - Smooth Streaming

The following options are specific to ISS Smooth Streaming:

--iss.client_manifest_version

Output version of the client manifest file. Defaults to 20 specifying version 2.0. Version 2.2 client manifests add support for compressed timelines, this greatly reduces the size of the manifest files.