Object Storage Reducing Latency

Introduction

A webserver is stateless and each call for a fragment, will trigger reading the server manifest and requesting (sample) ranges from the storage backend.

This configuration, when storing source content in HTTP storage, results in an overhead in communications between origin and a HTTP storage location.

Using a cache on the origin for requests normally made against the content on HTTP storage, minimizes latency and improves throughput.

cdns --> shield-cache --> origin --> http storage (s3)

The origin makes multiple calls to the storage for manifest and sample indexes to find, and retrieve the samples it needs to create the output. When the source content is large, the content index will also be large. Without caching there will be significantly more requests increasing the overall latency of the response.

The following setup describes how to lower latency. If you do not experience latency problems (for instance when storage and webserver(s) are close to each other) you do not need to use this setup.

Local cache overview

By utilizing a local cache populated with additional index files made from the stored content, it is possible to significantly reduce the number of requests. This leads to both a reduction in fragment latency and an increase in throughput. An additional benefit would be a reduction in CPU usage on the origin server.

Schematically this looks likes the following:

cdns -> shield-cache -> origin -> storage-cache -> (http) storage

The server manifest and index file are cached locally, pointing to the audio and video source placed in the storage:

cache       [http]     storage
              |
.ism          |
  --> .mp4 -- | -->   audio/video source

Prerequisites

Index files need to be created alongside the existing content assets, these are small files containing the necessary meta-data and which reference the actual movie data from the original (fragmented) video.

The index MP4s must be put in the storage in the same bucket as the .ism and .cfmv/a files and contain references to the media directly in the same bucket (no uri of any kind). As the references in the MP4 are relative, the origin resolves the references against the content accessible on the same path.

An index MP4 should be created for each fragmented MP4 (audio, video or text).

In addition to mod_smooth_streaming you will need to enable the following modules:

  • mod_headers
  • mod_proxy
  • mod_proxy_http
  • mod_cache
  • mod_cache_disk

A new virtual host will be created on the origin to proxy the S3 content through to the default virtual host and will selectively cache the index files locally.

Enabling subrequests and connection reuse

The Apache directive UspEnableSubreq on must be added to the VirtualHost and <Proxy> sections and ProxySet directives configured for each storage, see the storage proxy Configuration documentation for the details on how to set this up.

The cache will gain more efficiency if the storage cache itself supports HTTP keepalive [1], and the Origin is configured to use it. When it does connections between the Origin and the storage cache can be pooled and re-used. This is configured like the following:

<Proxy "http://storage1.example.com/">
  ProxySet connectiontimeout=5 enablereuse=on keepalive=on retry=0 timeout=30 ttl=300
</Proxy>
[1]https://httpd.apache.org/docs/2.4/mod/core.html#keepalive

Creating the index MP4 files

To create the data reference (dref) MP4 files you use the --use_dref_no_subs command with mp4split as below:

#!/bin/bash

mp4split -o tears-of-steel-avc1-400k-index.mp4 \
  --use_dref_no_subs tears-of-steel-avc1-400k.cfmv

mp4split -o tears-of-steel-avc1-1000k-index.mp4 \
  --use_dref_no_subs tears-of-steel-avc1-1000k.cfmv

mp4split -o tears-of-steel-aac-128k-index.mp4 \
  --use_dref_no_subs tears-of-steel-aac-128k.cfma

Please make sure you use a filename that you can later match against in this case.

Note

Creating one dref MP4 for all tracks within a stream is possible as well, and may be even more efficient (simply add all tracks as input when creating the dref MP4).

Creating a new manifest

A new manifest is necessary to reference the new index MP4s, the origin can then fetch it from the local cache for subsequent requests:

#!/bin/bash

mp4split -o tears-of-steel.ism \
  tears-of-steel-avc1-400k-index.mp4 \
  tears-of-steel-avc1-1000k-index.mp4 \
  tears-of-steel-aac-128k-index.mp4

Apache configuration

You will need to add an additional Virtual Host to your origin configuration and a new listen port of 8081 to your main configuration.

The new Virtual Host will act as a reverse proxy caching only the manifest and index files, the default Virtual Host can choose to then use the cached files instead.

Below is an example default Virtual Host and the new Virtual Host available on port 8081. Please note this example is for CentOS.

As the HTTP storage returns a status code of 206 for the requested S3 content we remove the range header from the manifest and index files only, allowing them to enter the local Apache cache.

You can uncomment the debug logging temporarily on the caching Virtual Host to ensure your mod_cache set is working correctly and the right items have the correct status code and are in turn being cached.

<VirtualHost *:80>
  Header set Access-Control-Allow-Origin "*"
  ServerAdmin webmaster@localhost
  ServerName origin-proxy
  DocumentRoot /var/www/origin

  <Location "/">
    UspHandleIsm on
    UspEnableSubreq on
    IsmProxyPass "http://localhost:8081/"
  </Location>

  <Proxy "http://localhost:8081/">
    ProxySet connectiontimeout=5 enablereuse=on keepalive=on retry=0 timeout=30 ttl=300
  </Proxy>

  ErrorLog /var/log/apache2/origin-error.log
  CustomLog /var/log/apache2/origin-access.log combined
  LogLevel warn
</VirtualHost>

<VirtualHost *:8081>
  ServerName origin-cache

  # The cache directory (which should exists)
  CacheRoot /var/cache/apache2
  CacheEnable disk /
  CacheDirLevels 5
  CacheDirLength 3
  CacheDefaultExpire 7200
  CacheIgnoreNoLastMod On
  CacheIgnoreCacheControl On
  CacheIgnoreQueryString On
  # The max size of your index files
  CacheMaxFileSize 1000000000

  # This allows for the full dref mp4 index to be cached locally
  CacheQuickHandler off

  # Unset range to cache the index mp4s and manifest
  <LocationMatch ".*\.[is]sm[l]?$">
    RequestHeader unset Range
  </LocationMatch>

  <LocationMatch ".*-index\.mp4$">
    RequestHeader unset Range
  </LocationMatch>

  ProxyPass / http://your-bucket.s3.amazonaws.com/ connectiontimeout=5 timeout=10 ttl=300 keepalive=on retry=0
  ProxyPassReverse / http://s3.amazonaws.com/

  ErrorLog /var/log/apache2/cache-error.log
  CustomLog /var/log/apache2/cache-access.log combined
  #LogLevel debug # Use this to check your files are being cached
  LogLevel warn
</VirtualHost>

Note

This guide does not cover S3 authorization with a secret key and access key. It is recommended in this instance to use a S3 bucket policy to allow requests from the origin, preserving the portability of the files, and to mitigate the extra signature and signing workflow that would be necessary.