Remote Storage Reducing Latency

Introduction

A webserver is stateless and each call for a fragment will trigger reading the server manifest and requesting (sample) ranges from the storage backend.

This configuration when storing source content in HTTP storage results in an overhead in communications between origin and a HTTP storage location.

Using a cache on the origin for requests normally made against the content on HTTP storage minimizes latency and improves throughput.

cdns -> shield-cache -> origin -> (http) storage

The origin makes multiple calls to the storage for manifest and sample indexes to find and retrieve the samples it needs to create the output. When the source content is large, the content index will also be large and without caching there will be significantly more requests, increasing the overall latency of the response.

The following setup describes how to lower latency. If you do not experience latency problems (for instance when storage and webserver(s) are close to each other) you do not needd to use this setup.

Local cache overview

By utilizing a local cache populated with additional index files made from the stored content, it is possible to significantly reduce the number of requests. This leads to both a reduction in fragment latency and an increase in throughput. An additional benefit would be a reduction in CPU usage on the origin server.

Schematically this looks likes the following:

cdns -> shield-cache -> origin -> storage-cache -> (http) storage

The server manifest and index file are cached locally, pointing to the audio and video source placed in the storage:

cache       [http]     storage
              |
.ism          |
  --> .mp4 -- | -->   audio/video source

Prerequisites

Index files need to be created alongside the existing content assets, these are small files containing the necessary meta-data and references the actual movie data from the original (fragmented) video.

The index MP4 must be put in the storage in the same bucket as the .ism and .ismv/a files and contain references to the media directly in the same bucket (no uri of any kind). As the references in the MP4 are relative, the origin resolves the references against the content accessible on the same path.

An index MP4 should be created for each fragmented MP4 (audio, video or text).

In addition to mod_smooth_streaming you will need to enable the following modules:

  • mod_headers
  • mod_proxy
  • mod_proxy_http
  • mod_cache
  • mod_cache_disk

A new virtual host will be created on the origin to proxy the S3 content through to the default virtual host and will selectively cache the index files locally.

Creating the index MP4 files

To create the data reference MP4 files you use the --use_dref command with mp4split as below:

#!/bin/bash

mp4split -o example-400k-index.mp4 \
  --use_dref example-400k.ismv
mp4split -o example-800k-index.mp4 \
  --use_dref example-800k.ismv
mp4split -o example-64k-index.mp4 \
  --use_dref example-64k.isma

Please make sure you use a filename that you can later match against.

Creating a new manifest

A new manifest is necessary to reference the new index MP4s, the origin can then fetch them from the local cache for subsequent requests:

#!/bin/bash

mp4split -o example.ism \
  example-400k-index.mp4 \
  example-800k-index.mp4 \
  example-64k-index.mp4

Virtual Host configuration

You will need to add an additional Virtual Host to your origin configuration and a new listen port of 8081 to your main configuration.

The new Virtual Host will act as a reverse proxy caching only the manifest and index files, the default Virtual Host can choose to then use the cached files instead.

Below is an example default Virtual Host and the new Virtual Host available on port 8081. Please note this example is for CentOS.

As the HTTP storage returns a status code of 206 for the requested S3 content we remove the range header from the manifest and index files only, allowing them to enter the local Apache cache.

You can uncomment the debug logging temporarily on the caching Virtual Host to ensure your mod_cache set is working correctly and the right items have the correct status code and are in turn being cached.

<VirtualHost *:80>
   Header set Access-Control-Allow-Origin "*"
   ServerAdmin webmaster@localhost
   ServerName example-origin.usp
   DocumentRoot /var/www/origin

   ErrorLog /var/log/httpd/origin-error.log
   CustomLog /var/log/httpd/origin-access.log combined

   <Location />
      UspHandleIsm on
      UspHandleF4f on
   </Location>

   <Directory /var/www/origin/proxy-test>
      Options Indexes FollowSymLinks MultiViews
      AllowOverride None
      Order allow,deny
      allow from all
      IsmProxyPass http://localhost:8081/
   </Directory>
</VirtualHost>

<VirtualHost *:8081>
  ServerName origin-cache.usp

  # The cache directory (which should exists)
  CacheRoot /var/cache/httpdcache
  CacheEnable disk /
  CacheDirLevels 5
  CacheDirLength 3
  CacheDefaultExpire 7200
  CacheIgnoreNoLastMod On
  CacheIgnoreCacheControl On
  CacheIgnoreQueryString On
  # The max size of your index files
  CacheMaxFileSize 1000000000

  # This allows for the full dref mp4 index to be cached locally
  CacheQuickHandler off

  # Unset range to cache the index mp4s and manifest
  <LocationMatch ".*\.[is]sm[l]?$">
    RequestHeader unset Range
  </LocationMatch>

  <LocationMatch ".*-index\.mp4$">
    RequestHeader unset Range
  </LocationMatch>

  ProxyPass / http://s3-eu-west-1.amazonaws.com/your-bucket/ connectiontimeout=5 timeout=10 ttl=300 keepalive=on retry=0
  ProxyPassReverse / http://s3-eu-west-1.amazonaws.com/your-bucket/

  ErrorLog /var/log/httpd/cache-error.log
  CustomLog /var/log/httpd/cache-access.log combined
  #LogLevel debug # check your files are being cached
</VirtualHost>

Note

This guide does not cover S3 authorization with a secret key and access key. It is recommended in this instance to use a S3 bucket policy to allow requests from the origin, preserving the portability of the files, and to mitigate the extra signature and signing workflow that would be necessary.