Workflow and Media

This page provides a general overview of the Remix AVOD workflow and some detail about what you need to consider regarding your source content:


The general workflow of Unified Remix is that it creates a remixed MP4 file from a SMIL Playlist. This remixed MP4 is ISOBMFF compliant. For a bit more background, please see Unified Remix - VOD. In short, a complete workflow is as follows, with the fourth step being unique to Unified Remix AVOD:

  • Generate a SMIL playlist that is used as input for Unified Remix.

  • Use Unified Remix to create a remixed MP4 from the SMIL playlist.

  • Package the remixed MP4 into a an actual stream using Unified Packager or Unified Origin.

  • Potentially use a third-party ad insertion service to dynamically insert ads.

For a Unified Remix AVOD use case, the SMIL Playlist lists one or more media sequences that are to be played consecutively and, if not yet present in the content itself, may include (additional) Timed Metadata as well. This Timed Metadata may be used to identify ad breaks.

The resulting remixed MP4 is a representation of the SMIL Playlist and represents all audio, video and Timed Metadata. The remixed MP4 does not contain actual media data, but instead references the original audio/video. Hence, a remixed MP4 is generally small in size.

When the media is not preconditioned (i.e., splice points do not match up with a keyframe in the video), Unified Remix may condition the media by closing the GOP at the splice point and inserting a keyframe. The transcoded GOP is then stored in the remixed MP4 file and no longer externally referenced. For more info about conditioning media, see Media Conditioning For Ad Insertion.

Generating a remixed MP4 with Unified Remix

The command-line used to generate a remixed MP4 from a SMIL playlist is very straightforward. All that is needed is to specify the playlist as input and the remixed MP4 as output:


unified_remix -o remixed.mp4 playlist.smil

The path(s) to the content in the SMIL playlist can be relative or absolute. The remixed MP4 that Remix will create from the playlist will reference the content using the same paths. If you use a relative path in your SMIL, it must be a downward path and it must reference local content. To create a remixed MP4 that references remote content using relative paths, mount the remote content so that you process it as if it's local. You can do this using a tool like s3fs or similar.

To learn more about the requirements and possibilities regarding the SMIL playlist format that is used as input for Unified Remix, please see the documentation further down below: SMIL Playlists - Timed Metadata - SCTE 35.


Unified Remix can ingest ISOBMFF subtitles (14496-30). This works best when the source uses fragmented WebVTT (text/wvtt) rather than TTML (subt/stpp) because the former can be dref-ed while the latter requires the transcoding of the media data payload. For optimal alignment of text track with main content we advise to package it with the same fragment duration (or GOP size) as the video before using at input for Unified Remix.

For example:


mp4split --fragment_duration=6006/1000 subs.cmft subs.vtt --track_language=eng

Source Media Considerations

Source media supplied to Unified Remix can come in many forms; fragmented or progressive MP4, and with either one or multiple tracks within a single file.

In scenarios where you source media contains multiple tracks per file, you should consider that Remix will process tracks based on the following attributes: type, role, language, bitrate. If two tracks in a file share the same values for all of these attributes, Remix will discard one the two as a duplicate and process only one of them.

This can be problematic in scenarios where not all of these attributes are set to a value that best describes the content. An example of this could be an MP4 containing two audio tracks (one for regular audio and one for audio description), both encoded with the same codec (AAC), bitrate (128Kbps), language (English), as well as the same role (main). In this case, the audio description track could be distinguished from the main audio by changing it's role to 'description'.