Timed Metadata: SCTE 35-based content replacement

Introduction

When streaming Live, metadata can be used to ‘mark’ a certain timestamp in the stream. Such a mark is also called a ‘cue’. These markers or cues are pushed to a publishing point as part of a separate track and are carried in SCTE 35 messages, which, like the contents of all other tracks, need to be packaged in fMP4 containers.

Note

Read more about timed metadata in our blog post: How to make your media streams smarter using timed metadata.

SCTE 35 messages can contain info about a program or other relevant data, but for the current document it is most important that they can cue splice points in a stream.

A splice point is a specific timestamp in a stream that corresponds to an IDR frame, which means that the splice point offers the opportunity to seamlessly switch the livestream to a different clip. Splice points can be used to cue:

  • (Ad) insertion opportunities
  • Start and endpoint of a program

When the timestamp of the cue that signals a splice point does not correspond to the start of a media segment in the stream (i.e., it does not correspond to any of the IDR frames that are present in the stream by default), the encoder that pushes the livestream to the publishing point needs to insert an additional IDR frame at the timestamp that the cue signals. If the --splice_media option is enabled, Origin will then splice frame accurately and part of the media segment that contains the splice point will be merged with either the previous or the next segment so that number of segments before and after splicing remains the same.

In addition to splicing the media segments if necessary, Origin will also signal the splice points in the Apple HLS and MPEG DASH client manifests. A third party service may then be used to insert a clip into the livestream at such a splice point, to create an ad insertion or ad replacement workflow for example.

Ad insertion

Splice points make replacing the part of the media stream that is marked as an ad insertion opportunity a relatively simple process since it only needs to involve manipulation of the client manifest. No further changes to the media are necessary and the media is shared across all the viewers.

The manifest and media is conditioned in such a way that you can use it directly for playout (with the original broadcast feed) or dynamically manipulate the manifest to replace parts of the content.

Note

In addition to Origin, using a third party service is necessary to create a full ad insertion or ad replacement workflow for livestreams. However, if the result does not need to be a livestream, but a VOD clip from the livestream, a workflow that incorporates Unified Capture and Unified Remix can be used to support ad insertion and ad replacement.

Live2VOD

By demarcating the start and endpoint of a program with splice points, it becomes easy to create a frame accurate VOD clip for such a program, as the presence of IDR frames at these points eliminates the need for any transcoding. In a similar vain, all original advertisements can be cut from the VOD clip as well, by making sure that the original livestream contains splice points at the start and end of all of these.

Note

Unified Capture may be used with Origin to create a Live2VOD workflow: Capturing LIVE.

How to enable media splicing on SCTE 35 markers

–timed_metadata

New in version 1.9.0.

If your USP license includes support for timed metadata, you can enable Origin to pass it through using the --timed_metadata option when creating the server manifest. SCTE 35 messages that Origin ingests are then automatically signaled in the MPEG-DASH and Apple HLS client manifests.

–splice_media

New in version 1.9.0.

This option requires that your USP license includes support for timed metadata and that the –timed_metadata option is enabled on the publishing point of your livestream. When this is true and your content contains SCTE 35 markers that signal splice points, Origin can be instructed to splice the MPEG-DASH and Apple HLS media segments on these splice points by enabling the --splice_media when creating the Live server manifest.

Origin ingest requirements

Origin supports the ingest of SCTE 35 messages in the form of DASH event messages. This means that SCTE 35 messages need to be contained in a emsg box inside a fMP4 container. The SCTE 35 messages need to be stored in binary, with a schemeIdUri of urn:scte:scte35:2013:bin.

The timing of the cue must be sample accurate and it is an error to not have an IDR frame at the time of a cue. In other words, the encoder must ensure that an IDR frame is present at each timestamp that is signaled in a SCTE 35 message. Also, if an additional IDR frame needs to be inserted, the encoder should not shorten or lengthen any of the media segments but keep the original length intact.

In addition to the above, the regular ingest requirements should be followed, as documented in Encoding Requirements.

SCTE 35 cue events

Note

We follow the guidelines as described in ANSI/SCTE 67 2017. In particular chapter 8.1: Starting a Break.

The SCTE 35 splice_insert() is used to announce an opportunity to either splice out of the network into an ad (a ‘cue-out event’), or splice into network, out of an ad.

A cue-out event is indicated by a SCTE 35 splice_insert() message with the out_of_network_indicator field set to 1. It needs to reach Origin at least half a media segment’s duration prior to the splice time, with a minimum of four seconds. Furthermore, the break_duration() must be present and signal the duration of the break.

Note

A cue-out event should not overlap other cue-out events.

A SCTE 35 splice_insert() message that signals the return to the main content must have the out_of_network_indicator set to 0. When present, its splice_event_id must match an associated cue-out event. The time of the cue that signals the switch back to the main content must match the sum of the associated cue-out’s time and duration.

A cue event may cause Origin to splice a media segment. Since a media segment can only be spliced once, you cannot have cues with time ranges that refer to the same media segment. E.g., when using 8 seconds media segments you cannot have a cue-out marker with a duration shorter than 8 seconds. This should not pose a problem because in practice the (ad) insertion opportunities are much longer than a single media segment.

Origin playout

The events signaled in SCTE 35 messages are passed through to the client manifests when this feature is part of your USP license and you have enabled the --timed_metadata option for the particular publishing point. For HLS, the events are signaled using the EXT-X-DATERANGE tag and a combination of the EXT-X-CUE-OUT and EXT-X-CUE-IN tags, while for MPEG-DASH they are signaled in DASH Event Messages.

For HLS, the EXT-X-DATERANGE and EXT-X-CUE-OUT plus EXT-X-CUE-IN tags present different ways of signaling similar information. By adding them both, compatibility with a broader range of 3rd party services that make use of these tags is ensured. Some of these services rely on the EXT-X-DATERANGE tag being present (e.g., Yospace), while others expect EXT-X-CUE-OUT and #EXT-X-CUE-IN tags (e.g. Google DFP and AWS Elemental MediaTailor).

Because Origin appends or prepends part of the spliced media segment to the previous or next media segment, no new media segments are introduced and discontinuities in the sequence numbering of the segments is avoided. Since we merge a part of the spliced media segment, the duration of the media segments remains between 0.5 and 1.5 times the original segment duration.

HLS signaling of SCTE 35

In addition to SCTE 35 markers being added to the media playlists using the EXT-X-DATERANGE and EXT-X-CUE-OUT plus EXT-X-CUE-IN tags, the break_duration() of a cue-out event is signaled both in the PLANNED-DURATION attribute of the EXT-X-DATERANGE tag and as the value of the EXT-X-CUE-OUT tag.

An example from an actual livestream with SCTE 35 markers is shown below. The example is taken from HLS - Pure live (SCTE 35). This livestream contains cue-out markers that are exactly aligned with the media segment boundaries so that no media segments need to be spliced:

#EXTINF:4, no desc
scte35-audio=69000-video=700000-385202370.ts
#EXT-X-DATERANGE:ID="2002",START-DATE="2018-10-29T10:38:00Z",PLANNED-DURATION=24,SCTE35-OUT=0xFC302100000000000000FFF01005000007D27FEF7F7E0020F580C0000000000088B9661D
#EXT-X-CUE-OUT:24
#EXT-X-PROGRAM-DATE-TIME:2018-10-29T10:38:00Z
#EXTINF:4, no desc
scte35-audio=69000-video=700000-385202371.ts
#EXTINF:4, no desc
scte35-audio=69000-video=700000-385202372.ts
#EXTINF:4, no desc
scte35-audio=69000-video=700000-385202373.ts
#EXTINF:4, no desc
scte35-audio=69000-video=700000-385202374.ts
#EXTINF:4, no desc
scte35-audio=69000-video=700000-385202375.ts
#EXTINF:4, no desc
scte35-audio=69000-video=700000-385202376.ts
#EXT-X-CUE-IN
#EXT-X-PROGRAM-DATE-TIME:2018-10-29T10:38:24Z
#EXTINF:4, no desc

Note

As long as a cue-event has not finished, the signaling of the start of the event will remain part of the playlist, even if the start of the event is outside of the specified DVR window.

SCTE 35 based spliced media in HLS

The below example that is taken from HLS - Pure live (SCTE 35, spliced) shows the signaling of SCTE 35 markers in a livestream when the cue-out markers do not align with the media segment boundaries. In this example, a 10 second segment need to be spliced to fit a 24 second break. (Origin will only splice segments if the --splice_media option is present in the publishing point’s server manifest.)

The duration of the SCTE 35 cue-out event in the example below is 24 seconds and the last media segment in this time range is spliced at exactly 4 seconds and appended to the media segment before it to match the duration of 24 seconds (10 + (10 + 4) = 24 seconds). The remaining 6 seconds of the media segment that was spliced then becomes a media segment on its own, right after the cue-in that signals the end of the 24 second break:

#EXTINF:10, no desc
scte35-audio=69000-video=700000-154080970.ts?hls_minimum_fragment_length=10
#EXTINF:10, no desc
scte35-audio=69000-video=700000-154080971.ts?hls_minimum_fragment_length=10
#EXTINF:10, no desc
scte35-audio=69000-video=700000-154080972.ts?hls_minimum_fragment_length=10
#EXT-X-DATERANGE:ID="2004",START-DATE="2018-10-29T10:42:00Z",PLANNED-DURATION=24,SCTE35-OUT=0xFC302100000000000000FFF01005000007D47FEF7F7E0020F580C000000000004F1B1A5F
#EXT-X-CUE-OUT:24
#EXT-X-PROGRAM-DATE-TIME:2018-10-29T10:42:00Z
#EXTINF:10, no desc
scte35-audio=69000-video=700000-154080973.ts?hls_minimum_fragment_length=10
#EXTINF:14, no desc
scte35-audio=69000-video=700000-154080974.ts?hls_minimum_fragment_length=10
#EXT-X-CUE-IN
#EXT-X-PROGRAM-DATE-TIME:2018-10-29T10:42:24Z
#EXTINF:6, no desc
scte35-audio=69000-video=700000-154080975.ts?hls_minimum_fragment_length=10
#EXTINF:10, no desc
scte35-audio=69000-video=700000-154080976.ts?hls_minimum_fragment_length=10

MPEG DASH signaling of SCTE 35

For a MPEG DASH client manifest, Origin does not create a new period for cue-out events, but signals these events in the same period as the main event. The advantage of this is that the presentation remains backwards compatible with any device previously capable of playing this stream (if the player on such a devices silently ignores the presence of the EventStream element).

The SCTE 35 markers are carried in DASH Event Messages. The @presentationTime is relative to the Period@Start, the @messageData contains a base64 encoded representation of the message_data field in the emsg.

The below example is taken from MPEG DASH - Pure live (SCTE 35). In this livestream the cue-out markers are exactly aligned with the media segment boundaries so that no media segments are spliced.

<EventStream
  schemeIdUri="urn:scte:scte35:2014:xml+bin"
  timescale="1">
  <Event
    presentationTime="1540809120"
    duration="24"
    id="1999">
    <Signal xmlns="http://www.scte.org/schemas/35/2016">
      <Binary>/DAhAAAAAAAAAP/wEAUAAAfPf+9/fgAg9YDAAAAAAAA/APOv</Binary>
    </Signal>
  </Event>
  <Event
    presentationTime="1540809240"
    duration="24"
    id="2000">
    <Signal xmlns="http://www.scte.org/schemas/35/2016">
      <Binary>/DAhAAAAAAAAAP/wEAUAAAfQf+9/fgAg9YDAAAAAAAA2Z7lO</Binary>
    </Signal>
  </Event>
</EventStream>

SCTE 35 based spliced media in DASH (index based)

Attention

At the moment, you must use --splice_media in combination with --mpd.miminum_fragment_length if you want to use DASH as an output format in combination with SCTE 35 markers that signal cue out events within a segment (which will be the case in any real-life use case). This is so because splicing MPEG-DASH media segments is currently only supported with number based indexing of segments (which is automatically enabled when using --mpd.miminum_fragment_length).

In the example MPEG DASH - Pure live (SCTE 35, spliced) livestream, the AdaptationSets in the MPD contain media segments with a duration of 8 seconds, because --mpd.miminum_fragment_length=8 is specified (the ingested fragments have a 1 second length and are concatenated into segments that have the 8 seconds length that is specified using the --mpd.miminum_fragment_length option).

As the cue-out events signaled by the SCTE 35 markers in this stream do not align with the stream’s media segment boundaries, the media segments containing the timestamps that represent the start and end of a cue-out event need to be spliced and Origin will append or prepend part of the spliced media segment to the previous or next media segment.

In this example, that results in the media segments before and after a splice being between 4 and 12 seconds long, instead of the stream’s original segment duration of 8 seconds. However, because the SegmentTemplate does not signal individual segments when $Number$ is used to index the segments, the contents of the SegmentTemplate element are no different than if splicing would not have been enabled in the server manifest of the publishing point.

Insertion opportunities for MPEG DASH

When a third party service is used to insert content based on the cue-out events in the MPEG DASH client manifest that is generated by Origin, this service can create a multi-period presentation based on the original client manifest. When doing so, it should insert a new period inserted both at the time of the start and end of the cue-out event.

The second of the new periods will represent a return to the main event, while the first will represent the content that is inserted (e.g., an ad). The timing information required for this can be calculated based on the timestamp in the EventStream element of the MPD, where the start of the event is represented by the @presentationTime and the end of the event equals the sum @presentationTime and @duration.

Note

When the original livestream contains ads and SCTE 35 markers are inserted to allow for these ads to be replaced, the markers also have a use when the ads are not yet replaced, because a device’s player can be configured to use the events signaled in EventStream to fire beacons to gather metrics for the ads in the original livestream.