MPTF/ADR Minimal Control Model Proposal

From Web and TV IG

Brief Definition

Minimal Control - In this model, the manifest file and heuristics are handled by the user agent. A few parameters or interfaces may be implemented provide hints to the adaptive bit rate process, but the user agent is in control. This model has the advantage of a simple interface at the expense of limited ability to modify the design.

This proposal is responsive to LC Bug #13625 and LC Bug #12399.

Use Cases

The use cases below are categorized within different overall concerns, in order of descending priority.

Reporting

  • RP1 - An application developer would like to collect data on CDN (Content Delivery Network) health and performance by getting playback statistics from widely deployed players.
  • RP2 - An application developer would like to collect data on the quality of a user's experience by monitoring playback statistics & script callbacks (for quality changes)

Errors

  • ER1 - An application developer would like to provide appropriate responses to error conditions specific to adaptive playback.

Control

  • CT1 - An application developer would like to limit switches to, and playback of, high bandwidth streams.
  • CT2 - An application developer would like to enable an end-user to select only an "HD" level (or other high quality) stream.
  • CT3 - An application developer would like to enable an end-user to select a lower-quality stream to compensate for "hunting" in the heuristics.

Control Parameters

These parameters support use case CT1, CT2, CT3.

  • maximumBandwidth (input) - This parameter specifies the maximum bandwidth usage for downloading adaptive streaming content. The input is in bits/second.
    • [MW] I propose to add "The user agent should ensure that the network bandwidth used, averaged over the lifetime of the media element, does not exceed the specified value."
  • minimumBandwidth(input) - This parameter specifies the minimum throughput above which the UA should optimize for continuous playback and below which it should optimize for quality, even if this implies buffering. The input is in bits/second.
    • [MW] I propose to add "The user agent should optimise for quality when the average throughput falls below the specified value over a user-agent dependent timescale, recommended to be at least one minute."
  • startingPlaybackHint(input) - This parameter gives the UA a hint about the preferred startup optimization, whether it be for fast video startup or for high quality.

Error Codes

These codes are proposed as HTML error codes to be added to the current error code list.

This supports use case ER1.

  • MEDIA_ERR_ADB_MALFORMED_SEGMENT - Indicates that some media segment was malformed and unable to be rendered by the UA
  • MEDIA_ERR_ADB_MISSING_SEGMENT - Indicates that the UA got 404 when attempting to download a media segment
  • MEDIA_ERR_ADB_MALFORMED_MANIFEST_FILE - Indicates that the manifest file was malformed and could not be parsed by the UA
  • MEDIA_ERR_ADB_MISSING_MANIFEST_FILE - Indicates that a manifest could not be downloaded. In HLS particularly, there are "nested" M3U8 playlists for the variant streams, so a nested playlist would not cause a 404 for the @src

Feedback

This section describes how information communicated back from the user agent to the client application. This approach uses callbacks to send asynchronous feedback to the client application. The suggested feedback values (obtained through <video> element accessors):

These properties support use cases RP1 and RP2.

  • downloadRate - The "instantaneous" server-client bandwidth, expressed in bits/second as a moving average over a specific period (suggested recommendation of one minute or the duration of the content if the duration is less than a minute).
    • [MW] I don't believe this is useful. As discussed, if we are going to provide the script with information needed to make real-time decisions, this should be done based on experience and detailed analysis and leads naturally to Model 2 and Model 3 solutions
  • representationID - An identifier for the current stream (separately for each track) where the id is taken from the manifest in a way specific to the adaptive streaming system.
  • droppedFrames - The total number of frames dropped for this playback session.
  • decodedFrames - The total number of frames decoded for this playback session.
  • bufferLength - The current length of buffered video (in media time). This is the current "amount" of video presently in the UA's playback buffer.
    • [MW] It remains unclear to me how this is different from the existing timeranges attribute

JavaScript APIs

Script Callbacks

These script callbacks support use cases RP1 and RP2.

representationChanged(representationIdentifier) - This is called when the UA has selected a new representation. The callback parameter is an identifier for the now-playing representation.

  • [MW] I propose to change "has selected" to "begins rendering".

representationGroupChanged(position) - This is called when the group of representations has changed and representations are either added or removed (for example when new or modified quality levels or representations are introduced in the manifest). The position indicates when the change shall occur.

  • [MW] I don't understand what this means or how it differs from the change in available tracks discussed at the HTML f2f meeting

References To Existing Implementations

The following section provides examples of how other systems manage adaptive bit rate media.

Flash

Flash video playback APIs provide similar capabilities to the proposed HTML5 additions:

  1. NetStreamInfo class [1]
    1. droppedFrames : Number
      1. [read-only] Returns the number of video frames dropped in the current NetStream playback session.
    2. maxBytesPerSecond : Number
      1. [read-only] Specifies the maximum rate at which the NetStream buffer is filled in bytes per second.
    3. playbackBytesPerSecond : Number
      1. [read-only] Returns the stream playback rate in bytes per second.
  2. NetStream class [2]
    1. bufferLength : Number
      1. [read-only] The number of seconds of data currently in the buffer.
    2. client : Object
      1. Specifies the object on which callback methods are invoked to handle streaming or F4V/FLV file data.
      2. An event info code of NetStream.Play.TransitionComplete is used to indicate bitrate or representation changes.
      3. Further details: [3]
  3. Maximum & Minimum Bandwidth and Startup Selection
    1. This is typically handled by a application code and depends on the heuristics applied on top of the native APIs.
    2. Most video frameworks (such as Open Source Media Framework) abstract this into API's that enable:
      1. the ability to set a starting index or bitrate
      2. setting the min bitrate
      3. setting the max bitrate

Silverlight

The following API within the Silverlight client allows an application developer to pass in a list of "allowable" tracks from a given manifest. This gives the ability to limit the set of tracks with a minimum and maximum level, or any arbitrary set of tracks. This would be analogous to combination of minLevel() and maxLevel() parameters.


public bool SelectTracks( IList<TrackInfo> selectedTracks, bool flushBuffer ) SelectTracks


This Silverlight API notifies the application when the playback track has changed. This would be analogous to the desired "representationChanged()" callback.

public event EventHandler<TrackChangedEventArgs> PlaybackTrackChanged PlaybackTrackChanged