This document is licensed under a Creative Commons Attribution 3.0 License.
This specification defines a standard mechanism a user agent should use to expose the tracks in an MPEG-2 TS media container so that the tracks can be identifed in a common way.
This document is merely a public working draft of a potential specification. It has no official standing of any kind and does not represent the support or consensus of any standards organisation.
HTML5 UAs [HTML5] may playback MPEG-2 TS media resources that contain a multiplex of video, audio, text and private data elementary streams. Television program providers and distributors use these streams to deliver services associated with the primary video and audio in the multiplexed stream. These services are collectively termed "TV Services". A consistent HTML presentation of these TV services tracks by UAs is essential in order for script to understand the specific type of service and interpret the track data, independent of the media resource provider. This specification defines requirements for how these MPEG-2 TS elementary streams should be translated by the HTML5 user agent into the equivalent HTML5 video, audio and text track elements.
Note that the Web page providing the user interface (e.g. program guide) is often not provided by the originator of the program content. For example, the guide may be provided by the television manufacturer or the cable or satellite TV provider, while the multiplexed streams are provided by hundreds of independent television program providers. Therefore, the Web page has no a priori knowledge of which streams are in the programs at any given time.
This specification defines the requirements for an HTML5 user agent to recognize and make the MPEG-2 TS program streams available to Web content in a consistent way that is independent of the program provider. Example TV Services are:
| Closed Captioning | Textual representation of the media resource audio dialogue intended for the hearing impaired. |
| Subtitles | Alternate language textual representation of the media resource audio dialogue. |
| Content Advisories | Content rating information used by parental control applications. |
| Synchronized Content | Signaling messages to control the execution of a client application in a manner synchronized with the media resource playback. |
| Client ad insertion | Signaling messages that convey advertisement insertion opportunities to a client application. |
| Audio translations | Alternate language representation of the primary audio track. |
| Audio descriptions | Audio descriptions of the video intended for the visually impaired. |
The requirements in this specification only apply to single program MPEG-2 transport streams; multi-program MPEG-2 transport streams are out of scope for this specification. Different specifications may define equivalent requirements for other media transport and container formats, MPEG-4 base media file format and MPEG DASH for example. The following sections define requirements for how the user agent must recognize MPEG-2 TS video, audio and other data tracks and how the HTML5 elements representing those tracks must be created.
HTML5 VideoTrack, AudioTrack and TextTrack must be created as defined in [HTML5].
HTML5 VideoTrack, AudioTrack and TextTrack elements have additional attributes, beyond those referenced in this specification, that should be set by the user agent consistent with user preferences.
A user agent may be presented with previously processed in-band TextTracks, for example, when the viewer seeks back in the media resource, as controlled by the seekable time ranges attribute of the HTMLMediaElement. TextTrackCues are not removed from the TextTrack so the user agent must not create duplicate TextTrackCues in this case. How the user agent accomplishes this is implementation specific.
Recognition of specific types of video, audio and text tracks will, in general, be dependent on geographical region or service or content provider. In order that UA implementations are independent of region and provider, it is desirable that the UA recognize tracks in a generic manner and rely on a script to implement region, provider and application specific recognition of tracks. This requires script access to the MPEG program description in the program map table (PMT) as defined in [H.222.0] so that the tracks can be used correctly.
The UA must create a TextTrack in the media resource TextTrackList to make the PMT available to a script and set the TextTrack attributes using the following rules:
For each PMT received in the program stream, the UA must create a TextTrackCue only in the case where the PMT differs from the PMT represented by the previously created TextTrackCue. This is in recognition of the fact that the PMT is received at a minimum rate of every 140 msec but changes at a much lower rate.
For each new PMT, a UA must create a new TextTrackCue in the TextTrack as described in [HTML5] section "Text track model" with attributes set as follows:
Other media container formats, Ogg and WebM for example, can contain multiple tracks along with metadata describing those tracks. The track-description text track would be useful for making that metadata available to script.
For all MPEG-2 video stream types supported by the UA, the UA must create a new VideoTrack in the VideoTrackList of the media resource.
The HTML5 specification [HTML5] requires that the VideoTrackList must use the order defined by the media resource. VideoTracks must appear in the VideoTrackList in the same order as they appear in the PMT as defined in [H.222.0].
The UA must set the VideoTrack label attribute to a text string representing the packet ID (PID) of the equivalent MPEG-2 program stream. NOTE: Use of the label to identify the media resource track identifier works in MPEG-2 TS as there is no specified way for the label to be set by the media resource. Other container formats do provide a value to be used for the label. It would preferable for a new IDL attribute to be defined for the express purpose of specifying the media resource track identifier.
The UA must set VideoTrackList[0].VideoTrack.kind = "main".
For all other VideoTrackList entries, the UA should set the VideoTrack kind attribute if it can determine the correct value.
The UA must set VideoTrack.language to the value of the ISO_639_language_code field [ISO639.2] in the ISO 639 descriptor, if present, associated with the video stream type in the PMT. If the UA cannot determine the VideoTrack kind and language attributes it must set them to the empty string.
For all MPEG-2 audio stream types supported by the user agent, the use agent must create a new AudioTrack in the AudioTrackList of the media resource.
The HTML5 specification [HTML5] requires that the AudioTrackList must use the order defined by the media resource. AudioTracks must appear in the AudioTrackList in the same order as they appear in the PMT as defined in [H.222.0].
For each AudioTrack created the UA must set the label attribute to a text string representing the PID of the equivalent MPEG-2 program stream.
The UA must set AudioTrackList[0].AudioTrack.kind = "main".
For all other AudioTrackList entries, the UA should set the AudioTrack kind attribute if it can determine the correct value.
The UA must set AudioTrack.language to the value of the ISO_639_language_code field in the ISO 639 descriptor, if present, associated with the audio stream type in the PMT. If the UA cannot determine the AudioTrack kind and language attributes it must set them to the empty string.
For all MPEG-2 stream types that are not audio or video stream types, the UA must create a new TextTrack in the TextTrackList of the media resource.
The HTML5 specification [HTML5] requires that the TextTrackList must use the order defined by the media resource. TextTracks must appear in the TextTrackList in the same order as they appear in the PMT as defined in [H.222.0].
The UA should set the TextTrack kind attribute to one of the categories defined in [HTML5].
If the UA cannot determine the TextTrack kind attribute it must set it to "metadata "
If the UA sets the TextTrack.kind attribute to one of the categories defined in [HTML5], it should set the TextTrack language attribute if it can determine the appropriate value. If the UA cannot determine the TextTrack language attribute it must set it to the empty string.
The UA must:
The MPEG-2 TS packets with the PID corresponding to the TextTrack contain either PES packets or private data packets as defined in [H.222.0]. For each PES or private data packet in the program stream represented by the TextTrack, the UA must create a TextTrackCue in the TextTrack as described in [HTML5] section "Text track model" with attributes set as follows:
It is important to note that the above metatdata TextTrack and textTrackCue creation requirements, while minimizing region/provider/application knowledge in the UA, make the semantics of the metadata TextTrack and TextTrackCues opaque to the UA. So, for example, if the UA does not recognize subtitle tracks but creates a generic metadata text track as defined above the user agent behavior defined in [HTML5] for subtitle tracks will not occur since the UA is not aware this is a subtitle track. It is up to a script to identify the subtitle track and process the subtitle messages in the TextTrackCues in a manner appropriate for the subtitle format. One way this could be done is for a script to receive the TextTrackCues, extract the subtitle messages and create a new subtitle TextTrack and textTrakCues, in which case the UA defined behavior in [HTML5] would occur.
Closed captioning is delivered as part of the MPEG-2 TS video stream and must be recognized and made available by the UA.
A user agent (UA) that recognizes closed captioning must:
Thanks are expressed by the editor to the following individuals for their input to and feedback on this specification to date (in alphabetical order).
Mukta Kar, Giuseppe Pascale, Ed Shrum, George Sarosi, Clarke Stevens, Mark Vickers and Eric Winkelman.
No informative references.