This Wiki page is edited by participants of the HTML Accessibility Task Force. It does not necessarily represent consensus and it may have incorrect information or information that is not supported by other Task Force participants, WAI, or W3C. It may also have some very useful information.

TextFormat Comparison Overview

From HTML accessibility task force Wiki
Jump to: navigation, search

Summary of Reviewed/Proposed Time Stamp Formats

The Media Subgroup of the HTML5 Accessibility Task Force recently reviewed two proposed candidates for inclusion into the HTML5 Standard. The two candidates were TTML (Timed-Text Markup Language) and WebSRT. The following is a high-level overview of the strengths and remaining issues identified by the group. A complete comparison of both candidates mapped against the user requirements previously identified by the subgroup can be found at http://www.w3.org/WAI/PF/HTML/wiki/TextFormat_Mapping_to_Requirements.

Both candidate formats have identified ways to meet all of the identified user requirements. However, meeting some requirements may rely on dependencies outside of the time-stamp format itself (e.g., incorporating a speech-markup language), and/or in some instances using solutions from a different or competing perspective. Consider, for example, the synchronized display of cue text and media data: TTML can specify cues at frame intervals, or in absolute time (ms), whereas WebSRT is synchronized to media time through the Web browser. Whether this has any practical implication has not yet been fully determined.

Timed-Text Markup Language http://www.w3.org/TR/ttaf1-dfxp WebSRT http://www.whatwg.org/specs/web-apps/current-work/websrt.html
Strengths Details Strengths Details
Current W3C Recommendation Active, on-going development of the specification allows for inclusion of missing requirements.
Adopted by several major commercial content producers, streaming-media and internet-communication providers; integrated into current commercial tool chains as well as free authoring tools. Based upon the SRT format, which has been adopted by major commercial content producers, streaming-media and internet-communication providers; and has been integrated into current commercial tool chains as well as free authoring tools.
Is the basis for SMPTE-TT (http://store.smpte.org/SearchResults.asp?Search=smpte-tt&Extensive_Search=Y&Submit=Search). "Is not XML."
A profile of TTML is in active use on the internet today for both inband (e.g., MediaRoom - MPEG4) and out-of-band captioning (e.g. BBC iplayer, JW FLV player, ccPlayer, CCforFlash, the Adobe FLVPlaybackCaptioning component and other Flash-based players) Methods exist for encapsulating SRT into Ogg, Matroska (the basis of WebM) and MPEG-4. WebSRT would work identically.
Issues Details Issues Details
The XML root of TTML is seen as a problem by some implementers and authors. Arguments include the use of XSL-FO for styling - something that browsers haven't implemented. They also include concerns about the hierarchical nature of XML, which needs to be flattened for inclusion in a media container format. Specification currently at an early stage outside of W3C space, with little to no current implementation other than building on top of the existing SRT format.
The full profile contains more features than some developers feel are necessary for a basic implementation. The basic profile (transformation profile) contains fewer features and may be sufficient. However, it may be necessary to create a new profile to satisfy all user requirements. May require full development by W3C; would add another text-display format to an already large pool of existing formats.
Work required to support mixed languages inside a cue (e.g., Susan is eating a <span lang="fr">croissant</span>).
Work required to support file-wide metadata; in particular, identification of type of content (subtitles, descriptions, chapters).
Work required to support hierarchical structure navigation not just through JavaScript.
Work required to support a means to tell the player to stop the video and wait for the end of speech synthesis.