Tt/TTWG Consensus Input

From Web and TV IG
< Tt

Consensus Input of the Task Force

The Timed Text Task Force [1] of the Web & TV IG [2] wishes to submit the following as input to the proposed revision of the Timed Text WG charter [3].

With the two independent W3C Timed Text specifications today (TTML & WebVTT) [4], Timed Text is often authored in one specification but rendered on a client supporting only the other specification. This requires a translation between the two specifications. With no defined mapping from one to the other, we risk mistranslating or dropping authored Timed Text information when it is rendered on the client. This results in a poor user experience and could also result in violations of government regulations on Timed Text display.

One possible workaround will be that authors will use a least common denominator approach to keep within a safe, translatable subset between the two specifications. But this will mean that end users miss out on the richer capabilities of both specifications. For example, in the US, captions could continue to be authored at a simple EIA-608 equivalent to avoid translation issues, which cheats end users of the richer experience possible today. Another issue is that supporting multiple formats creates operational burden and cost, especially when creative decisions are involved.

This is similar to dealing with multiple video formats, such as H.264 and WebM, and having to work with clients that support only one of these formats. What is different here is that the semantics of timed text is richer and more difficult to translate than video.

The Web & TV IG supports the addition of WebVTT to the deliverables of the W3C Timed Text Working Group, and recommends that the W3C Timed Text Working Group investigate the following harmonization strategies:

  1. The needs of timed text consumers and timed text authors should be prioritized over the needs of specification writers and implementors.
    While the focus of W3C efforts is often on authoring Web specifications and implementing Web client code, Timed Text needs to be viewed as an end-to-end ecosystem where the Timed Text is created as part of media authoring, often outside the Web, and the Web is one of many clients. The W3C Timed Text specifications and strategies need to integrate into this larger context.
  2. The overriding goal should be to maximize the consistency between the authored Timed Text and the rendered Timed Text. In other words, the goal is to ensure that the client processor can maximize its ability to support the authorial intentions with respect to presentation and other processing semantics.
    Media content augmented by Timed Text is very vulnerable to even small mistranslations. For example, it's easy to think of movies where the loss of a single line of dialog would distort the understanding of the entire movie.
  3. The Web & TV IG recommends that the Timed Text WG work towards achieve the goal stated in (2). We outline three possible logical strategies towards this goal, but make no recommendations about which of these strategies is most practical or preferred.
    1. Define fully specified mappings between the content specifications. By defining the two client specifications and how to map between the two, translation errors can be minimized or even eliminated. This could be implemented either at the authoring tool level (to generate the two formats) or at the video distribution level (to create the missing format).
    2. No translation via a single specification. The W3C could work with media standards groups on a strategy of driving a single Timed Text specification for the end-to-end media ecosystem. This could be based on TTML, WebVTT or some new merged effort. All authors and clients implementing the same specification would clearly eliminate translation errors.
    3. No translation via clients supporting both specifications. The W3C could evangelize that all media clients support both client specifications, much like Web clients support both PNG and JPEG. Again, this would eliminate all translation errors.
  4. Any of the above strategies will require a minimal number of well-defined content profiles clients must support. For example, defining a common, limited-resource content profile definition that will work across national variants will be critical for both specifications.
  5. The Web & TV IG encourages the W3C Timed Text Working Group to engage with regional and international organizations dealing with closed caption and subtitle delivery on the Internet, especially when considering developing profiles of its specifications.

Comments