This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 11207 - Make track element additions technology neutral
Summary: Make track element additions technology neutral
Status: RESOLVED FIXED
Alias: None
Product: HTML WG
Classification: Unclassified
Component: pre-LC1 HTML5 spec (editor: Ian Hickson) (show other bugs)
Version: unspecified
Hardware: All Windows NT
: P1 normal
Target Milestone: ---
Assignee: Ian 'Hixie' Hickson
QA Contact: HTML WG Bugzilla archive list
URL:
Whiteboard:
Keywords: a11y, a11ytf, media
Depends on:
Blocks:
 
Reported: 2010-11-03 11:14 UTC by Sean Hayes
Modified: 2011-01-22 18:08 UTC (History)
18 users (show)

See Also:


Attachments

Description Sean Hayes 2010-11-03 11:14:18 UTC
Changes required to remove webSRT specifics.

************************* 
Changes to Section 4.8.9

========================= 
Remove: 

"If the elements's track URL identifies a WebSRT resource, and the element's kind attribute is not in the metadata state, then the WebSRT file must be a WebSRT file using cue text."

========================= 
Replace:
"If the elements's track URL identifies a WebSRT resource, then the charset attribute may be specified. If the attribute is set, its value must be a valid character encoding name, must be an ASCII case-insensitive match for the preferred MIME name for that encoding, and must match the character encoding of the WebSRT file. [IANACHARSET]"

with:

"If the elements's track URL identifies a timed text resource, then the charset attribute may be specified. If the attribute is set, its value must be a valid character encoding name, must be an ASCII case-insensitive match for the preferred MIME name for that encoding, and must match the character encoding of the referenced file. [IANACHARSET]"


************************* 
Changes to Section 4.8.10.8 


========================= 
Remove:

"(e.g., for timed tracks based on WebSRT, the rules for updating the display of WebSRT timed tracks)."


************************* 
Changes to Section 4.8.10.10 

========================= 
Replace:
"A writing direction, "

with:
"The primary writing direction, "


========================= 
Remove  (to be defined in WebSRT):
"If the writing direction is horizontal, then line position percentages are relative to the height of the video, and text position and size percentages are relative to the width of the video.

Otherwise, line position percentages are relative to the width of the video, and text position and size percentages are relative to the height of the video."

=========================
Remove:
"(e.g. the WebSRT parser if the Content Type metadata is text/srt) "

========================= 
Remove:
"(e.g., for WebSRT, the rules for updating the display of WebSRT timed tracks)."


========================= 
Remove  (to be defined in WebSRT):
"A snap-to-lines flag 
A boolean indicating whether the line's position is a line position (positioned to a multiple of the line dimensions of the first line of the cue), or whether it is a percentage of the dimension of the video."

========================= 
Remove  (to be defined in WebSRT):

"A line position 
Either a number giving the position of the lines of the cue, to be interpreted as defined by the writing direction and snap-to-lines flag of the cue, or the special value auto, which means the position is to depend on the other active tracks.

A text position 
A number giving the position of the text of the cue within each line, to be interpreted as a percentage of the video, as defined by the writing direction."

========================= 
Remove  (to be defined in WebSRT):

"An alignment 
An alignment for the text of each line of the cue, either start alignment (the text is aligned towards its start side), middle alignment (the text is aligned centered between its start and end sides), end alignment (the text is aligned towards its end side). Which sides are the start and end sides depends on the Unicode bidirectional algorithm and the writing direction. [BIDI]"

========================= 
Remove :

"(e.g., for timed tracks based on WebSRT, the rules for updating the display of WebSRT timed tracks)."

========================= 
Remove :
"(so e.g. for cues from a WebSRT file, that would be the order in which the cues were listed in the file)"


************************* 
Remove section "4.8.10.10.4 Guidelines for exposing cues in various formats as timed track cues" in its entirety.


************************* 
interface TimedTrackCue


========================= 
Replace

"[Constructor(in DOMString id, in double startTime, in double endTime, in DOMString text, in optional DOMString settings, in optional DOMString voice, in optional boolean pauseOnExit)]"

with:

[Constructor(in DOMString id, in double startTime, in double endTime, in DOMString text, in optional CueSettings settings, in optional DOMString voice, in optional boolean pauseOnExit)]


========================= 
Replace:
"
  readonly attribute DOMString direction;
  readonly attribute boolean snapToLines;
  readonly attribute long linePosition;
  readonly attribute long textPosition;
  readonly attribute long size;
  readonly attribute DOMString alignment;
"

With:

"
   readonly attribute CueSettings settings
"

where CueSettings is to be defined an apbstract base type which can be subclassed for specific timed text formats


========================= 
Replace:
"The settings argument is a string in the format of WebSRT cue settings. If omitted, the empty string is assumed."

With:

"The settings argument is a sub class of the CueSettings type."

========================= 
Remove:

"cue . snapToLines 
Returns true if the timed track cue snap-to-lines flag is set, false otherwise.

cue . linePosition 
Returns the timed track cue line position. In the case of the value being auto, the appropriate default is returned.

cue . textPosition 
Returns the timed track cue text position.

cue . size 
Returns the timed track cue size.

cue . alignment 
Returns a string representing the timed track cue alignment, as follows:

If it is start alignment 
The string "start".

If it is middle alignment 
The string "middle".

If it is end alignment 
The string "end".

"

========================= 
Replace:

"Let cue's timed track cue text be the value of the text argument, and let the rules for its interpretation be the WebSRT cue text parsing rules, the WebSRT cue text rendering rules, and the WebSRT cue text DOM construction rules.

Let cue's timed track cue writing direction be horizontal.

Let cue's timed track cue snap-to-lines flag be true.

Let cue's timed track cue line position be auto.

Let cue's timed track cue text position be 50.

Let cue's timed track cue size be 100.

Let cue's timed track cue alignment be middle alignment.

Let input be the string given by the settings argument.

Let position be a pointer into input, initially pointing at the start of the string.

Parse the WebSRT settings for cue."

with
"Let cue's timed track cue text be the value of the text argument.

 Let cue's settings be determined in a format specific manner from the settings argument"

========================= 
Remove:
"The snapToLines attribute must return true if the timed track cue snap-to-lines flag of the timed track cue that the TimedTrackCue object represents is set; or false otherwise.

The linePosition attribute must return the timed track cue line position of the timed track cue that the TimedTrackCue object represents, if that value is numeric. Otherwise, the value is the special value auto; if the timed track cue snap-to-lines flag of the timed track cue that the TimedTrackCue object represents is not set, the attribute must return the value 100; otherwise, it must return the value returned by the following algorithm:

Let cue be the timed track cue that the TimedTrackCue object represents.

If cue is not associated with a timed track, return -1 and abort these steps.

Let track be the timed track that the cue is associated with.

Let n be the number of timed tracks whose timed track mode is showing and that are in the media element's list of timed tracks before track.

Return n.

The textPosition attribute must return the timed track cue text position of the timed track cue that the TimedTrackCue object represents.

The size attribute must return the timed track cue size of the timed track cue that the TimedTrackCue object represents.

The alignment attribute must return the timed track cue alignment of the timed track cue that the TimedTrackCue object represents."

========================= 
Remove:
"(For example, for WebSRT, those rules are the WebSRT cue text parsing rules and the WebSRT cue text DOM construction rules.)"


***********************
MutableTrack

========================= 
Replace:
"Create a new timed track, and set its timed track kind to kind, its timed track label to label, its timed track language to language, its timed track readiness state to the timed track loaded state, its timed track mode to the timed track hidden mode, and its timed track list of cues to an empty list, associated with the rules for updating the display of WebSRT timed tracks as its rules for updating the timed track rendering."

"Create a new timed track, and set its timed track kind to kind, its timed track label to label, its timed track language to language, its timed track readiness state to the timed track loaded state, its timed track mode to the timed track hidden mode, and its timed track list of cues to an empty list."
Comment 1 Tab Atkins Jr. 2010-11-03 11:17:30 UTC
Why do we want to remove WebSRT specifics?  Why do we want to try and genericize the API, when there aren't currently plans to add additional timed text formats?
Comment 2 John Foliot 2010-11-03 16:38:52 UTC
(In reply to comment #1)
> Why do we want to remove WebSRT specifics? 

At this time the media sub-group of the Accessibility Task Force desire that the language be as neutral and technology agnostic as possible. It is unclear _at this time_ if WebSRT is sufficient for meeting all of the user requirements and author needs that we have identified. We are currently evaluating a number of time formats (WebSRT, TTML, etc.) to determine which, if any, best meets these needs, which is why we are asking that folks review the user requirements.

> Why do we want to try and
> genericize the API, when there aren't currently plans to add additional timed
> text formats?

Curious to know where this assertion is coming from, as AFAIK this has never been discussed within the W3C, and this is a topic that I have been following most closely. Implementers might be experimenting with WebSRT today (and there is usefulness in that), but at this time I do not believe a final decision has been made to standardize on a specific time format.
Comment 3 Maciej Stachowiak 2010-11-03 16:50:11 UTC
(In reply to comment #1)
> Why do we want to remove WebSRT specifics?  Why do we want to try and
> genericize the API, when there aren't currently plans to add additional timed
> text formats?

I believe Microsoft has expressed some interest in offering TTML as a timed text formal. Apple isn't super enthusiastic about TTML, but we wouldn't rule out ever supporting TTML or perhaps another format.

I think it would be useful if at least the basic APIs and elements for referencing caption/subtitle tracks could work with other timed text formats, since many exist and it is plausible that they will someday be supported on the Web.
Comment 4 Tab Atkins Jr. 2010-11-03 16:56:39 UTC
(In reply to comment #2)
> (In reply to comment #1)
> > Why do we want to remove WebSRT specifics? 
> 
> At this time the media sub-group of the Accessibility Task Force desire that
> the language be as neutral and technology agnostic as possible. It is unclear
> _at this time_ if WebSRT is sufficient for meeting all of the user requirements
> and author needs that we have identified. We are currently evaluating a number
> of time formats (WebSRT, TTML, etc.) to determine which, if any, best meets
> these needs, which is why we are asking that folks review the user
> requirements.

There is an obvious possibility that WebSRT be the format chosen by the media accessibility subgroup.  If this occurs, then any effort spent on genericizing the API will be wasted.  (In fact, it will be a waste if any single format is chosen, as you instead want a specialized API if you have only a single format.)  As such, it seems premature to spend any effort on this right now.


> > Why do we want to try and
> > genericize the API, when there aren't currently plans to add additional timed
> > text formats?
> 
> Curious to know where this assertion is coming from, as AFAIK this has never
> been discussed within the W3C, and this is a topic that I have been following
> most closely. Implementers might be experimenting with WebSRT today (and there
> is usefulness in that), but at this time I do not believe a final decision has
> been made to standardize on a specific time format.

The fact that a final decision hasn't been made doesn't negate the fact that there aren't *currently* any plans to add an additional format.  So far, there has been no public announcement of any plans to add an additional format; there is only the a11y TF's investigation of formats, which has not yet resulted in any announcement.
Comment 5 David Singer 2010-11-03 17:02:25 UTC
More to the point, even we were to select a timed text format as recommended or mandatory, that's a long way from making it the only one ever possible. The API *clearly* needs to be generic and not specific to a given format.

The W3C has TTML, and we would be remiss to cut it off.  But more importantly, we should make it possible to innovate in this area.
Comment 6 John Foliot 2010-11-03 17:37:10 UTC
(In reply to comment #4)
> 
> There is an obvious possibility that WebSRT be the format chosen by the media
> accessibility subgroup. 

The only thing that is obvious at this time is that WebSRT is being evaluated to see if it meets all of the User and Author requirements currently identified by the sub-team. Presuming that we will concur that WebSRT is *the* format is premature at this time. A quick check of what WebSRT can and cannot do at this time already suggests to me that it is incomplete in some ways.

> If this occurs, then any effort spent on genericizing
> the API will be wasted.  (In fact, it will be a waste if any single format is
> chosen, as you instead want a specialized API if you have only a single
> format.)  As such, it seems premature to spend any effort on this right now.

Right. So rather than "wasting our time" working on a "specialized API for a single (WebSRT) format" - which may or may not be a recommended format - we'd rather focus on a generic API at this time. You've just argued for our request <grin>.


> 
> The fact that a final decision hasn't been made doesn't negate the fact that
> there aren't *currently* any plans to add an additional format.  So far, there
> has been no public announcement of any plans to add an additional format; there
> is only the a11y TF's investigation of formats, which has not yet resulted in
> any announcement.

Putting the cart before the horse is rarely a good long-term solution. There have been no "announcements" one way or the other, and likely won't be until such time as we have finished the work we set out to do, which was to a) capture all user (and author) requirements, b) evaluate possible solutions against those requirements, c) make one or more recommendations.

It is well known that WebSRT has captured a certain fondness in some circles, and the fact that work on that specification is both current and active, but if WebSRT cannot meet all the user/author requirements for ensuring accessibility then the media sub-team must say so, and provide proof of such. However, as part of the assessment, if 'holes' are discovered, and they can be addressed inside the WebSRT spec, then it would strengthen the case for adopting WebSRT as a baseline or recommended format.
 
Meanwhile, going back to our very first face-to-face discussion at Stanford on Nov. 1, 2009 on this topic (which pre-dates the media sub-team) and subsequent meetings at TPAC 2009 that same week, we emerged from those meetings noting that likely we would need to support more than one time format (at that time noting both SRT, TTML, and SMIL-TEXT) http://lists.w3.org/Archives/Public/public-html/2009Nov/0163.html

To that end, I believe we have been both clear and consistent to date.
Comment 7 Frank Olivier 2010-11-03 17:44:22 UTC
I agree that we should not specify a specific text track format in the spec.

WebSRT is a fine for basic captioning needs, but it does not solve all accessiblity requirements. That is not a problem as such - the author may only have a need for basic captioning - but tying WebSRT in the spec limits the solutions that the spec provides.

We recommend that the HTML5 spec allow authors to add several tracks of the same type (where type=caption), in different formats.
Comment 8 Ms2ger 2010-11-17 09:46:16 UTC
Changes such as

Remove:

"(e.g., for timed tracks based on WebSRT, the rules for updating the display of
WebSRT timed tracks)."

are not helpful. They do not make the specification any more technology-neutral, as this is a non-normative note. A somewhat more useful suggestion would be to add "...and for those based on <insert random format>, the rules for updating the display of <insert random format> timed tracks", once the experts have decided what we must do to not be a11y-haters.
Comment 9 steve faulkner 2010-11-17 10:00:41 UTC
(In reply to comment #8)
> once the experts have decided what we must do to not be a11y-haters.

a good start would be to stop making flame baiting comments such as the one above.
Comment 10 Michael Cooper 2010-11-23 16:59:17 UTC
Bug triage sub-team think this is a HTML A11Y TF priority, is already in active discussion with the media sub-group.
Comment 11 Ian 'Hixie' Hickson 2010-11-30 20:51:25 UTC
EDITOR'S RESPONSE: This is an Editor's Response to your comment. If you are satisfied with this response, please change the state of this bug to CLOSED. If you have additional information and would like the editor to reconsider, please reopen this bug. If you would like to escalate the issue to the full HTML Working Group, please add the TrackerRequest keyword to this bug, and suggest title and text for the tracker issue; or you may create a tracker issue yourself, if you are able to do so. For more details, see this document:
   http://dev.w3.org/html5/decision-policy/decision-policy.html

Status: Partially Accepted
Change Description: see diff given below
Rationale: I did most of the changes. A CueSettings object doesn't make much sense, so I just removed the constructor instead. For the charset="" attribute I used slightly different phrasing for essentially the same effect (so that I could cross-reference it from the WebSRT section in versions of the spec that do still include WebSRT).
Comment 12 contributor 2010-11-30 20:51:50 UTC
Checked in as WHATWG revision r5688.
Check-in comment: Remove some text from W3C version as requested by a11y task force.
http://html5.org/tools/web-apps-tracker?from=5687&to=5688
Comment 13 Shelley Powers 2010-11-30 21:05:44 UTC
(In reply to comment #12)
> Checked in as WHATWG revision r5688.
> Check-in comment: Remove some text from W3C version as requested by a11y task
> force.
> http://html5.org/tools/web-apps-tracker?from=5687&to=5688

Where in the W3C is this change checked in?
Comment 14 Ms2ger 2010-12-01 07:52:43 UTC
(In reply to comment #13)
> (In reply to comment #12)
> > Checked in as WHATWG revision r5688.
> > Check-in comment: Remove some text from W3C version as requested by a11y task
> > force.
> > http://html5.org/tools/web-apps-tracker?from=5687&to=5688
> 
> Where in the W3C is this change checked in?

http://dev.w3.org/cvsweb/html5/spec/Overview.html.diff?r1=1.4555&r2=1.4556

Did you even look?
Comment 15 Shelley Powers 2010-12-01 13:19:43 UTC
(In reply to comment #14)
> (In reply to comment #13)
> > (In reply to comment #12)
> > > Checked in as WHATWG revision r5688.
> > > Check-in comment: Remove some text from W3C version as requested by a11y task
> > > force.
> > > http://html5.org/tools/web-apps-tracker?from=5687&to=5688
> > 
> > Where in the W3C is this change checked in?
> 
> http://dev.w3.org/cvsweb/html5/spec/Overview.html.diff?r1=1.4555&r2=1.4556
> 
> Did you even look?

Rudeness is not necessary. I'm surprised that it's tolerated in the W3C bugzilla database. 

This particular section differs from what is in the WhatWG document. What should be linked in this W3C bug maintenance system is recorded changes to _W3C documents_, as the WhatWG differences differs from the W3C differences.
Comment 16 Sean Hayes 2011-01-19 18:58:46 UTC
The continued presence of section 10.3.2 (marked as being destined for an as yet unnamed CSS editor) and the various references to it are preventing closing this bug.
Comment 17 Silvia Pfeiffer 2011-01-21 12:36:54 UTC
(In reply to comment #16)
> The continued presence of section 10.3.2 (marked as being destined for an as
> yet unnamed CSS editor) and the various references to it are preventing closing
> this bug.

If WebVTT is the baseline format for external synchronized text for audio/video, it may as well be specified as part of HTML5. However, I do like a denser and separate document like http://www.whatwg.org/specs/web-apps/current-work/webvtt.html since it really helps with the implementation to not have to look all over the HTML5 spec to understand how WebVTT works.

Maybe http://www.whatwg.org/specs/web-apps/current-work/webvtt.html can be turned into another separate spec of HTML5 and then it doesn't need to be in the HTML5 spec, but can just be referenced as an external spec. If it needs an editor, I'd be happy to offer my help.
Comment 18 Sean Hayes 2011-01-21 12:54:15 UTC
Section 10.3.2 is intended to be moved into a CSS module; so eventually it won't be in the HTML living specification anyway. But while it is being tracked in the whatwg it does not need to be in the W3C snapshot where it would hold up LC.
Comment 19 Cynthia Shelly 2011-01-22 18:08:50 UTC
related to issue 9 http://www.w3.org/html/wg/tracker/issues/9