This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 19632 - [WebVTT] cue identifier string clarification
Summary: [WebVTT] cue identifier string clarification
Status: RESOLVED FIXED
Alias: None
Product: TextTracks CG
Classification: Unclassified
Component: WebVTT (show other bugs)
Version: unspecified
Hardware: PC All
: P2 normal
Target Milestone: ---
Assignee: Silvia Pfeiffer
QA Contact: This bug has no owner yet - up for the taking
URL:
Whiteboard: v1
Keywords:
Depends on:
Blocks:
 
Reported: 2012-10-19 11:26 UTC by Silvia Pfeiffer
Modified: 2014-02-23 03:08 UTC (History)
4 users (show)

See Also:


Attachments

Description Silvia Pfeiffer 2012-10-19 11:26:50 UTC
The WebVTT spec says about cue identifiers:

"A WebVTT cue identifier is any sequence of one or more characters not containing the substring "-->" (U+002D HYPHEN-MINUS, U+002D HYPHEN-MINUS, U+003E GREATER-THAN SIGN), nor containing any U+000A LINE FEED (LF) characters or U+000D CARRIAGE RETURN (CR) characters."


And further in the CSS extension section about ::cue() selection:

"For the purposes of ID selector matching, Lists of WebVTT Node Objects have the ID given by the cue's text track cue identifier, if any."

However, the CSS spec says:

"In CSS, identifiers (including element names, classes, and IDs in selectors) can contain only the characters [a-zA-Z0-9] and ISO 10646 characters U+00A0 and higher, plus the hyphen (-) and the underscore (_); they cannot start with a digit, two hyphens, or a hyphen followed by a digit. Identifiers can also contain escaped characters and any ISO 10646 character as a numeric code (see next item). For instance, the identifier "B&W?" may be written as "B\&W\?" or "B\26 W\3F"."

Thus, there are restrictions on the CSS ID selectors that we don't have on cue identifiers and consequentually not every cue identifier can be interpreted as a CSS ID selector. For example, what happens with a cue identifier that starts with a number, or one that has spaces in it?
Comment 1 Simon Pieters 2012-10-19 11:30:57 UTC
You can target any id with an id selector using CSS escapes (as the CSS spec says in the part you quoted). http://mothereff.in/css-escapes
Comment 2 Silvia Pfeiffer 2012-10-19 12:54:07 UTC
Shouldn't the spec then say that we have to escape the WebVTT cue identifier before using it as a CSS ID selector?
Comment 3 Ian 'Hixie' Hickson 2012-10-19 20:22:46 UTC
I don't see what this has to do with WebVTT. CSS identifier syntax is a detail for the CSS spec, not WebVTT.
Comment 4 Silvia Pfeiffer 2012-10-19 22:39:23 UTC
This falls in between WebVTT and CSS.

Since the WebVTT spec has a CSS extension section where it says how to apply ::cue(selector), and there is nothing about ::cue in the CSS spec, it seems that the WebVTT spec is the only place where this could be explained

The "selector" in ::cue(selector) is expected to be a conformant CSS selector, so we should add a sentence referencing CSS escaped characters:

E.g. add after:

"For the purposes of ID selector matching, Lists of WebVTT Node Objects have the ID given by the cue's text track cue identifier, if any."

Something like:

"If necessary, CSS escaped characters [1] are used to make the cue identifier a valid CSS selector."

[1] http://www.w3.org/TR/CSS21/syndata.html#escaped-characters
Comment 5 Ian 'Hixie' Hickson 2012-10-19 22:46:21 UTC
That's essentially saying "Don't forget to not screw up your CSS".

Should we have similar comments everywhere saying "By the way, don't forget that if you are setting this string from JavaScript, you have to escape your quotes!" or some such?

This makes no sense to me.
Comment 6 Glenn Maynard 2012-10-19 22:53:22 UTC
That doesn't make sense to me.  This is just a generic CSS selector, using regular CSS syntax.  The syntactical details of selectors are defined by CSS.  All WebVTT is doing is exporting IDs (and attributes, and element types, and classes) to CSS.  WebVTT doesn't care about the syntax of CSS; it doesn't need to say "IDs might need to be escaped", any more than it needs to say "IDs are prefixed with a hash sign" or "class names are prefixed with a period".  HTML doesn't need to do these things, either.
Comment 7 Silvia Pfeiffer 2012-10-19 23:38:16 UTC
Let's take this example cue:

52 my cue identifier
00:00:52.000 --> 00:00:01:05.000
My cue

In CSS you have to write:
::cue(#\35 2\ my\ cue\ identifier)


You're telling me that browsers identify the element id "52 my cue identifier" with the CSS selector "#\35 2\ my\ cue\ identifier" - which resolves part of my concerns.

I'm also concerned about WebVTT authors that are not deeply knowledgable about the Web and may try to use ::cue(52 my cue identifier) , since that is what the WebVTT spec implies.

I'm closing this bug again, but I think this will be a common pitfall for authors and thus should at minimum go into an authoring spec for WebVTT.
Comment 8 Ian 'Hixie' Hickson 2012-10-22 18:29:33 UTC
that's it
00:00:52.000 --> 00:00:01:05.000
my cue

In JS you have to write:
track.cues.getCueById('that\'s it')

Do we have to warn about this too? If not, what's the difference? If yes, do we have to list every possible language and describe all their escaping syntaxes just to be sure that authors don't get confused?
Comment 9 Silvia Pfeiffer 2012-10-22 21:27:35 UTC
In an author specification it makes a lot of sense to point out non-obvious traps like these escapes, since they will invariable lead to author mistakes.

The one you are citing is obvious (because of the use of the same quotes for the whole string), but the one I cite is non-obvious because while you can write:
   track.cues.getCueById('52 my cue identifier'),
you can't write
   ::cue('52 my cue identifier').

Feel free to ignore it for the browser spec other than using it in an example.
Comment 10 Silvia Pfeiffer 2013-07-08 02:17:53 UTC
I want to add an example to the spec.
Comment 11 Philip Jägenstedt 2014-01-27 10:28:48 UTC
(In reply to Silvia Pfeiffer from comment #10)
> I want to add an example to the spec.

I'm with Simon and Hixie on this, I don't think this is worth an example, any more than the kind of escaping you have to do in JavaScript, or in fact in HTML if you have inline JavaScript...
Comment 12 Silvia Pfeiffer 2014-01-27 11:38:05 UTC
The HTML spec has several examples where escaping mechanisms are demonstrated, e.g.
<a href="?art&amp;copy">Art and Copy</a> <!-- the & has to be escaped, since &copy is a named character reference -->

or
<iframe seamless sandbox srcdoc="<p>Yeah, you can see it <a href=&quot;/gallery?mode=cover&amp;amp;page=1&quot;>in my gallery</a>."></iframe>

and in "Restrictions for contents of script elements" it says in a Note:
"The easiest and safest way to avoid the rather strange restrictions described in this section is to always escape "<!--" as "<\!--", "<script" as "<\script", and "</script" as "<\/script" ..."

I think this is similarly an error that somebody who is a captioner and not used to writing CSS would easily trip over and thus is worth at least a note or an example.
Comment 13 Philip Jägenstedt 2014-01-28 08:48:43 UTC
(In reply to Silvia Pfeiffer from comment #12)
> The HTML spec has several examples where escaping mechanisms are
> demonstrated, e.g.
> <a href="?art&amp;copy">Art and Copy</a> <!-- the & has to be escaped, since
> &copy is a named character reference -->
> 
> or
> <iframe seamless sandbox srcdoc="<p>Yeah, you can see it <a
> href=&quot;/gallery?mode=cover&amp;amp;page=1&quot;>in my
> gallery</a>."></iframe>
> 
> and in "Restrictions for contents of script elements" it says in a Note:
> "The easiest and safest way to avoid the rather strange restrictions
> described in this section is to always escape "<!--" as "<\!--", "<script"
> as "<\script", and "</script" as "<\/script" ..."
> 
> I think this is similarly an error that somebody who is a captioner and not
> used to writing CSS would easily trip over and thus is worth at least a note
> or an example.

In all of these cases the examples are for dealing this escaping that is required due to how the HTML parser works, the equivalent for WebVTT would be an example using one of the WebVTT escapes, not an example of how CSS escaping works.

I think this kind of thing is more appropriate in a tutorial or devrel material, but will of course review a patch if one comes along...
Comment 14 Silvia Pfeiffer 2014-02-23 03:08:47 UTC
Improved on an existing example in:
https://github.com/w3c/webvtt/commit/2dd898265cb16881fa928b17a8b2b52a0770d94b