13995 2011-09-01 11:50:51 +0000 <track> Don't check Content-Type for <track> 2012-09-14 12:15:27 +0000 1 1 1 Unclassified HTML.next default unspecified Other other RESOLVED LATER http://www.whatwg.org/specs/web-apps/current-work/#sourcing-out-of-band-text-tracks P3 normal --- 14294 1 contributor dave.null annacc annevk eoconnor eric.carlson franko ian julian.reschke mike philipj plh public-html-admin public-html-wg-issue-tracking robin silviapfeiffer1 public-html-bugzilla oldest_to_newest 56175 0 contributor 2011-09-01 11:50:51 +0000 Specification: http://www.whatwg.org/specs/web-apps/current-work/multipage/the-video-element.html Multipage: http://www.whatwg.org/C#sourcing-out-of-band-text-tracks Complete: http://www.whatwg.org/c#sourcing-out-of-band-text-tracks Comment: Don't check Content-Type for <track> Posted from: 83.218.67.122 by philipj@opera.com User agent: Opera/9.80 (X11; Linux x86_64; U; Edition Next; en) Presto/2.9.186 Version/12.00 56176 1 philipj 2011-09-01 11:55:21 +0000 "The tasks queued by the fetching algorithm on the networking task source to process the data as it is being fetched must examine the resource's Content Type metadata, once it is available, if it ever is. If no Content Type metadata is ever available, or if the type is not recognised as a text track format, then the resource's format must be assumed to be unsupported (this causes the load to fail, as described below). If a type is obtained, and represents a supported text track format, then the resource's data must be passed to the appropriate parser (e.g. the WebVTT parser if the Content Type metadata is text/vtt) as it is received, with the text track list of cues being used for that parser's output." We don't want to implement this, given the experience with <video>, see http://www.w3.org/html/wg/wiki/ChangeProposals/NoVideoContentType#Issues_for_Authors WebVTT already has the WEBVTT header magic to differentiate it from other formats if we should ever support such in <track>. The main consequence of doing this check is that <track> will appear broken to all authors when they first test it. Even if they figure out how to set Content-Type, it'll mostly be a waste of time and not provide any real benefits. 56190 2 franko 2011-09-01 15:46:44 +0000 > WebVTT already has the WEBVTT header magic to differentiate it from other > formats if we should ever support such in <track>. ...but this requires the user agent to download the file to check the track type; this is not a good solution, especially in mobile scenarios. > The main consequence of doing this check is that <track> will appear broken to > all authors when they first test it. Even if they figure out how to set > Content-Type, it'll mostly be a waste of time and not provide any real > benefits. I don't think that this is a big burden on developers; they use documentation and samples when using new elements. This would just have to be documented clearly, like a lot of other parts of the HTML5 spec. The real benefit is that multiple track formats are supported cleanly. 56212 3 annevk 2011-09-02 09:02:08 +0000 If you are already fetching the file it does not matter much whether you terminate the request when reading the Content-Type header or the first set of octets from the response entity body (even on mobile). 56277 4 silviapfeiffer1 2011-09-04 11:14:16 +0000 (In reply to comment #3) > If you are already fetching the file it does not matter much whether you > terminate the request when reading the Content-Type header or the first set of > octets from the response entity body (even on mobile). Not for one file. But it matters if you have many files, which in this case can be several dozens if not hundreds of caption, subtitle, audio description etc. files in different languages. 56278 5 annevk 2011-09-04 11:34:43 +0000 Why? The difference is marginal. 56280 6 silviapfeiffer1 2011-09-04 11:48:10 +0000 (In reply to comment #5) > Why? The difference is marginal. Wait, are you comparing the difference between a HEAD and a GET request with a request for the first 8 bytes? Then, yes, that's marginal. 56282 7 annevk 2011-09-04 12:02:58 +0000 That is what this bug is about. 56793 8 ian 2011-09-14 22:23:04 +0000 As far as I can tell, the experience with <video> is that we're moving towards more strict checking, not moving towards sniffing. In general, I agree that the platform would be more practical if we used sniffing everywhere instead of content type labeling. However, that's not how the platform is designed. If you want to deprecated Content-Type headers, I suggest bringing it up with the HTTP working group. 56812 9 philipj 2011-09-15 07:13:40 +0000 It is this spec that defines when and how the Content-Type headers are checked, not the HTTP spec. If the spec does not change, we will just ignore what it says on this point, since it is just annoying to web authors. As for Content-Type in video, I assume that the suggestion in http://www.w3.org/Bugs/Public/show_bug.cgi?id=11984#c22 is what will eventually be spec'd, which includes sniffing in the absence of a trusted Content-Type. Doing the same for <track> might be fine, but strict checking is not. 56874 10 ian 2011-09-15 23:04:29 +0000 Oh, if you mean just in the absence of a Content-Type, then sure, that can be added. I thought you meant ignoring the Content-Type altogether, which last I checked definitely violates HTTP. 56887 11 philipj 2011-09-16 08:29:18 +0000 I can't find any MUST requirements at all in http://tools.ietf.org/html/rfc2616#section-14.17 or related sections, how would this be a violation? As with <video>, I see 3 options here: 1. Always obey Content-Type 2. Always ignore Content-Type 3. Add sniffing for WebVTT to http://tools.ietf.org/html/draft-ietf-websec-mime-sniff-03 and use the sniffed type, not the official content type. If spec'd as I imagine it, this would allow serving WebVTT as text/plain, but serving it as application/ttml+xml would fail. (1) is unacceptable, (2) is what we've implemented, but (3) could be tolerable if <video> were also made consistent with that. 56888 12 julian.reschke 2011-09-16 09:06:35 +0000 (In reply to comment #11) > I can't find any MUST requirements at all in > http://tools.ietf.org/html/rfc2616#section-14.17 or related sections, how would > this be a violation? Whether or not there is a MUST doesn't tell you anything about whether it's a requirement or not. You seem to be confused about when or not BCP14 keywords need to be used; see RFC 2119 for details. Note that there are entire RFCs without any BCP 14 keywords; that doesn't affect their "normativeness" at all. That being said, <http://greenbytes.de/tech/webdav/rfc2616.html#rfc.section.7.2.1> says: "Any HTTP/1.1 message containing an entity-body SHOULD include a Content-Type header field defining the media type of that body. If and only if the media type is not given by a Content-Type field, the recipient MAY attempt to guess the media type via inspection of its content and/or the name extension(s) of the URI used to identify the resource. If the media type remains unknown, the recipient SHOULD treat it as type "application/octet-stream"." But also note: <http://trac.tools.ietf.org/wg/httpbis/trac/ticket/155> > As with <video>, I see 3 options here: > > 1. Always obey Content-Type > > 2. Always ignore Content-Type > > 3. Add sniffing for WebVTT to > http://tools.ietf.org/html/draft-ietf-websec-mime-sniff-03 and use the sniffed > type, not the official content type. If spec'd as I imagine it, this would > allow serving WebVTT as text/plain, but serving it as application/ttml+xml > would fail. > > (1) is unacceptable, (2) is what we've implemented, but (3) could be tolerable > if <video> were also made consistent with that. I don't think (1) is unacceptable. 56899 13 philipj 2011-09-16 12:11:33 +0000 (In reply to comment #12) > "Any HTTP/1.1 message containing an entity-body SHOULD include a Content-Type > header field defining the media type of that body. If and only if the media > type is not given by a Content-Type field, the recipient MAY attempt to guess > the media type via inspection of its content and/or the name extension(s) of > the URI used to identify the resource. If the media type remains unknown, the > recipient SHOULD treat it as type "application/octet-stream"." > > But also note: > > <http://trac.tools.ietf.org/wg/httpbis/trac/ticket/155> Great, this is option 3. > > As with <video>, I see 3 options here: > > > > 1. Always obey Content-Type > > > > 2. Always ignore Content-Type > > > > 3. Add sniffing for WebVTT to > > http://tools.ietf.org/html/draft-ietf-websec-mime-sniff-03 and use the sniffed > > type, not the official content type. If spec'd as I imagine it, this would > > allow serving WebVTT as text/plain, but serving it as application/ttml+xml > > would fail. > > > > (1) is unacceptable, (2) is what we've implemented, but (3) could be tolerable > > if <video> were also made consistent with that. > > I don't think (1) is unacceptable. Will not implement, it is an unnecessary hoop for authors to jump through and not actually necessary to support multiple text formats. 57011 14 ian 2011-09-19 22:52:14 +0000 Ah, well, if you're saying you're not going to implement it then naturally I'll update the spec accordingly. Basically this means changing this paragraph: "The tasks queued by the fetching algorithm on the networking task source to process the data as it is being fetched must examine the resource's Content Type metadata, once it is available, if it ever is. If no Content Type metadata is ever available, or if the type is not recognised as a text track format, then the resource's format must be assumed to be unsupported (this causes the load to fail, as described below). If a type is obtained, and represents a supported text track format, then the resource's data must be passed to the appropriate parser (e.g. the WebVTT parser if the Content Type metadata is text/vtt) as it is received, with the text track list of cues being used for that parser's output." ...to something that instead of checking the Content-Type header recognises the file format based on signatures, and add a note mentioning that this is an intentional violation of the HTTP spec motivated by compatibility with implementations. What's the signature for TTML? 57025 15 philipj 2011-09-20 09:35:48 +0000 The specific signatures should be added to http://tools.ietf.org/html/draft-ietf-websec-mime-sniff-03, what is needed is to ensure that this spec has the necessary hooks and that we have consistency with how <video> works. Would it be acceptable to add another section for "Timed Text" or some such that is similar to http://tools.ietf.org/html/draft-ietf-websec-mime-sniff-03#section-8 ? Since most text formats are unsniffable I expect it would initially contain only an entry for text/vtt. (Note that <video> doesn't strictly obey Content-Type, yet there is no note citing a willful violation of HTTP. Oversight?) 57050 16 ian 2011-09-20 20:05:16 +0000 Didn't we change <video> to strictly adhere to HTTP here? What's the signature for TTML? 57065 17 philipj 2011-09-21 07:41:30 +0000 (In reply to comment #16) > Didn't we change <video> to strictly adhere to HTTP here? AFAIK, nothing has changed since http://www.w3.org/Bugs/Public/show_bug.cgi?id=11984 was filed. AFAICT, the spec still says to check if Content-Type is one of the supported types, missing or application/octet-stream and to then sniff. However, the spec is (deliberately?) vague here, so I can't say for sure. In any case we all know that the spec is bogus until Bug 11984 is resolved in a way that everyone is actually willing to implement. > What's the signature for TTML? It's XML, we all know that sniffing it is not a good idea. 57066 18 philipj 2011-09-21 08:00:51 +0000 Would it be acceptable to add another section for "Timed Text"? 57361 19 ian 2011-09-26 19:41:27 +0000 I misremembered what we did with media resources. > > What's the signature for TTML? > > It's XML, we all know that sniffing it is not a good idea. I'm at a loss as to how to make this work then. Could you elaborate on how you want this to work? 57391 20 silviapfeiffer1 2011-09-27 07:07:32 +0000 (In reply to comment #19) > > > What's the signature for TTML? > > > > It's XML, we all know that sniffing it is not a good idea. > > I'm at a loss as to how to make this work then. Could you elaborate on how you > want this to work? So, the XML Media Type registration [1] reckons you can use the string "<?xml" to identify it as a XML document. Further, the TTML spec [2] indicates in detail how to identify a TTML document (probably a simple approximation is to see if it contains some form of "http://www.w3.org/ns/ttml" in a namespace declaration). In any case - I'd leave these sniffing details to the browser that wants to implement support for TTML or just point to [2]. [1] http://tools.ietf.org/html/rfc3023 [2] http://www.w3.org/TR/ttaf1-dfxp/#doctypes 57396 21 philipj 2011-09-27 07:49:51 +0000 (In reply to comment #19) > I misremembered what we did with media resources. > > > > > What's the signature for TTML? > > > > It's XML, we all know that sniffing it is not a good idea. > > I'm at a loss as to how to make this work then. Could you elaborate on how you > want this to work? Add a section "Text Tracks" to http://tools.ietf.org/html/draft-ietf-websec-mime-sniff-03 that says: This section defines the *rules for sniffing text tracks specifically*. If the first octets match one of the signatures in Section 6 for one of the following media types, then let the sniffed-type be the corresponding media type and abort these steps: o text/vtt Otherwise, let the sniffed-type be the official-type and abort these steps. Of course, also update section 6 with the WebVTT magic. In effect, this will allow browser that only support WebVTT to ignore Content-Type since anything that begins with "WEBVTT" works and anything that doesn't will fire an error event (technically for two different reasons, but the API doesn't make that distinction.) 57400 22 philipj 2011-09-27 08:27:48 +0000 To clarify, TTML would not be sniffed at all. If there should ever be a browser that implements it and they also don't think Content-Type is funny, they can work to add sniffing for it at that time. 57565 23 ian 2011-09-30 16:14:08 +0000 Oh, I see. So you're saying to not ignore the Content-Type altogether, but just to use it in combination with MIME sniff as a fallback. I guess we can do that. It'll make the HTTP guys flip out again... 57688 24 philipj 2011-10-03 07:58:48 +0000 Well, I want to ignore Content-Type, but since you bring up TTML I suggest that draft-abarth-mime-sniff do sniffing for WebVTT and none for TTML. The net result is that regardless of Content-Type, if the resource has the WEBVTT magic bytes, it'll be treated as WebVTT. For browsers that only support WebVTT, it's equivalent to ignoring Content-Type. 58596 25 ian 2011-10-20 23:29:10 +0000 What do you want to have happen if you browse a browser to a page that is labeled text/vtt and starts with "WEBVTT"? What do you want to have happen if you browse a browser to a page that is labeled text/plain and starts with "WEBVTT"? What do you want to have happen if a <track> points to a text/vtt file that starts with "WEBVTT"? What do you want to have happen if a <track> points to a text/xml file that starts with "WEBVTT"? 58597 26 contributor 2011-10-20 23:31:09 +0000 Checked in as WHATWG revision r6721. Check-in comment: (WIP - MIMESNIFF has not yet been updated accordingly) Change the spec to use MIMESNIFF rules for text tracks instead of blindly honouring MIME types. http://html5.org/tools/web-apps-tracker?from=6720&to=6721 58630 27 philipj 2011-10-21 08:29:19 +0000 (In reply to comment #25) > What do you want to have happen if you browse a browser to a page that is > labeled text/vtt and starts with "WEBVTT"? Not relevant to this bug, but just displaying it as plain text would seem the natural choice. I don't suppose we want to disallow fancier display methods, automatic translation, etc, but that sounds unlikely to happen. > What do you want to have happen if you browse a browser to a page that is > labeled text/plain and starts with "WEBVTT"? Per the "Web Pages" section in the sniffing spec text/plain leads to sniffing, so it'll be sniffed as text/vtt and the same thing as above will happen. > What do you want to have happen if a <track> points to a text/vtt file that > starts with "WEBVTT"? It'll be parsed and rendered as WebVTT. > What do you want to have happen if a <track> points to a text/xml file that > starts with "WEBVTT"? Per comment 21, it will be sniffed as text/vtt and then parsed+rendered as WebVTT. (The same will happen regardless of the media type.) Finally, although you didn't ask, navigating to a text/xml file that starts with "WEBVTT" would not cause it to be sniffed as WebVTT, since text/xml is not in the table that leads to sniffing in step 2 of <http://tools.ietf.org/html/draft-ietf-websec-mime-sniff-03#section-4>. 58861 28 ian 2011-10-25 04:54:17 +0000 Your last two paragraphs seem to contradict each other. Can you elaborate? 58893 29 annevk 2011-10-25 07:44:03 +0000 You need to contrast with <img> loading. When navigating not everything is sniffed, when loading from <img> everything is sniffed (unless image/svg+xml, which here would be application/ttml+xml or some such). 58899 30 philipj 2011-10-25 08:54:50 +0000 (In reply to comment #28) > Your last two paragraphs seem to contradict each other. Can you elaborate? There is no contradiction, since the same sniffing wouldn't be applied when navigating and when fetching in a <track> context. However, if there is a solution which allows the exact same sniffing rules to be applied for both <track> and <video> contexts and when navigating I'd be perfectly happy with that. The main point is that it should be possible to serve video and text tracks without Content-Type or with application/octet-stream and text/plain respectively and have it just work, which strictly speaking doesn't require always ignoring Content-Type, even if that is my preferred solution. 58956 31 ian 2011-10-25 23:33:01 +0000 We can do whatever, I'm just trying to work out what you want. :-) I didn't notice the distinction of navigate vs <track> in comment 27's last two paragraphs, sorry for the confusion. What we do when you navigate to a text/xml file is pretty much a given, I don't see any way we'd want to sniff for that. That's why I didn't ask about it. Currently, what Anne has said and what Philip has said conflict (comment 27 and comment 29). If we do what <img> does, then we have to explicitly list all the types we _don't_ want to sniff, which would include the TTML type. But that doesn't seem to be what Philip is suggesting; if I understand correctly, that's more a matter of just having a list of formats that we sniff for regardless of the type when used with <track>, and then falling back on the default; for navigation, though, the sniffing would work as now, with just one more row in the MIMESNIFF table. I guess the remaining question is, if we fallback to the Content-Type, and a file is labeled as text/vtt but doesn't have the WEBVTT signature, should we still try to parse it as WEBVTT, or should we not recognise it? (I guess if we say that if the signature is missing we still fire onerror, it becomes a non-issue. See bug 14294.) 58982 32 philipj 2011-10-26 08:53:46 +0000 (In reply to comment #31) > Currently, what Anne has said and what Philip has said conflict (comment 27 and > comment 29). If we do what <img> does, then we have to explicitly list all the > types we _don't_ want to sniff, which would include the TTML type. But that > doesn't seem to be what Philip is suggesting; if I understand correctly, that's > more a matter of just having a list of formats that we sniff for regardless of > the type when used with <track>, and then falling back on the default; for > navigation, though, the sniffing would work as now, with just one more row in > the MIMESNIFF table. Hmm, I hadn't noticed that special case of image sniffing and note that there is nothing similar for sniffing video (although that section is fiction until bug 11984 is resolved). Is there a particular reason for special-casing "image/svg+xml" that would apply to TTML as well? If not, my suggestion in comment 21 stands. > I guess the remaining question is, if we fallback to the Content-Type, and a > file is labeled as text/vtt but doesn't have the WEBVTT signature, should we > still try to parse it as WEBVTT, or should we not recognise it? (I guess if we > say that if the signature is missing we still fire onerror, it becomes a > non-issue. See bug 14294.) We should still try to parse it and it will fail. The whole assumption here is that this bug together with bug 14294 will allow implementations that only support WebVTT to ignore Content-Type and just try to parse it. 59005 33 ian 2011-10-26 20:45:46 +0000 The reason for the image/svg+xml thing is to minimise how much we ignore Content-Type, so if that applies here too, we shouldn't do any sniffing... Anyway, looks like what you want is for <track src>, first do signature inspection (sniffing), and if that doesn't find anything, then trust the MIME type. Since there's only one signature, I can just do that inline instead of requiring changes to MIMESNIFF. That will also mean that there's no need for any magic in the navigation step. 59026 34 franko 2011-10-26 23:40:27 +0000 (In reply to comment #8) > As far as I can tell, the experience with <video> is that we're moving towards > more strict checking, not moving towards sniffing. Agree; In Internet Explorer, we do strict checking on the mime types for <video> and <track> sources. 59052 35 philipj 2011-10-27 08:24:26 +0000 (In reply to comment #33) > The reason for the image/svg+xml thing is to minimise how much we ignore > Content-Type, so if that applies here too, we shouldn't do any sniffing... > > Anyway, looks like what you want is for <track src>, first do signature > inspection (sniffing), and if that doesn't find anything, then trust the MIME > type. Since there's only one signature, I can just do that inline instead of > requiring changes to MIMESNIFF. That will also mean that there's no need for > any magic in the navigation step. That's an editorial choice, but spreading the sniffing around instead of keeping it in one place isn't the choice I would make. 59148 36 ian 2011-10-28 19:52:47 +0000 Philip: What you're asking for can't be "in one place" regardless of which spec it's in, since it would be a separate section even if it was in MIMESNIFF. Frank: Are you saying you disagree with the change Philip is proposing here? 59518 37 ian 2011-11-02 19:44:46 +0000 So what is the spec supposed to say, at this point? 59564 38 ian 2011-11-03 16:30:55 +0000 Unless browser vendors agree on what they are intending to implement here, which seems not to be the case, I will probably go back to strict adherence to the HTTP specs, with a warning in the spec saying that this is likely to change, and then let the market figure it out. Once browsers converge on a particular behaviour, or once there is agreement on what they should converge on, I will update the spec to match it. As with the <video> element case, you are of course welcome to try to use the escalation process to make a decision sooner, but of course, as with the <video> case, if it doesn't match what finally gets implemented, it doesn't much matter what that decision is. 59568 39 philipj 2011-11-03 16:58:07 +0000 OK, as long as the spec is clear in both cases about the status of the issue, I can live with this. 59569 40 annevk 2011-11-03 17:01:40 +0000 FWIW, I think it should work like images and fonts. Always sniff, and add exceptions (e.g. application/ttml+xml) as needed. 59985 41 ian 2011-11-12 00:31:26 +0000 I've put a warning in the spec. Marking this LATER for now; once browser vendors have converged on a behaviour or desired behaviour I'll revisit this. (See comment 32.)