13995 – <track> Don't check Content-Type for <track>

This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 13995 - <track> Don't check Content-Type for <track>

Summary: <track> Don't check Content-Type for <track>

Status:	RESOLVED LATER

Alias:	None

Product:	HTML.next
Classification:	Unclassified
Component:	default (show other bugs)
Version:	unspecified
Hardware:	Other other

Importance:	P3 normal
Target Milestone:	---
Assignee:	This bug has no owner yet - up for the taking
QA Contact:	HTML WG Bugzilla archive list

URL:	http://www.whatwg.org/specs/web-apps/...
Whiteboard:
Keywords:

Depends on:
Blocks:	14294
	Show dependency tree / graph

Reported:	2011-09-01 11:50 UTC by contributor
Modified:	2012-09-14 12:15 UTC (History)
CC List:	14 users (show)

See Also:

Attachments

Description contributor 2011-09-01 11:50:51 UTC

Specification: http://www.whatwg.org/specs/web-apps/current-work/multipage/the-video-element.html
Multipage: http://www.whatwg.org/C#sourcing-out-of-band-text-tracks
Complete: http://www.whatwg.org/c#sourcing-out-of-band-text-tracks

Comment:
Don't check Content-Type for <track>

Posted from: 83.218.67.122 by philipj@opera.com
User agent: Opera/9.80 (X11; Linux x86_64; U; Edition Next; en) Presto/2.9.186 Version/12.00

Comment 1 Philip Jägenstedt 2011-09-01 11:55:21 UTC

"The tasks queued by the fetching algorithm on the networking task source to process the data as it is being fetched must examine the resource's Content Type metadata, once it is available, if it ever is. If no Content Type metadata is ever available, or if the type is not recognised as a text track format, then the resource's format must be assumed to be unsupported (this causes the load to fail, as described below). If a type is obtained, and represents a supported text track format, then the resource's data must be passed to the appropriate parser  (e.g. the WebVTT parser if the Content Type metadata is text/vtt)  as it is received, with the text track list of cues being used for that parser's output."

We don't want to implement this, given the experience with <video>, see http://www.w3.org/html/wg/wiki/ChangeProposals/NoVideoContentType#Issues_for_Authors

WebVTT already has the WEBVTT header magic to differentiate it from other formats if we should ever support such in <track>.

The main consequence of doing this check is that <track> will appear broken to all authors when they first test it. Even if they figure out how to set Content-Type, it'll mostly be a waste of time and not provide any real benefits.

Comment 2 Frank Olivier 2011-09-01 15:46:44 UTC

> WebVTT already has the WEBVTT header magic to differentiate it from other
> formats if we should ever support such in <track>.

...but this requires the user agent to download the file to check the track type; this is not a good solution, especially in mobile scenarios.

> The main consequence of doing this check is that <track> will appear broken to
> all authors when they first test it. Even if they figure out how to set
> Content-Type, it'll mostly be a waste of time and not provide any real
> benefits.

I don't think that this is a big burden on developers; they use documentation and samples when using new elements. This would just have to be documented clearly, like a lot of other parts of the HTML5 spec.

The real benefit is that multiple track formats are supported cleanly.

Comment 3 Anne 2011-09-02 09:02:08 UTC

If you are already fetching the file it does not matter much whether you terminate the request when reading the Content-Type header or the first set of octets from the response entity body (even on mobile).

Comment 4 Silvia Pfeiffer 2011-09-04 11:14:16 UTC

(In reply to comment #3)
> If you are already fetching the file it does not matter much whether you
> terminate the request when reading the Content-Type header or the first set of
> octets from the response entity body (even on mobile).

Not for one file. But it matters if you have many files, which in this case can be several dozens if not hundreds of caption, subtitle, audio description etc. files in different languages.

Comment 5 Anne 2011-09-04 11:34:43 UTC

Why? The difference is marginal.

Comment 6 Silvia Pfeiffer 2011-09-04 11:48:10 UTC

(In reply to comment #5)
> Why? The difference is marginal.

Wait, are you comparing the difference between a HEAD and a GET request with a request for the first 8 bytes? Then, yes, that's marginal.

Comment 7 Anne 2011-09-04 12:02:58 UTC

That is what this bug is about.

Comment 8 Ian 'Hixie' Hickson 2011-09-14 22:23:04 UTC

As far as I can tell, the experience with <video> is that we're moving towards more strict checking, not moving towards sniffing.

In general, I agree that the platform would be more practical if we used sniffing everywhere instead of content type labeling. However, that's not how the platform is designed. If you want to deprecated Content-Type headers, I suggest bringing it up with the HTTP working group.

Comment 9 Philip Jägenstedt 2011-09-15 07:13:40 UTC

It is this spec that defines when and how the Content-Type headers are checked, not the HTTP spec. If the spec does not change, we will just ignore what it says on this point, since it is just annoying to web authors.

As for Content-Type in video, I assume that the suggestion in http://www.w3.org/Bugs/Public/show_bug.cgi?id=11984#c22 is what will eventually be spec'd, which includes sniffing in the absence of a trusted Content-Type. Doing the same for <track> might be fine, but strict checking is not.

Comment 10 Ian 'Hixie' Hickson 2011-09-15 23:04:29 UTC

Oh, if you mean just in the absence of a Content-Type, then sure, that can be added. I thought you meant ignoring the Content-Type altogether, which last I checked definitely violates HTTP.

Comment 11 Philip Jägenstedt 2011-09-16 08:29:18 UTC

I can't find any MUST requirements at all in http://tools.ietf.org/html/rfc2616#section-14.17 or related sections, how would this be a violation?

As with <video>, I see 3 options here:

1. Always obey Content-Type

2. Always ignore Content-Type

3. Add sniffing for WebVTT to http://tools.ietf.org/html/draft-ietf-websec-mime-sniff-03 and use the sniffed type, not the official content type. If spec'd as I imagine it, this would allow serving WebVTT as text/plain, but serving it as application/ttml+xml would fail.

(1) is unacceptable, (2) is what we've implemented, but (3) could be tolerable if <video> were also made consistent with that.

Comment 12 Julian Reschke 2011-09-16 09:06:35 UTC

(In reply to comment #11)
> I can't find any MUST requirements at all in
> http://tools.ietf.org/html/rfc2616#section-14.17 or related sections, how would
> this be a violation?

Whether or not there is a MUST doesn't tell you anything about whether it's a requirement or not. You seem to be confused about when or not BCP14 keywords need to be used; see RFC 2119 for details. Note that there are entire RFCs without any BCP 14 keywords; that doesn't affect their "normativeness" at all.

That being said, <http://greenbytes.de/tech/webdav/rfc2616.html#rfc.section.7.2.1> says:

"Any HTTP/1.1 message containing an entity-body SHOULD include a Content-Type header field defining the media type of that body. If and only if the media type is not given by a Content-Type field, the recipient MAY attempt to guess the media type via inspection of its content and/or the name extension(s) of the URI used to identify the resource. If the media type remains unknown, the recipient SHOULD treat it as type "application/octet-stream"."

But also note:

<http://trac.tools.ietf.org/wg/httpbis/trac/ticket/155>

> As with <video>, I see 3 options here:
> 
> 1. Always obey Content-Type
> 
> 2. Always ignore Content-Type
> 
> 3. Add sniffing for WebVTT to
> http://tools.ietf.org/html/draft-ietf-websec-mime-sniff-03 and use the sniffed
> type, not the official content type. If spec'd as I imagine it, this would
> allow serving WebVTT as text/plain, but serving it as application/ttml+xml
> would fail.
> 
> (1) is unacceptable, (2) is what we've implemented, but (3) could be tolerable
> if <video> were also made consistent with that.

I don't think (1) is unacceptable.

Comment 13 Philip Jägenstedt 2011-09-16 12:11:33 UTC

(In reply to comment #12)

> "Any HTTP/1.1 message containing an entity-body SHOULD include a Content-Type
> header field defining the media type of that body. If and only if the media
> type is not given by a Content-Type field, the recipient MAY attempt to guess
> the media type via inspection of its content and/or the name extension(s) of
> the URI used to identify the resource. If the media type remains unknown, the
> recipient SHOULD treat it as type "application/octet-stream"."
> 
> But also note:
> 
> <http://trac.tools.ietf.org/wg/httpbis/trac/ticket/155>

Great, this is option 3.

> > As with <video>, I see 3 options here:
> > 
> > 1. Always obey Content-Type
> > 
> > 2. Always ignore Content-Type
> > 
> > 3. Add sniffing for WebVTT to
> > http://tools.ietf.org/html/draft-ietf-websec-mime-sniff-03 and use the sniffed
> > type, not the official content type. If spec'd as I imagine it, this would
> > allow serving WebVTT as text/plain, but serving it as application/ttml+xml
> > would fail.
> > 
> > (1) is unacceptable, (2) is what we've implemented, but (3) could be tolerable
> > if <video> were also made consistent with that.
> 
> I don't think (1) is unacceptable.

Will not implement, it is an unnecessary hoop for authors to jump through and not actually necessary to support multiple text formats.

Comment 14 Ian 'Hixie' Hickson 2011-09-19 22:52:14 UTC

Ah, well, if you're saying you're not going to implement it then naturally I'll update the spec accordingly.

Basically this means changing this paragraph:

"The tasks queued by the fetching algorithm on the networking task source to process the data as it is being fetched must examine the resource's Content Type metadata, once it is available, if it ever is. If no Content Type metadata is ever available, or if the type is not recognised as a text track format, then the resource's format must be assumed to be unsupported (this causes the load to fail, as described below). If a type is obtained, and represents a supported text track format, then the resource's data must be passed to the appropriate parser (e.g. the WebVTT parser if the Content Type metadata is text/vtt) as it is received, with the text track list of cues being used for that parser's output."

...to something that instead of checking the Content-Type header recognises the file format based on signatures, and add a note mentioning that this is an intentional violation of the HTTP spec motivated by compatibility with implementations.

What's the signature for TTML?

Comment 15 Philip Jägenstedt 2011-09-20 09:35:48 UTC

The specific signatures should be added to http://tools.ietf.org/html/draft-ietf-websec-mime-sniff-03, what is needed is to ensure that this spec has the necessary hooks and that we have consistency with how <video> works.

Would it be acceptable to add another section for "Timed Text" or some such that is similar to http://tools.ietf.org/html/draft-ietf-websec-mime-sniff-03#section-8 ? Since most text formats are unsniffable I expect it would initially contain only an entry for text/vtt.

(Note that <video> doesn't strictly obey Content-Type, yet there is no note citing a willful violation of HTTP. Oversight?)

Comment 16 Ian 'Hixie' Hickson 2011-09-20 20:05:16 UTC

Didn't we change <video> to strictly adhere to HTTP here?

What's the signature for TTML?

Comment 17 Philip Jägenstedt 2011-09-21 07:41:30 UTC

(In reply to comment #16)
> Didn't we change <video> to strictly adhere to HTTP here?

AFAIK, nothing has changed since http://www.w3.org/Bugs/Public/show_bug.cgi?id=11984 was filed. AFAICT, the spec still says to check if Content-Type is one of the supported types, missing or application/octet-stream and to then sniff. However, the spec is (deliberately?) vague here, so I can't say for sure. In any case we all know that the spec is bogus until Bug 11984 is resolved in a way that everyone is actually willing to implement.

> What's the signature for TTML?

It's XML, we all know that sniffing it is not a good idea.

Comment 18 Philip Jägenstedt 2011-09-21 08:00:51 UTC

Would it be acceptable to add another section for "Timed Text"?

Comment 19 Ian 'Hixie' Hickson 2011-09-26 19:41:27 UTC

I misremembered what we did with media resources.


> > What's the signature for TTML?
> 
> It's XML, we all know that sniffing it is not a good idea.

I'm at a loss as to how to make this work then. Could you elaborate on how you want this to work?

Comment 20 Silvia Pfeiffer 2011-09-27 07:07:32 UTC

(In reply to comment #19)
> > > What's the signature for TTML?
> > 
> > It's XML, we all know that sniffing it is not a good idea.
> 
> I'm at a loss as to how to make this work then. Could you elaborate on how you
> want this to work?

So, the XML Media Type registration [1] reckons you can use the string "<?xml" to identify it as a XML document.

Further, the TTML spec [2] indicates in detail how to identify a TTML document (probably a simple approximation is to see if it contains some form of "http://www.w3.org/ns/ttml" in a namespace declaration).

In any case - I'd leave these sniffing details to the browser that wants to implement support for TTML or just point to [2].

[1] http://tools.ietf.org/html/rfc3023
[2] http://www.w3.org/TR/ttaf1-dfxp/#doctypes

Comment 21 Philip Jägenstedt 2011-09-27 07:49:51 UTC

(In reply to comment #19)
> I misremembered what we did with media resources.
> 
> 
> > > What's the signature for TTML?
> > 
> > It's XML, we all know that sniffing it is not a good idea.
> 
> I'm at a loss as to how to make this work then. Could you elaborate on how you
> want this to work?

Add a section "Text Tracks" to http://tools.ietf.org/html/draft-ietf-websec-mime-sniff-03 that says:

This section defines the *rules for sniffing text tracks specifically*.

   If the first octets match one of the signatures in Section 6 for one
   of the following media types, then let the sniffed-type be the
   corresponding media type and abort these steps:

   o  text/vtt

   Otherwise, let the sniffed-type be the official-type and abort these
   steps.

Of course, also update section 6 with the WebVTT magic.

In effect, this will allow browser that only support WebVTT to ignore Content-Type since anything that begins with "WEBVTT" works and anything that doesn't will fire an error event (technically for two different reasons, but the API doesn't make that distinction.)

Comment 22 Philip Jägenstedt 2011-09-27 08:27:48 UTC

To clarify, TTML would not be sniffed at all. If there should ever be a browser that implements it and they also don't think Content-Type is funny, they can work to add sniffing for it at that time.

Comment 23 Ian 'Hixie' Hickson 2011-09-30 16:14:08 UTC

Oh, I see. So you're saying to not ignore the Content-Type altogether, but just to use it in combination with MIME sniff as a fallback. I guess we can do that. It'll make the HTTP guys flip out again...

Comment 24 Philip Jägenstedt 2011-10-03 07:58:48 UTC

Well, I want to ignore Content-Type, but since you bring up TTML I suggest that draft-abarth-mime-sniff do sniffing for WebVTT and none for TTML. The net result is that regardless of Content-Type, if the resource has the WEBVTT magic bytes, it'll be treated as WebVTT. For browsers that only support WebVTT, it's equivalent to ignoring Content-Type.

Comment 25 Ian 'Hixie' Hickson 2011-10-20 23:29:10 UTC

What do you want to have happen if you browse a browser to a page that is labeled text/vtt and starts with "WEBVTT"?

What do you want to have happen if you browse a browser to a page that is labeled text/plain and starts with "WEBVTT"?

What do you want to have happen if a <track> points to a text/vtt file that starts with "WEBVTT"?

What do you want to have happen if a <track> points to a text/xml file that starts with "WEBVTT"?

Comment 26 contributor 2011-10-20 23:31:09 UTC

Checked in as WHATWG revision r6721.
Check-in comment: (WIP - MIMESNIFF has not yet been updated accordingly) Change the spec to use MIMESNIFF rules for text tracks instead of blindly honouring MIME types.
http://html5.org/tools/web-apps-tracker?from=6720&to=6721

Comment 27 Philip Jägenstedt 2011-10-21 08:29:19 UTC

(In reply to comment #25)
> What do you want to have happen if you browse a browser to a page that is
> labeled text/vtt and starts with "WEBVTT"?

Not relevant to this bug, but just displaying it as plain text would seem the natural choice. I don't suppose we want to disallow fancier display methods, automatic translation, etc, but that sounds unlikely to happen.

> What do you want to have happen if you browse a browser to a page that is
> labeled text/plain and starts with "WEBVTT"?

Per the "Web Pages" section in the sniffing spec text/plain leads to sniffing, so it'll be sniffed as text/vtt and the same thing as above will happen.

> What do you want to have happen if a <track> points to a text/vtt file that
> starts with "WEBVTT"?

It'll be parsed and rendered as WebVTT.

> What do you want to have happen if a <track> points to a text/xml file that
> starts with "WEBVTT"?

Per comment 21, it will be sniffed as text/vtt and then parsed+rendered as WebVTT. (The same will happen regardless of the media type.)

Finally, although you didn't ask, navigating to a text/xml file that starts with "WEBVTT" would not cause it to be sniffed as WebVTT, since text/xml is not in the table that leads to sniffing in step 2 of <http://tools.ietf.org/html/draft-ietf-websec-mime-sniff-03#section-4>.

Comment 28 Ian 'Hixie' Hickson 2011-10-25 04:54:17 UTC

Your last two paragraphs seem to contradict each other. Can you elaborate?

Comment 29 Anne 2011-10-25 07:44:03 UTC

You need to contrast with <img> loading. When navigating not everything is sniffed, when loading from <img> everything is sniffed (unless image/svg+xml, which here would be application/ttml+xml or some such).

Comment 30 Philip Jägenstedt 2011-10-25 08:54:50 UTC

(In reply to comment #28)
> Your last two paragraphs seem to contradict each other. Can you elaborate?

There is no contradiction, since the same sniffing wouldn't be applied when navigating and when fetching in a <track> context.

However, if there is a solution which allows the exact same sniffing rules to be applied for both <track> and <video> contexts and when navigating I'd be perfectly happy with that. The main point is that it should be possible to serve video and text tracks without Content-Type or with application/octet-stream and text/plain respectively and have it just work, which strictly speaking doesn't require always ignoring Content-Type, even if that is my preferred solution.

Comment 31 Ian 'Hixie' Hickson 2011-10-25 23:33:01 UTC

We can do whatever, I'm just trying to work out what you want. :-)

I didn't notice the distinction of navigate vs <track> in comment 27's last two paragraphs, sorry for the confusion. What we do when you navigate to a text/xml file is pretty much a given, I don't see any way we'd want to sniff for that. That's why I didn't ask about it.

Currently, what Anne has said and what Philip has said conflict (comment 27 and comment 29). If we do what <img> does, then we have to explicitly list all the types we _don't_ want to sniff, which would include the TTML type. But that doesn't seem to be what Philip is suggesting; if I understand correctly, that's more a matter of just having a list of formats that we sniff for regardless of the type when used with <track>, and then falling back on the default; for navigation, though, the sniffing would work as now, with just one more row in the MIMESNIFF table.

I guess the remaining question is, if we fallback to the Content-Type, and a  file is labeled as text/vtt but doesn't have the WEBVTT signature, should we still try to parse it as WEBVTT, or should we not recognise it? (I guess if we say that if the signature is missing we still fire onerror, it becomes a non-issue. See bug 14294.)

Comment 32 Philip Jägenstedt 2011-10-26 08:53:46 UTC

(In reply to comment #31)

> Currently, what Anne has said and what Philip has said conflict (comment 27 and
> comment 29). If we do what <img> does, then we have to explicitly list all the
> types we _don't_ want to sniff, which would include the TTML type. But that
> doesn't seem to be what Philip is suggesting; if I understand correctly, that's
> more a matter of just having a list of formats that we sniff for regardless of
> the type when used with <track>, and then falling back on the default; for
> navigation, though, the sniffing would work as now, with just one more row in
> the MIMESNIFF table.

Hmm, I hadn't noticed that special case of image sniffing and note that there is nothing similar for sniffing video (although that section is fiction until bug 11984 is resolved). Is there a particular reason for special-casing "image/svg+xml" that would apply to TTML as well? If not, my suggestion in comment 21 stands.

> I guess the remaining question is, if we fallback to the Content-Type, and a 
> file is labeled as text/vtt but doesn't have the WEBVTT signature, should we
> still try to parse it as WEBVTT, or should we not recognise it? (I guess if we
> say that if the signature is missing we still fire onerror, it becomes a
> non-issue. See bug 14294.)

We should still try to parse it and it will fail. The whole assumption here is that this bug together with bug 14294 will allow implementations that only support WebVTT to ignore Content-Type and just try to parse it.

Comment 33 Ian 'Hixie' Hickson 2011-10-26 20:45:46 UTC

The reason for the image/svg+xml thing is to minimise how much we ignore Content-Type, so if that applies here too, we shouldn't do any sniffing...

Anyway, looks like what you want is for <track src>, first do signature inspection (sniffing), and if that doesn't find anything, then trust the MIME type. Since there's only one signature, I can just do that inline instead of requiring changes to MIMESNIFF. That will also mean that there's no need for any magic in the navigation step.

Comment 34 Frank Olivier 2011-10-26 23:40:27 UTC

(In reply to comment #8)
> As far as I can tell, the experience with <video> is that we're moving towards
> more strict checking, not moving towards sniffing.

Agree; In Internet Explorer, we do strict checking on the mime types for <video> and <track> sources.

Comment 35 Philip Jägenstedt 2011-10-27 08:24:26 UTC

(In reply to comment #33)
> The reason for the image/svg+xml thing is to minimise how much we ignore
> Content-Type, so if that applies here too, we shouldn't do any sniffing...
> 
> Anyway, looks like what you want is for <track src>, first do signature
> inspection (sniffing), and if that doesn't find anything, then trust the MIME
> type. Since there's only one signature, I can just do that inline instead of
> requiring changes to MIMESNIFF. That will also mean that there's no need for
> any magic in the navigation step.

That's an editorial choice, but spreading the sniffing around instead of keeping it in one place isn't the choice I would make.

Comment 36 Ian 'Hixie' Hickson 2011-10-28 19:52:47 UTC

Philip: What you're asking for can't be "in one place" regardless of which spec it's in, since it would be a separate section even if it was in MIMESNIFF.

Frank: Are you saying you disagree with the change Philip is proposing here?

Comment 37 Ian 'Hixie' Hickson 2011-11-02 19:44:46 UTC

So what is the spec supposed to say, at this point?

Comment 38 Ian 'Hixie' Hickson 2011-11-03 16:30:55 UTC

Unless browser vendors agree on what they are intending to implement here, which seems not to be the case, I will probably go back to strict adherence to the HTTP specs, with a warning in the spec saying that this is likely to change, and then let the market figure it out. Once browsers converge on a particular behaviour, or once there is agreement on what they should converge on, I will update the spec to match it.

As with the <video> element case, you are of course welcome to try to use the escalation process to make a decision sooner, but of course, as with the <video> case, if it doesn't match what finally gets implemented, it doesn't much matter what that decision is.

Comment 39 Philip Jägenstedt 2011-11-03 16:58:07 UTC

OK, as long as the spec is clear in both cases about the status of the issue, I can live with this.

Comment 40 Anne 2011-11-03 17:01:40 UTC

FWIW, I think it should work like images and fonts. Always sniff, and add exceptions (e.g. application/ttml+xml) as needed.

Comment 41 Ian 'Hixie' Hickson 2011-11-12 00:31:26 UTC

I've put a warning in the spec. Marking this LATER for now; once browser vendors have converged on a behaviour or desired behaviour I'll revisit this. (See comment 32.)