This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.
<http://dev.w3.org/html5/spec/Overview.html#content-type-sniffing>: "If it is a U+0022 QUOTATION MARK ('"') and there is a later U+0022 QUOTATION MARK ('"') in s If it is a U+0027 APOSTROPHE ("'") and there is a later U+0027 APOSTROPHE ("'") in s Return the encoding corresponding to the string between this character and the next earliest occurrence of this character." This is indeed a violation of the Content-Type syntax defined in RFC 2616, in not handling backslash-escapes inside quoted-string properly. The spec claims that this is required for "backwards compatibility with legacy content". I'm attaching a test case that shows that the following browsers *do* handle escapes despite what the spec says: Opera, Safari, Konqueror 4.4 Please remove the requirement to violate the base syntax.
Created attachment 917 [details] test case for backslash-escapes in quoted-string
It seems better for Opera to match Chrome / Gecko / IE here. We may very well hit problems because of this.
(In reply to comment #2) > It seems better for Opera to match Chrome / Gecko / IE here. We may very well > hit problems because of this. Please elaborate. The backslash character isn't even allowed in charset names. I believe that requiring an incompatibility with a base spec needs much stronger justification.
(In reply to comment #3) > The backslash character isn't even allowed in charset names. Right, so we may need to ignore the rule rather than handle it.
(In reply to comment #4) > Right, so we may need to ignore the rule rather than handle it. Nope, unless you can prove that there is existing content that "breaks" because it uses "\x" although it wants to say "x". In absence of that proof, the right thing to do is to handle escape sequences as specified.
EDITOR'S RESPONSE: This is an Editor's Response to your comment. If you are satisfied with this response, please change the state of this bug to CLOSED. If you have additional information and would like the editor to reconsider, please reopen this bug. If you would like to escalate the issue to the full HTML Working Group, please add the TrackerRequest keyword to this bug, and suggest title and text for the tracker issue; or you may create a tracker issue yourself, if you are able to do so. For more details, see this document: http://dev.w3.org/html5/decision-policy/decision-policy.html Status: Rejected Change Description: no spec change Rationale: WebKit trunk doesn't support this. Are you sure you haven't set your default encoding to UTF-8 in Safari? What version are you testing? Opera doesn't support this either. It just ignores all punctuation. For instance, see: http://www.hixie.ch/tests/adhoc/html/parsing/encoding/121.html This is known to be incompatible with legacy content, however (the spec used to do this too, but had to change for compat reasons).
1) I'm running Safari 5.0.2 on Windows, and "Text Encoding" is set to "default". 2) The question of *why* Opera is handling the escapes properly isn't really relevant; what counts is the observable behavior. You have claimed that this violation is needed for "compatibility", yet the shipping versions of Opera and Safari on Windows, nor Konqueror do this. I think this is proof that the claim is incorrect. Please provide more data.
What does Safari get for you on this test (the expected and actual encodings, not the pass/fail state)?: http://www.hixie.ch/tests/adhoc/html/parsing/encoding/113.html
(In reply to comment #8) > What does Safari get for you on this test (the expected and actual encodings, > not the pass/fail state)?: > http://www.hixie.ch/tests/adhoc/html/parsing/encoding/113.html Expected result: Windows-1252 Encoding used by browser is: Windows-1254
Upon further investigation, it turns out WebKit has changed behaviour. It used to ignore all punctuation (it didn't support backslash escapes, just ignored punctuation like Opera). Newer trunk builds however now don't ignore punctuation. (In reply to comment #7) > 2) The question of *why* Opera is handling the escapes properly isn't really > relevant; what counts is the observable behavior. The point is that Opera isn't handling the escapes at all. For example, if you put a backslash before the final quote, it doesn't continue the string. It still ends at the quote. For example, consider: http://www.hixie.ch/tests/adhoc/html/parsing/encoding/122.html http://www.hixie.ch/tests/adhoc/html/parsing/encoding/123.html EDITOR'S RESPONSE: This is an Editor's Response to your comment. If you are satisfied with this response, please change the state of this bug to CLOSED. If you have additional information and would like the editor to reconsider, please reopen this bug. If you would like to escalate the issue to the full HTML Working Group, please add the TrackerRequest keyword to this bug, and suggest title and text for the tracker issue; or you may create a tracker issue yourself, if you are able to do so. For more details, see this document: http://dev.w3.org/html5/decision-policy/decision-policy.html Status: Rejected Change Description: no spec change Rationale: As far as I can tell, no browsers with more than 1% market share do anything with backslashes at all, and the browsers that ignored backslashes are actively moving towards what the spec says.
You still haven't provided evidence that this is needed for compatibility. Will raise a tracker issue.
(In reply to comment #11) > You still haven't provided evidence that this is needed for compatibility. > > Will raise a tracker issue. Are you suggesting that backlash-processing complexity be introduced into implementations without compatibility with existing content requiring it?
What I'm suggesting is that implementations should treat quoted-strings in HTTP parameters uniformly, so special-casing Content-Type is *adding* complexity.
Opera will very soon match Safari nightlies and everyone else here. Us "supporting" this was just a side effect of supporting the way too liberal UTS22 matching rules.
(In reply to comment #14) > Opera will very soon match Safari nightlies and everyone else here. Us > "supporting" this was just a side effect of supporting the way too liberal > UTS22 matching rules. A, self-fulfilling prophecy. Anyway, I have shown that the claim that this is "required for compatibility" is wrong; thus will escalate the issue.
Raised as http://www.w3.org/html/wg/tracker/issues/126