18396 – Encoding Sniffing Algorithm: Add an XML check as a step zero

This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 18396 - Encoding Sniffing Algorithm: Add an XML check as a step zero

Summary: Encoding Sniffing Algorithm: Add an XML check as a step zero

Status:	RESOLVED FIXED

Alias:	None

Product:	HTML WG
Classification:	Unclassified
Component:	HTML5 spec (show other bugs)
Version:	unspecified
Hardware:	PC All

Importance:	P1 normal
Target Milestone:	---
Assignee:	This bug has no owner yet - up for the taking
QA Contact:	HTML WG Bugzilla archive list

URL:	http://dev.w3.org/html5/spec/Overview...
Whiteboard:	whatwg-resolved
Keywords:

Depends on:
Blocks:

Reported:	2012-07-25 12:31 UTC by Leif Halvard Silli
Modified:	2016-04-20 22:39 UTC (History)
CC List:	7 users (show)

See Also:

Attachments

Description Leif Halvard Silli 2012-07-25 12:31:00 UTC

Proposal: Extend the encoding sniffing algorithm by adding a new,
          explicit step zero, like so:

     0. If the document is an XML document, abort these steps.

Justification.

    By extending the algorithm this way, then there is an *explicit* 
step to 'jump out of the algorithm if XML' - for which it would also be 
possible write test cases.

    Currently, and especially if the XML document lives in a 'nested 
browsing context'[1], then (unless there is a BOM) some browsers let 
the XML doc default to the encoding of the 'parent browsing context' 
instead of letting it default to the default encoding of the XML format 
(UTF-8). Webkit/Chromium/Opera have this error. Firefox do not have 
this error. I did not test IE9/10 yet, but suspect they are more on 
Firefox' side. Regarding defaulting to the encoding of the parent 
browsing context, then [see bug #foo and see bug #bar]

More data in my related blog post.[2]

[1] http://dev.w3.org/html5/spec/Overview#nested-browsing-context
[2] http://målform.no/blog/white-spots-in-html5-s-encoding-sniffing-algorithm

Comment 1 Michael[tm] Smith 2015-06-16 10:17:52 UTC

Making this a higher priority to actively seek more feedback on from implementers and webdevs.

Comment 2 Chris Rebert 2016-02-04 07:42:34 UTC

It seems like the current spec addresses this sufficiently.

Quoting from https://mimesniff.spec.whatwg.org/#determining-the-computed-mime-type-of-a-resource :
> 4. If the supplied MIME type is an XML type, the computed MIME type is the supplied MIME type. Abort these steps.

Comment 3 Travis Leithead [MSFT] 2016-04-20 22:39:06 UTC

HTML5.1 Bugzilla Bug Triage: Fixed!

Confirmed that W3C HTML links to [MIMESNIFF] which does indeed bailout fast for XML.

If this resolution is not satisfactory, please copy the relevant bug details/proposal into a new issue at the W3C HTML5 Issue tracker: https://github.com/w3c/html/issues/new where it will be re-triaged. Thanks!