This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 18474 - Encoding Sniffing Algorithm: parent browsing context defines encoding default
Summary: Encoding Sniffing Algorithm: parent browsing context defines encoding default
Status: RESOLVED FIXED
Alias: None
Product: WHATWG
Classification: Unclassified
Component: HTML (show other bugs)
Version: unspecified
Hardware: PC All
: P2 normal
Target Milestone: Unsorted
Assignee: Ian 'Hixie' Hickson
QA Contact: contributor
URL: http://dev.w3.org/html5/spec/Overview...
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2012-08-02 19:05 UTC by Ian 'Hixie' Hickson
Modified: 2012-11-25 05:33 UTC (History)
5 users (show)

See Also:


Attachments

Description Ian 'Hixie' Hickson 2012-08-02 19:05:57 UTC
+++ This bug was initially created as a clone of Bug #18394 +++

Proposal: Extend the encoding sniffing algorithm[1] with a new,
          2nd last step, like so:

     #. If the document lives in a 'nested browsing context'[2],
        then return the encoding of the 'parent browsing context',
        as a parent browsing context dictated default encoding,
        and abort these steps.

Bug #3: Justification.

   (1) Currently, the HTML5 encoding sniffing algorithm fails to take 
account of the fact that, in case the document of a nested browsing 
context has not been supplied with encoding information, then Web 
browsers[*] do *not* "return an implementation-defined or 
user-specified default character encoding" (as HTML5 currently 
requires). Web browsers instead return a 'parent browsing 
context-defined' character encoding - the encoding of the document in 
the parent browsing context.

     [*]I did not test the relevant editions of IE - IE8/IE9/IE10 - yet.
        But I know that IE6 does not consider the encoding of the parent
        browsing context.

   (2) By explicitly including the 'parent browsing context encoding 
default' into the algorithm, then we make sure that browser applies the 
default at the same step.
       The problem, right now, is that the browsers that thus far has 
implemented the encoding sniffing algorithm's current step 7 (encoding 
pattern matching/detection) disagree about whether it should take place 
*before* the parent browsing context default is applied — or *after* 
the encoding of the parent browsing context has been considered.
       The latter approach, which Chrome seems to take, means that step 
7 is unlikely to take place at all if the document lives in a nested 
browsing context. Firefox 12 (which by default only performs step 7 for 
some locales or at user request) and Opera 12 (which - unlike in at 
least Opera 10 - applies step 7 for all locales, take the approach that 
encoding pattern matching/detection should occur before the locale 
default eventually is applied.


For more, see the blog post I wrote in connection with this bug report.[3]

[1] http://dev.w3.org/html5/spec/Overview#encoding-sniffing-algorithm
[2] http://dev.w3.org/html5/spec/Overview#nested-browsing-context
[3] http://målform.no/blog/white-spots-in-html5-s-encoding-sniffing-algorithm
Comment 1 Silvia Pfeiffer 2012-09-22 02:18:09 UTC
Isn't this fixed in
http://html5.org/tools/web-apps-tracker?from=7323&to=7324 ?
Comment 2 Leif Halvard Silli 2012-09-22 06:20:23 UTC
(In reply to comment #1)
> Isn't this fixed in
> http://html5.org/tools/web-apps-tracker?from=7323&to=7324 ?

Ian has started to fix the bug. 
He asked me some follow-up questions in the WHATwg list. 
I sent an answer. Which probably will answer. 

I consider that he is not finished looking at it. But I don't know.
Comment 3 Henri Sivonen 2012-09-22 06:33:15 UTC
Don't inherit the encoding if the parent is different-Origin (implemented in Gecko). Don't inherit the encoding when the parent encoding is not a rough ascii superset (not implemented in Gecko, yet, but we have a bug open for compat reasons).
Comment 4 Leif Halvard Silli 2012-09-22 08:54:36 UTC
(In reply to comment #2)

> I sent an answer. Which [Ian] probably will answer. 

http://lists.whatwg.org/pipermail/whatwg-whatwg.org/2012-September/037226.html

(In reply to comment #3)
> Don't inherit the encoding if the parent is different-Origin (implemented in
> Gecko).

Indeed. And IE, Webkit and Opera behave like Gecko.

> Don't inherit the encoding when the parent encoding is not a rough
> ascii superset (not implemented in Gecko, yet, but we have a bug open for
> compat reasons).

Interesting.  I don't disagre.
Comment 5 Leif Halvard Silli 2012-09-23 01:31:21 UTC
(In reply to comment #4)
> (In reply to comment #3)
> > Don't inherit the encoding if the parent is different-Origin (implemented in
> > Gecko).
> 
> Indeed. And IE, Webkit and Opera behave like Gecko.

One minus: Opera seems to treat same and different origin the same:

http://www.xn--mlform-iua.no/blog/utf8files/locale_default_vs_doc_of_parent_browsing_context/
Comment 6 Ian 'Hixie' Hickson 2012-11-25 05:32:53 UTC
In future, please don't file bugs and send e-mail, it's confusing.
Comment 7 contributor 2012-11-25 05:33:48 UTC
Checked in as WHATWG revision r7544.
Check-in comment: More detail on the inheritance of encodings from parent browsing contexts.
http://html5.org/tools/web-apps-tracker?from=7543&to=7544