<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<!DOCTYPE bugzilla SYSTEM "https://www.w3.org/Bugs/Public/page.cgi?id=bugzilla.dtd">

<bugzilla version="5.0.4"
          urlbase="https://www.w3.org/Bugs/Public/"
          
          maintainer="sysbot+bugzilla@w3.org"
>

    <bug>
          <bug_id>18474</bug_id>
          
          <creation_ts>2012-08-02 19:05:57 +0000</creation_ts>
          <short_desc>Encoding Sniffing Algorithm: parent browsing context defines encoding default</short_desc>
          <delta_ts>2012-11-25 05:33:48 +0000</delta_ts>
          <reporter_accessible>1</reporter_accessible>
          <cclist_accessible>1</cclist_accessible>
          <classification_id>1</classification_id>
          <classification>Unclassified</classification>
          <product>WHATWG</product>
          <component>HTML</component>
          <version>unspecified</version>
          <rep_platform>PC</rep_platform>
          <op_sys>All</op_sys>
          <bug_status>RESOLVED</bug_status>
          <resolution>FIXED</resolution>
          
          
          <bug_file_loc>http://dev.w3.org/html5/spec/Overview#encoding-sniffing-algorithm</bug_file_loc>
          <status_whiteboard></status_whiteboard>
          <keywords></keywords>
          <priority>P2</priority>
          <bug_severity>normal</bug_severity>
          <target_milestone>Unsorted</target_milestone>
          
          
          <everconfirmed>1</everconfirmed>
          <reporter name="Ian &apos;Hixie&apos; Hickson">ian</reporter>
          <assigned_to name="Ian &apos;Hixie&apos; Hickson">ian</assigned_to>
          <cc>hsivonen</cc>
    
    <cc>ian</cc>
    
    <cc>mike</cc>
    
    <cc>silviapfeiffer1</cc>
    
    <cc>xn--mlform-iua</cc>
          
          <qa_contact>contributor</qa_contact>

      

      

      

          <comment_sort_order>oldest_to_newest</comment_sort_order>  
          <long_desc isprivate="0" >
    <commentid>71772</commentid>
    <comment_count>0</comment_count>
    <who name="Ian &apos;Hixie&apos; Hickson">ian</who>
    <bug_when>2012-08-02 19:05:57 +0000</bug_when>
    <thetext>+++ This bug was initially created as a clone of Bug #18394 +++

Proposal: Extend the encoding sniffing algorithm[1] with a new,
          2nd last step, like so:

     #. If the document lives in a &apos;nested browsing context&apos;[2],
        then return the encoding of the &apos;parent browsing context&apos;,
        as a parent browsing context dictated default encoding,
        and abort these steps.

Bug #3: Justification.

   (1) Currently, the HTML5 encoding sniffing algorithm fails to take 
account of the fact that, in case the document of a nested browsing 
context has not been supplied with encoding information, then Web 
browsers[*] do *not* &quot;return an implementation-defined or 
user-specified default character encoding&quot; (as HTML5 currently 
requires). Web browsers instead return a &apos;parent browsing 
context-defined&apos; character encoding - the encoding of the document in 
the parent browsing context.

     [*]I did not test the relevant editions of IE - IE8/IE9/IE10 - yet.
        But I know that IE6 does not consider the encoding of the parent
        browsing context.

   (2) By explicitly including the &apos;parent browsing context encoding 
default&apos; into the algorithm, then we make sure that browser applies the 
default at the same step.
       The problem, right now, is that the browsers that thus far has 
implemented the encoding sniffing algorithm&apos;s current step 7 (encoding 
pattern matching/detection) disagree about whether it should take place 
*before* the parent browsing context default is applied — or *after* 
the encoding of the parent browsing context has been considered.
       The latter approach, which Chrome seems to take, means that step 
7 is unlikely to take place at all if the document lives in a nested 
browsing context. Firefox 12 (which by default only performs step 7 for 
some locales or at user request) and Opera 12 (which - unlike in at 
least Opera 10 - applies step 7 for all locales, take the approach that 
encoding pattern matching/detection should occur before the locale 
default eventually is applied.


For more, see the blog post I wrote in connection with this bug report.[3]

[1] http://dev.w3.org/html5/spec/Overview#encoding-sniffing-algorithm
[2] http://dev.w3.org/html5/spec/Overview#nested-browsing-context
[3] http://målform.no/blog/white-spots-in-html5-s-encoding-sniffing-algorithm</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>74298</commentid>
    <comment_count>1</comment_count>
    <who name="Silvia Pfeiffer">silviapfeiffer1</who>
    <bug_when>2012-09-22 02:18:09 +0000</bug_when>
    <thetext>Isn&apos;t this fixed in
http://html5.org/tools/web-apps-tracker?from=7323&amp;to=7324 ?</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>74300</commentid>
    <comment_count>2</comment_count>
    <who name="Leif Halvard Silli">xn--mlform-iua</who>
    <bug_when>2012-09-22 06:20:23 +0000</bug_when>
    <thetext>(In reply to comment #1)
&gt; Isn&apos;t this fixed in
&gt; http://html5.org/tools/web-apps-tracker?from=7323&amp;to=7324 ?

Ian has started to fix the bug. 
He asked me some follow-up questions in the WHATwg list. 
I sent an answer. Which probably will answer. 

I consider that he is not finished looking at it. But I don&apos;t know.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>74301</commentid>
    <comment_count>3</comment_count>
    <who name="Henri Sivonen">hsivonen</who>
    <bug_when>2012-09-22 06:33:15 +0000</bug_when>
    <thetext>Don&apos;t inherit the encoding if the parent is different-Origin (implemented in Gecko). Don&apos;t inherit the encoding when the parent encoding is not a rough ascii superset (not implemented in Gecko, yet, but we have a bug open for compat reasons).</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>74305</commentid>
    <comment_count>4</comment_count>
    <who name="Leif Halvard Silli">xn--mlform-iua</who>
    <bug_when>2012-09-22 08:54:36 +0000</bug_when>
    <thetext>(In reply to comment #2)

&gt; I sent an answer. Which [Ian] probably will answer. 

http://lists.whatwg.org/pipermail/whatwg-whatwg.org/2012-September/037226.html

(In reply to comment #3)
&gt; Don&apos;t inherit the encoding if the parent is different-Origin (implemented in
&gt; Gecko).

Indeed. And IE, Webkit and Opera behave like Gecko.

&gt; Don&apos;t inherit the encoding when the parent encoding is not a rough
&gt; ascii superset (not implemented in Gecko, yet, but we have a bug open for
&gt; compat reasons).

Interesting.  I don&apos;t disagre.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>74321</commentid>
    <comment_count>5</comment_count>
    <who name="Leif Halvard Silli">xn--mlform-iua</who>
    <bug_when>2012-09-23 01:31:21 +0000</bug_when>
    <thetext>(In reply to comment #4)
&gt; (In reply to comment #3)
&gt; &gt; Don&apos;t inherit the encoding if the parent is different-Origin (implemented in
&gt; &gt; Gecko).
&gt; 
&gt; Indeed. And IE, Webkit and Opera behave like Gecko.

One minus: Opera seems to treat same and different origin the same:

http://www.xn--mlform-iua.no/blog/utf8files/locale_default_vs_doc_of_parent_browsing_context/</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>78767</commentid>
    <comment_count>6</comment_count>
    <who name="Ian &apos;Hixie&apos; Hickson">ian</who>
    <bug_when>2012-11-25 05:32:53 +0000</bug_when>
    <thetext>In future, please don&apos;t file bugs and send e-mail, it&apos;s confusing.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>78768</commentid>
    <comment_count>7</comment_count>
    <who name="">contributor</who>
    <bug_when>2012-11-25 05:33:48 +0000</bug_when>
    <thetext>Checked in as WHATWG revision r7544.
Check-in comment: More detail on the inheritance of encodings from parent browsing contexts.
http://html5.org/tools/web-apps-tracker?from=7543&amp;to=7544</thetext>
  </long_desc>
      
      

    </bug>

</bugzilla>