<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<!DOCTYPE bugzilla SYSTEM "https://www.w3.org/Bugs/Public/page.cgi?id=bugzilla.dtd">

<bugzilla version="5.0.4"
          urlbase="https://www.w3.org/Bugs/Public/"
          
          maintainer="sysbot+bugzilla@w3.org"
>

    <bug>
          <bug_id>19931</bug_id>
          
          <creation_ts>2012-11-10 15:47:02 +0000</creation_ts>
          <short_desc>Should not prefer byte order mark with UTF-8</short_desc>
          <delta_ts>2012-11-10 18:49:14 +0000</delta_ts>
          <reporter_accessible>1</reporter_accessible>
          <cclist_accessible>1</cclist_accessible>
          <classification_id>1</classification_id>
          <classification>Unclassified</classification>
          <product>HTML WG</product>
          <component>pre-LC1 HTML/XHTML Compat. Authoring Guide (ed: Eliot Graff)</component>
          <version>unspecified</version>
          <rep_platform>All</rep_platform>
          <op_sys>All</op_sys>
          <bug_status>RESOLVED</bug_status>
          <resolution>DUPLICATE</resolution>
          <dup_id>13392</dup_id>
          
          <bug_file_loc></bug_file_loc>
          <status_whiteboard></status_whiteboard>
          <keywords>externalComments, NE</keywords>
          <priority>P2</priority>
          <bug_severity>normal</bug_severity>
          <target_milestone>---</target_milestone>
          
          
          <everconfirmed>1</everconfirmed>
          <reporter>bugz.ate.my.horse</reporter>
          <assigned_to name="Eliot Graff">eliotgra</assigned_to>
          <cc>eliotgra</cc>
    
    <cc>mike</cc>
    
    <cc>public-html-admin</cc>
    
    <cc>public-html-wg-issue-tracking</cc>
    
    <cc>xn--mlform-iua</cc>
          
          <qa_contact name="HTML WG Bugzilla archive list">public-html-bugzilla</qa_contact>

      

      

      

          <comment_sort_order>oldest_to_newest</comment_sort_order>  
          <long_desc isprivate="0" >
    <commentid>78174</commentid>
    <comment_count>0</comment_count>
    <who name="">bugz.ate.my.horse</who>
    <bug_when>2012-11-10 15:47:02 +0000</bug_when>
    <thetext>In the section &quot;Specifying a Document&apos;s Character Encoding&quot;, it is stated that polyglot markup uses UTF-8. It then says that the prefered way to indicate this encoding is with a Byte Order Mark. 

This is not advisable I feel due to: UTF-8 not requiring a BOM [3]; that it could cause problems with applications (apparently MSIE does or did have a problem) and programing languages (apparently inc. Java [4][5]); it causes otherwise valid ASCII to stop being ASCII. 

As such, I would swap the prefered method for indicating UTF inside the document and add a note about using the BOM.

* By using &lt;meta charset=&quot;UTF-8&quot;/&gt; (the HTML encoding declaration)(preferred).
* By using the Byte Order Mark (BOM) character (could cause problems in some situations).


References: 
[1] https://en.wikipedia.org/wiki/Byte_order_mark#UTF-8
[2] https://en.wikipedia.org/wiki/UTF-8#Byte_order_mark
[3] http://www.unicode.org/faq/utf_bom.html#bom5
[4] http://bugs.sun.com/view_bug.do?bug_id=6378911
[5] http://bugs.sun.com/view_bug.do?bug_id=4508058</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>78177</commentid>
    <comment_count>1</comment_count>
    <who name="Leif Halvard Silli">xn--mlform-iua</who>
    <bug_when>2012-11-10 18:49:14 +0000</bug_when>
    <thetext>We are waiting for the editor to take action on bug 13392

*** This bug has been marked as a duplicate of bug 13392 ***</thetext>
  </long_desc>
      
      

    </bug>

</bugzilla>