This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.
In the section "Specifying a Document's Character Encoding", it is stated that polyglot markup uses UTF-8. It then says that the prefered way to indicate this encoding is with a Byte Order Mark. This is not advisable I feel due to: UTF-8 not requiring a BOM [3]; that it could cause problems with applications (apparently MSIE does or did have a problem) and programing languages (apparently inc. Java [4][5]); it causes otherwise valid ASCII to stop being ASCII. As such, I would swap the prefered method for indicating UTF inside the document and add a note about using the BOM. * By using <meta charset="UTF-8"/> (the HTML encoding declaration)(preferred). * By using the Byte Order Mark (BOM) character (could cause problems in some situations). References: [1] https://en.wikipedia.org/wiki/Byte_order_mark#UTF-8 [2] https://en.wikipedia.org/wiki/UTF-8#Byte_order_mark [3] http://www.unicode.org/faq/utf_bom.html#bom5 [4] http://bugs.sun.com/view_bug.do?bug_id=6378911 [5] http://bugs.sun.com/view_bug.do?bug_id=4508058
We are waiting for the editor to take action on bug 13392 *** This bug has been marked as a duplicate of bug 13392 ***