This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 10150 - [polyglot] i18n comment 2 : In-document declarations always useful
Summary: [polyglot] i18n comment 2 : In-document declarations always useful
Status: RESOLVED FIXED
Alias: None
Product: HTML WG
Classification: Unclassified
Component: pre-LC1 HTML/XHTML Compat. Authoring Guide (ed: Eliot Graff) (show other bugs)
Version: unspecified
Hardware: PC Windows XP
: P2 normal
Target Milestone: ---
Assignee: Eliot Graff
QA Contact: HTML WG Bugzilla archive list
URL: http://www.w3.org/TR/2010/WD-html-pol...
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-07-13 19:50 UTC by Richard Ishida
Modified: 2010-10-05 13:07 UTC (History)
5 users (show)

See Also:


Attachments

Description Richard Ishida 2010-07-13 19:50:01 UTC
Comment from the i18n review of:
http://www.w3.org/TR/2010/WD-html-polyglot-20100624/

Comment 2
At http://www.w3.org/International/reviews/1007-polyglot/
Editorial/substantive: S
Tracked by: RI

Location in reviewed document:
3. Character Encoding [http://www.w3.org/TR/2010/WD-html-polyglot-20100624/#character-encoding]

Comment: 
"In addition, polyglot markup need not include the meta charset declaration, because the parser would have to read UTF-16 in order to parse it by definition."

 
The i18n WG guidelines recommend, nevertheless, that you always include a visible encoding declaration in your document, since it helps developers, testers, or translation production managers who want to visually check the encoding of a document. So it's true to say that you strictly don't need it, but we would prefer that people do. Please could you reflect that in your document.
Comment 1 Eliot Graff 2010-09-27 21:44:32 UTC
The 27 September editor's draft contains the following changes:

Section 3 now reads:

**************************

3. Specifying a Document's Character Encoding
Polyglot markup uses either UTF-8 or UTF-16. UTF-8 is preferred. When polyglot markup uses UTF-16, it must include the BOM indicating UTF-16LE or UTF-16BE. 

Polyglot markup declares character encoding one of two ways: 

By using the BOM. 
In the HTTP header of the response [HTTP11], as in the following: 
Content-type: text/html; charset=utf-8 
or 
Content-type: text/html; charset=utf-16 

Note that polyglot markup may use either text/html or application/xhtml+xml for the value of the content type. 

Using <meta charset="*"/> has no effect in XML. Therefore, polyglot markup may use <meta charset="*"/> in combination with BOM, as long the meta element specifies the same character encoding as the BOM. In addition, the meta tag may be used in the absence of a BOM as long as it matches the already specified encoding. Note that the W3C Internationalization (i18n) Group recommends to always include a visible encoding declaration in a document, because it helps developers, testers, or translation production managers to check the encoding of a document visually. 

**************************

I believe that this satisfies the request in this bug. thank you for your patience.