Summarized test results:
HTML5, the input byte stream

Intended audience: users, XHTML/HTML coders (using editors or scripting), script developers (PHP, JSP, etc.), CSS coders, Web project managers, and anyone who wants to know how character encoding detection works in current browsers.

Updated

These tests check whether user agents recognise character encoding declarations for HTML5 documents. These test pages have HTML syntax and are served as text/html.

Note that these test results are for released versions of the browsers tested. Versions that are still in development may provide better support for these features. The tests do not use any vendor prefixes.

Results

The tables show results for tests run on the date shown. Above the tables are summaries of the results at that date. The table data may be more up-to-date than the summary. If the tables contain some incorrectly scored tests, or tests that relate to non-released versions of browsers, these are not included in the summary.

To see the test, click on the link in the left-most column. To see detailed results for a single test, click on the link in the right-most column.

Basic character encoding

Snapshot summary, 2014-02-17
Firefox 27.0, Chrome 32.0.1700.107, Safari 6.1.1, Internet Explorer 9, Opera 19.0

All user agents detected character encodings declared in the HTTP header.

For little- and big-endian UTF-16 BOMs, the BOM triggers correct encoding in all browsers.

In the absence of other character encoding declarations, the XML declaration was used by Opera, Safari and Chrome to detect the character encoding for HTML documents. This use for HTML is not specified.

The meta content attribute was used by all browsers to set the encoding.

The meta charset attribute was also recognized by all browsers.

All latest versions of browsers except Internet Explorer (versions 9 and 10) appear to use UTF-8 in the absence of encoding information.

Test link Assertion Details
HTTP charset The character encoding of a page can be set using the HTTP header charset declaration.
UTF-16LE BOM A page with no encoding declarations, but with a UTF-16 little-endian BOM will be recognized as UTF-16.
UTF-16BE BOM A page with no encoding declarations, but with a UTF-16 big-endian BOM will be recognized as UTF-16.
XML declaration [Exploratory] Setting the encoding in the XML declaration will NOT affect the encoding of a page served as text/html.
meta content attribute The character encoding of the page can be set by a meta element with http-equiv and content attributes.
meta charset attribute The character encoding of the page can be set by a meta element with charset attribute.
No encoding declaration A page with no encoding information in HTTP, BOM, XML declaration or meta element will be treated as UTF-8.

Character encoding precedence

Snapshot summary, 2014-02-17
Firefox 27.0, Chrome 32.0.1700.107, Safari 6.1.1, Internet Explorer 9, Opera 19.0

The BOM overrides the HTTP character encoding declaration for all browsers except Internet Explorer 10. (In Internet Explorer 9 the BOM overwrides the HTTP header, and it is expected that future versions of IE will do so also.)

A byte-order mark overrides all in-document declarations in all browsers tested.

Although some browsers use the XML declaration to set the encoding in the absence of other information (see the previous table), no browser uses that information in preference to a meta element. (Note that use of the XML declaration is not described in the HTML spec.)

When both a meta content attribute and meta charset attribute appear in an HTML page, the first always trumps the second.

Test link Assertion Details
HTTP vs UTF-8 BOM A character encoding set in the HTTP header has lower precedence than the UTF-8 signature.
HTTP vs XML declaration [Exploratory] The HTTP header has a higher precedence than an XML declaration.
HTTP vs meta content The HTTP header has a higher precedence than an encoding declaration in a meta content attribute.
HTTP vs meta charset The HTTP header has a higher precedence than an encoding declaration in a meta charset attribute.
UTF-8 BOM vs meta content A page with a UTF-8 BOM will be recognized as UTF-8 even if the meta content attribute declares a different encoding.
UTF-8 BOM vs meta charset A page with a UTF-8 BOM will be recognized as UTF-8 even if the meta charset attribute declares a different encoding.
XML declaration vs meta content [Exploratory] The XML declaration has a lower precedence than an encoding declaration in a meta content attribute.
XML declaration vs meta charset [Exploratory] The XML declaration has a lower precedence than an encoding declaration in a meta charset attribute.
meta content, then meta charset [Exploratory] An encoding declaration in a meta content attribute has a higher precedence than a following encoding declaration in a meta charset attribute.
meta charset, then meta content [Exploratory] An encoding declaration in a meta charset attribute has a higher precedence than a following encoding declaration in a meta charset attribute.