HTML5 Internationalization Tests
Character encodings
This page gathers links related to tests being developed by the Internationalization Core Working Group to assess internationalization support of user agents. These tests are subject to change from time to time.
For each of the sections below, this page links to summaries of the results of related tests (for major browsers), to more detailed results (by test suite) in the W3C Test Framework, and to pages in the framework that allow you to run the tests and record results. The remainder of each section provides direct links to the tests themselves, with information about the test assertion and relevant formats.
Note: The rules for character encodings in XHTML5 are not described in the HTML5 spec, which points to the XML spec instead. A number of tests for XHTML5 are, however, included here for convenience.
Basic tests
Links: Section 8.2.2.1 • Summary of results • Detailed results for 8.2.2.1 • Submit data for 8.2.2.1
- HTTP charset (character-encoding-001)
Setting the HTTP header charset declaration will affect the encoding of a page.
TestResults - UTF-16LE BOM (character-encoding-003)
A page with no encoding declarations, but with a UTF-16 little-endian BOM will be recognized as UTF-16.
TestResults - UTF-16BE BOM (character-encoding-004)
A page with no encoding declarations, but with a UTF-16 big-endian BOM will be recognized as UTF-16.
TestResults - Exploratory: XML declaration (character-encoding-006)
[Exploratory test] Setting the encoding in the XML declaration will not affect the encoding of a page served as text/html.
TestResults - meta content attribute (character-encoding-007)
Setting the encoding in the meta content attribute will affect the encoding of a page served as text/html.
TestResults - meta charset attribute (character-encoding-009)
Setting the encoding in the meta charset attribute will affect the encoding of an HTML5 page.
TestResults - Exploratory: charset on an a element (character-encoding-014)
[Exploratory test] A link to a page using the a element with a charset attribute will cause a page with no other encoding information to be rendered using the encoding specified in the charset attribute.
TestResults - No encoding declaration (character-encoding-015)
A page with no encoding information in HTTP, BOM, XML declaration or meta element will be treated as UTF-8.
TestResults - HTTP vs UTF-8 BOM (character-encoding-034)
A character encoding set in the HTTP header has lower precedence than the UTF-8 signature.
TestResults - HTTP vs UTF-16LE BOM (character-encoding-035)
A character encoding set in the HTTP header has lower precedence than the UTF-16LE BOM.
TestResults - HTTP vs UTF-16BE BOM (character-encoding-036)
A character encoding set in the HTTP header has lower precedence than the UTF-16BE BOM.
TestResults - Exploratory: HTTP vs XML declaration (character-encoding-022)
[Exploratory test] The HTTP header has a higher precedence than an XML declaration.
TestResults - HTTP vs meta content (character-encoding-016)
The HTTP header has a higher precedence than an encoding declaration in a meta content attribute.
TestResults - HTTP vs meta charset (character-encoding-018)
The HTTP header has a higher precedence than an encoding declaration in a meta charset attribute.
TestResults - UTF-8 BOM vs meta content (character-encoding-037)
A page with a UTF-8 BOM will be recognized as UTF-8 even if the meta content attribute declares a different encoding.
TestResults - UTF-8 BOM vs meta charset (character-encoding-038)
A page with a UTF-8 BOM will be recognized as UTF-8 even if the meta charset attribute declares a different encoding.
TestResults - Exploratory: XML declaration vs meta content (character-encoding-024)
[Exploratory test] The XML declaration has a lower precedence than an encoding declaration in a meta content attribute.
TestResults - Exploratory: XML declaration vs meta charset (character-encoding-026)
[Exploratory test] The XML declaration has a lower precedence than an encoding declaration in a meta charset attribute.
TestResults - meta content, then meta charset (character-encoding-027)
An encoding declaration in a meta content attribute has a higher precedence than a following encoding declaration in a meta charset attribute.
TestResults - meta charset, then meta content (character-encoding-030)
An encoding declaration in a meta charset attribute has a higher precedence than a following encoding declaration in a meta charset attribute.
TestResults
Basic tests (Charset attribute)
Basic tests (XHTML5)
Precedence
Links: Section 8.2.2.1 • Summary of results • Detailed results for 8.2.2.1 • Submit data for 8.2.2.1
- HTTP vs meta content (character-encoding-016)
The HTTP header has a higher precedence than an encoding declaration in a meta content attribute.
TestResults - HTTP vs meta charset (character-encoding-018)
The HTTP header has a higher precedence than an encoding declaration in a meta charset attribute.
TestResults - UTF-8 BOM vs meta content (character-encoding-037)
A page with a UTF-8 BOM will be recognized as UTF-8 even if the meta content attribute declares a different encoding.
TestResults - UTF-8 BOM vs meta charset (character-encoding-038)
A page with a UTF-8 BOM will be recognized as UTF-8 even if the meta charset attribute declares a different encoding.
TestResults - Exploratory: XML declaration vs meta content (character-encoding-024)
[Exploratory test] The XML declaration has a lower precedence than an encoding declaration in a meta content attribute.
TestResults - Exploratory: XML declaration vs meta charset (character-encoding-026)
[Exploratory test] The XML declaration has a lower precedence than an encoding declaration in a meta charset attribute.
TestResults - meta content, then meta charset (character-encoding-027)
An encoding declaration in a meta content attribute has a higher precedence than a following encoding declaration in a meta charset attribute.
TestResults - meta charset, then meta content (character-encoding-030)
An encoding declaration in a meta charset attribute has a higher precedence than a following encoding declaration in a meta charset attribute.
TestResults
Precedence (XHTML5)
Escapes
- hex ncr (escapes-001)
A hexadecimal numeric character reference produces the intended character.
TestResults - decimal ncr (escapes-002)
A decimal numeric character reference produces the intended character.
TestResults - lower-case entity (escapes-003)
A lower case character entity reference produces the intended character.
TestResults - upper-case entity (escapes-004)
An upper case character entity reference produces the intended character.
TestResults - supplementary character (escapes-005)
A hexadecimal numeric reference containing the Unicode code point of a supplementary character produces the intended character.
TestResults - hex ncr outside range of charset (escapes-006)
An hexadecimal numeric reference containing the Unicode code point of a character which is not supported by the current character encoding still produces the intended character.
TestResults - decimal ncr outside range of charset (escapes-007)
A decimal numeric reference containing the Unicode code point of a character which is not supported by the current character encoding still produces the intended character.
TestResults - character entity outside range of charset (escapes-008)
A character entity reference for a Unicode code point which is not supported by the current character encoding still produces the intended character.
TestResults - hex ncr for C1 position of euro sign (escapes-009)
A hexadecimal numeric reference containing the code point for the euro in the Windows 1252 code page should produce a euro sign.
TestResults - decimal ncr for C1 position of euro sign (escapes-010)
A decimal numeric reference containing the code point for the euro in the Windows 1252 code page should produce a euro sign.
TestResults
Content last changed 2012-03-29 18:36 GMT.