Accesskey n skips to in-page navigation. Skip to the content start.

s_gotoW3cHome Internationalization
 

Test results: C1 Character encoding detection (HTML5)

These tests check how characters in the C1 range of ISO code pages (ie. 0x80-0x9F) are displayed for pages with ISO 8859-1 and ISO 8859-15 encodings declared using the HTML5 charset attribute, and whether their presence affects the encoding reported by the user agent.

Summary & conclusions

See the results below for user agents tested. This section summarizes the results of those tests.

This is a summary of the basic test results.

All browsers tested displayed characters in the C1 range as per Windows cp1252 when the encoding of the document was set to ISO 8859-1. However, the two browsers that indicate the encoding of the document through the pull-down encoding menu on the user interface (IE and Firefox) did not report that the encoding had been changed to cp1252.

For test two, it is assumed that IE and Safari are trying to display cp1252 characters for which there are no font glyphs. For Firefox, the behaviour is clear, since the results are different from those of test 4.

C1 range characters are not displayed as cp1252 characters in any browser when the encoding is ISO 8859-15.

Only Firefox uses the Unicode replacement character for characters in the C1 range when the encoding is ISO 8859-15. Since these characters map to Unicode characters that are unused by the HTML standard, one would have expected all browsers to use the replacement character.

All browsers displayed characters above the C1 range as expected for the declared encoding.

Latest results

These are results for the latest versions of each browser tested. Green (yes) means the results are consistent with the encoding declared in the document, or with the default encoding; Red (no) means the results were not. Ochre is used where the results are unclear. Numbers in the table relate to notes immediately below.

UA   IE Firefox Safari IE
version   7 3.0.1 3.1.2 8 Beta
OS   XP XP XP XP
date   20080718 20080718 20080718 20080718
1 ISO 8859-1, C1 cp1252 C1 results cp1252 1 cp1252 1 cp1252 1 cp1252 1
  >C1 results yes yes yes yes
  Reported encoding yes yes n/a yes
2 ISO 8859-1, C1 cp1256 C1 results ▯ 4 no font glyph icon 5 _ 6 ▯ 4
  >C1 results yes yes yes yes
  Reported encoding yes yes n/a yes
3 ISO 8859-15, C1 cp1252 C1 results ▯ 4 � 3 _ 6 ▯ 4
  >C1 results yes yes yes yes
  Reported encoding yes yes n/a yes
4 ISO 8859-15, C1 cp1256 C1 results ▯ 4 � 3 _ 6 ▯ 4
  >C1 results yes yes yes yes
  Reported encoding yes yes n/a yes

Notes:

  1. Characters displayed were those in Windows cp1252 (Western Europe).
  2. Characters displayed were those in Windows cp1256 (Arabic).
  3. The Unicode replacement character � was displayed.
  4. White vertical rectangles were displayed.
  5. Displayed codepoint because no font glyph was available.
  6. No glyph displayed.

Tell us what you think (English).

Subscribe to an RSS feed.

New resources

Home page news

Further reading

Author: Richard Ishida, W3C.

Valid XHTML 1.0!
Valid CSS!
Encoded in UTF-8!

Content first published 2008-08-01. Last substantive update 2008-08-01 10:42 GMT. This version 2008-08-01 10:42 GMT

For the history of document changes, search for results-css-lang in the i18n blog.