This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 6589 - utf-8 character not recognized as such
Summary: utf-8 character not recognized as such
Status: RESOLVED INVALID
Alias: None
Product: Validator
Classification: Unclassified
Component: check (show other bugs)
Version: 0.8.4
Hardware: All Linux
: P2 normal
Target Milestone: ---
Assignee: This bug has no owner yet - up for the taking
QA Contact: qa-dev tracking
URL: http://www.webcodinghub.com/index.php...
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2009-02-17 03:39 UTC by Detlef Horchler
Modified: 2015-08-23 07:43 UTC (History)
2 users (show)

See Also:


Attachments

Description Detlef Horchler 2009-02-17 03:39:35 UTC
I get this error message though I believe that "\xC2" a valid utf-8 character is
(LATIN CAPITAL LETTER A WITH CIRCUMFLEX):


 Sorry, I am unable to validate this document because on line 532  it contained one or more bytes that I cannot interpret as utf-8  (in other words, the bytes found are not valid values in the specified Character Encoding). Please check both the content of the file and the character encoding indication.

The error was: utf8 "\xC2" does not map to Unicode
Comment 1 Etienne Miret 2009-08-17 01:41:31 UTC
UTF-8 for LATIN CAPITAL LETTER A WITH CIRCUMFLEX (U+00C2) is \xC3\x82. \xC2 is valid in UTF-8 only if followed by a byte in the range 80-BF.

I suggest closing this bug as invalid.