This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 6839 - CHARACTER_ENCODING_SUPPORT-2 not raised when stated encoding is in the XML declaration and is not supported
Summary: CHARACTER_ENCODING_SUPPORT-2 not raised when stated encoding is in the XML de...
Status: NEW
Alias: None
Product: mobileOK Basic checker
Classification: Unclassified
Component: Java Library (show other bugs)
Version: unspecified
Hardware: PC Windows 3.1
: P2 minor
Target Milestone: ---
Assignee: Dominique Hazael-Massieux
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2009-04-21 08:59 UTC by fd
Modified: 2012-12-04 00:51 UTC (History)
0 users

See Also:


Attachments

Description fd 2009-04-21 08:59:45 UTC
Multiple conditions required to trigger the bug:
- if the stated encoding of an XHTML resource is not specified in the HTTP Content-Type header but in the XML declaration,
- if the stated encoding is not supported by the Checker (e.g. Big5)
- and if the resource content really cannot be decoded as UTF-8
... then CHARACTER_ENCODING_SUPPORT-2 is not returned in the report.

[If the HTTP Content-Type header does not specify a character encoding:
 If there is no XML declaration, or UTF-8 character encoding is not specified in the XML declaration, FAIL]

Ex:
 <?xml version="1.0" encoding="unknown"?>
 [text garbage that cannot be decoded as UTF-8]

The problem is that XhtmlContent uses the decoded text to retrieve the stated encoding, and thus cannot serialize the stated encoding when the text could not be decoded.

CONTENT_FORMAT_SUPPORT-4 would still be triggered in such cases, effectively preventing the URI from being mobileOK, but since the real problem is the encoding, we'd better say it!