This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 6567 - Transcoding should modify encoding in XML declaration, meta elements
Summary: Transcoding should modify encoding in XML declaration, meta elements
Status: NEW
Alias: None
Product: Validator
Classification: Unclassified
Component: check (show other bugs)
Version: 0.8.4
Hardware: All All
: P2 normal
Target Milestone: ---
Assignee: This bug has no owner yet - up for the taking
QA Contact: qa-dev tracking
Depends on:
Reported: 2009-02-12 21:27 UTC by Ville Skyttä
Modified: 2009-02-12 21:38 UTC (History)
0 users

See Also:


Description Ville Skyttä 2009-02-12 21:27:10 UTC
In addition to plain charset conversion, transcoding should also modify the encoding in XML declaration, as well as <meta http-equiv> and <meta charset> (HTML5), preferably the same way as doctype override does (leaves the existing one there in comments).

Not doing the above replacements results in issues when the transcoded content is passed to other validators that care about the encoding specified in one or more of the above.  There's already a hack in place for XML::LibXML (bug 4867) and some workarounds are attempted for the HTML5 validator in html5_validate() which are not enough when there's a charset or doctype override in effect, but I think it would be better to do this centrally (as part of the transcoding process?) and get rid of the parser specific hacks and workarounds.