This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 5336 - Non-XML characters from input copied to XML output making it ill-formed
Summary: Non-XML characters from input copied to XML output making it ill-formed
Status: NEW
Alias: None
Product: Validator
Classification: Unclassified
Component: Templates (show other bugs)
Version: 0.8.2
Hardware: All All
: P2 major
Target Milestone: ---
Assignee: This bug has no owner yet - up for the taking
QA Contact: qa-dev tracking
URL: http://validator.w3.org/check?uri=htt...
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2008-01-02 14:00 UTC by Henri Sivonen
Modified: 2008-01-02 14:00 UTC (History)
0 users

See Also:


Attachments

Description Henri Sivonen 2008-01-02 14:00:40 UTC
Steps to reproduce:
1) Load http://validator.w3.org/check?uri=http%3A%2F%2Fphilip.html5.org%2Fmisc%2Fchars.html&charset=iso-8859-1&output=soap12
2) Examine the result or try parsing it as XML

Actual results:
At line 30, column 147, there's U+0000, which is forbidden in XML.

Expected results:
Expected characters that are prohibited by XML to be replaced with the REPLACEMENT CHARACTER when a normal character would be copied to output.