This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.
The serialization spec says that the value of serialization parameter doctype-system can be any sequence of Unicode charaters. However, if the string contains both a single-quote character and a double-quote character, then it is not possible to construct a doctype declaration that satisfies the XML syntax. The corresponding restriction for doctype-public was introduced as a side-effect of erratum SE.E1
I'll put forward two alternatives. In section 3 "Serialization Parameters," in the row that describes "doctype-system," after "A string of Unicode characters. This parameter may be absent," add either 1) "It is an error if doctype-system does not conform to the syntax of SystemLiteral[XML]." or 2) "It is an error if the value of doctype-system contains both an apostrophe and a quotation mark." The advantage of 1) is that it follows the model of SE.E1.[1] The problem is that it's not strictly correct - the syntax of SystemLiteral (and of PubidLiteral, in the case of SE.E1) includes the enclosing apostrophes or quotation marks, while the values of doctype-system and doctype-public do not include those delimiters. We could just ignore that issue, and go with 1) - it's not very likely to cause confusion - or we go with 2) and also alter the text added by SE.E1 to say "It is an error if the value of doctype-public contains a character that is not PubidChar[XML]." I'm inclined to take the latter route. [1] http://www.w3.org/XML/2007/qt-errata/xslt-xquery-serialization-errata.html#E1
>we go with 2) and also alter the text added by SE.E1 to say "It is an error if the value of doctype-public contains a character that is not PubidChar[XML]." I'm inclined to take the latter route. I agree.
At its teleconference of 2009-02-05, the XSL WG considered this bug report. The WG approved the substantive changes proposed by the second alternative of the last paragraph of comment 1. To reiterate: In section 3 "Serialization Parameters," in the row that describes "doctype-system," after "A string of Unicode characters. This parameter may be absent," add "It is an error if the value of doctype-system contains both an apostrophe and a quotation mark." And alter the text added by SE.E1 to say "It is an error if the value of doctype-public contains a character that is not a PubidChar[XML]." XQuery WG consideration of the bug is still pending.
At the joint teleconference of 2009-02-10, the XQuery WG concurred with the decision of the XSL WG. This will be erratum SE.E10.
After the changes to the Serialization 1.0 recommendation were accepted, I noted that the second paragraph of section 3 already states, "It is a serialization error [err:SEPM0016] if a parameter value is invalid for the given parameter," so I decided to make an editorial change to restate the descriptions of the doctype-system and doctype-public parameters in the positive, saying instead what values are permitted: For doctype-public, "A string of PubidCharXML characters. This parameter may be absent." For doctype-system, "A string of Unicode characters that does not include both an apostrophe (#x27) and a quotation mark (#x22) character. This parameter may be absent." I trust this change will be acceptable.
Published in "Errata for XSLT 2.0 and XQuery 1.0 Serialization"[1] and PER draft of "XSLT 2.0 and XQuery 1.0 Serialization (Second Edition)."[2] [1] http://www.w3.org/XML/2007/qt-errata/xslt-xquery-serialization-errata.html [2] http://www.w3.org/TR/2009/PER-xslt-xquery-serialization-20090421/