This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 6466 - [SER] quotes in doctype-system parameter
Summary: [SER] quotes in doctype-system parameter
Status: CLOSED FIXED
Alias: None
Product: XPath / XQuery / XSLT
Classification: Unclassified
Component: Serialization 1.0 (show other bugs)
Version: Recommendation
Hardware: PC Windows NT
: P2 normal
Target Milestone: ---
Assignee: Henry Zongaro
QA Contact: Mailing list for public feedback on specs from XSL and XML Query WGs
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2009-01-23 13:37 UTC by Michael Kay
Modified: 2009-06-16 15:26 UTC (History)
0 users

See Also:


Attachments

Description Michael Kay 2009-01-23 13:37:17 UTC
The serialization spec says that the value of serialization parameter doctype-system can be any sequence of Unicode charaters.

However, if the string contains both a single-quote character and a double-quote character, then it is not possible to construct a doctype declaration that satisfies the XML syntax.

The corresponding restriction for doctype-public was introduced as a side-effect of erratum SE.E1
Comment 1 Henry Zongaro 2009-01-29 10:42:24 UTC
I'll put forward two alternatives.  In section 3 "Serialization Parameters," in the row that describes "doctype-system," after "A string of Unicode characters. This parameter may be absent," add either

1) "It is an error if doctype-system does not conform to the syntax of SystemLiteral[XML]." or

2) "It is an error if the value of doctype-system contains both an apostrophe and a quotation mark."

The advantage of 1) is that it follows the model of SE.E1.[1]  The problem is that it's not strictly correct - the syntax of SystemLiteral (and of PubidLiteral, in the case of SE.E1) includes the enclosing apostrophes or quotation marks, while the values of doctype-system and doctype-public do not include those delimiters.

We could just ignore that issue, and go with 1) - it's not very likely to cause confusion - or we go with 2) and also alter the text added by SE.E1 to say "It is an error if the value of doctype-public contains a character that is not PubidChar[XML]."  I'm inclined to take the latter route.

[1] http://www.w3.org/XML/2007/qt-errata/xslt-xquery-serialization-errata.html#E1
Comment 2 Michael Kay 2009-01-29 11:10:46 UTC
>we go with 2) and also alter the text added by SE.E1 to say "It is an error if the value of doctype-public contains a character that is not PubidChar[XML]."  I'm inclined to take the latter route.

I agree.
Comment 3 Henry Zongaro 2009-02-05 18:06:16 UTC
At its teleconference of 2009-02-05, the XSL WG considered this bug report.  The WG approved the substantive changes proposed by the second alternative of the last paragraph of comment 1.  To reiterate:

In section 3 "Serialization Parameters," in the row that describes "doctype-system," after "A string of Unicode characters.  This parameter may be absent," add "It is an error if the value of doctype-system contains both an apostrophe
and a quotation mark."

And alter the text added by SE.E1 to say "It is an error if the value of doctype-public contains a character that is not a PubidChar[XML]."

XQuery WG consideration of the bug is still pending.
Comment 4 Henry Zongaro 2009-02-10 18:41:33 UTC
At the joint teleconference of 2009-02-10, the XQuery WG concurred with the decision of the XSL WG.  This will be erratum SE.E10.
Comment 5 Henry Zongaro 2009-03-24 17:22:04 UTC
After the changes to the Serialization 1.0 recommendation were accepted, I noted that the second paragraph of section 3 already states, "It is a serialization error [err:SEPM0016] if a parameter value is invalid for the given parameter," so I decided to make an editorial change to restate the descriptions of the doctype-system and doctype-public parameters in the positive, saying instead what values are permitted:

For doctype-public, "A string of PubidCharXML characters. This parameter may be absent."  For doctype-system, "A string of Unicode characters that does not include both an apostrophe (#x27) and a quotation mark (#x22) character. This parameter may be absent."

I trust this change will be acceptable.
Comment 6 Henry Zongaro 2009-06-16 15:26:55 UTC
Published in "Errata for XSLT 2.0 and XQuery 1.0 Serialization"[1] and PER draft of "XSLT 2.0 and XQuery 1.0 Serialization (Second Edition)."[2]

[1] http://www.w3.org/XML/2007/qt-errata/xslt-xquery-serialization-errata.html
[2] http://www.w3.org/TR/2009/PER-xslt-xquery-serialization-20090421/