This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.
section 5 says If the XQuery contains characters that are prohibited in XML text (specifically < and &), except when they occur within a CDATA section within the XQuery, they must be "escaped" as either character entity references (< and &, respectively) or numeric character references I think that the "except when they occur within a CDATA section within the XQuery" should be deleted and that all "<" including those within CDATA sections (and including the < in <![CDATA[ in such a section) should be escaped. In addition there is a third possibility for escaping besides entity or character references, namely to use CDATA sections, and in fact this possibility is demonstrated in the last example. It goes on to say: CDATA sections within an XQuery expression are embedded in the same form in which they appear in any XML document. I am not at all sure what this is intended to mean. Perhaps it is intended to mean that XQuery CDATA sections are encoded as XML CDATA sections. In which case I think that is completely wrong and means that this is a not-so-trivial embedding. The Trivial embedding should take the xquery text as plain text and embed it into XML using standard plain text to XML constructs, without having to parse the xquery expression. (The plain text xml serialiser has to scan for <> and & but not parse the expression.) The Xquery <x><![CDATA[<]]></x> should be encoded as <xqx:xquery><x><![CDATA[<]]></x><xqx:xquery> not <xqx:xquery><x><![CDATA[<]]></x><xqx:xquery> as this latter embedding is an embedding of the xquery <x><</x> which has the same run time behaviour as the first expression but it is a different expression with a different parse tree. It's important not to lose the fact that the CDATA section was in the XQuery as although this example has the same behaviour if it is replaced, in other cases it may be different, due to white space stripping (which is suppressed by CDATA sections). it is recommended that > always be "escaped" (for example, as > or E;). there's a missing x in the hex character ref at the end of that sentence.
Your final argument re: white space behavior was persuasive to me, so I intend to propose to the Working Groups that your suggestion be adopted. Thanks for catching the missing "x" in the hex character reference, too.
The XML Query Working Group has considered your comment and agrees with the problem that you described. A solution has been developed and approved by the WG: (1) Replace the fourth paragraph ("If the XQuery contains...") of section 5, A Trivial Embedding of XQuery, with: **** XQuery expressions are, for the purposes of this trivial embedding, treated as literal text. Therefore, if the XQuery contains characters that are prohibited in XML text (specifically < and &), they must be "escaped" as character entity references (< and &, respectively) or as numeric character references (for example, < and &, respectively), or they must be enclosed in a CDATA section (for example, <![CDATA[<]] or <![CDATA[&]]). Note that this includes the leading "<" of a CDATA section that appears in the original XQuery expression. In addition, because the sequence of characters "]]>" is always prohibited within a CDATA section, it is recommended that instances of > in the original XQuery always be "escaped" (for example, as >, >, or <![CDATA[>]]). **** (2) In addition, in the sixth paragraph ("The following two more..."), delete the entire sentence that reads "CDATA sections within an XQuery expression are embedded in the same form in which they appear in any XML document." This is the official WG response. Please let us know if you agree with this resolution of your issue, by adding a comment to the issue record and changing the Status of the issue to Closed. Or, if you do not agree with this resolution, please add a comment explaining why. If you wish to appeal the WG's decision to the Director, then also change the Status of the record to Reopened. If you wish to record your dissent, but do not wish to appeal the decision to the Director, then change the Status of the record to Closed. If we do not hear from you in the next two weeks, we will assume you agree with the WG decision.
In addition, because the sequence of characters "]]>" is always prohibited within a CDATA section, This should say "within XML element content" not "within a CDATA section". ]]> is forbidden from all element content not just inside a CDATA section. so an Xquery of <a x="]]>" /> can't be encoded as <xqx:xqueryx><a x="]]>" /></xqx:xqueryx> you have to quote the > (or the ] ) as well. David
It's seems to me that we're tying ourselves in knots in this section by trying to tell people in full gory detail how to write a serializer. All we need to say is: In the trivial embedding, the string of Unicode characters making up the text of an XQuery query forms the string-value of a text node, which itself is the only child of an xqx:xquery element. Note: when such an element is serialized, special characters such as < and & must be escaped in the usual way. For example... (But frankly, this section on trivial embedding isn't worth the paper it isn't written on. We don't need a standard for how to represent a string of Unicode characters in an XML document. If an XML spec such as XML Schema or XSLT decides it wants to embed XQuery, it will probably do it in a different way anyway.)
I just noticed that this comment has languished unCLOSED for some time. I have accepted the correction provided in comment #3 (http://www.w3.org/Bugs/Public/show_bug.cgi?id=2611#c3) and made the requisite changes editorially. I have NOT, however, done anything with respect to comment #4 (http://www.w3.org/Bugs/Public/show_bug.cgi?id=2611#c4) and have no plans to do so unless directed by the XML Query WG to do so. (Apologies, Mike!) May we now mark this bug as CLOSED?
Closing this (although I agree with comment #4, that the trival embedding feature should be dropped.)