This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 3637 - [XQuery] Adjacent text nodes
Summary: [XQuery] Adjacent text nodes
Status: RESOLVED FIXED
Alias: None
Product: XPath / XQuery / XSLT
Classification: Unclassified
Component: XQuery 1.0 (show other bugs)
Version: Candidate Recommendation
Hardware: PC Windows XP
: P2 normal
Target Milestone: ---
Assignee: Don Chamberlin
QA Contact: Mailing list for public feedback on specs from XSL and XML Query WGs
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2006-08-28 08:57 UTC by Michael Kay
Modified: 2007-06-15 00:00 UTC (History)
1 user (show)

See Also:


Attachments

Description Michael Kay 2006-08-28 08:57:00 UTC
XDM states in 6.7.1: "In addition, Document and Element Nodes impose the constraint that two consecutive Text Nodes can never occur as adjacent siblings. When a Document or Element Node is constructed, Text Nodes that would be adjacent must be combined into a single Text Node. "

At first sight XQuery 3.7.1.3 (Content of an element constructor) rule 2 appears to be restating this:

"2. Adjacent text nodes in the content sequence are merged into a single text node by concatenating their contents, with no intervening blanks. After concatenation, any text node whose content is a zero-length string is deleted from the content sequence."

However, this is followed by rule 3 which says: "If the content sequence contains a document node, the document node is replaced in the content sequence by its children."

Replacing the document node by its children may produce text nodes, which may then be adjacent to other text nodes that have already been merged under rule 2. So we have to rely on the statement in the data model to know that these text nodes now have to be merged with adjacent text nodes, in other words we have to repeat the process described in rule 2.

It would be clearer if rule 3 came before rule 2: that is, replace a document node by its children, then merge adjacent text nodes, then remove zero-length text nodes. I believe that when you read the XQuery rules alongside the XDM rules, this is what has to happen anyway.

Document nodes with text node children don't actually arise much in XQuery (unlike XSLT) but they are still possible.
Comment 1 Don Chamberlin 2006-09-19 17:47:23 UTC
Michael,
On Sept. 19, 2006, the Query working group decided to accept your suggestion to reverse the order of Rules 2 and 3 in XQuery Section 3.7.1.3 (Content of an element constructor). This change will ensure that adjacent text nodes are correctly merged. Since you were present in the discussion, I have marked this bug report as Fixed and Closed.
Regards,
Don Chamberlin (for the Query working group)
Comment 2 Anthony Jones 2007-02-23 16:35:22 UTC
I believe a similar change also needs to be made under section "3.7.3.1 Computed Element Constructors".
Comment 3 Jim Melton 2007-02-26 00:23:19 UTC
The complete fix for this bug does not appear in the Recommendation of 23 January 2007.  It will be considered for a future publication (either an Errata document or some possible future version of the specification). 
Comment 4 Don Chamberlin 2007-02-27 18:45:11 UTC
Anthony,
Thanks for your perceptive comment #2 which points out an incomplete resolution of this issue. On Feb. 27, 2007, the working group agreed to invert the order of Rules 1 and 2 in XQuery Section 3.7.3.1, Computed Element Constructors (processing of content sequence). This change will cause document nodes to be replaced by their children before adjacent text nodes are merged. We believe that this change, which will appear in a future errata document, will resolve the issue that you have raised. If you agree, please mark this comment as Closed.
Regards,
Don Chamberlin (for the Query Working Group)
Comment 5 Michael Kay 2007-06-13 12:33:30 UTC
The same change needs to be made to 3.7.3.3, which describes the process for constructing document nodes (and which claims to be the same as the process for constructing element nodes, which is modified by the agreed erratum).
Comment 6 Don Chamberlin 2007-06-15 00:00:04 UTC
Michael,
Right you are. This change is clearly required for consistency with the other changes. I have added this change to my errata list (to be published later), and put this bug report back into the "Fixed" state.
Thanks for your attention to detail!
Don Chamberlin