This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 29431 - [XSLT30] item-separator applicability
Summary: [XSLT30] item-separator applicability
Status: CLOSED FIXED
Alias: None
Product: XPath / XQuery / XSLT
Classification: Unclassified
Component: XSLT 3.0 (show other bugs)
Version: Candidate Recommendation
Hardware: PC Windows NT
: P2 normal
Target Milestone: ---
Assignee: Michael Kay
QA Contact: Mailing list for public feedback on specs from XSL and XML Query WGs
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-02-07 20:10 UTC by Abel Braaksma
Modified: 2016-03-31 23:13 UTC (History)
0 users

See Also:


Attachments

Description Abel Braaksma 2016-02-07 20:10:36 UTC
Not necessarily related to bug 29424.

I have trouble finding out the scope of the applicability of the item-separator attribute. We say at several occasions that it only applies when build-tree is false and item-separator is set to anything but "#absent".

# Pm @build-tree

Build-tree may be a bit of a misnomer, as I think it is not preventing to build a (sequence of) tree(s), but it prevents wrapping the result in a document node. Build-document, or build-document-node, or serialize-to-document-node (ouch) seems to cover it better.

# On @item-separator

My gut tells me it is only applicable to method="text" or "json", but the text indicates otherwise. Does this mean that if you create a sequence of two elements, say "<a>1</a><b>2</b>" and item-separator="###", the (serialized!) result becomes "<a>1</a>###<b>2</b>"?

I would assume it only applies to the "root sequence", though we don't say that explicitly, I think.

We also say on several occasions that text nodes are merged together. I don't see how that works here. Suppose you have:

<xsl:template match="/">
   <xsl:text>A</xsl:text>
   <xsl:text>B</xsl:text>
   <xsl:text>C</xsl:text>
</xsl:template>

When and how do we merge the text nodes? If we do it early, this is serialized as "ABC", if we do it as late as possible, we get "A###B###C". But I am unsure whether this may disrupt existing practices. My gut here says that item-separator does not come into play and we should return "ABC" always, but I don't know why exactly.

Then, compare that with:

<xsl:template match="/">
   <xsl:sequence select="'A', 'B', 'C'" />
</xsl:template>

which is supposed to return as "A B C" without the item-separator present and *with* it as "A###B###C".

Note: we may be saying it clearly, I just can't find it and I don't know how it relates to the different output methods.
Comment 1 Michael Kay 2016-02-15 14:03:46 UTC
We spent considerable time discussing this.

One possible spec change that might help usability is for the "build tree" phase to take account of item-separator, so item-separator separates atomic values in the raw result sequence whether or not you are serializing.

Apart from that, we felt that improved explanations in the spec (including perhaps a diagram!) could aid understanding, but there was no need for technical change.

Note (from reading the serialization spec) that item-separator does NOT separate adjacent text or element nodes, only adjacent atomic values.
Comment 2 Michael Kay 2016-02-27 23:37:52 UTC
I propose the following.

(a) whether or not the principal and/or secondary results are serialized is a decision made by the application and established using the processor's API; it is not affected by anything in the stylesheet.

(b) in the case of the principal result, the "raw result" is the immediate result of the sequence constructor in the initial template or function, converted if necessary to the declared type of that template or function using the function conversion rules. In the case of a secondary result, the "raw result" is the immediate result of the sequence constructor in the xsl:result-document instruction (there is no declared type in this case, therefore no conversion).

(c) if a result document is to be serialized, then the raw result is passed to the serializer, where it goes through the various phases of serialization starting with "sequence normalization" (where sequence normalization applies to the chosen output method). The serialization parameters defined in xsl:output or xsl:result-document (as appropriate) (including item-separator) are passed to the serializer. The effect of item-separator is thus (like all other serialization parameters) defined solely by the serialization spec.

(d) if a result document is _not_ to be serialized, then the application may receive either the raw result (as defined above), or the result of applying sequence normalization as defined in the serialization specification. This choice is controlled by the build-tree attribute of xsl:output or xsl:result-document, and may be overridden using the processor's transformation API. (If build-tree is yes, then the sequence normalization process is applied.) All serialization parameters other than item-separator are ignored (so sequence normalization takes place even if there are serialization parameters specifying, say, method="json").

The consequence of this rule is that item-separator is applicable whether or not the result is being serialized, and that its meaning is as defined in the serialization spec.
Comment 3 Michael Kay 2016-03-31 15:29:49 UTC
The minutes of 10 March do not record a formal decision, but they do record:

Abel wonders if the validation should be done before or after the effect of
adding any item separators.

Decided to do it after sequence normalization (ie with any separators).

MSMQ comments:

25.1:
w.r.t. “1. The raw result may be delivered as is.” I wonder if
it’s worth observing that this is logically possible only through
some API or other interface that is out of scope for this spec ?

A note might be helpful.
Comment 4 Michael Kay 2016-03-31 22:25:31 UTC
The proposal in comment 2 was accepted, with the observations noted in the minutes, which the editor is encouraged to take into account when implementing the proposal.

The WG indicated that it would be useful if the spec includes an example showing the effect of item-separator on the serialized result.
Comment 5 Michael Kay 2016-03-31 23:13:49 UTC
The changes have been applied.