This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 29445 - [XSLT30] clarification / cleanup in section 5.7.2 Constructing Simple Content on the separator attribute
Summary: [XSLT30] clarification / cleanup in section 5.7.2 Constructing Simple Content...
Status: CLOSED FIXED
Alias: None
Product: XPath / XQuery / XSLT
Classification: Unclassified
Component: XSLT 3.0 (show other bugs)
Version: Candidate Recommendation
Hardware: PC Windows NT
: P2 normal
Target Milestone: ---
Assignee: Michael Kay
QA Contact: Mailing list for public feedback on specs from XSL and XML Query WGs
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-02-10 20:08 UTC by Abel Braaksma
Modified: 2016-02-18 22:33 UTC (History)
0 users

See Also:


Attachments

Description Abel Braaksma 2016-02-10 20:08:07 UTC
The essential part of this evolves around the following instruction:

<a>
  <t:value-of>
   <t:sequence select="('a','b','c','d')"/>
  </t:value-of>
</a>

xsl:value-of has a seqtor that contains an xsl:sequence. I think this is supposed to work as follows:

1) the @select is evaluated, returning a sequence of four strings
2) the xsl:value-of's seqtor is evaluated, which returns the result of xsl:sequence, which is a sequence of four strings
3) section 5.7.2 Constructing Simple Content kicks in because the instruction creates a text node, this goes as:

3.1) zero-length text nodes are discarded, adjacent text nodes joined (n/a)
3.2) sequence is atomized and cast to a string (n/a, already seq of strings)
3.3) strings are concatenated with a default separator: #x20
3.4) the result is "a b c d"

This then becomes a text node, which is appended to <a> following "Constructing Complex Content".

The crux is, I think, that this construct does not return a sequence of four text nodes (which would then be concatenated without spaces), but a sequence of one text node, already created by xsl:value-of.

This test has been around for a while (4 years) and I only noticed it today because we fixed a similar bug in another location (principle result tree with build-tree options) leading to this test failing. I think we were having it wrong all along.

If you would agree, I'll update the expected result.
Comment 1 Michael Kay 2016-02-10 20:32:24 UTC
I think the test is correct. It's written to test this rule in 11.4.3:

If the separator attribute is present, then the effective value of this attribute is used to separate adjacent items in the result sequence, as described in 5.7.2 Constructing Simple Content. In the absence of this attribute, the default separator is a single space (#x20) when the content is specified using the select attribute, or a zero-length string when the content is specified using a sequence constructor.

I'm afraid I don't recall the full history of this rule (which was there in 2.0). xsl:attribute has the same rule, and it's possible that the rule was added to xsl:attribute to achieve 1.0 compatibility, and was then applied to xsl:value-of so that both would work the same way. Recall that in 1.0, the content of attributes was always constructed using a contained sequence constructor (no select attribute available), and the content of a text node was always constructed using a select attribute (xsl:value-of had to be empty). However I don't think that can be the full explanation because separators were only introduced in 2.0 - in 1.0 all items after the first were discarded.

In 1.0 you could write

<xsl:attribute name="x">
  <xsl:copy-of select="keywords/keyword"/>
</xsl:attribute>

and you got all the keywords with no space-separation. I suspect we felt obliged to retain that behaviour in 2.0, and then to make xsl:value-of do the same thing.
Comment 2 Abel Braaksma 2016-02-10 21:09:37 UTC
I was just about to sum up other occasions, glad I just saw your comment. An odd rule indeed, I know I have had my share of head-scratching every now and then when trying to find out why spaces were added in real-world scenarios.

Guess my fix then was too overzealous and I didn't oversee the implications. Thanks for pointing that out.

I do think that we have a (small) conflict in the spec. We point to 5.7.2 Creating Simple Content, but there we write a different story about xsl:value-of. I think we should change the part where we say "the default separator is the single space", as that is clearly not always true:

<quote>
The strings within the resulting sequence are concatenated, with a (possibly zero-length) separator inserted between successive strings. The default separator is a single space. In the case of xsl:attribute and xsl:value-of, a different separator can be specified using the separator attribute of the instruction; it is permissible for this to be a zero-length string, in which case the strings are concatenated with no separator. In the case of xsl:comment, xsl:processing-instruction, and xsl:namespace, and when expanding a value template, the default separator cannot be changed.
</quote>
Comment 3 Michael Kay 2016-02-10 21:13:18 UTC
I'm sympathetic. I recall my own struggles when implementing XSLT 1.0 when you would read one part of the spec without noticing that the rules were "refined" somewhere else.
Comment 4 Abel Braaksma 2016-02-10 22:59:45 UTC
I updated the title to reflect the (new) subject and that it is now a spec bug, as per comment#2 and comment#3.
Comment 5 Michael Kay 2016-02-15 16:01:00 UTC
Agreed that the sentence "The default separator is a single space." is potentially misleading. Should say something like "The default separator depends on the containing instruction."
Comment 6 Michael Kay 2016-02-18 22:33:32 UTC
The summary of the rules for selecting a separator in constructing-simple-content has been made more complete.