This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 29142 - [XSLT30] streamability of the xsl:merge-source/select expression
Summary: [XSLT30] streamability of the xsl:merge-source/select expression
Status: CLOSED FIXED
Alias: None
Product: XPath / XQuery / XSLT
Classification: Unclassified
Component: XSLT 3.0 (show other bugs)
Version: Last Call drafts
Hardware: PC Windows NT
: P2 normal
Target Milestone: ---
Assignee: Michael Kay
QA Contact: Mailing list for public feedback on specs from XSL and XML Query WGs
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2015-09-23 15:12 UTC by Abel Braaksma
Modified: 2015-10-29 12:42 UTC (History)
0 users

See Also:


Attachments

Description Abel Braaksma 2015-09-23 15:12:50 UTC
This bug followed from bug#29120.

We say in the same section 15.4:

> 3. The expression in the select attribute of that xsl:merge-source 
> element has striding posture;

I have the following observations:

1) Before we can assess the streamability of anything, we need to know the
   context posture and the context item type, which is not given here. It 
   stands to reason that the context posture should be striding. The context 
   item follows from the anchor item, which is the document node.

2) We don't mention the sweep. While this may be irrelevant (free-ranging 
   sweep results in roaming posture and v.v.), we seem to always mention it 
   elsewhere. 

3) Perhaps it is intentionally just striding, but I think we do not need to 
   limit it to being grounded. select="/snapshot(.)", or select="/foo
   /bar/copy-of()" does not need to be illegal.

4) We currently imply that we do not disallow grounded posture, but that 
   if you return non-nodes, that the snapshot() call will fail. I don't see 
   a problem with leaving that as is (neither would I mind it if we allow 
   snapshot to operate on non-nodes).

I propose to change this something like the following:

    <proposal>
    3. The expression in the select attribute of that xsl:merge-source 
    element, assessed with a context posture of striding and a context item 
    type of document(), has striding or grounded posture and a motionless or 
    consuming sweep."
    </proposal>

(consequently, the example given has an expression has a relative path expression, which is not wrong, but may be confusing, I suggest to change it "/events/...". This observation may be important because of the streamability rules on the root() function).

I think that legal merge expressions should include:

/
which is motionless/striding

/foo/bar
which is consuming/striding

/foo/bar/@zed
which is consuming/striding

/foo//*/../@zed
a scanning expr, then climbing, then striding and consuming

/foo/bar/namespace::node()
which is consuming/striding

/foo/string-join(bar)
which is consuming/grounded

//*/string-join(ancestor-or-self::*/name())
which is striding, then scanning expr, then crawling, then grounded and consuming

(/foo//bar)[3]
which is crawling, then striding and consuming


I think all in all our rules were intended to allow these. The only nitpick is with xsl:merge-source/@select returning non-nodes, but this is covered by the current text.
Comment 1 Abel Braaksma 2015-10-15 15:58:50 UTC
> 4) We currently imply that we do not disallow grounded posture, but that 
>   if you return non-nodes, that the snapshot() call will fail. I don't see 
>   a problem with leaving that as is (neither would I mind it if we allow 
>   snapshot to operate on non-nodes).

Point 4 quoted above has been taken over by events (we now allow atomic items for the implicit fn:snapshot() call). However, the other numbered items are still outstanding, I believe.
Comment 2 Michael Kay 2015-10-16 08:59:55 UTC
The WG studied this today and concluded that the points made are valid. I was asked to propose detailed changes to correct the problems identified. However, I think Abel has already proposed the changes which are needed, and I propose to accept these unchanged. Specifically:

(a) change rule 3 to:
 
    3. The expression in the select attribute of that xsl:merge-source 
    element, assessed with a context posture of striding and a context item 
    type of U{document-node()}, has striding or grounded posture and a motionless or 
    consuming sweep."

(b) add a leading "/" to the select expressions in the example, so they become /events/event, and /log/day/record.

I'm leaving the bug open because the WG didn't minute a formal decision on it, but I have applied the changes to the spec and I don't think any further action is needed.
Comment 3 Abel Braaksma 2015-10-16 10:08:55 UTC
Note that we are still saying the following, while the text on fn:snapshot has been updated:

<quote>
A consequence of the use of the snapshot function is that a type error occurs if the select expression delivers anything other than nodes. There is no rule to prevent the select expression returning grounded nodes from a different source document, or newly constructed nodes, but they are still processed using the snapshot function.
</quote>

At least the first sentence in this Note should be removed or updated to reflect the fact that we can return any kind of item. Perhaps we can give an example that returns non-nodes, for instance the following is streamable, will not raise an error and selects three items on each iteration:

select="/log/items ! (xs:dateTime(@ts), string(@error), string(@server))"
Comment 4 Abel Braaksma 2015-10-16 10:13:37 UTC
> select="/log/items ! (xs:dateTime(@ts), string(@error), string(@server))"

Actually, this shows a potential dangerousness of this kind of grounding expressions if used with merging and I think they have the same issue as the grounding expressions we looked up in the discussion of fn:last(). As written, it *may* be consumed all at once, because it is grounding. Which is perhaps even more the reason to add it and to say something about it (i.e., that streamability is guaranteed, but limited memory is not, but that optimizing processors will process this item by item).
Comment 5 Michael Kay 2015-10-22 13:46:14 UTC
Regarding comment #3, I have changed the offending paragraph to read:

<p diff="chg" at="S-bug29141">There is no rule to prevent the <code>select</code> expression returning atomic values, or grounded nodes from a different source document, or newly constructed nodes, but they are still processed using the <function>snapshot</function> function.</p>

I have added an example using atomic values as follows (though the non-XML source here is not marked streamable, so it might not be quite what you wanted, but I think it is still useful):

<events>
   <xsl:merge>
      <xsl:merge-source name="fax" 
                        select="unparsed-text-lines('fax-log.txt')">
         <xsl:merge-key select="xs:dateTime(substring-before(., ' '))"/>
      </xsl:merge-source>
      <xsl:merge-source name="mail"
                        for-each-stream="'mail-log.xml'" 
                        select="/log/day/record" 
                        streamable="yes">
         <xsl:merge-key select="dateTime(../@date, time)"/>
      </xsl:merge-source>
      <xsl:merge-action>
         <messages time="{current-merge-key()}">
            <xsl:where-populated>
               <fax>
                  <xsl:sequence select="current-merge-group('fax')!substring-after(., ' ')"/>
               </fax>
               <mail>
                  <xsl:sequence select="current-merge-group('mail')/*"/>
               </mail>
            </xsl:where-populated>   
         </messages>   
      </xsl:merge-action>
   </xsl:merge>
</events>

I don't propose to do anything about comment #4.
Comment 6 Michael Kay 2015-10-22 19:26:25 UTC
The changes were today accepted, and have been applied.