This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 24343 - xsl:merge cannot be streamable because of semantics doc() etc
Summary: xsl:merge cannot be streamable because of semantics doc() etc
Status: CLOSED DUPLICATE of bug 25335
Alias: None
Product: XPath / XQuery / XSLT
Classification: Unclassified
Component: XSLT 3.0 (show other bugs)
Version: Last Call drafts
Hardware: PC Windows NT
: P2 normal
Target Milestone: ---
Assignee: Michael Kay
QA Contact: Mailing list for public feedback on specs from XSL and XML Query WGs
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: 24497
  Show dependency treegraph
 
Reported: 2014-01-21 11:57 UTC by Abel Braaksma
Modified: 2014-05-15 14:00 UTC (History)
1 user (show)

See Also:


Attachments

Description Abel Braaksma 2014-01-21 11:57:30 UTC
This bug was first raised on the xsl-wg mailing list [1], relevant text copied here in its entirety:

We currently define xsl:merge-source with @for-each to have a doc/document/collection function and @streamable="yes" to indicate streaming processing of the sources. However, this conflicts with the functions, which are defined as deterministic, whereas streamed processing is deliberately not deterministic.

That means, in practice, that streamed processing for xsl:merge can at most be a compiler optimization, while at the same time it is required to retain determinism, i.e., multiple invocations of such xsl:merge instruction must result in the same results, and even invocations of doc etc with the same arguments elsewhere must result in the same document node.

I propose to solve this the same way we did when introducing xsl:stream. Instead of a function call, xsl:merge should have a src attribute that is an AVT which returns a sequence of URIs. In line with this, we could also add a collection-src attribute.

This change would bring it more in line with other streamability features, plus it makes the claim that xsl:merge can do streaming mean something that can be enforced by the guaranteed streamability rules.

[1] https://lists.w3.org/Archives/Member/w3c-xsl-wg/2014Jan/0012.html
Comment 1 C. M. Sperberg-McQueen 2014-02-10 15:59:55 UTC
We discussed this in the face to face meeting in Prague.

The syntax of merge-source does need to be revamped in the light of the current handling of streaming and streamability analysis.

We concluded that we will need three mutually exclusive attributes for merge-source:

- one to accept an expression which returns a set of objects (which are not necessarily streamable), each of which provides an input to the merge

- one to accept an AVT which evaluates to a sequence of URIs, each of which provides a document node as input to the merge

- one to accept an AVT which evaluates to a single collection URI, where the member documents of that collection are inputs to the merge

The second and third attributes can be regarded as opening the documents with doc() or document() in the normal case, and as opening streams analogous to those opened with the xsl:stream instruction when the merge-source element bears the attribute value specification streamable="yes".

We seem to have consensus this far, but no useful consensus on the names of these attributes.
Comment 2 Michael Kay 2014-02-11 21:30:20 UTC
Rather than the proposed three attributes, I propose we make it two:

for-each-item  

which is an expression that selects the anchor items (equivalent to the present for-each)

for-each-stream

which is an expression that selects a sequence of URIs, each being the URI of a streamed input document. Typical usages:

for-each-stream="'a.xml'"

for-each-stream="('a.xml', 'b.xml')"

for-each-stream="uri-collection('collection-uri')"

Do we also need validation and type attributes to allow control over input validation?
Comment 3 C. M. Sperberg-McQueen 2014-02-12 13:58:46 UTC
The WG considered the proposal in comment 2 and found it good.  The answer to the question at the end of comment 2 is "yes":  we believe having type and validation attributes on merge-source would be a good things.

The rules for guaranteed streamability must still be formulated, and the references to the doc(), document(), and collection() functions must be removed, to make the description logically consistent.
Comment 4 Michael Kay 2014-02-20 16:30:18 UTC
Fixed as agreed.
Comment 5 Abel Braaksma 2014-03-04 14:20:42 UTC
I'm taking the liberty of reopening this because of some (possible) bugs introduced in the new streaming rules (numbered 1 to 6):

1) About rule #2
"The expression in the select attribute of that xsl:merge-source element must have striding posture;"

This should probably be:

"The expression in the select attribute of that xsl:merge-source element must have striding or grounded posture and a sweep that is not free-ranging;"

I think the limitation on only the posture is not necessarily enough, yet I seem to fail to find an expression that is free-ranging and striding/grounded at the same time. I.e., the expression snapshot(a//b) is roaming and free-ranging because the argument has usages absorption, though at first look it might seem like grounded and free-ranging (because of the result of snapshot always being grounded).

2) About rule #4
"The select expression of each merge key in that xsl:merge-source element must be a motionless expression;"

This should be something like:

"The select expression of each merge key in that xsl:merge-source element must be a motionless expression, assessed with the posture of the select expression of the immediate xsl:merge-source parent;"

3) About rule #5
Typo: "and and either" => "and either a"

Also, the context item of xsl:merge-action is the result of the select expression of xsl:merge-source, hence it might make sense to set the context posture to the higher result posture of all select expressions (allowing that, if all are grounded, free-ranging expressions are allowed). However, this might over-complicate things. If we don't add this, we should perhaps explain what the context posture is for the assessment of the posture of the xsl:merge-action (striding?).

4) About the example
Currently, the example is not streamable with the rules given. It is streamable if the changes mentioned above are applied.

The Note on the example should probably go into the green section of the example, as it applies to example, not to the body of the text of section 15.4.

5) About context posture
It is not clear from the get-go what the context posture is for assessing the select expression of the xsl:merge-source element. Perhaps it is irrelevant?

6) About 19.8.4.25, Streamability of xsl:merge
There is no rule about how to assess xsl:merge-source children, the operands are not known. 

The current rule:
"If all xsl:merge-source children are motionless then the instruction is grounded and motionless."

does not take into account a that a select expression on xsl:merge-source can be motionless, but can still select streamable nodes, hence allowing the xsl:merge-action to return streamable nodes and still be motionless. This should be disallowed.


I think we could fix this by writing something like the following:

#1. If all xsl:merge-source without @for-each-stream and @for-each-item have a select expression that is motionless and grounded;

#2. If all xsl:merge-source with @for-each-item have a for-each-item expression that is grounded and motionless;

#3. If all xsl:merge-source with @for-each-stream follow the rules under 15.4 Streamable Merging;

Then grounded and motionless. Otherwise, roaming and free-ranging.
Comment 6 Abel Braaksma 2014-03-04 16:12:09 UTC
It appears that MK and I were having the exact same thoughts regarding streamability of xsl:merge, item (6) in the list in the previous comment 5 was tackled as part of bug 24497.
Comment 7 Michael Kay 2014-04-16 08:28:16 UTC
I think the remaining issues are subsumed by the discussion of streamed merging in bug #25335: marking this as a duplicate for convenience.

*** This bug has been marked as a duplicate of bug 25335 ***