This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 27807 - [xslt 3.0] Streaming of "document fragments"
Summary: [xslt 3.0] Streaming of "document fragments"
Status: CLOSED FIXED
Alias: None
Product: XPath / XQuery / XSLT
Classification: Unclassified
Component: XSLT 3.0 (show other bugs)
Version: Candidate Recommendation
Hardware: PC All
: P2 normal
Target Milestone: ---
Assignee: Michael Kay
QA Contact: Mailing list for public feedback on specs from XSL and XML Query WGs
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2015-01-12 13:21 UTC by Michael Kay
Modified: 2015-10-29 09:50 UTC (History)
1 user (show)

See Also:


Attachments

Description Michael Kay 2015-01-12 13:21:56 UTC
We say in section 4.6:

"Expressions such as (/) instance of document-node(element(invoice)) also 
require look-ahead as far as the start-tag of the first element."

with the assumption that if the first start-tag is <invoice>, then the expression is true. But what if the document node has a second element child? In this case the expression should return false. But in this case the "instance of" test cannot be motionless.

We seem to be making the assumption that it is known a-priori that a streamed document will either be a well-formed document, or will cause a dynamic error. If we don't make this assumption then type checking against document-node tests becomes impossible for streamed documents.

So I think we should make the assumption, and make it explicit that we are making it.
Comment 1 Abel Braaksma 2015-01-14 11:47:09 UTC
Isn't this inherent to streaming in general? If a streamed document is not well-formed, dynamic, late-in-the-process errors can occur with just about any expression, like child::p, if it isn't closed.
Comment 2 Michael Kay 2015-01-14 12:51:39 UTC
The issue is with something like

<?xml version="1.0"?>
<a/>
<a/>

which is not a well-formed document, but has a legal representation in XDM, as a document node with two element-node children. If such "documents" are allowed (e.g. as the input of xsl:stream), then the expression ". instance of document-node(element(a))" becomes non-streamable, because it's defined to return false for the above "document".
Comment 3 Michael Kay 2015-02-06 18:20:02 UTC
After further discussion, the WG accepted a "sketch proposal" to resolve this along the following lines: the expression "$X instance of ST" should not be motionless in the case where the item type of ST takes the form document-node(element(X)). The editor was asked to flesh this out into a detailed proposal. Here is the proposal.

A. In 19.8.7, in the table row for InstanceOfExpr [25], replace the proforma by a reference to a new section entitled "Streamability of Instance-Of expressions".

B. Add a new section "Streamability of InstanceOf expressions" with the following content:

For an expression of the form X instance of ST (where X is an expression and ST is a SequenceType), the posture and sweep are determined by the general streamability rules. There is a single operand X, whose operand usage is as follows:

(1) If the ItemType of ST is a DocumentTest, optionally parenthesized, then absorption

(2) Otherwise, inspection.

Note: In general, is is possible to determine whether a node matches an ItemType without consuming the node.

An ItemType of the form document-node(element(X)) is a exception to this rule because it matches a document node only if it has exactly one element node child, and this cannot be determined without consuming the document.

A processor may have knowledge that the document node cannot contain multiple element nodes, for example because it knows that the source of the streamed document is an XML parser that is not capable of generating such a stream. In such cases the processor may make a different assessment of the streamability of this construct. This comes under the general provision that a processor is always at liberty to use streaming even when the stylesheet is not guaranteed streamable.

Note: As with other constructs that are evaluated with inspection usage, for example the name() function or access to an attribute node, evaluation of a construct such as ($X instance of schema-element(E)) as true or false may be invalidated if reading of the input stream subsequently fails. Dynamic errors during streamed processing of an input document invalidate all output generated prior to the failure, and this case is no different.

Note: Given an expression such as (child::* instance of element(E)*), the expression as a whole is consuming and grounded. By contrast, the expression (. instance of element(E)*) is motionless and grounded. This can be verified by applying the general streamability rules to these cases.

[ED NOTE: a brave assertion, which I need to check...]
Comment 4 Michael Kay 2015-02-06 18:22:34 UTC
In the above, change (1) to:

(1) If the ItemType of ST is a DocumentTest (optionally parenthesized) that contains an ElementTest or SchemaElementTest, then absorption

(I overlooked that document-node() is also a DocumentTest.)
Comment 5 Michael Kay 2015-02-17 08:59:27 UTC
The change was accepted.

Ed note: typos
(1) in "Note: In general, is is "
(2) in following para "is a exception to this rule"
Comment 6 Michael Kay 2015-02-26 21:19:23 UTC
The changes have been applied.