This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 29251 - [xslt30] Conformance and optional features
Summary: [xslt30] Conformance and optional features
Status: CLOSED FIXED
Alias: None
Product: XPath / XQuery / XSLT
Classification: Unclassified
Component: XSLT 3.0 (show other bugs)
Version: Last Call drafts
Hardware: PC All
: P2 normal
Target Milestone: ---
Assignee: Michael Kay
QA Contact: Mailing list for public feedback on specs from XSL and XML Query WGs
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2015-10-29 19:36 UTC by Michael Kay
Modified: 2015-11-05 21:26 UTC (History)
1 user (show)

See Also:


Attachments

Description Michael Kay 2015-10-29 19:36:24 UTC
In reviewing the section of the specification on conformance and optional features, the WG noted today:

(1) The section 3.5.8 "Using an XQuery Library Package" has never been specified with sufficient precision to allow interoperable testing, and (b) the new load-xquery-module() function in XPath 3.1 is probably a better (and certainly a more interoperable) solution to the underlying requirement. The WG therefore proposes to drop this feature and the associated XQuery Invocation Feature.

(Note: the function load-xquery-module() will be available in every processor that provides the XPath 3.1 feature and the higher-order-function feature. But under the rules of the function itself, the implementation is allowed to raise a dynamic error FOQM0006 if no suitable XQuery processor is available, and it is free to do this on every invocation if it wishes.)

(2) In XQuery 3.0 (and 3.1), higher-order-functions are an optional feature of the specification. XPath does not call out subsets of the XPath language to be treated as optional, but instead expects this to be defined in host languages. It has always been the WG's intent to make the feature optional but the specification does not state this.

The definition of the higher-order-functions optional feature can be based directly on the definition in XQuery 3.1, which is as follows:

<quote>
An implementation that does not provide the Higher-Order Function Feature MUST raise a static error [err:XQST0129] if it encounters a TypedFunctionTest, named function reference, inline function expression, or partial function application. The effect of this rule is that it is impossible to construct function items other than maps and arrays. Dynamic function calls are permitted, but the function that is called will necessarily be a map or array. The AnyFunctionTest sequence type is permitted and will match either a map or array. Such an implementation MAY raise a type error [err:XPTY0004] if an expression evaluates to a sequence that contains a function.

If an implementation provides the Higher-Order Function Feature, then it must provide all higher-orderFO31 functions defined in [XQuery and XPath Functions and Operators 3.1]. If an implementation does not provide the Higher Order Function Feature, a static error is raised [err:XPST0017] if any of these functions is present in a query.
</quote>

Note: the functions classified as "higher-order" (for example fn:filter, fn:fold-left, etc) were enumerated in the XQuery 3.0 version of the specification, but in 3.1 the specification of each such function in F+O carries an annotation labelling it as higher-order, and the conformance statement refers to these labels.

Recasting the rule into the form used in the XSLT specification gives the following definition:

27.X Higher-Order Functions Feature

[Definition: An implementation that provides the Higher-Order Function feature provides constructs allowing the creation of functions as items in the data model, and invocation of such functions.]

More specifically: an implementation that does not provide the Higher-Order Function Feature MUST raise a static error [err:TBA] if it encounters a TypedFunctionTest, named function reference, inline function expression, or partial function application. The effect of this rule is that in such an implementation, it is impossible to construct function items other than maps and arrays. Dynamic function calls are permitted, but the function that is called will necessarily be a map or array. The AnyFunctionTest sequence type is permitted and will match either a map or array.

An implementation that does not provide the higher-order function feature constrains the data model by disallowing function items other than maps and arrays. Such a processor must raise a dynamic error [err:TBA] if the input to the processor includes a function item other than a map or array.

Note: see the notes for the analogous error condition XTSE1665.

An implementation that does not provide the Higher-Order Function feature excludes from the static context of all XPath expressions those functions which are labeled in F+O 3.1 as having the higher-order property (for example, filter, for-each, fold-left, function-lookup, etc). Calling function-available on these functions returns false.



To fn:system-property() we add the property name xsl:supports-function-items which returns "yes" if and only if the processor provides the higher-order-function feature.
Comment 1 Michael Kay 2015-10-30 10:30:31 UTC
These changes have been applied to the specification; the bug remains open until the WG has reviewed the changes.
Comment 2 Abel Braaksma 2015-10-30 11:51:55 UTC
Looks good.

Two minor suggestions, editorial:

1) If possible, I would prefer to enumerate the exact function list. I understand they are visible, but in many likewise situations we enumerate for completeness.

2) I thought the term "function item" was changed to "function"?

3) I would prefer the system-property to be called "xsl:supports-higher-order-functions". It matches the text better, plus it covers better what it does. 

Besides, any XSLT implementation will have to support function items (maps and arrays).

And a few caveats that we may or may not need to address:

4) 
> The effect of this rule is that in such an implementation, it is impossible 
> to construct function items other than maps and arrays.

Perhaps unlikely, but not entirely inconceivable, the initial match selection and global context item *may* contain function items. These are not constructed *in* the implementation, but can be provided from external sources. We should probably say what happens then.

5) 
I don't think we should forbid using (compiled) packages that can operate on function items. In fact, I don't think we can, because a package can be compiled with another version of a processor and it may not be visible to the package. I think it is too high a constraint to impose on package builders if they want to market their packages.

6)
On the same token: don't we have a rule of some sort on packages that are build with support for streaming or schema-awareness, used by a stylesheet build with lesser support?

7)
As a result of (5) and (6), such packages *may* return function items. We should say what happens if you try to match them (predicate pattern dot-matches-all). Of course, you can't do anything useful with them, but you can encounter them.
Comment 3 Michael Kay 2015-10-30 12:05:00 UTC
>1) If possible, I would prefer to enumerate the exact function list. I understand they are visible, but in many likewise situations we enumerate for completeness.

I've given a few examples but I would prefer to avoid a normative list: it invites inconsistencies. We changed between XQ30 and XQ31 to using the function properties explicitly to avoid this problem.


2) I thought the term "function item" was changed to "function"?

Yes. I sometimes feel the need to use "function item" to distinguish a function as the value of an item from a function declaration e.g. a stylesheet function.

3) I would prefer the system-property to be called "xsl:supports-higher-order-functions". It matches the text better, plus it covers better what it does. 

No, it's actually a misnomer. If the HOF feature isn't supported then you can't do

let $f := math:pi#0 return $f()

even though no higher order function is involved. We're disallowing all expressions that return functions as their value, not simply functions that accept or return functions.



And a few caveats that we may or may not need to address:

4) 
> The effect of this rule is that in such an implementation, it is impossible 
> to construct function items other than maps and arrays.


>Perhaps unlikely, but not entirely inconceivable, the initial match selection and global context item *may* contain function items. These are not constructed *in* the implementation, but can be provided from external sources. We should probably say what happens then.

Yes, I have addressed this. I've commoned up the language used for non-schema-aware processors to say that several optional features, if absent, constrain the data model by disallowing certain kinds of item, and generalizing the existing XTDE1665 error code to cover all these constraints in the same way.

5) 
>I don't think we should forbid using (compiled) packages that can operate on function items. In fact, I don't think we can, because a package can be compiled with another version of a processor and it may not be visible to the package. I think it is too high a constraint to impose on package builders if they want to market their packages.

The way I've written the rules, (a) a processor that doesn't support the HOF feature rejects certain syntactic constructs, and (b) may throw a dynamic error if it comes across (non-array/map) function items in incoming data. It's permitted to pass such items through transparently if it wants.

6)
>On the same token: don't we have a rule of some sort on packages that are build with support for streaming or schema-awareness, used by a stylesheet build with lesser support?

We haven't spelled out all the implications.

7)
>As a result of (5) and (6), such packages *may* return function items. We should say what happens if you try to match them (predicate pattern dot-matches-all). Of course, you can't do anything useful with them, but you can encounter them.

It's covered by the general dynamic error condition. But I'll expand the notes a little.
Comment 4 Abel Braaksma 2015-11-02 12:32:51 UTC
> let $f := math:pi#0 return $f()

> even though no higher order function is involved. We're disallowing all 
> expressions that return functions as their value, not simply functions that 
> accept or return functions.

For what it's worth, $f in your example is a function item, $f() a dynamic function invocation. The term "higher order function" typically refers to anything that involves creating, returning and invoking function items. Besides, you called the section "27.X Higher-Order Functions Feature". It's a bad mnemonic if the feature is called differently than the property.

Conversely, if we say "we don't support function items", it is not (entirely) true. Every implementation must support them, you just can't use them, invoke them, or return them.

I would dare say that for instance, one could define:

<xsl:template match=".[. instance of function(*)]">
    <xsl:message terminate="yes">We don't support function items</xsl:message>
</xsl:message>

And you can also do:

<xsl:template match=".[. instance of function(*)]">
    <xsl:try>
        <xsl:value-of select="'attempt (is it a map?): ' || map:keys(.)" />
        <xsl:catch>
            <xsl:message terminate="yes">
                We don't support function items
            </xsl:message>
        </xsl:catch>
    </xsl:try>
</xsl:message>

But perhaps this should fail. I mean in the sense that syntax ".[. instance of function(*)]" is allowed, but ".()" is not. But ".('x')" can be a map.

Note, you wrote:

> The AnyFunctionTest sequence type is permitted and will match either a map 
> or array.

So that would violate the ideas laid out above. It feels kinda strange if we have a sequence of function items (from external source) and it is treated as if it is an empty sequence. But empty($f) on that sequence will return false, while if($f instance of item()) will return true, and if($f instance of function(*)) will return false.

That sounds somewhat similar to function(*) (but not map/array) being mapped to xs:error.

I think I would prefer that we allow:
* operators castable as, cast as, instance of, is
* <, >, =, <=, >=, !=
* eq, ne, lt, gt, le, ge
* pattern .
* function-name, function-arity (or not)

And that we disallow
* dynamic function invocation
* partial function application
* inline functions
* arrow operator (?)
* core higher order functions

Note that, somewhat strangely, XP31 marked fn:load-query-module as higher order and fn:transform not. Both are in the higher-order section, though neither takes a function item as argument, or returns a function item.

The thing is, I think that this way it is *much* easier to implement. Instead of special-case many situations, we can mark a set of functions, but default operations remain to exist and default templates remain to be called.

We could even go one step further. Simply allow the whole syntax of XP31 (the grammar) and say that any function item is wrapped in a new function with the same arguments, with as body a call to fn:error(err:XTDExxxx).

This would prevent painstaking updates to the grammar parsing. It is the "everything is allowed, except for a set of functions". This has the downside for end-users to create a late surprise (they can create an inline function but only receive an error when they call it).

In the end it is a little bit potáto - potàto what we choose, and my preference would go to the least disruptive.
Comment 5 Michael Kay 2015-11-02 13:06:42 UTC
Wikipedia:

In mathematics and computer science, a higher-order function (...) is a function that does at least one of the following:

* takes one or more functions as arguments,
* returns a function as its result.

(Although there does seem to be some disagreement about this definition).

We're restricting all use of functions (other than maps and arrays) as values, not only the use of functions as arguments or results of (higher-order) functions.

(In what follows, I'll use "FI" to mean an item whose value is a function other than a map or array.)

I think there are two questions here: terminology (what do we call the feature? what do we call the system property?), and substance (how exactly does a system behave if it doesn't support this optional feature?).

On the issue of substance, I think it's important to align with XQuery. The question of what is statically disallowed is fairly easy, the tricky bits come with what is dynamically disallowed. On this side I think we can recognize two classes of implementation in which the optional feature is not supported:

(a) implementations that don't define any representation of an FI. So the dynamic problem never arises, for example they don't have to look at a value returned from an extension function and say "check that the result isn't an FI", because it can't be.

(b) processors that don't support the feature, but that work with an implementation of the data model that provides a representation of FIs. (Perhaps the product can exchange data model instances with another product that does support the optional feature). There are two possible approaches here, and I think we should allow both: (i) reject any FI in any value supplied as "input to the processor" immediately with a dynamic error. (ii) allow FIs to exist in input values, and raise an error only if some unsupported operation is attempted, e.g. a dynamic function call, or an instance-of test: I don't think we need to prescribe exactly which operations are allowed and which aren't (we've banned the most interesting operations statically), only that the product is entitled to raise a dynamic error at any stage if it encounters an FI.

Note that this situation is very similar to the situation for a non-schema-aware processor handling typed nodes, which is why I have tried to common-up the rules.

On the terminology question, I agree that what I've proposed appears a little inconsistent. "Higher order function feature" is what XQuery calls it and it's a term that the user community will recognize, but it's not an accurate technical description of the set of capabilities we are making optional. I was looking for greater accuracy when selecting a language keyword.
Comment 6 Abel Braaksma 2015-11-02 17:11:53 UTC
> * takes one or more functions as arguments,
> * returns a function as its result.

Perhaps this is simpler? I agree to your (a), which means such processors should implement maps and arrays, not function items.

An analogy to streaming comes to mind. I have the feeling that this feature should mean that a "function item" cannot be invoked (absorption). But inspecting whether something is a function item should be allowed, so that both your option (a) and (b1/b2) can co-exist.

This can lead to dynamic errors only in the case of map/array-style invocation, where the actual item turns out being a function and not a map.

This means that if a processor follows (a), it is possible to do "$f instance of function(*)", it will simply return true for maps and arrays, and false otherwise (it cannot be a function item).

Likewise, it should be possible to do function-name($f), where $f can be a map or an array, in which case the name is the empty sequence. Another outcome is not possible (unless a type error). And function-arity($f) will always return 1, or an error.

This is then consistent between your (a) and (b), where in the case of (b) those functions SHOULD return something useful, if the function item comes from an external input sequence.

Going this way, it will allow users to have <xsl:param name="m-or-a" as="function(*)" />, which can then contain a map or an array. And "function(xs:int) as xs:string" can be a map with int keys or an array.

I think this is not too hard to get right (might be even simpler than your original proposal), and even though it marginally differs from XQuery, I think it does so for good reasons (and perhaps XQ31 wants to bring its definition in line with this, if there's some interest).

I would prefer not to differentiate between b1 and b2, because it may be inevitable when using a library package that such items at some point "exist". Opting for the inspection, but not absorption rule makes this as painless as possible. It's a bit like C# 4, which supports the async keyword for asynchronous processing. C# 3 users can not use it, but can still call into libraries build with C# 4+ that use the async keyword.
Comment 7 Michael Kay 2015-11-03 09:10:55 UTC
Abel, I'm not clear what changes you are proposing. I think it would be helpful if you could put forward specific changes to the text that I put in the currently-posted draft for us to either accept or reject.
Comment 8 Abel Braaksma 2015-11-05 17:01:08 UTC
I was aiming at / proposing something along these lines:

Processors that do not support higher order functions MUST NOT allow any expression that can create functions items (list?) but MAY support function items to be delivered by an external source. It MUST raise a static error when it encounters any function listed in FO31 under "higher-order functions", except fn:function-name and fn:function-arity, which can be used on maps and arrays. Any operation on externally provided function items MUST fail with a dynamic type error, except when such item is tested against a testing expression, such as "instance of", "castable as". Such expressions MAY return false if a function item is encountered.

It is implementation dependent whether a processor allows externally supplied function items, for instance through packages or the initial match selection, but it MAY NOT allow any operation other than inspection on such function items.

Note:
This is similar to saying that any externally supplied function item has type xs:error, and can therefore be tested, but not accessed.

Note:
This allows processors freedom in choosing whether or not they accept packages that use higher order functions internally, or whether they are made available through public parameters, modes or functions.
Comment 9 Michael Kay 2015-11-05 18:45:12 UTC
The WG decided to make one of the suggested changes: changing the name of the system-property.