7350 – [XPath 2.1] Higher Order Functions Need Sugar

This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 7350 - [XPath 2.1] Higher Order Functions Need Sugar

Summary: [XPath 2.1] Higher Order Functions Need Sugar

Status:	CLOSED FIXED

Alias:	None

Product:	XPath / XQuery / XSLT
Classification:	Unclassified
Component:	XPath 3.0 (show other bugs)
Version:	Working drafts
Hardware:	PC Windows NT

Importance:	P2 normal
Target Milestone:	---
Assignee:	Jonathan Robie
QA Contact:	Mailing list for public feedback on specs from XSL and XML Query WGs

URL:
Whiteboard:
Keywords:

Depends on:
Blocks:

Reported:	2009-08-16 22:23 UTC by Michael Kay
Modified:	2013-06-19 07:50 UTC (History)
CC List:	3 users (show)

See Also:

Attachments
PDF document containing comments on the higher-order functions proposal (158.81 KB, application/pdf) 2009-08-16 22:23 UTC, Michael Kay	Details
Revised comments on the HOF facility (152.01 KB, application/pdf) 2009-08-19 18:25 UTC, Michael Kay	Details

Description Michael Kay 2009-08-16 22:23:54 UTC

Created attachment 741 [details]
PDF document containing comments on the higher-order functions proposal

I attach some comments received on the higher order function proposal (which although it is not yet incorporated in any public working draft, is available on John Snelson's web pages). The comments were received from an individual who wished to submit them anonymously because they are not official comments from his company.

Please see PDF attachment.

Comment 1 Michael Kay 2009-08-16 22:50:10 UTC

Some comments on the proposed changes to the facility.

(1) Referring to function items without specifying the arity.

I'm sympathetic to the statement that it's a burden for users to have to specify the arity.

However, the proposed solution whereby current-date() becomes a function item, and the function call has to be written current-date(#void) is clearly a non-starter for both backwards compatibility and usability reasons.

It might be possible to allow the arity to be omitted in the case where there is only one function in the static context with a given QName. The function item literal might then be written, say current-date##. However, this lacks resilience to change: if the WG were to introduce a 1-argument version of current-date() in a future release, the reference to current-date## (or any other syntax that omitted the arity) would become invalid. One could defend against that by making current-date## refer to the version of the function with lowest arity. But I'm not convinced this is a good idea.

The proposal to introduce aliases for function names seems to me to introduce more complexity where the aim should be to have less.

(2) Allowing reference to functions in the op: namespace.

I have some sympathy with the proposal. For coherence, I think it would require making all the op: functions into standard user-visible functions.

(3) Partial application of more than one argument.

Again I think there is some merit in the proposal. I don't think underscore works as a placeholder, because it is a valid NCName; I would suggest question-mark instead. So you get for example

let $max_de := max(?, 'http://my-collation/de')

Interestingly this means that concat#2 can now be written instead as concat(?,?), which seems to obviate the need for function literals in all cases except 0-argument functions. I think we would need some special ad-hoc syntax for that case.

Comment 2 Michael Kay 2009-08-19 18:25:39 UTC

Created attachment 743 [details]
Revised comments on the HOF facility

This is a revised version of the comments that takes into account the observations in comment #1

Comment 3 Michael Dyck 2009-08-19 19:56:16 UTC

> Further, we introduce the rule that if values for the first K arguments
> are supplied and the function doesnt have two overloads with more than
> K arguments, then the trailing underscores in the function invocation
> pattern may be omitted. This is always the case with a function that
> has no overloads. Thus, we will usually only write:
>     f:func(1,2,3)
> instead of:
>     f:func(1,2,3, ?, ?, ?)

This seems like a bad idea to me. There's no visual signal that
    f:func(1,2,3)
is a partial application (yielding a function) rather than a simple
function call (yielding whatever it is that f:func normally returns).

So if the user writes
    f:func(1,2,3)
*intending* it as a function call (mistakenly thinking that f:func takes
3 arguments), the processor wouldn't be able to raise static error XPST0017
(pointing out that f:func doesn't have a signature with arity 3), rather it
would have to interpret it as a partial application, which would probably lead
to a type error somewhere else (possibly far away from the actual mistake).

Comment 4 John Snelson 2009-08-19 23:01:07 UTC

I'm intrigued who our anonymous commenter could be :-).

My comments on the proposal:

1) I'm not keen on using "(~)" for literal function items - at least the "#" symbol has an association with numbers and by extension arity. I see no reason not to allow "local:func#" (without specific arity) to reference either a function with unambiguous arity, or the function with minimum arity.

I strongly dislike the idea of retrofitting a "primary-overload" modifier to one of many overloaded functions signatures. I think this is to complicated for it's marginal benefit.

2) I have sympathy with the desire to reference functions for built in operators, but think that exposing the underlying polymorphic operator functions is undesirable. It's my opinion that something like an EXPath module will quickly come into existence to fill this need.

3) I like the "?" syntax for partial application, and think it's a definite improvement. With regard to Michael Dyck's comment, my suggestion is that this is only allowed to be used when invoking function items ("dynamic function invocation" in my proposal) - meaning a regular function call like this will still be an error (wrong number of arguments):

fn:starts-with("a")

However, we instead allow this to bind the first argument of fn:starts-with():

fn:starts-with#("a")

And this could bind a collation name for fn:starts-with():

fn:starts-with#3(?, ?, "http://my.collation.com")

This means the proposal example becomes:

let $sum := f:foldl#(local:add#, 0)

Comment 5 John Snelson 2009-08-19 23:16:39 UTC

For those that like to see the EBNF, these are the changes I was suggesting in comment #4:

LiteralFunctionItem ::= QName "#" IntegerLiteral? /* ws:explicit */

DynamicFunctionInvocation ::= FunctionItem "(" ArgOrNotPovided
  ("," ArgOrNotProvided)*)? ")"

FunctionItem ::= FilterExpr

ArgOrNotProvided ::= ExprSingle | "?"

The explicit whitespace flag has been added to the literal function item production to enable easier parsing, especially given the newly optional arity integer.

Note that using the "?" partial apply syntax provides no way to partially apply an argument to a 1-arity function to turn it into a 0-arity function. I don't see this as a big drawback, but it's worth pointing out.

Comment 6 John Snelson 2009-10-13 17:16:00 UTC

The XQuery and XSLT working groups would like to thank you for your bug report, which we discussed at our teleconference on 2009-10-13. Although the functionality in question is not yet published, the working groups expect to make the following changes:

(1) Referring to function items without specifying the arity.

The working groups decided to make no change for this.

(2) Allowing reference to functions in the op: namespace.

The working groups decided to make no change for this.

(3) Partial application of more than one argument.

The working groups decided to allow question marks ("?") in place of argument expressions for both FunctionCall and DynamicFunctionInvocation to signify partial application. The number of argument expressions plus the number of question marks must equal the number of arguments that the function or function item accepts. The fn:partial-apply() function will be removed.

Please close this bug report if you are satisfied with these solutions.

Comment 7 Michael Kay 2009-10-13 17:56:17 UTC

Perhaps, before we forget the reasoning, I could add a little bit more detail of the discussion that led to these decisions.

On (1) the discussion was mainly about forwards compatibility. Allowing the arity to be omitted only in the case where there were currently no overloads would prevent such overloads being added in future without causing existing programs to fail. Making an omitted arity refer to the function with minimum arity would allow new arguments to be added, but would not allow an existing argument to be made optional, as happened recently with string-join. Desigating one of the overloads as primary requires a lot of syntactic machinery which we felt was not justified by the benefits.

On (2) we felt that the existing family of op: functions was not especially well designed for this job. For example, instead of six functions representing the six operators op:greater-than, op:ge, op:lt, op:le, op:eq, and op:ne, there are generally only two or three. We felt we would need to redesign this interface if it were exposed to users, and this would be a lot of work. Meanwhile it was quite possible for users or third parties to build a function library on top of the operators and use this.

On (3) there was a general feeling that the idea was a good one, and various discussions about the best way to do it (and the best characters to use - tilde in place of question mark made a strong showing). We decided to require either an expression or a "?" in each argument position so there was no ambiguity about which function was being referenced, regardless whether a function was being named explicitly or by reference to an expression returning a function item.

Comment 8 Michael Kay 2013-06-19 07:50:08 UTC

Marking this closed, as the original anonymous commenter is not in a position to do so... Marking it "fixed", though some of the comments were accepted and others rejected.