29622 – [XP31] Atomization in Postfix Lookup, Unary Lookup

This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 29622 - [XP31] Atomization in Postfix Lookup, Unary Lookup

Summary: [XP31] Atomization in Postfix Lookup, Unary Lookup

Status:	RESOLVED FIXED

Alias:	None

Product:	XPath / XQuery / XSLT
Classification:	Unclassified
Component:	XPath 3.1 (show other bugs)
Version:	Candidate Recommendation
Hardware:	PC Windows NT

Importance:	P2 normal
Target Milestone:	---
Assignee:	Jonathan Robie
QA Contact:	Mailing list for public feedback on specs from XSL and XML Query WGs

URL:
Whiteboard:
Keywords:

Depends on:
Blocks:

Reported:	2016-05-11 12:08 UTC by Tim Mills
Modified:	2016-06-17 10:09 UTC (History)
CC List:	3 users (show)

See Also:

Attachments

Description Tim Mills 2016-05-11 12:08:00 UTC

In 3.11.3.1 Unary Lookup we read

for $k in KS
return .($k)

and in 3.11.3.2 Postfix Lookup we read

for $e in E, $s in S return $e($s)

In neither is atomization explicit, so as written it is interpreted as:

for $k in KS
return .(fn:data($k))

and

for $e in E, $s in S return $e(fn:data($s))

Consider 

['A', 'B', 'C'] ! ?([1, 2])

is interpreted as

['A', 'B', 'C'] ! (for $k in ([1, 2]) return .(fn:data($k))

which causes an error since $k iterates over the single array item, and fn:data([1, 2]) is the sequence (1, 2).

Similarly

['a', 'b', 'c'] ?([1, 2]))

is interpreted as

for $e in ['a', 'b', 'c'], $s in [1, 2] return $e(fn:data($s))

and it should fail in a similar manner.

I propse that the specification be modified to read in 3.11.3.1 Unary Lookup 

for $k in fn:data(KS)
return .($k)

and in 3.11.3.2 Postfix Lookup 

for $e in E, $s in fn:data(S) return $e($s)

The only effect is to make some queries which would otherwise return XPTY0004 return a non-error.  It affects cases where the key specfier includes arrays or elements whose typed value is a sequence.

Note that one implementation already behaves in the manner suggested.

Comment 1 Michael Kay 2016-05-24 17:18:06 UTC

I agree that we should make this change. I think the main justification is consistency: in every other case where we have an operator (or function) where one of the operands (or arguments) is required to be an atomic sequence, we atomize the operand as a whole, rather than atomizing each item in the operand individually.

For example A=B means (some $a in data(A), $b in data(B) satisfies $a eq $b); it does not mean (some $a in A, $b in B satisfies data($a) eq data($b)).

We don't have any language or machinery to say "the required type of the RHS is a sequence of items S such that the result of atomizing each item in S is a single atomic value". We do have machinery to say "the required type of the RHS (after applying the function conversion rules) is an atomic sequence". Let's re-use the machinery we have rather than doing something different.

Comment 2 Jonathan Robie 2016-06-07 15:56:36 UTC

I assume you want this for both maps and arrays.  If I understand this correctly, it would allow sequences of keys in either case:

let $m := map {
  "a" : 1,
  "b" : 2
}
return $m(("a", "b"))
=> (1, 2)


[ 2, 4, 6, 8](1, 3)
=> (2, 6)

The order of keys in the sequence determines the order of the result.

This seems useful.  It's probably not a hard change for implementations, but it's a significant change in behavior.

Comment 3 Michael Kay 2016-06-07 19:24:37 UTC

(In reply to Jonathan Robie from comment #2)
> I assume you want this for both maps and arrays.  If I understand this
> correctly, it would allow sequences of keys in either case:

No, I don't think that's an accurate description of the proposed change. 

(1) array:get($K) and map:get($K), as well as $A($K) and $M($K), all still require $K to be a single atomic value. 

(2) the lookup operator $A?$K and $M?$K already allows $K to be a sequence of atomic values.

What is changed is that in a lookup operator, $K is atomized as a whole, rather than being atomized item-by-item. This brings it into line with all other operations that expect a sequence of atomic values as an argument.

For example, suppose that @IDREFS is an attribute node of type xs:IDREFS with value "A B C D" -- that is, its typed value is ("A", "B", "C", "D").

And consider $M, a map {"A":1, "B":2, "E":5}

Under the current rules, $M?(@IDREFS) is an error because it translates to 

map:get($M, ("A", "B", "C", "D"))

Under the proposed rules, the same expression returns (1, 2), because it translates to

("A", "B", "C", "D") ! map:get($M, .)

Comment 4 Abel Braaksma 2016-06-14 16:29:09 UTC

(reclassified the bug as XP31 bug)

Comment 5 Andrew Coleman 2016-06-17 10:09:32 UTC

At the meeting on 2016-06-14, the WG agreed to adopt the proposal in the Description.
Action A-646-06 will track this change.