25445
2014-04-24 22:03:15 +0000
[XP3.1] Replace curly array constructor with a function
2014-07-29 08:44:07 +0000
1
1
1
Unclassified
XPath / XQuery / XSLT
XPath 3.1
Last Call drafts
PC
All
RESOLVED
WONTFIX
P2
normal
---
1
mike
jonathan.robie
public-qt-comments
oldest_to_newest
104377
0
mike
2014-04-24 22:03:15 +0000
Semantically, array { X } is a pure function of X. I can't see any good reason why it needs special syntax; it should be a function call.
We have used curly brace syntax traditionally for things that are too complex to express as functions. This one isn't. It converts a sequence to an array; the inverse operation, to convert an array to a sequence, is a function.
There is no other case where we have used curly braces for something that is a simple non-higher-order function.
With the syntax growing it's becoming increasingly difficult to remember where to use curlies and where to use parens. For things like typeswitch I have to look it up every time. So let's avoid new syntax where we don't need it, and lets avoid making the two parallel operations array() and seq() different.
Plus, using functions in a functional language has obvious benefits.
104447
1
jonathan.robie
2014-04-25 13:59:02 +0000
The only reason we have two syntaxes is that we want one of these syntaxes to use commas the same way functions do, and we want the other to accept an arbitrary sequence and create an array whose members are the items in the sequence.
The square array constructor uses commas the same way that functions do, commas delimit the members of the resulting array. If we were to use function syntax for an array constructor, this would be a better candidate.
The curly array constructor takes an arbitrary expression and creates an array with one member for each item in the sequence. It does not use commas the same way as function calls, a comma is a comma operator.
One of the use cases creates an array of maps. The current solution uses JSONiq, so the square array constructor has the same semantics as our curly array constructor:
[
for $w in $s()
return { "pos" : $w(2), "lemma" : $w(1) }
]
Creating arrays of maps or maps of arrays will be common when working with JSON. But with our current square constructor, this expression creates an array with one member, a sequence of maps. I was not happy with that decision. JSON doesn't even allow that structure, and I expect this to be a common mistake for people working with JSON.
I think your suggestion is probably backwards. We need one syntax in which the comma delimits arguments the same way as commas in function calls. Lets use function call syntax for that. We need another syntax in which the items in an arbitrary sequence are used as the members of an array. Function call syntax doesn't work well for that.
104450
2
mike
2014-04-25 14:47:17 +0000
I did not intend to reopen the discussion about the semantics of [a,b,c]. I think we got it right in Prague: it creates an array with three members, these being the values of a, b, and c. I think that's what most people would expect, as we discussed in Prague many languages have such a construct and they invariably create an array with N+1 members where N is the number of commas. We can't do this with a function call unless it is a variable-arity function call.
I'm concerned here with the other construct, array{X}. I want to understand whether there is a good reason for having custom syntax for this, rather than using a function call as we do with its inverse, seq(X).
I see this being used in situations like
array {
for $x in employee
return $x/salary/data()
}
and perhaps this is why curlies were chosen; the FLWOR looks more like a statement than an expression to people from other cultures, because of its sentential syntax, and in those cultures curly braces are used to group "statements". But that's not our culture; we have an expression language, and array{} is semantically a pure function call. Making it a pure function allows things like
(a/b/c) => array()
which people will increasingly expect to be able to write.
104451
3
jonathan.robie
2014-04-25 15:52:33 +0000
(In reply to Michael Kay from comment #2)
> I did not intend to reopen the discussion about the semantics of [a,b,c]. I
> think we got it right in Prague: it creates an array with three members,
> these being the values of a, b, and c.
I think this issue inherently reopens the discussion of the syntax of array constructors. And frankly, I'd rather first rewrite all the use cases in our current syntax before we revisit these decisions, we spent a lot of time getting to where we are now. If we do revisit these decisions, we should look at construction of arrays in general, not just one kind of array constructor.
In Prague, we had agreement that we needed two kinds of constructors. One camp felt that the comma in [a,b,c] should have the same meaning that it has in (a,b,c). The other camp felt that the comma should have the same meaning that it has in function calls, separating arguments.
In this issue, you have suggested that we use function call syntax for array constructors. I think that makes most sense for the syntax that uses commas the same way function calls do.
> We can't do this with a function call unless it is a variable-arity
> function call.
Yes, it would be variable arity.
> I'm concerned here with the other construct, array{X}. I want to understand
> whether there is a good reason for having custom syntax for this, rather
> than using a function call as we do with its inverse, seq(X).
Why do you have this question only for one of the two array constructor syntaxes? The question seems equally apt for both.
> I see this being used in situations like
>
> array {
> for $x in employee
> return $x/salary/data()
> }
>
> and perhaps this is why curlies were chosen; the FLWOR looks more like a
> statement than an expression to people from other cultures, because of its
> sentential syntax, and in those cultures curly braces are used to group
> "statements". But that's not our culture; we have an expression language,
> and array{} is semantically a pure function call.
In our culture, we use {} for computed constructors of many kinds - documents, elements, attributes, maps, even text nodes, PIs, and comments. To me, this seems perfectly in line with those constructs.
> Making it a pure function allows things like
>
> (a/b/c) => array()
>
> which people will increasingly expect to be able to write.
We could certainly add a function to do that. I'm not sure how important it is to create structures this way, or why arrays are different from documents, elements, attributes, maps, etc.
We're talking about syntax here, and I think the best way to determine the most convenient syntax for expressions in a language is to look at a body of examples, written in each proposal. Syntax always has a high potential for bike shedding, so I suggest we first rewrite the use cases in our current syntax, then entertain change proposals for attribute constructors.
[
for $w in $s()
return { "pos" : $w(2), "lemma" : $w(1) }
]
array { a/b/c }
109328
4
jonathan.robie
2014-07-27 20:17:56 +0000
Here is a set of examples from the use cases in three different syntaxes:
** XQuery 3.1 CWD
The syntax in the current working draft.
** Changing array { } to array(())
Michael's proposal in comment 0 of this BZ. To my eyes, these
expressions are harder to read because of the number of parentheses.
** Changing array { } to [], changing [] to array()
A counterproposal that allows us to replace one of our array
constructor syntaxes with array() rather than array(())
Here are some examples from the use cases.
Note:
I did not find an example that depends on the comma behavior
we have defined for the current [] operator, so I will try to
construct such an example in a subsequent comment.
* Example 1:
** XQuery 3.1 CWD
declare function local:spellcheck($languages, $text)
{
map:new (
{ "languages" : $languages },
{ "raw" : $text },
for $l in $languages
return map {
$l : array { $text ! ext:sc($l, .) }
}
)
};
** Changing array { } to array(())
declare function local:spellcheck($languages, $text)
{
map:new (
{ "languages" : $languages },
{ "raw" : $text },
for $l in $languages
return map {
$l : array (( $text ! ext:sc($l, .) ))
}
)
};
** Changing array { } to [], changing [] to array()
declare function local:spellcheck($languages, $text)
{
map:new (
{ "languages" : $languages },
{ "raw" : $text },
for $l in $languages
return map {
$l : array [ $text ! ext:sc($l, .) ]
}
)
};
* Example 2:
** XQuery 3.1 CWD
[
for $w in $s()
return array { "pos" : $w(2), "lemma" : $w(1) }
]
** Changing array { } to array(())
[
for $w in $s()
return array (( "pos" : $w(2), "lemma" : $w(1) ))
]
** Changing array { } to [], changing [] to array()
[
for $w in $s()
return [ "pos" : $w(2), "lemma" : $w(1) ]
]
* Example 3:
** XQuery 3.1 CWD
map {
true() : array { $s[$p(.)] },
false() : array { $s[not($p(.))] }
}
** Changing array { } to array(())
map {
true() : array (( $s[$p(.)] )),
false() : array (( $s[not($p(.))] ))
}
** Changing array { } to [], changing [] to array()
map {
true() : [ $s[$p(.)] ],
false() : [ $s[not($p(.))] ]
}
* Example 4:
** XQuery 3.1 CWD
declare function local:mult( $matrix1, $matix2 )
{
if (length($matrix1) != length($matrix2(1))
then error("Matrices must be m*n and n*p to multiply!")
else array {
for $i in 1 to length($matrix1)
return array {
for $j in 1 to length($matrix2(1))
return
sum (
for $k in 1 to length($matrix2)
return $matrix1($i)($k) * $matrix2($k)($j)
)
}
}
};
** Changing array { } to array(())
declare function local:mult( $matrix1, $matix2 )
{
if (length($matrix1) != length($matrix2(1))
then error("Matrices must be m*n and n*p to multiply!")
else array ((
for $i in 1 to length($matrix1)
return array ((
for $j in 1 to length($matrix2(1))
return
sum (
for $k in 1 to length($matrix2)
return $matrix1($i)($k) * $matrix2($k)($j)
)
))
))
};
** Changing array { } to [], changing [] to array()
declare function local:mult( $matrix1, $matix2 )
{
if (length($matrix1) != length($matrix2(1))
then error("Matrices must be m*n and n*p to multiply!")
else [
for $i in 1 to length($matrix1)
return [
for $j in 1 to length($matrix2(1))
return
sum (
for $k in 1 to length($matrix2)
return $matrix1($i)($k) * $matrix2($k)($j)
)
]
]
};
* Example 5: assign items to groups
Note: We don't have really good use cases in our document for this.
I don't consider this one strong, but it illustrates the syntax.
** XQuery 3.1 WD
let $x := (1, 2, 3, 4, 5, 6, 7, 8, 9)
return [$x[. mod 2 eq 0], $x[. mod 3 eq 0], $x[. mod 5 eq 0]]
** Changing array { } to array(())
(Same as above.)
let $x := (1, 2, 3, 4, 5, 6, 7, 8, 9)
return [$x[. mod 2 eq 0], $x[. mod 3 eq 0], $x[. mod 5 eq 0]]
** Changing array { } to [], changing [] to array()
let $x := (1, 2, 3, 4, 5, 6, 7, 8, 9)
return array( $x[. mod 2 eq 0], $x[. mod 3 eq 0], $x[. mod 5 eq 0])
109469
5
jonathan.robie
2014-07-29 08:44:07 +0000
The WG decided to close this bug with no change.