Re: ungrouped variables used in projections - Further implications?

The more I think about this, I am afraid there are possibly further implications 
of this discussion...

This is the current text:

"In aggregate queries and sub-queries variables that appear in the query
pattern, but are not grouped by cannot be projected nor used in project
expressions. In order to project arbitrary expressions the SAMPLE
aggregate may be used."

I am afraid it is still potentially not totally clear:

1) in principle this forbids 

SELECT (COUNT(?O1) AS ?C)
WHERE { ?S :p ?O1; :q ?O2 } GROUP BY (?O1 + ?O2)
ORDER BY ?O12

(currently this is reflected within testcase agg08)
I am not sure we intended to forbid this?

2) 
ACTION-209 reflects our understanding that we shouldn't allow projected variables being reused 
in HAVING clauses.  Now how about non-grouped variables and expressions in HAVING clauses?

In total, addressing 1) and 2) my current understanding is that we should change:

"In aggregate queries and sub-queries variables that appear in the query
pattern, but are not grouped by cannot be projected nor used in project
expressions. In order to project arbitrary expressions the SAMPLE
aggregate may be used."

-->

"In aggregate queries and sub-queries variables *or expressions* that appear in the query
pattern, but are not grouped by cannot be projected, nor be used in project
expressions *(except within aggregations)*, *nor be used in HAVING clauses*. 
In order to project arbitrary expressions the SAMPLE aggregate may be used."

The formulation gets a bit heavier, but at least it seems clearer.

If we can agree on that, I'd need to extend the test cases towards covering 2)

best,
Axel

On 25 Aug 2010, at 09:53, Axel Polleres wrote:

> 
>> Right. When we closed ISSUE-11, we referenced ISSUE-41 about whether
>> expressions were allowed in GROUP BY and noted the 2 were closely
>> related. So I still don't think this is anything new, as it's something
>> we talked about and considered when resolving 11.
> 
> Thanks Lee, Steve for clarification!
> 
>>> I do think Axel's wording is less ambiguous though, I'll incorporate it into the text.
> 
> Perfect, I consider this sorted so, and will try to come up with some additional syntax-test-cases that reflect that.
> 
> Thanks,
> Axel
> 
> 
> 
> On 25 Aug 2010, at 01:41, Lee Feigenbaum wrote:
> 
>> On 8/24/2010 5:37 PM, Steve Harris wrote:
>>> On 2010-08-24, at 21:01, Lee Feigenbaum wrote:
>>> 
>>>> On 8/24/2010 1:09 PM, Axel Polleres wrote:
>>>>> We couldn't really find consensus on the issue of ungrouped variables used in projections in aggregate queries in today's call. I volunteered to summarise my currnet understanding of the different positions:
>>>> 
>>>> Hi everyone,
>>>> 
>>>> I think that we resolved this question in favor of it being an error back on November when we closed ISSUE-11. (See http://www.w3.org/2009/sparql/meeting/2009-11-17#resolution_2 .)
>>>> 
>>>> It's unclear from the minutes what prompted this conversation. Is there new information about this topic? If so, could someone please share it on the list or point to it in the minutes? (I looked but couldn't discern it.) Otherwise, I suggest we stick with our resolution and spend our time on other topics.
>>> 
>>> That is also my understanding. I don't believe there's substantial new information.
>>> 
>>> One thing that was unknown when that decision was taken (if I remember correctly), was whether we would allow grouping by expressions, or just variables. I don't think that has a significant bearing on the decision though.
>> 
>> Right. When we closed ISSUE-11, we referenced ISSUE-41 about whether
>> expressions were allowed in GROUP BY and noted the 2 were closely
>> related. So I still don't think this is anything new, as it's something
>> we talked about and considered when resolving 11.
>> 
>>> I do think Axel's wording is less ambiguous though, I'll incorporate it into the text.
>> 
>> Cool.
>> 
>> Lee
>> 
>>> 
>>> - Steve
>>> 
>>>>> The issue is exemplified by the following query:
>>>>> 
>>>>> SELECT ?N COUNT(?P1) WHERE { ?P name ?N; knows ?P1 } GROUP BY ?P
>>>>> 
>>>>> 1) The current spec seems to be clear about this case...
>>>>> 
>>>>> "In aggregate queries and sub-queries only expressions which have been used as GROUP BY
>>>>>  expressions, or aggregated expressions (i.e. expressions where all variables appear
>>>>>  inside an aggregate) can be projected."
>>>>> 
>>>>> ... suggesting that it is an error.
>>>>> 
>>>>> An alternative handling would be to
>>>>> 2) treat the non-grouped variables as unbound (I think that's what Andy suggested)
>>>>> 3) or leave the behavior to the implementation (I think that would be least favorable, increasing
>>>>>    ambiguity of the language and allowing to do anything)
>>>>> 
>>>>> 
>>>>> An argument raised against 1) in favor of 2) was that we'd raise an error on an - otherwise syntactically correct - query, which might be considered awkward, and hard to implement for parsing, essentially needing to respect the context for parsing.
>>>>> 
>>>>> Note that we have a similar behaviour (needing a context-aware parser) already in forbidding bnodes being shared among patterns:
>>>>> "When using blank nodes of the form _:abc,  labels for blank nodes are scoped to the basic graph pattern.  A label can be used in only a single basic graph pattern in any query."
>>>>> 
>>>>> If I understood correctly, Andy was arguing that checking reuse of bnodes was easier since the
>>>>> scope doesn't play a role, as apposed to GROUP BY. (More detailed explanation here appreciated.)
>>>>> 
>>>>> We had a strawpoll which ended as follows:
>>>>> 
>>>>>   Should ungrouped variabled in project expressions generate an error?
>>>>>   +1: 6 0: 6 -1: 0
>>>>> 
>>>>> no objections, but when I asked whether among the supporters anyone would object against NOT flagging an error, Souri said he'd probably object.
>>>>> 
>>>>> Summarising, that lets me lean towards forbidding projection, unless we get new information.
>>>>> 
>>>>> As a side remark, note that the current wording is not precise:
>>>>> 
>>>>> "In aggregate queries and sub-queries only expressions which have been used as GROUP BY
>>>>>  expressions, or aggregated expressions (i.e. expressions where all variables appear
>>>>>  inside an aggregate) can be projected."
>>>>> 
>>>>> Note that this does not cover the following case:
>>>>> 
>>>>>  SELECT (?N AS ?New) COUNT(?P1) WHERE { ?P name ?N; knows ?P1 } GROUP BY ?P
>>>>> 
>>>>> Thus, in case we stick with the general understanding of 1) I would suggest to reword:
>>>>> 
>>>>> "In aggregate queries and sub-queries variables that appear in the query pattern, but are not grouped by
>>>>>  cannot be projected nor used in project expressions."
>>>>> 
>>>>> In case we adopt 2) we should probably still say something about this case, maybe illustrate it with an example:
>>>>> 
>>>>> "In aggregate queries and sub-queries variables that appear in the query pattern, but are not grouped by
>>>>>  are unbound outside the query pattern. For instance, (add an example)"
>>>>> 
>>>>> 
>>>>> Opinions welcome!
>>>>> 
>>>>> best,
>>>>> Axel
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>> 
>>> 
>> 
> 

Received on Wednesday, 25 August 2010 12:34:17 UTC