From SPARQL Working Group
Jump to: navigation, search

You wrote:

> Hi all
> Here are my comments/questions on the draft.  Some of this is based on
> personal opinion and some of this is based on starting to try and implement
> some of this stuff in my SPARQL implementation in my latest development
> builds of dotNetRDF (early Alpha release .Net RDF library).
> Aggregates
> - I definitely support the use of a separate HAVING keyword, I agree with
> Leigh that it makes clear that the type of constraint is different and helps
> SQL programmers make the move to SPARQL by making the syntax familiar

The WG has not yet made a decision on which syntax to use, but the editors are currently considering the options.

> - The proposed grammar has a couple of things which bug me:
>  1. It allows for a Having clause without a Group By clause which doesn't
> necessarily make sense depending on the constraint given.  I assume that if
> you have a Having clause without a Group By then it should act as a Filter
> on the result set?
>  The only reason I can see for allowing this is if the having clause uses
> an aggregate then this makes sense in some circumstances e.g. seeing whether
> there are a given number of Triples matching a pattern:
> SELECT * WHERE {?s a ?type} HAVING (COUNT(?s) > 10)

Currently the proposal is that by default a binding multiset will consist of 1 groups containing all solutions. This is compatible with the way SQL works. However this design is not final, and may well change in the near future.

>  2. What is the intended meaning of grouping by an aggregate since the
> grammar permits this?  Should this be permitted at all?

Perhaps not. The grammar does not explicitly rule out all the syntactically invalid expressions, some things are left to the application, to reduce the complexity of the grammar.

> - On the subject of which aggregates to include I think the ones currently
> proposed are all useful and I can't think of any obvious missing ones.  With
> regards to how MIN and MAX should operate would it be reasonable to suggest
> that since SPARQL defines a partial ordering over values that MIN/MAX should
> return the minimum/maximum based on that ordering.  This is much easier to
> implement than doing some form of type detection or having some complex
> algorithm for how they operate over mixed datatypes.  While it does have the
> disadvantage or potentially returning different results depending on how
> exactly the SPARQL engine orders values I would be happy with this
> behaviour.  If people really need type specific minima/maxima then they can
> use appropriate FILTERs in their queries or possibly extra aggregates could
> be introduced eg. NMIN/NMAX (Numeric minimum/maximum)

This is the behavior that the group is proposing to go with.

> Subqueries
> - I would appreciate some clearer guidance on variable scoping as I don't
> feel comfortable attempting to implement these until I have a better idea of
> how this should work.

Certainly. The intention is that only variables projected from the subquery be in scope in the outer expression.

> Projection Expressions
> - I like these very much and have implemented these already.  I personally
> don't like the idea of a LET keyword, it certainly restricts the ability of
> the SPARQL processor to decide in what order it wishes to execute the query.
> Plus to my mind it starts to make SPARQL into the equivalent of Transact SQL
> (and other vendor specific SQL stored procedure languages) which feels wrong
> to me.

The ordering concern is not necessarily a problem, depending on how LET would be defined.

Your concern in the second point is noted.

The group has previously decided not to to address LET like structures in this version of the recommendation, as yet that decision has not been revisited.

Steve Harris, on behalf of the SPARQL working group.