CommentResponse:RV-2

From SPARQL Working Group
Jump to: navigation, search

(In response to RV-2)


Rob,

Thanks for the comments.


Aggregates

There is currently no plan to reduce the set of aggregates, so SAMPLE is highly likely to be included in the subsequent drafts. It is neccesary as SPARQL has no implicit sampling behaviour, unlike SQL.

GROUP_CONCAT is named that way so that there can also be a scalar CONCAT function. GROUP_CONCAT probably should not be conventionally two argument, as that causes some confusion with the semantics of aggregates. There's a possibility of a syntax like GROUP_CONCAT(?x SEPARATOR "\t"), the intention is to add this before the next WD.

Property Paths

- I agree with some of the previous comments on the list that some of the
features in property paths seem overly complex, e.g. alternatives.  If you
really need to do alternatives isn't it best just to use UNIONs?

Property path features can be combined into complex paths. Allowing alternatives in property path makes for more compact expression.

- Returning results of a path expression in an ordered way (with regards to
RDF lists) seems at odds with the general evaluation model of SPARQL which
as I understood it was that the results were an unordered multiset up until
you start applying solution modifiers and only actually becomes ordered if
an OrderBy is applied

Order of results from property paths is not guaranteed.

- Providing lengths of paths would complicate things and I don't think it
should be in the 1.1 spec

Providing lengths is not currently planned. This does weaken the usefulness of property paths but, as a time permitting feature, the WG is inclined to leave analysis and specification of including lengths to a later group when more deployed experience is available. The WG believes it has not designed out the possibility - for example, potential syntax forms have been considered to make sure the synatx is not a barrier to a future WG.

- Limiting results of path expressions to being distinct seems logical and
would aid implementation since you can potentially build a list of valid
paths as you evaluate the expression and by checking that you haven't
already found a specific path you can do cycle detection very easily (I may
be wrong here I'm just thinking off the top of my head how I might implement
paths)

Thank you for the observation.

Open Issues

- 5: Surely there is nothing you can express in an ASK that you can't with
an EXISTS?

Yes - EXISTS behaves like a nested ASK.

- 14: I think the aggregates defined now are sufficient and people can
provide extensions as per my comments on issue 15

It is proposed that extension aggregates would be described by a URI as functions are.

- 15: Extension aggregates should be defined by URIs just as with extension
functions and the individual implementations can then generate appropriate
structures depending on whether the URI indicates an aggregate/expression.
For example I've already defined a few in the function library for my engine
[2] e.g.

PREFIX lfn:<http://www.dotnetrdf.org/leviathan#>

SELECT ?s lfn:all(IsUri(?o)) AS ?AllObjectsAreUris
WHERE
{
   ?s ?p ?o
} GROUP BY ?s

This is what the group intends to do. Extension aggregates will also be able to take the DISTINCT flag.

- 35: I think that with the aggregates currently proposed the only ones
which DISTINCT makes sense for are COUNT and possibly GROUP_CONCAT though
I'd rather have it as only valid for COUNT

The group has decided to allow DISTINCT as a flag to all aggregates, as per SQL.

- 36: I think this should be rejected at the parsing stage - you shouldn't
be able to project an expression to an existing variable

It can't be enforced by the grammar, but the current text supports this.

- 39: I don't see too much of an issue with this though this may require
some queries to be rewritten such that projection expressions are evaluated
in such an order that the necessary expressions are evaluated prior to their
value being used

The WG has discussed this and does plan to allow a variable to be used later in a SELECT expression list, with clear rules on scoping.

- 41: GROUP BY expression should be permitted

The current text supports this.

Service Description

> Looking at the Service Description draft my main concern is that it allows 
> you to specify that you support some extension functions but not to say 
> anything about the arguments of those functions.  For example there's no way 
> to express that an extension function takes 2 arguments both of which must be 
> xsd:string and gives back an xsd:string This may be too complicated for the 
> service description to express easily and I guess you run into issues when 
> you have functions like fn:concat() which can take variable/unlimited numbers 
> of arguments.  Is the assumption that a user/their agent will be able to 
> retrieve the description of that function from somewhere else?

The intention of the Service Description vocabulary is to provide a minimal set of terms that can allow a simple description of a SPARQL endpoint, its dataset(s), and supported features. Importantly, we're not trying to provide a vocabulary with which to describe *all* possible aspects of an endpoint, including specifics of the supported functions (such as argument and return types) or dataset descriptions.

Our expectation is that with the infrastructure of service descriptions shared between endpoints, implementers can start to use/develop vocabularies for describing services in more detail, with consensus hopefully developing around specific features. For example, voiD[1] is likely to be a good way to describe datasets, while SPIN[2] might provide the sort of extension function descriptions you are talking about.

Grammar

As a general point on the query draft the EBNF in the grammar section is
still the 1.0 EBNF and does not contain the new rules which 1.1 introduces -
though I guess this may be in part due to the rules not being finalised?
Some of the new EBNF is embedded in the course of the text but some of it
seems to have disappeared at the moment.

The WG intends to produce a single grammar for both query and update languages because they share many grammar rules. The EBNF in the new features is included to be helpful as indicative of changes that will be made to the final SPARQL 1.1 grammar.


We hope this message addressed your comments. If it does, please could you can help our comment tracking by replying to this message stating that you are satisfied with this response.


thanks,
Gregory Williams, Steve Harris , Andy Seaborne
on behalf of the SPARQL working group.

[1] http://rdfs.org/ns/void-guide
[2] http://spinrdf.org/sp.html