Feature:OneOfGp

From SPARQL Working Group
Jump to: navigation, search


Feature: One-Of Group Pattern

It is possible that an application knows of many semantically equivalent expressions of a query. In principle, given any expression of the query, the query optimizer should be able to arrive at the optimal plan. This is not always so, though.

Feature description

There are cases where an optimizer cannot make this choice. This is for example the case if at the application level it is known that equivalent data (for application purposes) exists in many sources, e.g. a RDBMS mapped on demand to SPARQL and an extract of said RDBMS stored as RDF. In such a case, the application can issue a special kind of union of which only one term is to be evaluated, at the discretion of the cost model.

We see this as specially useful for federated queries.


Short summary (one or two sentences max), explaining the capability that this feature provides.

Longer description, possibly practical motivations, references welcome.

Example

select ?contact where { graph <mapped> { ?x foaf:name "Alice". ?x foaf:knows ?f . ?f foaff:name ?contact }
  alternate graph <etl> { ?x foaf:name "Alice". ?x foaf:knows ?f . ?f foaff:name ?contact }}

The scope rules are as for UNION. A trivial implementation would always choose the first alternative, thus the minimal cost is low. Since a set of alternate patterns may occur at any level of nesting in a query, this is not readily replaced by giving the application access to cost model estimates.

Existing Implementation(s)

Virtuoso will support it as soon as the syntax is defined in some draft of the SPARQL spec.

Existing Specification / Documentation

List any existing text that attempts a formal definition of this extension. This could be a draft specification, API or syntax documentation, etc.

Compatibility

The feature introduces new keyword so no ambiguity is possible.

Links to postponed Issues

Has this extension/use case some history in the group already? I.e. are there posponed issues or archived mail-threads related to this originating from DAWG?

Related Features

List and possibly link to other use cases or extensions that relate to that use case. That will help us grouping them together in the end.

Champions

Use cases

  • Federated query to numerous similar sources so the server may choose its favorite (say, one that is locally cached at the server), even if sources use different ontologies.
  • A query to the server with precompiled SJMPVs so the server may choose between querying small in-memory data set with big amount of inference/transitive operations and querying large disk-resident SJMPV with simple joins.

References