W3C

SPARQL 1.1 Query

W3C Working Draft 22 October 2009

This version:
http://www.w3.org/TR/2009/WD-sparql11-query-20091022/
Latest version:
http://www.w3.org/TR/sparql11-query/
Editors:
Steve Harris, Garlik  steve.harris@garlik.com
Andy Seaborne, Hewlett-Packard Laboratories, Bristol  andy.seaborne@gmail.com

Abstract

RDF is a directed, labeled graph data format for representing information in the Web. The SPARQL specification defines the syntax and semantics of the SPARQL query language for RDF.

This document describes changes that will be made to the SPARQL query language to form SPARQL 1.1 Query.

Status of this document

This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at http://www.w3.org/TR/.

This is the First Public Working Draft of the "SPARQL 1.1 Query" specification for review by W3C members and other interested parties.

If you wish to make comments regarding this document, please send them to public-rdf-dawg-comments@w3.org (subscribe, archives)

Implementors should be aware that this specification is not stable. Implementors who are not taking part in the discussions are likely to find the specification changing out from under them in incompatible ways. Vendors interested in implementing this specification before it eventually reaches the Candidate Recommendation stage should join the aforementioned mailing lists and take part in the discussions.

This draft describes the differences between SPARQL/Query 1.0 and the proposed SPARQL/Query 1.1, rather than being a standalone draft recommendation. The final version of this draft shall be integrated with the earlier SPARQL/Query 1.0 document to produce a new document.

The grammar examples are indicative of changes that will be made to the final SPARQL/Query 1.1 grammar, in later drafts the complete grammar will be provided.

Publication as a Working Draft does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.

The publication of this document by the W3C as a W3C Working Draft does not imply that all of the participants in the W3C SPARQL working group endorse the contents of the specification. Indeed, for any section of the specification, one can usually find many members of the working group or of the W3C as a whole who object strongly to the current text, the existence of the section at all, or the idea that the working group should even spend time discussing the concept of that section.

The W3C SPARQL Working Group is the W3C working group responsible for this specification's progress along the W3C Recommendation track.

This document was produced by a group operating under the 5 February 2004 W3C Patent Policy. W3C maintains a public list of any patent disclosures made in connection with the deliverables of the group; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy.


1 Introduction

This document covers the new features of the SPARQL query language being developed by the W3C SPARQL Working Group. The working group document SPARQL New Features and Rationale describes, in section SPARQL/Query 1.1, the background to the choice of this set of features.

No incompatibilities with existing valid SPARQL queries, in either syntax or results, will be introduced by these extensions to the language.

The mandatory features are:

The time-permitting features are:

This document describes the current status of the mandatory features for review and comment by the community. Comments on this document should be sent to public-rdf-dawg-comments@w3.org, a mailing list with a public archive.

The design of the features presented here is work-in-progress and does not represent the final decisions of the working group.  Implementers and application writers should not assume that the designs in this document will not change.

2 Aggregate Functions

Example

Data:

@prefix : <http://books.example/> .

:org1 :affiliates :auth1, :auth2 .
:auth1 :writesBook :book1, :book2 .
:book1 :price 9 .
:book2 :price 5 .
:auth2 :writesBook :book3 .
:book3 :price 7 .
:org2 :affiliates :auth3 .
:auth3 :writesBook :book4 .
:book4 :price 7 .

Query:

PREFIX  <http://books.example/>
SELECT SUM(?lprice) AS ?totalPrice
WHERE { 
  ?org :affiliates ?auth .
  ?auth :writesBook ?book .
  ?book :price ?lprice 
} 
GROUP BY ?org
HAVING (SUM(?lprice) > 10)

Results:

?totalPrice
21

Syntax

Grammar rules:

@@ Alternatively the FILTER keyword could be used in place of HAVING. It is believed that the use of FILTER in this case is always unambiguous.

SolutionModifier   ::=   GroupClause? HavingClause? OrderClause? LimitOffsetClauses?
GroupClause   ::=   'GROUP' 'BY' GroupCondition+
GroupCondition   ::=   ( BuiltInCall | FunctionCall | '(' Expression ( 'AS' Var )? ')' | Var )
HavingClause   ::=   'HAVING' HavingCondition+
HavingCondition   ::=   Constraint
BuiltInCall  ::=   ...
| 'SUM' '(' AggregateExpression ')'
| 'AVG' '(' AggregateExpression ')'
| 'COUNT' '(' ( AggregateExpression | 'DISTINCT'? '*' ) ')'
| 'MIN' '(' AggregateExpression ')'
| 'MAX' '(' AggregateExpression ')'
...
AggregateExpression  ::='DISTINCT'? Expression

@@ Aggregate functions are not yet decided, but examples of possible functions are given above.

Algebra Operators

The operators that make up aggregates consist of three functions: Key, Partition, and Aggregation.

Key is a function that projects aggregated solution values from a solution:

Definition: key

Single valued function: returns a solution projected down to named variables only:

key(varlist, μ) = { (v,x) | (v,x) in μ, v in varlist }

and the set of all keys:

key(varlist, Ω) = { k | μ in Ω, k=key(varlist,μ) }

Partition is a function which groups solutions according to the GROUP BY expression:

Definition: Partition

The partition of the multiset Ω is:

Partition(GroupClause, Ω) = { (k,μ) | μ in Ω, k=key(GroupClause, μ) }

Aggregation, a function which calculates a scalar value as an output of the aggregate expression in the SELECT clause.

Definition: Aggregation

Aggregation is a function agg(key, X). It produces a single value for each key and partition for that key (key, X).

Aggregation(GroupClause, Var, agg, Ω) =
   { merge(k, (Var, agg(k, Ω')) | (k, Ω') in Partition(GroupClause, Ω), Var is a variable }

@@ this needs to be generalised to account for group expressions, if we allow for those.

@@ need to define HAVING as a filter

@@ need to define some aggregate functions in algebra

Mapping from Abstract Syntax to Algebra

Example:

SELECT SUM(?val)
WHERE {
  ?a rdf:value ?val .
} GROUP BY ?a

Becomes Aggregation(?a, ?val, Sum, BGP(?x rdf:value ?val)).

@@ This needs to be a more comprehensive example.

3 Subqueries

Example

Data:

@prefix : <http://people.example/> .

:alice :name "Alice", "Alice Foo", "A. Foo" .
:alice :knows :bob, :carol . 
:bob :name "Bob", "Bob Bar", "B. Bar" .
:carol :name "Carol", "Carol Baz", "C. Baz" .

Return a name (the one with the lowest sort order) from all the people that know Alice and have a name.

Query:

PREFIX : <http://people.example/>
SELECT ?y ?name WHERE {
  :alice :knows ?y .
  {
    SELECT ?y ?name WHERE {
      ?y :name ?name
    }
    ORDER BY ?name
    LIMIT 1
  }
}

Results:

yname
:bob"B. Bar"
:carol"C. Baz"

Syntax

The included sub-select is the same as a top-level SELECT except it excludes the prologue (BASE, PREFIX) and dataset description clause (FROM, FROM NAMED)

Grammar rules:

[7]   SubSelect   ::=   Project WhereClause SolutionModifier
[43]   GroupGraphPattern   ::=   '{' ( SubSelect | GroupGraphPatternSub ) '}'
[44]   GroupGraphPatternSub   ::=   TriplesBlock? ( ( GraphPatternNotTriples | Filter ) '.'? TriplesBlock? )*

Algebra Operator

As it stands Subqueries require one new algebra operator, toMultiset, which takes Lists and returns Multisets.

@@ define toMultiset function

@@ alternatively define SPARQL algebra in terms of tables.

Mapping from Abstract Syntax to Algebra

Example:

{
  SELECT ?z WHERE {
   ?x ?y ?z .
  }
}

Becomes Project(BGP(?x ?y ?z), ?z).

In general, GroupGraphPatternSub is evaluated and then the resulting multiset is projected with respect to Project, as Project(GroupGraphPatternSub, Project).

As a consequence the ordering from any ORDER BY expressions is not propagated outside the Project expression.

4 Negation

The NOT EXISTS pattern tests whether a graph pattern does not match the dataset, given the values of variables in-scope. It does not generate any additional bindings. It can be used in graph patterns and in FILTER expressions. Note that the filter form applies to the whole group in which the filter appears. In the case where the NOT EXISTS pattern is used, it applies only to variables defined earlier in the pattern.

The form EXISTS is also provided, as a pattern and as a filter expression.  It tests whether the pattern does match the data or not; it does not generate any additional bindings.

Data:

@prefix  :       <http://example/> .
@prefix  rdf:    <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix  foaf:   <http://xmlns.com/foaf/0.1/> .

:alice  rdf:type   foaf:Person .
:alice  foaf:name  "Alice" .
:bob    rdf:type   foaf:Person .

Query:

PREFIX  rdf:    <http://www.w3.org/1999/02/22-rdf-syntax-ns#> 
PREFIX  foaf:   <http://xmlns.com/foaf/0.1/> 

SELECT ?person
WHERE 
{
    ?person rdf:type  foaf:Person .
    NOT EXISTS { ?person foaf:name ?name }
}

Query Result:

person
<http://example/bob>

Syntax

The keywords EXISTS and NOT EXISTS can be used both as part of graph patterns and also in FILTER expressions.

As a graph pattern:

Grammar rules:
GraphPatternNotTriples   ::=   ... | ExistsElt | NotExistsElt
ExistsElt   ::=   'EXISTS' GroupGraphPattern
NotExistsElt   ::=   'NOT EXISTS' GroupGraphPattern

In a FILTER:

Grammar rules:
BuiltInCall   ::=     ...
| ExistsFunc
| NotExistsFunc
ExistsFunc   ::=   'EXISTS' GroupGraphPattern
NotExistsFunc   ::=   'NOT EXISTS' GroupGraphPattern

The rules ExistsFunc and NotExistsFunc are the same (syntactically) as rules  ExistsPattern and NotExistsPattern but are pulled out for convenience of parser writers, where a parser might trigger different internal structures for the different cases.

 Note: NOT EXISTS as a graph pattern operator between patterns P1 and P2 is equivalent to the filter form:

P1 NOT EXISTS {P2}

{P1 FILTER (NOT EXISTS {P2})}

The introduction of the outer { } puts the FILTER in the place where the graph pattern NOT EXISTS occurred.

@@Ref spec: filters happen at end of BGPs.

Algebra Operator

There is a filter operator "exists" that takes a graph pattern. exists returns true/false depending on whether the pattern matches. No additional binding of variables occurs. The NOT EXISTS form translates into fn:not(exists(...)).

 xsd:boolean   EXISTS {pattern pat}

Returns true if pattern pat matches the dataset. Returns false otherwise.

@@active graph

Variables in the pattern pat  that are bound in the current solution mapping take the value they have from the solution mapping. Variables in the pattern pat that are not bound in the current solution mapping take part in pattern matching.

To facilitate this, we introduce an algebra operation for the evaluation of the pattern in an algebra EXISTS operation:

Definition: Substitute

Let μ a solution mapping. We define:

    substitute(pattern, μ) to be the pattern formed by replacing every occurrence of a variable in μ by its value in μ.

Note: NOT EXISTS as a graph pattern is translated into an algebra filter and positioned exactly where it occurs in the graph pattern, unlike a FILTER expression which is applied over the whole group it occurs in, as done during algebra translation. See below.

We define a expression function "exists" using "substitute":

Definition: Exists

Let μ a solution mapping. We define:

    exists(pattern, μ) = true if and only if eval(substitute(pattern, μ), D[g]) has any solutions.

Mapping from Abstract Syntax to Algebra

@@Example only.

The syntax element for NOT EXISTS graph pattern translates to a filter using not and exists. The translation does not move the location of the operation unlike the translation of FILTER. Translation of a filter expression in that uses NOT EXISTS or EXIST in a FILTER proceeds as for all other filter operations.

 { ?s rdf:type <t> 
   NOT EXISTS { ?s <p> ?v }
   ?s :p ?o
 }

Example:

{ ?s rdf:type <t>
   NOT EXISTS { ?s <p> ?v }
   ?s :p ?o
 }
(join
   (filter (not (exists { ?s <p> ?v }))
      (bgp (triple (?s rdf:type <t>))))
   (bgp (triple (?s :p ?o))))

SPARQL-WG Note: Alternative Design: MINUS

The working group has also considered a different design for negation: the MINUS operator.

Minus(Ω1, Ω2) = { μ | μ in Ω1 such that for all μ′ in Ω2, either μ and μ′ are not compatible or dom(μ) and dom(μ') are disjoint }

The additional restriction on dom(μ) and dom(μ') is added so that if any solution mapping has no variables in common with solution mappings of Ω1 then Minus(Ω1, Ω2)  is empty, regardless of the rest of Ω2. The empty solution  mapping is compatible with every other solution mapping so {P} MINUS {} would otherwise be empty.

@@editor: the definition is the editors interpretation of discussion.

See also: SPARQL Diff which is used in the definition of SPARQL LeftJoin.

5 Project expressions

Example:

 PREFIX foaf: <http://xmlns.com/foaf/0.1/>
 PREFIX : <http://www.example.org/>
 
 SELECT (fn:concat(?givenName, ' ', ?surname) AS ?fullName)
 WHERE {
   ?person foaf:givenname ?givenName ;
           foaf:surname ?surname ;
           foaf:interest :trees .
 }

To be decided: exact syntax.  Variations of this extension exist in several system with slightly different syntax.

See also time permitting feature in the SPARQL New Features and Rationale, section 2.5 Query language syntax

SELECT (?x+?y AS ?fullName)
SELECT (?x+?y) AS ?fullName
Grammar rules:
Project   ::=   'SELECT' ( 'DISTINCT' | 'REDUCED' )?
  ( '*' | ( Var | BuiltInCall | Aggregate | FunctionCall |
          ( '(' Expression ( 'AS' Var )? ')' ) )+ )

Algebra Operator

@@require some way to add into a solution mapping, either as a complex project operator or as a separate operation.

@@Scoping of variables - does "AS ?v" allow ?v to be used in another expression?   No reason why not if careful about solution extension ordering.

Definition: Extend

Let μ a solution mapping, var a variable and term an RDF term, then we define:

extend(μ, var, term) = { (x,t) | x in dom(μ) and t = μ(v) or x = var and t = term }

extend(Ψ, var, term) = { extend(μ, var, term) | μ in Ψ }

 

@@Error if var is in dom(μ) - synatctic version of this restriction

 

Mapping from Abstract Syntax to Algebra

Example:

SELECT (?x+?y) AS ?z
{ ... }
ORDER BY ...
(project (?z)
  (extend ?z (+ ?x ?y)
     (orderby ...)
  ))

 


Raw CVS log:

 
$Log: Overview.html,v $
Revision 1.4  2009/10/23 14:16:29  ivan
*** empty log message ***

Revision 1.3  2009/10/22 07:20:24  ivan
*** empty log message ***

Revision 1.2  2009/10/21 16:56:12  ivan
*** empty log message ***

Revision 1.1  2009/10/21 14:23:27  ivan
*** empty log message ***

Revision 1.7  2009/10/21 03:56:30  lfeigenb
added note to SotD about this being a temporary delta

Revision 1.6  2009/10/21 03:15:05  lfeigenb
fixed duplicate anchor

Revision 1.5  2009/10/21 03:13:36  lfeigenb
removed broken fragments

Revision 1.4  2009/10/21 03:11:32  lfeigenb
removed broken fragments

Revision 1.3  2009/10/21 02:57:20  lfeigenb
close errant a tag

Revision 1.2  2009/10/21 02:56:13  lfeigenb
publication prep

Revision 1.1  2009/10/21 02:49:12  lfeigenb
for pub

Revision 1.22  2009/10/20 21:40:08  sharris2
Added an abstract for FPWD process

Revision 1.21  2009/10/20 08:51:57  sharris2
Change title to reflect WG descision

Revision 1.20  2009/10/07 11:27:03  aseaborne
Editorial preparation for FPWD; validation

Revision 1.19  2009/10/05 17:23:38  sharris2
Use draft stylesheet

Revision 1.18  2009/10/05 17:20:11  sharris2
Incorporate changes suggested by Birte and Axel.

Revision 1.17  2009/10/05 12:51:54  sharris2
Fix typo in aggregate example

Revision 1.16  2009/10/05 09:24:44  sharris2
Add some more text about this draft being a delta on 1.0

Sync aggregate grammar with HTML combined grammar doc

Revision 1.15  2009/09/28 15:25:39  sharris2
Add note saying that this is a delta

Revision 1.14  2009/09/28 14:50:57  sharris2
Add section on Subqueries, replace invalid 8859 characters.

Revision 1.13  2009/09/22 11:40:40  sharris2
Removed link to my email address

Revision 1.12  2009/09/22 09:19:04  aseaborne
Fix grammar rule for 'project'

Revision 1.11  2009/09/20 17:55:44  sharris2
Fill in section on Aggregates, still needs some work, especially around the
algebra

Revision 1.10  2009/09/17 13:28:02  aseaborne
Defn for EXISTS

Revision 1.9  2009/09/17 12:01:29  aseaborne
Simple NOT EXISTS example added

Revision 1.8  2009/09/13 18:45:29  aseaborne
Correct for XHTML

Revision 1.7  2009/09/13 18:42:27  aseaborne
Editorial + start on extend()

Revision 1.6  2009/09/07 18:11:22  aseaborne
Partial content for Project Expressions section

Revision 1.4  2009/09/06 19:33:54  aseaborne
Fix up XHTML; WG note about MINUS

Revision 1.3  2009/09/06 18:29:24  aseaborne
Partial content for negation section

Revision 1.2  2009/09/02 12:54:08  aseaborne
Add links to design: on wiki; added to intro.

Revision 1.1  2009/08/27 15:25:00  aseaborne
Initial check-in of query 1.1