SPARQL Query Language for RDF - Formal Definitions

Definition: RDF Term

let I be the set of all IRIs.
let RDF-L be the set of all RDF Literals
let RDF-B be the set of all blank nodes in RDF graphs

The set of RDF Terms, RDF-T, is I union RDF-L union RDF-B.

Definition: Query Variable

A query variable is a member of the set V where V is infinite and disjoint from RDF-T.

Definition: Graph Pattern

A Graph Pattern is one of:

Definition: SPARQL Query

A SPARQL query is a tuple (GP, DS, SM, R) where:

The graph pattern of a query is called the query pattern.

Definition: Triple Pattern

A triple pattern is member of the set:
(RDF-T union V) x (I union V) x (RDF-T union V)

Definition: Pattern Solution

A variable solution is a substitution function from a subset of V, the set of variables, to the set of RDF terms, RDF-T.

A pattern solution, S, is a variable substitution whose domain includes all the variables in V and whose range is a subset of the set of RDF terms.

The result of replacing every member v of V in a graph pattern P by S(v) is written S(P).

If v is not in the domain of S then S(v) is defined to be v.

Definition: Basic Graph Pattern

A Basic Graph Pattern is a set of Triple Patterns.

Definition: E-entailment Regime

An E-entailment regime is a binary relation between subsets of RDF graphs.

A graph in the range of an E-entailment is called well-formed for the E-entailment.

Definition: Basic Graph Pattern equivalence

Two basic graph patterns are equivalent if there is a bijection M between the terms of the triple patterns that maps blank nodes to blank nodes and maps variables, literals and IRIs to themselves, such that a triple ( s, p, o ) is in the first pattern if and only if the triple ( M(s), M(p) M(o) ) is in the second.

Definition: Scoping Set

A Scoping Set B is some set of RDF terms.

Definition: Scoping Graph

The Scoping Graph G' for RDF graph G, is an RDF Graph that is graph-equivalent to G

Definition: Basic Graph Pattern E-matching

Given an entailment regime E, a basic graph pattern BGP, and RDF graph G, with scoping graph G', then BGP E-matches with pattern solution S on graph G with respect to scoping set B if:

• BGP' is a basic graph pattern that is graph-equivalent to BGP
• G' and BGP' do not share any blank node labels.
• (G' union S(BGP')) is a well-formed RDF graph for E-entailment
• G E-entails (G' union S(BGP'))
• The RDF terms introduced by S all occur in B.

Definition: Value Constraint

A value constraint is a boolean-valued expression of variables and RDF Terms.

For value constraint C, a solution S matches C if S(C) is true, where S(C) is the boolean-valued expression obtained by substitution of the variables mentioned in C.

Definition: Group Graph Pattern

A group graph pattern GP is a set of graph patterns, GPi.

A solution of Group Graph Pattern GP on graph G is any solution S such that, for every element GPi of GP, S is a solution of GPi.

Definition: Optional Graph Pattern

An optional graph pattern is a combination of a pair of graph patterns. The second pattern modifies pattern solutions of the first pattern but does not fail matching of the overall optional graph pattern.

If Opt(A, B) is an optional graph pattern, where A and B are graph patterns, then S is a solution of optional graph pattern if S is a pattern solution of A and of B otherwise if S is a solution to A, but not to A and B.

Definition: Union Graph Pattern

A union graph pattern is a set of graph patterns GPi.

A union graph pattern matches a graph G with solution S if there is some GPi such that GPi matches G with solution S.

Definition: RDF Dataset

An RDF dataset is a set:
{ G, (<u1>, G1), (<u2>, G2), . . . (<un>, Gn) }
where G and each Gi are graphs, and each <ui> is an IRI. Each <ui> is distinct.

G is called the default graph. (<ui>, Gi) are called named graphs.

There may be no named graphs.

Definition: Target Graph

One graph of the dataset is the target graph. A graph pattern P, where P is not an RDF Dataset Graph Pattern, matches an RDF dataset DS with solution S if P matches the target graph with solution S.

The initial target graph is the default graph.

Definition: RDF Dataset Graph Pattern

If D is a dataset {G, (<u1>, G1), ...}, and P is a graph pattern then S is a pattern solution of RDF Dataset Graph Pattern GRAPH(g, P) if either of:

1. g is an IRI where g = <ui> for some i, and S is pattern solution of P on target graph Gi.
2. g is a variable, S maps the variable g to <ui> and S is a pattern solution of P on target graph Gi

Definition: Solution Sequence

A solution sequence S is a list of solutions.

S = ( S1, S2, . . . , Sn)

The solution sequence from matching the query pattern is an unordered collection formed from the solutions of the query pattern.

Definition: Solution Sequence Modifier

A solution sequence modifier is one of:

If SM is set of modifiers, and QS is the collection of solutions of a query, we write SM(QS) for the sequence formed by applying SM to the solution sequence formed from QS.

Definition: Result Forms

The result form of a query is one of

Definition: Projection

The projection of solution QS over a set of variables VS is the solution
project(QS, VS) = { (v, QS(v)) | v in VS }

For a solution sequence S = ( S1, S2, . . . , Sn) and a finite set of variables VS,
project(S, VS) = { (project(Si, VS) | i = 1,2, . . . n }

Definition: Distinct Solution Sequence

A Distinct Solution Sequence is a solution sequence in which no two solutions the same.

Definition: Ordered Solution Sequence

A ordered solution sequence is a solution sequence where the sequence is partially ordered with respect to some ordering condition.

A solution sequence S = ( S1, S2, . . . , Sn) is ordered with respect to an ordering condition C if, for Si, SJ, then i < j if C orders Si before Sj.

Definition: Limited Solution Sequence

A Limited Solution Sequence has at most a given, fixed number of members.

The limit of solution sequence S = (S1, S2, . . . , Sn) is

limit(S,m) =
(S1, S2, . . . , Sm) if n > m
(S1, S2, . . . , Sn) if n <= m

Definition: Offset Solution Sequence

An Offset Solution Sequence with respect to another solution sequence S, is one which starts at a given index of S.

For solution sequence S = (S1, S2, . . . , Sn), the offset solution sequence
offset(S, k), k >= 0  is
(Sk, Sk+1, . . ., Sn) if n >= k
(), the empty sequence, if k > n

Definition: SELECT

Given Q = (GP, DS, SM, SELECT VS) where

then, if QS is the set of solutions formed by matching dataset DS with graph pattern GP, the SELECT result is project(SM(QS), VS)

Definition: Graph Template

A graph template is a set of triple patterns.

If T = { tj | j = 1,2 ... m } is a graph template and S is a pattern solution then SC(S, tj) is a set of one RDF triple S(tj) if all variables in tj are in the domain of S. SC(S, tj) is the empty set otherwise.

Write SC(S, T) for the union of SC(S, tj).

Definition: CONSTRUCT

Let Q = (GP, DS, SM, CONSTRUCT T) where

then, if QS is the set of solutions formed by matching dataset DS with graph pattern GP, then write SM(QS) = { Si | i = 1,2 ... n }.

Let Ti be a sequence of sets of triple patterns, such that each Ti is basic graph pattern equivalent to T and no Ti have a blank node in common and no Ti has a blank node in common with any blank node in SM(QS).

Let Ri be the RDF graph formed from SC(Si, Ti).

The CONSTRUCT result is the RDF graph formed by the union of Ri.

Definition: DESCRIBE

Let Q = (GP, DS, SM, DESCRIBE V) where

then, if QS is the set of solutions formed by matching dataset DS with graph pattern GP, the DESCRIBE result is an RDF graph formed by information relating elements of
distinct(U union project(SM(QS), VS)).

This definition intentionally does not prescribe the nature of the relevant information.