This document has no status as a Data Access Working Group document whatsoever. It is merely workspace while developing the text for rq24.
Background
The SPARQL Algebra defines the semantics of a SPARQL query execution. The algebraic expression is formed from the query string by first parsing, and then applying transformations as described below. The result of a query can then be calculated by the evaluation rules applied to the algebraic expression. This gives the correct results of query -- it does not imply the actual execution of a query must be performed in this manner.
Parsing a SPARQL query string produces an syntax tree comprised of:
The result is a SPARQL abstract query that uses the symbols:
This is not the grammar in rq24. Because of the explicit procedure to map groups to Join/LeftJoin expressions, this more direct form can be used and it more accurately relates the the conversion procedure.
[20] |
GroupGraphPattern |
::= | '{' GraphPatternElement '}' |
[21] |
GraphPatternElement |
::= | BasicGraphPattern? ( GraphPatternNotTriples '.'? GraphPatternElement )? |
[22] |
BasicGraphPattern |
::= | TriplesSameSubject ( '.' BasicGraphPattern? )? |
[23] |
GraphPatternNotTriples |
::= | OptionalGraphPattern | GroupOrUnionGraphPattern | GraphGraphPattern | Constraint |
[24] |
OptionalGraphPattern |
::= | 'OPTIONAL' GroupGraphPattern |
[25] |
GraphGraphPattern |
::= | 'GRAPH' VarOrIRIref GroupGraphPattern |
[26] |
GroupOrUnionGraphPattern |
::= | GroupGraphPattern ( 'UNION' GroupGraphPattern )* |
[27] |
Constraint |
::= | 'FILTER' ( BrackettedExpression | BuiltInCall | FunctionCall ) |
Define "GraphPattern" as one of
GroupGraphPattern
,
GroupOrUnionGraphPattern
,
GraphGraphPattern
,
OptionalGraphPattern
and
BasicGraphPattern
.
Step 1 : BasicGraphPatterns
Replace all BasicGraphPattern
elements
by BGP(list of triple patterns)
Step 2 : GroupOrUnionGraphPattern
Replace any
GroupOrUnionGraphPattern
elements:
GroupGraphPattern
,
replace with the GroupGraphPattern
.GroupGraphPatterns
,
connected with 'UNION'
terminals, then replace with a sequence of nested union operators:Step 3 : GraphGraphPattern
Map GRAPH IRIGroupGraphPattern
to Graph(IRI,GroupGraphPattern
)
Map GRAPH VarGroupGraphPattern
to Graph(var,GroupGraphPattern
)
Step 4 : GroupGraphPattern
We introduce the following symbols:
After parsing, a group pattern consists of a sequence of graph patterns and filters (Constraint
).
A group pattern is mapped into the SPARQL algebra as follows:
This is the left-associative operator assignment, mapping optional to LeftJoin, adjacency to Join, and noting that LeftJoin can take a filter from its right (fixed) subelement.
Write: "A" for an algebra expression
Map all graph patterns contained in this group to give a list, SP, of algebra expressions Let F := the conjunction of all filters appearing as elements of the GROUP expressions Let G := the empty pattern, {} for i := 0 ; i < length(SP); i++ If SP[i] is an OPTIONAL, If SP[i] is of the form OPTIONAL(Filter(F, A)) G := LeftJoin(G, A, F) else G := LeftJoin(G , A, true) Otherwise for expression SP[i], G := Join(G, SP[i]) If F is not empty: If G = empty pattern then G := Filter(F, empty pattern) If G = LeftJoin(A1, A2, true) then G := LeftJoin(A1, A2, F) If G = Join(A1, A2) then G := Filter(F, Join(A1, A2) If G = Union(A1, A2) then G := Filter(F, Union(A1, A2)) If G = Graph(x, A) then G := Filter(F, Graph(x, A)) where x is a variable or IRI. The result is G
Step 5 : Simplification
Groups of one graph pattern (not a filter) become join({}, A) and can be replaced by A
Replace join({}, A) by A
@@Join(A, {}) can also be removed.
The second form of a rewrite example is the first with empty group joins removed by step 5.
Symbols after parsing:
ToList turns a multiset into a sequence with the same elements and cardinality. There is no implied ordering to the sequence; duplicates need not be adjacent.
Slice is the combination of Offset and Limit
where Mod is any of the modifiers above.
Step 1 : ToList
Set M = ToList(Pattern)
Step 2 : ORDER BY
If the query string has any ORDER BY clauses
M = OrderBy(M, list of of order conditions)
Step 3 : Projection
M = Project(M, vars)
where vars is the set of variables mentioned in the SELECT clause or all named variables in the query if SELECT * used.
Step 4 : Distinct
If the query contains DISTINCT,
M = Distinct(M)
Step 5 : OFFSET and LIMIT
If the query mentions "OFFSET start" or "LIMIT length"
M = Slice(M, start, length)
start defaults to 0
length defaults to the (size(M)-start)
The overall abstract query is M.
@@A note about multisets: set and a cardinality function
These operators are not those of the SPARQL abstract query form. The SPARQL algebra operators of the same name are used to evaluate SPARQL abstract query nodes as described in the section "Evaluation Semantics"
Solution is a mapping.
Defn: Substitution function
A mapping μ from V to T is a partial function μ : V -> T.
The domain of μ, dom(μ) is the subset of V where μ is defined.
Write μ0 for the mapping such that dom(μ0) is the empty set.
Write Ω0 for the multiset consisting of exactly the empty mapping μ0, with cardinality 1.
Write μ(?x->t) for the substitution function mapping variable x to RDF term t : { (x, t) }
Write Ω(?x->t) for multiset consisting of exactly μ(?x->t), that is, { { (x, t) } } with cardinality 1.
Defn: Compatible Mappings
Two mappings μ1 and μ2 are compatible if, for every variable v in dom(μ1) and in dom(μ2), μ1(v) = μ2(v).
If μ1 and μ2 are compatible then μ1 set-union μ2 is also a mapping.
Write merge(μ1, μ2) for μ1 set-union μ2
Write card[Ω](μ) for the cardinality of μ in a multiset of mappings Ω.
@@Cardinality function is not defined for μ not in Ω.
@@ Define expr(μ)
Defn: Filter
Let Ω be a multiset of mappings and expr be an expression. We define:
Filter(expr, Ω) = { μ | μ in Ω and expr(μ) is an expression that has a boolean effective value of true }
card[Filter(expr, Ω)](μ) = card[Ω](μ)
Defn: Join
Let Ω1 and Ω2 be multisets of mappings. We define:
Join(Ω1, Ω2) = { merge(μ1, μ2) | μ1 in Ω1and μ2 in Ω2, and μ1 and μ2 are compatible }
card[Join(Ω1, Ω2)](μ) = sum over μ in (Ω1 set-union Ω2), card[Ω1](μ1)*card[Ω2](μ2)
Defn: Diff
Let Ω1 and Ω2 be multisets of mappings. We define:
Diff(Ω1, Ω2, expr) =
{ μ | μ in Ω1 such that for all
μ′ in Ω2, μ and μ′ are
not compatible }
set-union
{ μ | μ in Ω1 such that for all
μ′ in Ω2, μ and μ' are
compatible and expr(merge(μ, μ')) is false }
card[Diff(Ω1, Ω2, expr)](μ) = card[Ω1](μ)
Defn: LeftJoin
Let Ω1 and Ω2 be multisets of mappings and F a filter. We define:
LeftJoin(Ω1, Ω2, expr) = filter(expr, Join(Ω1, Ω2)) set-union diff(Ω1, Ω2, expr)
card[LeftJoin(Ω1, Ω2, expr)](μ) = card[filter(expr, Join(Ω1, Ω2))](μ) + card[diff(Ω1, Ω2, expr)](μ)
Written in full that is:
LeftJoin(Ω1, Ω2, expr)
=
{ merge(μ1, μ2) |
μ1 in Ω1and μ2 in
Ω2, and μ1 and μ2
are compatible and expr(merge(μ1,
μ2)) is true }
set-union
{ μ1 | μ1 in
Ω1and μ2 in Ω2,
and μ1 and μ2 are not compatible
}
set-union
{ μ1 | μ1 in
Ω1and μ2 in Ω2,
and μ1 and μ2 are compatible and
expr(merge(μ1, μ2)) is false }
As these are distinct, the cardinality of LeftJoin is cardinality of these individual components of the definition.
Defn: Union
Let Ω1 and Ω2 be multisets of mappings. We define:
Union(Ω1, Ω2) = { μ | μ in Ω1 or μ in Ω2 }
card[Union(Ω1, Ω2)](μ) = card[Ω1](μ) + card[Ω2](μ)
Write [x | C] for a sequence of elements satisfying some condition C
Write card[L](x) to be the cardinality of x in L
Let Ω be a multiset of mappings. We define:
ToList(Ω) = a sequence of mappings μ in Ω in any order, with card[Ω](μ) occurrences of μ
ToList(Ω) = [ μ | μ in Ω] ; card[ToList(Ω)](μ) = card[Ω](μ)
Let Ψ be a sequence of mappings. We define:
OrderBy(Ψ, condition) = [ μ | μ in Ψ and the sequence satisfies the ordering condition]
card[OrderBy(Ψ, condition)](μ) = card[Ψ](μ)
Let Ψ be a sequence of mappings. We define:
Distinct(Ψ) = [ μ | μ in Ψ] ; card[Distinct(Ψ)](μ) = 1 ;
The order of Distinct(Ψ) must preserve any ordering given by OrderBy.
Let Ψ be a sequence of mappings and V a set of variables.
For mapping μ, write P(μ, V) to be the restriction of μ to variables in V.
Project(Ψ, V)[i] = P(Ψ[i], V)
Let Ψ be a sequence of mappings. We define:
Slice(Ψ, start, length)[i] = Ψ[start+i] for i = 0 to (length-1)
We define eval(D(G), graph pattern) as the evaluation of a graph pattern with respect to a dataset D having active graph G. The active graph is initially the default graph. The active graph is used to match the pattern unless otherwise stated.
@@Note the change from Join(pattern, pattern) (introduced parse symbols) to Join(mapping multiset, mapping multiset) etc.
D : a dataset D(G) : D a dataset with active graph G (the one patterns match against) D[i] : The graph with IRI i in dataset D D[DFT] : the default graph of D
eval(D(G), Join(P1, P2)) = Join(eval(D(G), P1), eval(D(G), P2)) eval(D(G), Join(P1, P2), F) = Filter(F, Join( eval(D(G), P1), eval(D(G), P2) ) )
eval(D(G), LeftJoin(P1, P2, F)) = LeftJoin(eval(D(G), P1), eval(D(G), P2), F)
Section 2.5
eval(D(G), Union(P1,P2)) = Union(eval(D(G), P1), eval(D(G), P2))
eval(D(G), Graph(IRI,P)) = eval(D(D[i]), P) eval(D(G), Graph(var,P)) = multiset-union over IRI i in D : join( eval(D(D[i]), P) , Ω(?v->i) )
eval(List(P), Graph(IRI,P)) = ToList(eval(D[DFT], Graph(IRI,P))
eval(Distict(L)) = Distinct(eval(L))
eval(Project(L, vars)) = Project(eval(L), vars)
eval(OrderBy(L, condition)) = OrderBy(eval(L), condition)
eval(Slice(L, start, length)) = Slice(eval(L), start, length)
$Log: rq24-algebra.html,v $ Revision 1.15 2007/01/03 15:14:36 aseaborne Corrected from Fred Zemke (28 Dec 2006) Revision 1.14 2006/12/15 18:09:07 aseaborne Changes in response to 2006OctDec/0206 as noted in 2006OctDec/0208