XSPARQL: Semantics

XSPARQL is a query language combining XQuery and SPARQL for transformations between RDF and XML. This document defines the semantics of XSPARQL.

Status of this document

This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications can be found in the W3C technical reports index at http://www.w3.org/TR/.

This document is a part of the XSPARQL Submission which comprises five documents:

By publishing this document, W3C acknowledges that the Submitting Members have made a formal Submission request to W3C for discussion. Publication of this document by W3C indicates no endorsement of its content by W3C, nor that W3C has, is, or will be allocating any resources to the issues addressed by it. This document is not the product of a chartered W3C group, but is published as potential input to the W3C Process. A W3C Team Comment has been published in conjunction with this Member Submission. Publication of acknowledged Member Submissions at the W3C site is one of the benefits of W3C Membership. Please consult the requirements associated with Member Submissions of section 3.3 of the W3C Patent Policy. Please consult the complete list of acknowledged W3C Member Submissions.

1. Introduction

This document defines the semantics of XSPARQL based on the XQuery semantics [XQUERYSEMANTICS]. Each pragma-free XQuery is an XSPARQL query and has the same result under both XQuery and XSPARQL semantics.

2. XSPARQL Semantics

The semantics of XSPARQL follows the formal treatment of the XQuery semantics [XQUERYSEMANTICS]. We adopt the notation provided there and define XSPARQL semantics by means of respective normalization mapping rules and inference rules.

We define dynamic evaluation inference rules for a new built-in function fs:sparql which evaluates SPARQL queries according to the SPARQL semantics, cf. [SPARQL]. Other modifications include normalization of the XSPARQL constructs to XQuery expressions. This means that we do do not need new grammar productions but only those defined in the XQuery Core syntax.

2.1. FLWOR' Expressions

The XSPARQL syntax [XSPARQLLANGUAGE] defines, together with the XQuery FLWOR expression, a new for-loop for iterating over SPARQL results: SparqlForClause. This object stands at the same level as XQuery's for and let expressions, i.e., such type of clauses are allowed to start new FLWOR' expressions, or may occur inside nested XSPARQL queries.

To this end, our new normalization mapping rules [·]_Expr' inherit from the definitions of XQuery's [·]_Expr mapping rules and overload some expressions to accommodate XSPARQL's new syntactic objects. The semantics of XSPARQL expressions hence stands on top of XQuery's semantics.

Here, [·]_SparqlQuery and [·]_SparqlResult are auxiliary mapping rules for expanding the given expressions:

We now define the meaning of fs:sparql. It is, following the style of [XQUERYSEMANTICS], an abstract function which returns a SPARQL query result XML document [SPARQLPROTOCOL] expected as the result of a SPARQL select query. I.e., the result of fs:sparql conforms to the XML Schema definition http://www.w3.org/2007/SPARQL/result.xsd. We further assume that the _sparql_result: namespace is declared implicitly in each XSPARQL query and tied to the namespace URI http://www.w3.org/2007/SPARQL/result#.

Firstly, fs:sparql implicitly expands PrefixedNames within the SPARQL query string it is given, according to the namespace declarations in the prolog of XSPARQL queries.

Static typing rules apply here according to the rules given in the XQuery semantics.

The fs:serialize function behaves like the fn:concat function on arguments of xs:anyAtomicType, cf. [XPATHFUNCT], and serializes arguments bound to XML nodes to xs:string by converting them to a corresponding string in the lexical space of rdf:XMLLiteral as defined in [RDFCONCEPTS]. This ensures that the structure of the XML literals is preserved in the output, while the conversion to xs:string of fn:concat only retains the text data of the XML Literal.

Since this function must be evaluated according to the SPARQL semantics, we need to get the value of fs:sparql in the dynamic evaluation semantics of XSPARQL.

In case of error (for instance, the query string is not syntactically correct, or the DatasetClause cannot be accessed), fs:sparql issues an error:

The only remaining part is defining the semantics of a GroupGraphPattern using our extended [·]_Expr'. This mapping rule takes care that variables in scope of XSPARQL expressions are properly substituted using the evaluation mechanism of XQuery. To this end, we assume that [·]_Expr' takes expressions in SPARQL's GroupGraphPattern syntax and constructs a sequence of strings and variables, by applying the auxiliary mapping rule [·]_VarSubst to each of the graph pattern's variables. This rule looks up bound variables from the static environment and possibly replaces them to variables or to a string expression, where the value of the string is the name of the variable. This has the effect that unbound variables in GroupGraphPattern will be evaluated by SPARQL instead of XQuery. The statical semantics for [·]_VarSubst is defined below using the next inference rules. They use the new judgement $VarName bound, which holds if the variable $VarName is bound in the current static environment.

Next, we define the normalization of for expressions. In order to handle blank nodes appropriately in construct expressions, we need to decorate the variables of standard XQuery for-expressions with position variables. First, we must normalize for-expressions to core for-loops:

Now we can apply our decoration of the core for-loops (without position variables) recursively:

We do not specify where and order by clauses here, as they can be handled similarly as above let and for expressions.

2.2. CONSTRUCT Expressions

We define now the semantics for the ReturnClause. Expressions of form return Expr are evaluated as defined in the XQuery semantics. Stand-alone construct-clauses are normalized as follows:

The auxiliary mapping rule [·]_{SubjPredObjlist} rewrites variables and blank nodes inside of ConstructTemplates using the normalization mapping rules [·]_Subject, [·]_PredObjList, and [·]_ObjList. They use the judgements expr is valid subject, valid predicate, and valid object, which holds if the expression expr is, according to the RDF specification [RDFCONCEPTS], a valid subject, predicate, and object, resp: i.e., subjects must be bound and not literals, predicates, must be bound, not literals and not blank nodes, and objects must be bound. If, for any reason, one criterion fails, the triple containing the ill-formed expression will be removed from the output. Free variables in the construct are unbound, hence triples containing such variables must be removed too. The boundness condition can be checked at runtime by wrapping each variable and FLWOR' into a fn:empty() assertion, which removes the corresponding triple from the ConstructTemplate output. Next, we sketch only some of the normalization rules; the missing rules should be clear from the context:

Otherwise, if one of the premises is not true, we suppress the generation of this triple. One of the negated rules is the following:

The normalization for subjects, verbs, and objects according to [·]_Expr' is similar to GroupGraphPattern: all variables in it will be replaced using [·]_VarSubst.

Blank nodes inside of construction templates must be treated carefully by adding position variables from surrounding for expressions. To this end, we use [·]_BNodeSubst. Since we normalize every for-loop by attaching position variables, we just need to retrieve the available position variables from the static environment. We assume a new static environment component statEnv.posVars which holds - similar to the statEnv.varType component - all in-context positional variables in the given static environment, that is, the variables defined in the at clause of any enclosing for loop.

2.3. SPARQL Filter Operators

SPARQL filter expressions in WHERE GroupGraphPattern are evaluated using fs:sparql. But we additionally allow the following functions inherited from SPARQL in XSPARQL:

3. Correspondence between XSPARQL and XQuery

XSPARQL syntactically subsumes XQuery and - taking into account the shortcut notations defined in Section 4 of [XSPARQLLANGUAGE] - SPARQL construct queries. Concerning semantics, XSPARQL equally builds on top of its constituent languages. As shown above, we have extended the formal semantics of XQuery from [XQUERYSEMANTICS] by additional reduction rules which reduce each XSPARQL query to XQuery expressions which operate on results of SPARQL queries in the SPARQL's XML result format [SPARQLPROTOCOL].

Since we add only new reduction rules for SPARQL-like heads and bodies, it is easy to see that each native XQuery is treated in a semantically equivalent way in XSPARQL. The only thing affecting native XSPARQL queries are the "decoration" rules, which however, do not affect the final query result

Proposition 1. Each pragma-free XQuery is an XSPARQL query and has the same result under both XQuery and XSPARQL semantics

Proof (Sketch). As easily seen, given an XSPARQL query falling in the XQuery fragment, the result of our normalization is again an XQuery. Note that, however, even this fragment, our additional rewriting rules do change the original query in some cases. More concretely, what happens is that by our "decoration" rule each position-variable free for loop (i.e., that does not have an at clause) is decorated with a new position variable. As these new position variables start with an underscore they cannot occur in the original query, so this rewriting does not interfere with the semantics of the original query. The only rewriting rules which use the newly created position variables are those for rewriting blank nodes in construct parts, i.e., the [·]_BNodeSubst rule. However, this rule only applies to XSPARQL queries which fall outside the native XQuery fragment.

A similar correspondence holds for native SPARQL queries. Let us now sketch the proof showing the equivalence of XSPARQL's semantics and the evaluation of rewritten SPARQL queries into native XQuery. Intuitively, we "inherit" the SPARQL semantics from the fs:sparql "oracle".

Let Ω denote a solution sequence of a an abstract SPARQL query q =(E, DS, R) where E is a SPARQL algebra expression, DS is an RDF Dataset and R is a set of variables called the query form (cf. [SPARQL]). Then, by SPARQLResult(Ω) we denote the SPARQL result XML format representation of Ω.

We are now ready to state some properties about our transformations. The following proposition states that any SPARQL select query can be equivalently viewed as an XSPARQL F'DWMR query.

Proposition 2. Let q = (E_WM,DS,$x₁,...,$x_n) be a SPARQL query of the form select $x₁,...,$x_n DWM, where we denote by DS the RDF dataset (cf. [SPARQL]) corresponding to the DatasetClause (D), by G the respective default graph of DS, and by E_WM the SPARQL algebra expression corresponding to WM and P be the pattern defined in the where part (W). If eval(DS(G), q) = Ω₁, and

Then, Ω₁ ≡ Ω₂ modulo representation. Here, by equivalence (≡) modulo representation we mean that both Ω₁ and Ω₂ represent the same sequences of (partial) variable bindings.

[·]_SparqlQuery builds q as string without replacing any variable, since all variables in P are free. Then, the resulting string is applied to fs:sparql, which - since q was unchanged - by definition returns exactly SPARQLResult(Ω₁), and thus the return part return ($x₁, ..., $x_n) which extracts Ω₂ is obviously just a representational variant of Ω₁.

By similar arguments, we can see that SPARQL's construct queries are treated semantically equivalent in XSPARQL and in SPARQL, taking into account the shortcut notations defined in Section 4 of [XSPARQLLANGUAGE]. The idea here is that the rewriting rules constructs from Section 2.2 extract exactly the triples from the solution sequence from the body defined as defined in the SPARQL semantics [SPARQL].

4. References

[RDFCONCEPTS]: Graham Klyne and Jeremy Carroll (eds.). Resource Description Framework (RDF): Concepts and Abstract Syntax, February 2004. W3C Recommendation, available at http://www.w3.org/TR/rdf-concepts/.
[SPARQL]: Eric Prud'hommeaux and Andy Seaborne (eds.). SPARQL Query Language for RDF, 15 January 2008. W3C Recommendation, available at http://www.w3.org/TR/rdf-sparql-query/.
[SPARQLPROTOCOL]: Kendall Grant Clark, Lee Feigenbaum, and Elias Torres. SPARQL Protocol for RDF, 15 January 2008. W3C Recommendation, available at http://www.w3.org/TR/rdf-sparql-protocol/.
[XPATHFUNCT]: Ashok Malhotra, Jim Melton, and Norman Walsh (eds.). XQuery 1.0 and XPath 2.0 Functions and Operators. W3C Recommendation, 23 January 2007. Available at http://www.w3.org/TR/xpath-functions/.
[XQUERYSEMANTICS]: Denise Draper, Peter Fankhauser, Mary Fernández, Ashok Malhotra, Kristoffer Rose, Michael Rys, Jérôme Siméon, and Philip Wadler. XQuery 1.0 and XPath 2.0 Formal Semantics. W3c recommendation, W3C, January 2007. W3C Recommendation, available at http://www.w3.org/TR/xquery-semantics/.
[XSPARQLLANGUAGE]: XSPARQL Language Specification. Document included in the present W3C Member Submission.