Copyright © 2009 by DERI Galway at the National University of Ireland, Galway, Ireland.
This work is supported by Science Foundation Ireland under grants number SFI/02/CE1/I131 and SFI/08/CE/I1380 and under the European Commission European FP6 project inContext (IST-034718).
This document is available under the W3C Document License. See the W3C Intellectual Rights Notice and Legal Disclaimers for additional information.
XSPARQL is a query language combining XQuery and SPARQL for transformations between RDF and XML. This document provides a description of a prototype implementation of the language based on off-the-shelf XQuery and SPARQL engines. Along with a high-level description of the prototype the document presents a set of test queries and their expected output which are to be understood as illustrative help for possible other implementers.
This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications can be found in the W3C technical reports index at http://www.w3.org/TR/.
This document is a part of the XSPARQL Submission which comprises five documents:
By publishing this document, W3C acknowledges that the Submitting Members have made a formal Submission request to W3C for discussion. Publication of this document by W3C indicates no endorsement of its content by W3C, nor that W3C has, is, or will be allocating any resources to the issues addressed by it. This document is not the product of a chartered W3C group, but is published as potential input to the W3C Process. A W3C Team Comment has been published in conjunction with this Member Submission. Publication of acknowledged Member Submissions at the W3C site is one of the benefits of W3C Membership. Please consult the requirements associated with Member Submissions of section 3.3 of the W3C Patent Policy. Please consult the complete list of acknowledged W3C Member Submissions.
This document introduces a simple implementation of the XSPARQL language as a proof of concept. The implementation described here can be found in the XSPARQLer Open Source project page at SourceForge.net.
The main idea behind our implementation is translating XSPARQL queries to corresponding XQueries which possibly use interleaved calls to a SPARQL endpoint. The architecture of our prototype shown in Figure 1 consists of three main components: (1) a query rewriter, which turns an XSPARQL query into an XQuery; (2) a SPARQL endpoint, for querying RDF from within the rewritten XQuery; and (3) an XQuery engine for computing the result document.
The rewriter (Algorithm 1) takes as input a full XSPARQL QueryBody [XQUERYSEMANTICS] q (i.e., a sequence of FLWOR' expressions), a set of bound variables b and a set of position variables p, which we explain below. For a FL (or F', resp.) clause s, we denote by vars(s) the list of all newly declared variables (or the varlist, resp.) of s. We only sketch the core rewriting function rewrite() here; additional machinery handling the prolog including function, variable, module, and namespace declarations is needed in the full implementation. The rewriting is initiated by invoking rewrite(q, ∅, ∅) with empty bound and position variables an results in a syntactically valid XQuery that can be executed using an off-the-shelf XQuery implementation.
Input: XSPARQL query q, set of bounded variables b, set
of position variables p Result: XQuery |
|
1 | if q is of form s1, ... , sk then |
2 | return rewrite(s1, b, p), ... , rewrite(sk, b, p) |
3 | else if q is of form for $x1in XPathExpr1, ... , $xkin XPathExprk s1then |
4 | return for $x1at $x1_pos in XPathExpr1, ... , $xkat $xk_pos in XPathExprk |
5 | rewrite(s1, b, p ∪ {$x1_pos, ... , $xk_pos}) |
6 | else if q is of form for $x1... $xnfrom D where { pattern } M s1then |
7 | return let $aux query := sparql(D, {$x1, ... , $xn}, pattern, M, b) |
8 | for $aux_result in doc($aux_query)//sparql:result |
9 | auxvars({$x1, ... , $xn}) rewrite(s1, b ∪ vars(q), p) |
10 | else if q is of form construct {template} then |
11 | return return (rewrite-template(template, b, p) ) |
12 | else |
13 | split q into its subexpressions s1, ... , sn |
14 | for j := 1, ... , n do bj= b ∪ ∪1=<i=<j-1 = vars(si) |
15 | if n > 1 then return q [s1/rewrite(s1, b1, p), ... , sn/rewrite(sn, bn, p)] |
16 | else return q |
17 | end |
The rewriter is implemented as a Python script which is part of the XSPARQLer Open Source distribution. We provide an online interface (informative and not part of the submission) where example queries can be found and tested at http://xsparql.deri.org/. Figure 3 shows the output of our translation for the construct query in Figure 2. Let us explain the algorithm, which may be viewed as consisting of two parts, responsible for lifting and lowering (cf. Section 2 of [XSPARQLLANGUAGE]), respectively, using this sample output.
prefix vc: <http://www.w3.org/2001/vcard-rdf/3.0#> prefix foaf: <http://xmlns.com/foaf/0.1/> construct {_:b foaf:name { fn:concat($N," ",$F) } . } from <vc.rdf> where { $P vc:Given $N. $P vc:Family $F.} |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 25 26 27 28 29 30 31 32 33 34 35 36 37 |
import module namespace _xsparql = "http://xsparql.deri.org/XSPARQLer/xsparql.xquery" at "http://xsparql.deri.org/XSPARQLer/xsparql.xquery"; declare namespace vc = "http://www.w3.org/2001/vcard-rdf/3.0#"; declare namespace foaf = "http://xmlns.com/foaf/0.1/"; declare namespace _sparql_result = "http://www.w3.org/2005/sparql-results#"; declare variable $_NS1 := "prefix vc: <http://www.w3.org/2001/vcard-rdf/3.0#> "; declare variable $_NS2 := "prefix foaf: <http://xmlns.com/foaf/0.1/> "; _xsparql:_serialize(("@",$_NS1,".","@",$_NS2,".")), let $_aux1 := _xsparql:_serialize(( "http://example.org/sparql?query=", fn:encode-for-uri(_xsparql:_serialize($_NS1, $_NS2, "select $P $N $F from <vc.rdf> where {$P vc:Given $N. $P vc:Family $F.}")))) for $_aux_result1 at $_aux_result1_pos in doc($_aux1)//_sparql_result:result let $_P_Node := $_aux_result1/_sparql_result:binding[@name="P"] let $P := data($_P_Node/*) let $_P_NodeType := name($_P_Node/*) let $_P_NodeDatatype := string($_P_Node/*/@datatype) let $_P_NodeLang := string($_P_Node/*/@lang) let $_P_RDFTerm := _xsparql:_rdf_term($_P_NodeType,$P) let $_N_Node := $_aux_result1/_sparql_result:binding[@name="N"] let $N := data($_N_Node/*) let $_N_NodeType := name($_N_Node/*) let $_N_NodeDatatype := string($_N_Node/*/@datatype) let $_N_NodeLang := string($_N_Node/*/@lang) let $_N_RDFTerm := _xsparql:_rdf_term($_N_NodeType,$N) let $_F_Node := $_aux_result1/_sparql_result:binding[@name="F"] let $F := data($_F_Node/*) let $_F_NodeType := name($_F_Node/*) let $_F_NodeDatatype := string($_F_Node/*/@datatype) let $_F_NodeLang := string($_F_Node/*/@lang) let $_F_RDFTerm := _xsparql:_rdf_term($_F_NodeType,$F) let $_validSubject1 := _xsparql:_serialize(("_:b", "_", data($_aux_result1_Pos))) let $_validObject2 := _xsparql:_serialize(('"', fn:concat($N , " " , $F ) , '"')) return if (_xsparql:_validSubject("", $_validSubject1)) then ( if (_xsparql:_validObject("", $_validObject2)) then ( _xsparql:_serialize(($_validSubject1, " foaf:name ", $_validObject2, " ." )) ) else "") else "" |
Before we rewrite the QueryBody q, we process the prolog (P) of the XSPARQL query and output every namespace declaration as Turtle string literals "@prefix ns: <URI>." After generating the prolog (lines 1-8 of the output), the rewriting of the QueryBody is performed recursively following the syntax of XSPARQL. During the traversal of the nested FLWOR' expressions, SPARQL-like bodies (lowering) or heads (lifting) will be replaced by XQuery expressions, which handle our two tasks. The lowering part is processed first:
Lowering Normal XQuery-like FLWO expressions are simply copied to the output and "decorated" (cf. [XSPARQLSEMANTICS]) with position variables, see lines 3-5 of Algorithm 1. The lowering part of XSPARQL, i.e., SPARQL-like F'DWM blocks, is "encoded" in XQuery with interleaved calls to an external SPARQL endpoint. To this end, we translate F'DWM blocks into equivalent XQuery FLWO expressions which retrieve SPARQL result XML documents [SPARQLRESULT] from a SPARQL engine; i.e., we "push" each F'DWM body to the SPARQL side, by translating it to a native select query string, see lines 6-9 of Algorithm 1. The auxiliary function sparql() in line 7 of our rewriter provides the functionality of transforming the where { pattern } part of F'DWM clauses to XQuery expressions which have all bound variables in pattern replaced by the values of the variables; "free" XSPARQL variables serve as binding variables for the SPARQL query result. The outcome of the sparql() function is a list of expressions, which is concatenated and URI-encoded using XQuery's XPath functions, and wrapped into a URI with http scheme pointing to the SPARQL query service (lines 9-11 of the output), cf. [SPARQLPROTOCOL]. Then we create a new XQuery for-loop that iterates over variable $aux_result, i.e., it iterates over the query answers extracted from the SPARQL XML result returned by the SPARQL query processor (line 12 of the output). For each variable $xi ∈ vars(s) (i.e., in the (F') for clause of the original F'DWM body), new auxiliary variables are defined in separate let-expressions extracting its node, content, type (i.e., literal, uri, or blank), datatype URI or language tag if present, and the corresponding RDFTerm ($xi_Node, $xi, $xi_NodeType, $xi_NodeDataype, $xi_NodeLang and $xi_RDFTerm, resp.) by appropriate XPath expressions (lines 13-31 of Figure 3); the auxvars() helper function in line 9 of Algorithm 1 is responsible for this. Thereafter, the rewriter is called again recursively, with the newly declared variables added to b.
Lifting For the lifting part, i.e., SPARQL-like constructs in the R part, the transformation process is straightforward: Algorithm 1 is called on q and recursively decorates every for $Var expression by fresh position variables (line 12 of our example output); ultimately, construct templates are rewritten to an assembled string of the pattern's constituents, filling in variable bindings and evaluated subexpressions (lines 32-37 of the output): Blank nodes in constructs need special care, since, according to SPARQL's semantics, these must create new blank node identifiers for each solution binding. This is solved by "adorning" each blank node identifier in the construct part with the above-mentioned position variables from any enclosing for-loops, thus creating a new, unique blank node identifier in each loop (line 32 in the output). The auxiliary function rewrite-template() in line 11 of the algorithm provides this functionality by simply creating concatenations the lists of all position variable p as expressions to each blank node id; if there are nested expressions in the supplied construct {template}, rewrite-template() will return a sequence of nested FLWORs with each having rewrite() applied recursively on these expressions with the in-scope bound and position variables. rewrite-template() will create new variables for each of the RDFTerms that need to be dynamically evaluated for validity (variables $_validSubject1 and in $_validObject2 lines 32 and 33 in the output) and finally rewrite-template() generates a return clause which checks validity of the generated triples in Turtle syntax by respective helper function calls (validSubject(), validPredicate(), validObject(), which are declared - along with all other helper functions in the http://xsparql.deri.org/XSPARQLer/xsparql.xquery library), see lines 34-37 of the output.
Note that expressions involving SPARQL-like construct clauses create Turtle [TURTLE] output. Generating other output formats such as RDF/XML if needed is optionally done in our implementation by simple post-processing of the Turtle output by using standard RDF processing tools.
Finally, let us remark that that although both our implementation as well as XSPARQL's semantics definition [XQUERYSEMANTICS] use the SPARQL result format [SPARQLRESULT] for retrieving the results of a SPARQL query, other implementations can be conceived that do not require this intermediate step, but either use optimized internal data-structures to pass results between SPARQL and XQuery native processors, or implement a completely native, integrated processor.
In the following, a set of XSPARQL test queries is presented. All these queries, along with the necessary input data, are also available in the Examples, Test cases and Use cases file which is part of the present submission. Along with the original queries we also present the rewriting performed by our implementation as well as the expected query results.
This example query generates FOAF data from attribute values and element content in an input XML extracted by respectively simple XPath expressions. It is intended to demonstrate a simple lifting transformation from XML to RDF.
declare namespace foaf = "http://xmlns.com/foaf/0.1/"; for $person in doc("relations.xml")//person, $nameA in $person/@name, $nameB in $person/knows construct { [ foaf:name {data($nameA)}; a foaf:Person ] foaf:knows [ foaf:name {data($nameB)}; a foaf:Person ]. }
import module namespace _xsparql = "http://xsparql.deri.org/XSPARQLer/xsparql.xquery" at "http://xsparql.deri.org/XSPARQLer/xsparql.xquery"; declare namespace _sparql_result = "http://www.w3.org/2005/sparql-results#"; declare namespace foaf = "http://xmlns.com/foaf/0.1/" ; declare variable $_NS1 := "prefix foaf: <http://xmlns.com/foaf/0.1/>"; _xsparql:_serialize((" @", $_NS1, ".", "")), for $person at $_person_Pos in doc("relations.xml")//person , $nameA at $_nameA_Pos in $person/@name , $nameB at $_nameB_Pos in $person/knows let $_validObject1 := _xsparql:_serialize(('"', data($nameA), '"')) let $_validObject2 := _xsparql:_serialize(('"', data($nameB), '"')) return (_xsparql:_removeEmpty(_xsparql:_serialize(( "[", if ( _xsparql:_validObject("", $_validObject1)) then (_xsparql:_serialize(" foaf:name ", $_validObject1, ";")) else "", _xsparql:_serialize(" a ", 'foaf:Person', ";"), _xsparql:_serialize("foaf:knows ", "[", if ( _xsparql:_validObject( "", $_validObject2)) then (_xsparql:_serialize(" foaf:name ", $_validObject2, ";")) else "", _xsparql:_serialize(" a ", 'foaf:Person',";"), " ]") , " ] ."))))
@prefix foaf: <http://xmlns.com/foaf/0.1/>. [ foaf:name "Alice"; a foaf:Person;foaf:knows [ foaf:name "Bob"; a foaf:Person; ] ] . [ foaf:name "Alice"; a foaf:Person;foaf:knows [ foaf:name "Charles"; a foaf:Person; ] ] . [ foaf:name "Bob"; a foaf:Person;foaf:knows [ foaf:name "Charles"; a foaf:Person; ] ] .
This query performs a very similar task as the previous one: generate FOAF data from input XML. It demonstrates a full lifting transformation from XML to RDF. Particularly, the difference to the previous transformation lies in the fact that the same blank node identifier is given to people with the same name (assuming that names uniquely identify people in the input XML file at hand). The blank node identifier is "computed" from the position of the first occurrence of the name node in the source XML tree.
declare namespace foaf="http://xmlns.com/foaf/0.1/"; let $doc := doc("relations.xml") let $persons := $doc//*[@name or ../knows] return for $p in $persons let $n := if( $p[@name]) then $p/@name else $p let $id := count($p/preceding::*) + count($p/ancestor::*) where not(exists($p/following::*[@name=$n or data(.)=$n])) construct { _:b{$id} a foaf:Person; foaf:name {data($n)}. { for $k in $persons let $kn := if( $k[@name]) then $k/@name else $k let $kid :=count($k/preceding::*) + count($k/ancestor::*) where $kn = data($doc//*[@name=$n]/knows) and not(exists($kn/../following::*[@name=$kn or data(.)=$kn])) construct { _:b{$id} foaf:knows _:b{$kid}. _:b{$kid} a foaf:Person. } } }
import module namespace _xsparql = "http://xsparql.deri.org/XSPARQLer/xsparql.xquery" at "http://xsparql.deri.org/XSPARQLer/xsparql.xquery"; declare namespace _sparql_result = "http://www.w3.org/2005/sparql-results#"; declare namespace foaf = "http://xmlns.com/foaf/0.1/" ; declare variable $_NS1 := "prefix foaf: <http://xmlns.com/foaf/0.1/>"; _xsparql:_serialize(" @", $_NS1, ".", ""), let $doc := doc("relations.xml") let $persons := $doc//*[@name or ../knows ] return for $p at $_p_Pos in $persons let $n := if ( $p[@name ]) then $p/@name else $p let $id := count($p/preceding::*)+count($p/ancestor::*) let $_validSubject4 := _xsparql:_serialize(("_:b", data($id))) let $_validObject5 := _xsparql:_serialize(( '"', data($n) , '"')) where not(exists($p/following::*[@name =$n or data(.) =$n ])) return ( if ( _xsparql:_validSubject( "", $_validSubject4)) then (_xsparql:_serialize(($_validSubject4, " a ", 'foaf:Person', " .")), if ( _xsparql:_validObject( "", $_validObject5)) then (_xsparql:_serialize(($_validSubject4, " foaf:name ", $_validObject5, " ."))) else "" ) else "" , for $k at $_k_Pos in $persons let $kn := if ( $k[@name ]) then $k/@name else $k let $kid := count($k/preceding::*)+count($k/ancestor::*) let $_validSubject1 := _xsparql:_serialize(("_:b", data($id))) let $_validObject2 := _xsparql:_serialize(("_:b", data($kid))) let $_validSubject3 := _xsparql:_serialize("_:b", data($kid))) where $kn =data($doc//*[@name =$n ]/knows) and not(exists($kn/../following::*[@name =$kn or data(.) =$kn ])) return ( if ( _xsparql:_validSubject( "", $_validSubject1)) then ( if ( _xsparql:_validObject( "", $_validObject2)) then (_xsparql:_serialize(($_validSubject1, " foaf:knows ", $_validObject2, " ."))) else "" ) else "" , if ( _xsparql:_validSubject( "", $_validSubject3)) then (_xsparql:_serialize(($_validSubject3, " a ", 'foaf:Person', " ."))) else "" ) )
@prefix foaf: <http://xmlns.com/foaf/0.1/>. _:b1 a foaf:Person . _:b1 foaf:name "Alice" . _:b1 foaf:knows _:b4 . _:b4 a foaf:Person . _:b1 foaf:knows _:b6 . _:b6 a foaf:Person . _:b4 a foaf:Person . _:b4 foaf:name "Bob" . _:b4 foaf:knows _:b6 . _:b6 a foaf:Person . _:b6 a foaf:Person . _:b6 foaf:name "Charles" .
This query performs a simple mapping from vCard given and family name properties into FOAF full names; it shows the use of XPath and XQuery built-in functions for manipulating RDF.
prefix vc: <http://www.w3.org/2001/vcard-rdf/3.0#> prefix foaf: <http://xmlns.com/foaf/0.1/> construct { _:b foaf:name {fn:concat($N," ", $F)}.} from <vCard.rdf> where { $P vc:Given $N. $P vc:Family $F. }
import module namespace _xsparql = "http://xsparql.deri.org/XSPARQLer/xsparql.xquery" at "http://xsparql.deri.org/XSPARQLer/xsparql.xquery"; declare namespace _sparql_result = "http://www.w3.org/2005/sparql-results#"; declare namespace vc = "http://www.w3.org/2001/vcard-rdf/3.0#"; declare namespace foaf = "http://xmlns.com/foaf/0.1/"; declare variable $_NS1 := "prefix vc: <http://www.w3.org/2001/vcard-rdf/3.0#>"; declare variable $_NS2 := "prefix foaf: <http://xmlns.com/foaf/0.1/>"; _xsparql:_serialize(( "@", $_NS1, ".", " @", $_NS2, ".", "")), let $_aux1 := _xsparql:_serialize(("http://example.org/sparql?query=", fn:encode-for-uri( _xsparql:_serialize(( $_NS1, $_NS2, " select $N $P $F from <vCard.rdf> where { $P vc:Given $N . $P vc:Family $F . } "))))) for $_aux_result1 at $_aux_result1_Pos in doc($_aux1)//_sparql_result:result let $_N_Node := $_aux_result1/_sparql_result:binding[@name="N"] let $N := data($_N_Node/*) let $_N_NodeType := name($_N_Node/*) let $_N_NodeDatatype := string($_N_Node/*/@datatype) let $_N_NodeLang := string($_N_Node/*/@lang) let $_N_RDFTerm := _xsparql:_rdf_term($_N_NodeType,$N) let $_P_Node := $_aux_result1/_sparql_result:binding[@name="P"] let $P := data($_P_Node/*) let $_P_NodeType := name($_P_Node/*) let $_P_NodeDatatype := string($_P_Node/*/@datatype) let $_P_NodeLang := string($_P_Node/*/@lang) let $_P_RDFTerm := _xsparql:_rdf_term($_P_NodeType,$P) let $_F_Node := $_aux_result1/_sparql_result:binding[@name="F"] let $F := data($_F_Node/*) let $_F_NodeType := name($_F_Node/*) let $_F_NodeDatatype := string($_F_Node/*/@datatype) let $_F_NodeLang := string($_F_Node/*/@lang) let $_F_RDFTerm := _xsparql:_rdf_term($_F_NodeType,$F) let $_validSubject1 := _xsparql:_serialize(("_:b", "_", data($_aux_result1_Pos))) let $_validObject2 := _xsparql:_serialize(( '"', fn:concat($N , " " , $F) , '"')) return if ( _xsparql:_validSubject( "", $_validSubject1)) then (if ( _xsparql:_validObject( "", $_validObject2)) then (_xsparql:_serialize(($_validSubject1, " foaf:name ", $_validObject2, " ."))) else "" ) else ""
@prefix vc: <http://www.w3.org/2001/vcard-rdf/3.0#>. @prefix foaf: <http://xmlns.com/foaf/0.1/>. _:b_1 foaf:name "Axel Polleres" .
This query generates XML data from an input RDF file containing FOAF data. It demonstrates the lowering task, i.e., mapping from RDF to XML.
declare namespace foaf = "http://xmlns.com/foaf/0.1/"; <relations> { for $Person $Name from <relations.rdf> where { $Person foaf:name $Name } order by $Name return <person name="{$Name}"> { for $FName from <relations.rdf> where { $Person foaf:knows $Friend. $Person foaf:name $Name. $Friend foaf:name $FName. } return <knows> { $FName }</knows> } </person> } </relations>
import module namespace _xsparql = "http://xsparql.deri.org/XSPARQLer/xsparql.xquery" at "http://xsparql.deri.org/XSPARQLer/xsparql.xquery"; declare namespace _sparql_result = "http://www.w3.org/2005/sparql-results#"; declare namespace foaf = "http://xmlns.com/foaf/0.1/" ; declare variable $_NS1 := "prefix foaf: <http://xmlns.com/foaf/0.1/>"; <relations>{ let $_aux1 := _xsparql:_serialize(("http://example.org/sparql?query=", fn:encode-for-uri( _xsparql:_serialize(( $_NS1, " select $Person $Name from <relations.rdf> where { $Person foaf:name $Name . } order by $Name"))))) for $_aux_result1 at $_aux_result1_Pos in doc($_aux1)//_sparql_result:result let $_Person_Node := ($_aux_result1/_sparql_result:binding[@name = "Person"]) let $Person := data($_Person_Node/*) let $_Person_NodeType := name($_Person_Node/*) let $_Person_NodeDatatype := string($_Person_Node/*/@datatype) let $_Person_NodeLang := string($_Person_Node/*/@lang) let $_Person_RDFTerm := _xsparql:_rdf_term($_Person_NodeType, $Person) let $_Name_Node := ($_aux_result1/_sparql_result:binding[@name = "Name"]) let $Name := data($_Name_Node/*) let $_Name_NodeType := name($_Name_Node/*) let $_Name_NodeDatatype := string($_Name_Node/*/@datatype) let $_Name_NodeLang := string($_Name_Node/*/@lang) let $_Name_RDFTerm := _xsparql:_rdf_term($_Name_NodeType, $Name) return <person name = "{$Name}">{ let $_aux2 := _xsparql:_serialize(("http://example.org/sparql?query=", fn:encode-for-uri( _xsparql:_serialize(( $_NS1, "select $FName from <relations.rdf> where { ", $_Person_RDFTerm, " foaf:knows $Friend .", $_Person_RDFTerm, " foaf:name ", $_Name_RDFTerm, " . $Friend foaf:name $FName . }"))) )) for $_aux_result2 at $_aux_result2_Pos in doc($_aux2)//_sparql_result:result let $_FName_Node := ($_aux_result2/_sparql_result:binding[@name = "FName"]) let $FName := data($_FName_Node/*) let $_FName_NodeType := name($_FName_Node/*) let $_FName_NodeDatatype := string($_FName_Node/*/@datatype) let $_FName_NodeLang := string($_FName_Node/*/@lang) let $_FName_RDFTerm := _xsparql:_rdf_term($_FName_NodeType, $FName) return <knows>{ $FName }</knows> }</person> }</relations>
<relations> <person name="Alice"> <knows>Charles</knows> <knows>Bob</knows> </person> <person name="Bob"> <knows>Charles</knows> </person> <person name="Charles"/> </relations>
This query selects only persons "known by somebody" in the input RDF data. All these persons are then mapped to a class where the class URI is assigned to a variable using an XQuery let clause. The example demonstrates the combination of constructs from XQuery and SPARQL, more specifically the reuse of XQuery variables within SPARQL like construct clauses.
prefix : <http://www.example.org> prefix foaf: <http://xmlns.com/foaf/0.1/> let $y := "http://www.example.org/knownPerson" for $x from <foaf.rdf> where {$s foaf:knows $x} construct {$x a <{$y}> }
import module namespace _xsparql = "http://xsparql.deri.org/XSPARQLer/xsparql.xquery" at "http://xsparql.deri.org/XSPARQLer/xsparql.xquery"; declare namespace _sparql_result = "http://www.w3.org/2005/sparql-results#"; declare default element namespace "http://www.example.org"; declare namespace foaf = "http://xmlns.com/foaf/0.1/"; declare variable $_NS1 := "prefix : <http://www.example.org>"; declare variable $_NS2 := "prefix foaf: <http://xmlns.com/foaf/0.1/>"; _xsparql:_serialize(( " @", $_NS1, ".", " @", $_NS2, ".", "" )), let $y := "<http://www.example.org/knownPerson>" let $_aux1 := _xsparql:_serialize(("http://example.org/sparql?query=", fn:encode-for-uri( _xsparql:_serialize(( $_NS1, $_NS2, " select $x from <http://www.polleres.net/foaf.rdf> where { $s foaf:knows $x . } "))))) for $_aux_result1 at $_aux_result1_Pos in doc($_aux1)//_sparql_result:result let $_x_Node := ($_aux_result1/_sparql_result:binding[@name = "x"]) let $x := data($_x_Node/*) let $_x_NodeType := name($_x_Node/*) let $_x_NodeDatatype := string($_x_Node/*/@datatype) let $_x_NodeLang := string($_x_Node/*/@lang) let $_x_RDFTerm := _xsparql:_rdf_term($_x_NodeType, $x) let $_validObject1 := _xsparql:_serialize(( '"', $y, '"')) return ( if ( _xsparql:_validSubject( "", $_x_RDFTerm)) then (if ( _xsparql:_validObject( "", $_validObject1)) then (_xsparql:_serialize(($_x_RDFTerm, " a ", $_validObject1, " ."))) else "" ) else "" )
@prefix : <http://www.example.org>. @prefix foaf: <http://xmlns.com/foaf/0.1/>. _:b0 a <http://www.example.org/knownPerson> . <http://danbri.org/foaf.rdf#danbri> a <http://www.example.org/knownPerson> . _:b1 a <http://www.example.org/knownPerson> . <http://richard.cyganiak.de/foaf.rdf#cygri> a <http://www.example.org/knownPerson> . <http://nets.ii.uam.es/~rlara/foaf.rdf#me> a <http://www.example.org/knownPerson> . _:b2 a <http://www.example.org/knownPerson> . _:b3 a <http://www.example.org/knownPerson> . _:b4 a <http://www.example.org/knownPerson> . <http://eyaloren.org/foaf.rdf#me> a <http://www.example.org/knownPerson> . _:b5 a <http://www.example.org/knownPerson> . <http://harth.org/andreas/foaf#ah> a <http://www.example.org/knownPerson> . <http://www.aifb.uni-karlsruhe.de/Personen/viewPersonOWL/id2084instance> a <http://www.example.org/knownPerson> . <http://page.mi.fu-berlin.de/mochol/foaf.rdf#me> a <http://www.example.org/knownPerson> . _:b6 a <http://www.example.org/knownPerson> . <http://page.mi.fu-berlin.de/~nixon/foaf.rdf#nixon> a <http://www.example.org/knownPerson> . _:b7 a <http://www.example.org/knownPerson> . _:b8 a <http://www.example.org/knownPerson> . _:b9 a <http://www.example.org/knownPerson> . <http://www.postsubmeta.net/foaf.rdf#TK> a <http://www.example.org/knownPerson> . _:b10 a <http://www.example.org/knownPerson> . _:b11 a <http://www.example.org/knownPerson> . _:b12 a <http://www.example.org/knownPerson> . <http://sw.deri.org/~haller/foaf.rdf#ah> a <http://www.example.org/knownPerson> . _:b13 a <http://www.example.org/knownPerson> .
This query performs the same task as the previous one but removes persons only identified with a Blank Node using a SPARQL filter expression.
prefix : <http://www.example.org> prefix foaf: <http://xmlns.com/foaf/0.1/> let $y := "http://www.example.org/knownPerson" for * from <foaf.rdf> where {$s foaf:knows $x filter (!isblank($x))} construct {$x a <{$y}> }
import module namespace _xsparql = "http://xsparql.deri.org/XSPARQLer/xsparql.xquery" at "http://xsparql.deri.org/XSPARQLer/xsparql.xquery"; declare namespace _sparql_result = "http://www.w3.org/2005/sparql-results#"; declare default element namespace "http://www.example.org"; declare namespace foaf = "http://xmlns.com/foaf/0.1/"; declare variable $_NS1 := "prefix : <http://www.example.org>"; declare variable $_NS2 := "prefix foaf: <http://xmlns.com/foaf/0.1/>"; _xsparql:_serialize(( " @", $_NS1, ".", " @", $_NS2, ".", "" )), let $y := "http://www.example.org/knownPerson" let $_aux1 := _xsparql:_serialize(("http://example.org/sparql?query=", fn:encode-for-uri( _xsparql:_serialize(( $_NS1, $_NS2, " select $s $x from <http://www.polleres.net/foaf.rdf> where { $s foaf:knows $x . filter(!isblank($x))} "))))) for $_aux_result1 at $_aux_result1_Pos in doc($_aux1)//_sparql_result:result let $_s_Node := ($_aux_result1/_sparql_result:binding[@name = "s"]) let $s := data($_s_Node/*) let $_s_NodeType := name($_s_Node/*) let $_s_NodeDatatype := string($_s_Node/*/@datatype) let $_s_NodeLang := string($_s_Node/*/@lang) let $_s_RDFTerm := _xsparql:_rdf_term($_s_NodeType, $s) let $_x_Node := ($_aux_result1/_sparql_result:binding[@name = "x"]) let $x := data($_x_Node/*) let $_x_NodeType := name($_x_Node/*) let $_x_NodeDatatype := string($_x_Node/*/@datatype) let $_x_NodeLang := string($_x_Node/*/@lang) let $_x_RDFTerm := _xsparql:_rdf_term($_x_NodeType, $x) let $_validObject1 := _xsparql:_serialize(("<" , $y , ">")) return ( if ( _xsparql:_validSubject( "", $_x_RDFTerm)) then (if ( _xsparql:_validObject( "", $_validObject1)) then (_xsparql:_serialize(( $_x_RDFTerm, " a ", $_validObject1 , " ."))) else "" ) else "" )
@prefix : <http://www.example.org>. @prefix foaf: <http://xmlns.com/foaf/0.1/>. <http://danbri.org/foaf.rdf#danbri> a <http://www.example.org/knownPerson> . <http://richard.cyganiak.de/foaf.rdf#cygri> a <http://www.example.org/knownPerson> . <http://nets.ii.uam.es/~rlara/foaf.rdf#me> a <http://www.example.org/knownPerson> . <http://eyaloren.org/foaf.rdf#me> a <http://www.example.org/knownPerson> . <http://www.aifb.uni-karlsruhe.de/Personen/viewPersonOWL/id2084instance> a <http://www.example.org/knownPerson> . <http://harth.org/andreas/foaf#ah> a <http://www.example.org/knownPerson> . <http://page.mi.fu-berlin.de/mochol/foaf.rdf#me> a <http://www.example.org/knownPerson> . <http://page.mi.fu-berlin.de/~nixon/foaf.rdf#nixon> a <http://www.example.org/knownPerson> . <http://www.postsubmeta.net/foaf.rdf#TK> a <http://www.example.org/knownPerson> . <http://sw.deri.org/~haller/foaf.rdf#ah> a <http://www.example.org/knownPerson> .
Given a set of dated entries, this example extracts the distribution of these entries over time, grouping the entries by day and counting the entries for each day. A typical scenario where such data could apply would be a set of IRC logs in a given month, annotated in RDF, where one wants to get an overview over the activity on a channel over the month. This example shows how XQuery features can be used to perform aggregation of data, which is not possible with pure SPARQL. In our current, implementation the naive rewriting generated leaves some room for improvement and we expect future XSPARQL engines to optimize such queries.
prefix foaf: <http://xmlns.com/foaf/0.1/> prefix dct: <http://purl.org/dc/terms/> let $results := for $entry $date from <sample_distribution_data.nt> where {$entry dct:created $date} return <entry date="{$date}"/> return let $days := for $day in data($results/@date) return day-from-dateTime(xs:dateTime($day)) for $day in distinct-values($days) order by $day return <day d="{$day}">{count($results[day-from-dateTime(xs:dateTime(@date)) = $day])}</day>
Observe that assuming and optimal sorting algorithm for evaluating the order by clause (i.e., O(n.log(n) ), this query runs in O(n2.log(n)), where n is the number of dated entries in the input data.
import module namespace _xsparql = "http://xsparql.deri.org/XSPARQLer/xsparql.xquery" at "http://xsparql.deri.org/XSPARQLer/xsparql.xquery"; declare namespace _sparql_result = "http://www.w3.org/2005/sparql-results#"; declare namespace foaf = "http://xmlns.com/foaf/0.1/"; declare namespace dct = "http://purl.org/dc/terms/"; declare variable $_NS1 := "prefix foaf: <http://xmlns.com/foaf/0.1/>"; declare variable $_NS2 := "prefix dct: <http://purl.org/dc/terms/>"; let $results := let $_aux1 := _xsparql:_serialize(("http://example.org/sparql?query=", fn:encode-for-uri( _xsparql:_serialize(( $_NS1, $_NS2, " select $entry $date from <sample_distribution_data.nt> where { $entry dct:created $date . } "))))) for $_aux_result1 at $_aux_result1_Pos in doc($_aux1)//_sparql_result:result let $_entry_Node := ($_aux_result1/_sparql_result:binding[@name = "entry"]) let $entry := data($_entry_Node/*) let $_entry_NodeType := name($_entry_Node/*) let $_entry_NodeDatatype := string($_entry_Node/*/@datatype) let $_entry_NodeLang := string($_entry_Node/*/@lang) let $_entry_RDFTerm := _xsparql:_rdf_term($_entry_NodeType, $entry ) let $_date_Node := ($_aux_result1/_sparql_result:binding[@name = "date"]) let $date := data($_date_Node/*) let $_date_NodeType := name($_date_Node/*) let $_date_NodeDatatype := string($_date_Node/*/@datatype) let $_date_NodeLang := string($_date_Node/*/@lang) let $_date_RDFTerm := _xsparql:_rdf_term($_date_NodeType, $date ) return <entry date = "{$date}"/> return let $days := for $day at $_day_Pos in data($results/@date ) return day-from-dateTime(xs:dateTime($day ) ) for $day at $_day_Pos in distinct-values($days ) order by $day return <day d = "{$day}">{ count($results[day-from-dateTime(xs:dateTime(@date ) ) = $day ] ) }</day>
<day d="12">41</day> <day d="13">22</day> <day d="14">166</day> <day d="15">252</day>
This query is similar to the previous one, i.e. it performs the same task, except that a custom function is used to improve the complexity of the algorithm. This example shall show that there is obviously a lot of room for query optimizers for XSPARQL.
prefix foaf: <http://xmlns.com/foaf/0.1/> prefix dct: <http://purl.org/dc/terms/> declare function local:_distribution_count($s, $i, $c) { let $x := if ($i > count($s)) then () else if ($s[$i] eq $s[$i + 1]) then local:_distribution_count($s, $i + 1, $c + 1) else fn:concat( fn:concat($s[$i], ", ", $c) , " ", local:_distribution_count($s, $i + 1, 1) ) return $x }; let $days := for $entry $date from <sample_distribution_data.nt> where {$entry dct:created $date} let $day := day-from-dateTime(xs:dateTime($date)) order by $day return $day return local:_distribution_count($days, 1, 1)
Observe here that again the sorting in the first let takes O(n.log(n) steps for an optimal engine. The recursive custom function for counting is then called upon the messages sorted per day. Since this function boils down to a simple iteration over the sorted days, it runs in O(n) and the overall complexity thus stays within O(n.log(n) which is an improvement over the previous query. An intelligent query optimizer could possibly catch such cases.
import module namespace _xsparql = "http://xsparql.deri.org/XSPARQLer/xsparql.xquery" at "http://xsparql.deri.org/XSPARQLer/xsparql.xquery"; declare namespace _sparql_result = "http://www.w3.org/2005/sparql-results#"; declare namespace foaf = "http://xmlns.com/foaf/0.1/"; declare namespace dct = "http://purl.org/dc/terms/"; declare function local:_distribution_count ( $s , $i , $c ) { let $x := if ( $i > count($s ) ) then () else if ( $s[$i ] eq $s[$i+1 ] ) then local:_distribution_count($s , $i+1 , $c+1 ) else fn:concat(fn:concat($s[$i ] , ", " , $c ) , " " , local:_distribution_count($s , $i+1 , 1 ) ) return $x } ; declare variable $_NS1 := "prefix foaf: <http://xmlns.com/foaf/0.1/>"; declare variable $_NS2 := "prefix dct: <http://purl.org/dc/terms/>"; let $days := let $_aux1 := _xsparql:_serialize(("http://example.org/sparql?query=", fn:encode-for-uri( _xsparql:_serialize(( $_NS1, $_NS2, " select $entry $date from <sample_distribution_data.nt> where { $entry dct:created $date . } "))))) for $_aux_result1 at $_aux_result1_Pos in doc($_aux1)//_sparql_result:result let $_entry_Node := ($_aux_result1/_sparql_result:binding[@name = "entry"]) let $entry := data($_entry_Node/*) let $_entry_NodeType := name($_entry_Node/*) let $_entry_NodeDatatype := string($_entry_Node/*/@datatype) let $_entry_NodeLang := string($_entry_Node/*/@lang) let $_entry_RDFTerm := _xsparql:_rdf_term($_entry_NodeType, $entry ) let $_date_Node := ($_aux_result1/_sparql_result:binding[@name = "date"]) let $date := data($_date_Node/*) let $_date_NodeType := name($_date_Node/*) let $_date_NodeDatatype := string($_date_Node/*/@datatype) let $_date_NodeLang := string($_date_Node/*/@lang) let $_date_RDFTerm := _xsparql:_rdf_term($_date_NodeType, $date ) let $day := day-from-dateTime(xs:dateTime($date ) ) order by $day return $day return local:_distribution_count($days , 1 , 1 )
12, 41 13, 22 14, 166 15, 252