Please refer to the errata for this document, which may include some normative corrections.
The previous errata for this document, are also available.
See also translations.
Copyright © 2010 W3C® (MIT, ERCIM, Keio), All Rights Reserved. W3C liability, trademark and document use rules apply.
RDF is a directed, labeled graph data format for representing information in the Web. SPARQL can be used to express queries across diverse data sources, whether the data is stored natively as RDF or viewed as RDF via middleware. This specification defines the syntax and semantics of a SPARQL 1.1 Query extension to for executing distributed queries.
This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at http://www.w3.org/TR/.
This is the First Public Working Draft of the "SPARQL 1.1 Federation Extensions" specification for review by W3C members and other interested parties.
The documents produced by this Working Group are:
This publication includes two extensions, SERVICE
and BINDINGS
to the SPARQL 1.1 Query specification.
The structure of this document will change to fully integrate the new features.
The design of the features presented here is work-in-progress and does not represent the final decisions of the working group. Implementers and application writers should not assume that the designs in this document will not change.
Comments on this document should be sent to public-rdf-dawg-comments@w3.org, a mailing list with a public archive. Questions and comments about SPARQL that are not related to this specification, including extensions and features, can be discussed on the mailing list public-sparql-dev@w3.org, (public archive).
Publication as a Working Draft does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.
The publication of this document by the W3C as a W3C Working Draft does not imply that all of the participants in the W3C SPARQL working group endorse the contents of the specification. Indeed, for any section of the specification, one can usually find many members of the working group or of the W3C as a whole who object strongly to the current text, the existence of the section at all, or the idea that the working group should even spend time discussing the concept of that section.
The W3C SPARQL Working Group is the W3C working group responsible for this specification's progress along the W3C Recommendation track.
This document was produced by a group operating under the 5 February 2004 W3C Patent Policy. W3C maintains a public list of any patent disclosures made in connection with the deliverables of the group; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy.
1 Introduction
1.1 Document Conventions
1.1.1 Namespaces
1.1.2 Result Descriptions
1.1.3 Terminology
2 SERVICE Graph Patterns
2.1 Variable Services
3 BINDINGS
4 Definition of Federation Extensions to SPARQL
4.1 Definition of SERVICE
4.2 Definition of BINDINGS
5 SPARQL Federation Extensions Grammar
6 Conformance
7 Security Considerations (Informative)
8 Internet Media Type, File Extension and Macintosh File Type
A References
A.1 Normative References
A.2 Other References
B CVS History
The growing suite of SPARQL query services offer consumers an opportunity to merge data distributed across the web.
A small number of extensions to SPARQL 1.1 enable expression of the merging queries.
In particular, a SERVICE
allows one to direct a portion of a query to a particular SPARQL query service, just as a GRAPH directs queries to particular named graphs.
A BINDINGS
keyword adds a compact syntax for tranfering results which constrain a query.
The combination of these extensions allows one to compose a query which delegates parts of the query to a series of services.
This specification defines the syntax and semantics of these extensions.
The SPARQL query language is closely related to the following specifications:
In this document, examples assume the following namespace prefix bindings unless otherwise stated:
Prefix | IRI |
---|---|
rdf: | http://www.w3.org/1999/02/22-rdf-syntax-ns# |
rdfs: | http://www.w3.org/2000/01/rdf-schema# |
xsd: | http://www.w3.org/2001/XMLSchema# |
fn: | http://www.w3.org/2005/xpath-functions# |
Result sets are illustrated in tabular form.
A 'binding' is a pair (variable,
RDF term). In this result set, there are three
variables:
x
, y
and z
(shown as column headers). Each
solution is shown as one row in the body of the table. Here, there is a single
solution, in which variable x
is bound to "Alice"
, variable
y
is bound to <http://example/a>
, and variable z
is not bound to an RDF term. Variables are not required to be bound in a
solution.
The SPARQL language includes IRIs, a subset of RDF URI References that omits spaces. Note that all IRIs in SPARQL queries are absolute; they may or may not include a fragment identifier [RFC3987, section 3.1]. IRIs include URIs [RFC3986] and URLs. The abbreviated forms (relative IRIs and prefixed names) in the SPARQL syntax are resolved to produce absolute IRIs.
The following terms are defined in SPARQL Query Language 1.1 [SQRY] and used in SPARQL:
RDF URI reference
")RDF URI reference
")Queries over distributed data often entail querying one source and using the acquired information to constrain queries of the next source. For instance, an edpoint which contains information about transmembrane receptors (molecules which cross cell walls):
# Data in default graph at service: http://bio.example/receptors @prefix iuphar: <http://iuphar.example/ns#> . @prefix entrez: <http://entrez.example/ns#> . _:GABBR1 iuphar:code "2.3:GABA:1:GABAB1:" . _:GABBR1 iuphar:ligand "GABA" . _:GABBR1 iuphar:species _:h2550 . _:h2550 iuphar:name "GABBR1" . _:h2550 entrez:id 2550 . _:GABBR2 iuphar:code "2.3:GABA:1:GABAB2:" . _:GABBR2 iuphar:ligand "GABA" . _:GABBR2 iuphar:species _:h9568 . _:h9568 iuphar:name "GABBR2" . _:h9568 entrez:id 9568 .
Another endpoint contains information about protein concentrations after drug exposure:
# Data in default graph at service: http://study.example/analyzed @prefix med: <http://med.example/testDrug#> . @prefix entrez: <http://entrez.example/ns#> . @prefix study: <http://study.example/affects#> . _:study9 entrez:id 2550 . _:study9 med:ication "Illudium Phosdex" _:study9 study:change -.23 . _:study10 entrez:id 2986 . _:study10 med:ication "Illudium Phosdex" _:study10 study:change +.38 .
A researcher may wish to know which medications significantly suppress receptors attached to the GABA ligand, supplying explicit service IRIs for the two services containing the critical information:
PREFIX iuphar: <http://iuphar.example/ns#> PREFIX entrez: <http://entrez.example/ns#> PREFIX med: <http://med.example/testDrug#> PREFIX study: <http://study.example/affects#> SELECT ?med ?species ?iuphar WHERE { SERVICE <http://bio.example/receptors> { ?receptor iuphar:ligand "GABA" . ?receptor iuphar:species ?species . ?species iuphar:name ?iuphar . ?species entrez:id ?id . } SERVICE <http://study.example/analyzed> { ?study entrez:id ?id . ?study study:species ?species ?study med:ication ?med ?study study:change ?change . FILTER (?change < -.2) } }
The results of this query are identical to one executed over the graphs at the two services:
The mechanics of executing a query over a graph differ from those of querying a service. Typically, a GRAPH
constraint is matched against an RDF graph which is in the querying system, perhaps as the result of parsing the response to an HTTP GET on the named graph. The mechanics of querying a service are different and the SERVICE
directive can be used to indicate that these special mechanics are required.
Querying a SPARQL service requires encoding the GRAPH-constrained pattern as a stand-alone SPARQL query and passing that query to the endpoint, either as a GET or a POST. Note that WSDL defines the behavior with respect to constructing HTTP URLs from an endpoint and a set of query parameters, in particular appending '?' or '&' to an endpoint URL which may already have them.
If the HTTP response is of type application/sparql-results
, the response is parsed into a Solution Sequence which is processed according the the SPARQL Alebra. For any other response, the query fails. For example, the first service invocation in the above query would get a result set like:
<sparql xmlns="http://www.w3.org/2005/sparql-results#"> <head> <variable name="species"/> <variable name="iuphar"/> <variable name="id"/> </head> <results> <result> <binding name="species><bnode>_:n1</bnode></binding> <binding name="iuphar><literal>GABBR1</literal></binding> <binding name="id><literal datatype="http://www.w3.org/2001/XMLSchema#integer" >2550</literal></binding> </result> <result> <binding name="species><bnode>_:n2</bnode></binding> <binding name="iuphar><literal>GABBR2</literal></binding> <binding name="id><literal datatype="http://www.w3.org/2001/XMLSchema#integer" >9568</literal></binding> </result> </results> </sparql>
conveying a result set with two solutions:
By SPARQL join semantics, variables shared between a SERVICE
graph pattern and the rest of the query serve as constraints. For instance, the bindings of ?species
and ?id
constrain the results from the query on http://study.example/analyzed to not include ?id=>2986
. The strategy for this is undefined; a query federator may relay an unconstrained query to http://study.example/analyzed, it may insert FILTER constraints reflecting the result set, or it may issue a query with BINDINGS
(see below).
A variable may be used in place of a service IRI indicating that the service call for any solution depends on that variable's binding in that solution. For instance, the default graph may contain information about which services contain information about particular entrezgene identifiers:
# Default graph @prefix void: <http://rdfs.org/ns/void#> . @prefix dcterms: <http://purl.org/dc/terms/> . @prefix entrez: <http://entrez.example/ns#> . <http://GABABR1.example/SPARQL> a void:Dataset ; dcterms:subject entrez:h2550 . <http://GABABR2.example/SPARQL> a void:Dataset ; dcterms:subject entrez:h9568 . <http://GABA-B-R3.example/SPARQL> a void:Dataset ; dcterms:subject entrez:h33248 . # …
# Data in default graph at service: http://GABABR1.example/SPARQL @prefix iuphar: <http://iuphar.example/ns#> . @prefix entrez: <http://entrez.example/ns#> . _:GABBR1 iuphar:code "2.3:GABA:1:GABAB1:" . _:GABBR1 iuphar:ligand "GABA" . _:GABBR1 iuphar:species entrez:h2550 . entrez:h2550 iuphar:name "GABBR1" . entrez:h2550 entrez:id 2550 .
# Data in default graph at service: http://GABBR2.example/SPARQL @prefix iuphar: <http://iuphar.example/ns#> . @prefix entrez: <http://entrez.example/ns#> . _:GABBR2 iuphar:code "2.3:GABA:1:GABAB2:" . _:GABBR2 iuphar:ligand "GABA" . _:GABBR2 iuphar:species entrez:h9568 . entrez:h9568 iuphar:name "GABBR2" . entrez:h9568 entrez:id 9568 .
For a set of genes, we can acquire the entrez gene id and iuphar name, as well as the serivce that provided the answer:
PREFIX iuphar: <http://iuphar.example/ns#> PREFIX entrez: <http://entrez.example/ns#> PREFIX void: <http://rdfs.org/ns/void#> PREFIX dcterms: <http://purl.org/dc/terms/> SELECT ?service ?id ?iuphar WHERE { # Find the service with the expertise. ?service dcterms:subject ?gene FILTER (?gene = entrez:h2550 || ?gene = entrez:h9568 # Query that service for species and iuphar. SERVICE ?service { ?receptor iuphar:species ?species . ?species iuphar:name ?iuphar . ?species entrez:id ?id . } }
The bindings of ?service
provide the location of the service to query, yielding:
Editorial note | |
This notion of "already bound" (note the related constraint in the grammar) is still an issue for the SPARQL Working Group, as it the question of having variables in SERVICE calls at all. Feedback from the community is encouraged. |
In order to efficiently communicate constraints to sparql endpoints, the queryier may follow the WHERE
clause with BINDINGS
. In order to efficiently address the constraints, the query on http://study.example/analyzed could be expressed as follows:
PREFIX entrez: <http://entrez.example/ns#> PREFIX med: <http://med.example/testDrug#> PREFIX study: <http://study.example/affects#> SELECT ?med ?species ?iuphar WHERE { ?study entrez:id ?id . ?study study:species ?species ?study med:ication ?med ?study study:change ?change . FILTER (?change < -.2) } BINDINGS ?human ?iuphar ?id { ("human" "GABBR1" "2550") ("human" "GABBR2" "9568") }
which yields a single solution:
The SERVICE extension is defined as an additional type of GroupGraphPattern, with an accompanying addtion to SPARQL Query 1.1's Tranform(syntax form):
If the form is
ServiceGraphPattern
The result is Service(IRI, Transform(GroupGraphPattern))
Example: a SERVICE graph pattern in a series of joins:
The evaluation of Service
is defined in terms of the SPARQL Results [RESULTS] returned by a SPARQL Protocol [SPROT] execution of the nested graph pattern:
vars
is the set of variables in pattern P
eval(Service(IRI,P)) = Invocation( IRI, Project(P, vars) )where Invocation(Q, vars) is an implementation of the SPARQL protocol against endpoint IRI, with a query Q and no
default-graph-uri
or named-graph-uri
(see SPARQL Protocol [SPROT] section 2.1.1.1).
if IRI is not a service name, or if the service returns an error, evaluation fails.
If a WhereClause has a BindingsClause, the resulting Solution Sequence is a BindingsSolutionSequence
:
Binding*
, for each tuple tp, for each variable v in V,
Iv = the position of v in V,
term(t) = the RDF interpretation of the SPARQL grammatical form IRIref or RDFLiteral or NumericLiteral or BooleanLiteral or BlankNode
interp(v, t) = if t is UNBOUND
, an empty Solution Mapping.
else, a Solution Mapping of v⇒term(tp[Iv])
M(tp) = the sum of the Solution Mappings for the terms in tp
Rbc = Solution Sequence for each tuple tp
eval(BindingsSolutionSequence(P, V, St)) = Join(Rbc, P)
Example: a graph pattern and a BINDINGS assertion:
The EBNF notation used in the grammar is defined in Extensible Markup Language (XML) 1.1 [XML11] section 6 Notation.
SPARQL Federation Extensions introduces the case-insensitive keywords SERVICE
, BINDINGS
and UNDEF
:
[7] | SelectQuery | ::= | SelectClause DatasetClause* WhereClause SolutionModifier BindingsClause |
[10] | ConstructQuery | ::= | 'CONSTRUCT' ConstructTemplate DatasetClause* WhereClause SolutionModifier BindingsClause |
[11] | DescribeQuery | ::= | 'DESCRIBE' ( VarOrIRIref+ | '*' ) DatasetClause* WhereClause? SolutionModifier BindingsClause |
[12] | AskQuery | ::= | 'ASK' DatasetClause* WhereClause BindingsClause |
[28] | BindingsClause | ::= | ( 'BINDINGS' Var+ '{' ( '(' BindingValue+ ')' )* '}' )? |
[29] | BindingValue | ::= | IRIref | RDFLiteral | NumericLiteral | BooleanLiteral | 'UNDEF' |
[49] | GraphPatternNotTriples | ::= | GroupGraphPattern | OptionalGraphPattern | UnionGraphPattern | MinusGraphPattern | GraphGraphPattern | ServiceGraphPattern | ExistsElt | NotExistsElt | Filter |
[52] | ServiceGraphPattern | ::= | 'SERVICE' VarOrIRIref GroupGraphPattern |
It is a syntax error if to use a variable as the first argument to a ServiceGraphPattern if that variable is not bound (at least optionally) in the left hand side of a join with the ServiceGraphPattern on the right. If a solution does not bind the variable, or binds it to something which cannot resolve to a SPARQL service, that solution is eliminated.
It is a syntax error if the number of elements in any Binding
does not equal the number of variables in the BindingsClause
.
See appendix A SPARQL Grammar regarding conformance of SPARQL Query strings, and section 10 Query Forms for conformance of query results. See appendix E. Internet Media Type for conformance to the application/sparql-query media type.
This specification is intended for use in conjunction with the SPARQL Protocol [SPROT] and the SPARQL Query Results XML Format [RESULTS]. See those specifications for their conformance criteria.
Note that the SPARQL protocol describes an abstract interface as well as a network protocol, and the abstract interface may apply to APIs as well as network interfaces.
SPARQL queries using SERVICE imply that a URI will be dereferenced, and that the result will be incorporated into a working data set. All of the security issues of SPARQL Protocol 1.1 [SPROT] Section 3.1 SPARQL Query 1.1 [SQRY] Section 18, and Uniform Resource Identifier (URI): Generic Syntax [RFC3986] Section 7 should be considered.
It's probably not worth the cost of a differential media type. If it were, that registration would probably look like:
The Internet Media Type / MIME Type for the SPARQL Federation Extensions is "application/sparql-query".
It is recommended that sparql query files have the extension ".rq" (all lowercase) on all platforms.
It is recommended that sparql query files stored on Macintosh HFS file systems be given a file type of "TEXT".
$Log: Overview.html,v $ Revision 1.4 2018/10/09 13:23:09 denis fix validation of xhtml documents Revision 1.3 2017/10/02 10:42:14 denis add fixup.js to old specs Revision 1.2 2010/06/01 17:47:38 bertails (bertails) Changed through Jigsaw on edit.w3.org Revision 1.4 2010/06/01 17:32:52 lfeigenb fix links Revision 1.3 2010/06/01 17:31:07 lfeigenb fix links Revision 1.2 2010/06/01 17:28:05 lfeigenb fix links Revision 1.1 2010/06/01 15:41:01 lfeigenb initial checkin Revision 1.6 2010/05/25 18:38:40 lfeigenb remove invalid css attributes Revision 1.5 2010/05/25 18:36:02 lfeigenb for publication Revision 1.4 2010/05/25 18:33:24 lfeigenb for publication Revision 1.3 2010/05/25 18:29:16 lfeigenb for publication Revision 1.2 2010/05/25 18:27:59 lfeigenb for publication Revision 1.1 2010/05/25 18:24:43 lfeigenb for publication Revision 1.7 2010/05/25 13:30:58 lfeigenb move version in document title Revision 1.6 2010/05/18 00:35:55 eric - some extraneous references Revision 1.5 2010/05/18 00:31:18 eric per SPARQL Working Group feedback via LeeF - BINDINGS and SPARQL Update section + Variable Services section with ednote ~ copied grammar from complete SPARQL grammar Revision 1.4 2010/03/31 12:40:33 eric ~ corrections from imikhailov@openlinksw.com mid:1269962023.3105.192.camel@octo.iv.dev.null Revision 1.3 2010/03/29 16:23:36 eric ~ incorporated feedback from AndyS Revision 1.2 2010/03/27 03:32:47 eric + Bindings Revision 1.1 2010/03/26 15:47:18 eric CREATED