SPARQL Query Language for RDF

An RDF graph is a set of triples; each triple consists of a subject, a predicate and an object. This is defined in RDF Concepts and Abstract Syntax [12]. These triples can come from a variety of sources. For instance, they may come directly from an RDF document. They may be inferred from other RDF triples. They may be the RDF expression of data stored in other formats, such as XML or relational databases.

SPARQL is a query language for getting information from such RDF graphs. It provides facilities to:

As a data access language, it is suitable for both local and remote use. The companion SPARQL Protocol for RDF document [SPROT] describes the remote access protocol.

1.1 Document Conventions

In this document, examples assume the following namespace prefix bindings unless otherwise stated:

Prefix	IRI
`rdf`	`http://www.w3.org/1999/02/22-rdf-syntax-ns#`
`rdfs`	`http://www.w3.org/2000/01/rdf-schema#`
`xsd`	`http://www.w3.org/2001/XMLSchema#`

2 Making Simple Queries

The SPARQL query language is based on matching graph patterns. The simplest graph pattern is the triple pattern, which is like an RDF triple but with the possibility of a variable in any of the subject, predicate or object positions. Combining these gives a basic graph pattern, where an exact match to a graph is needed to fulfill a pattern.

Later sections of this document describe how other graph patterns can be built using the graph operators OPTIONAL and UNION; how graph patterns can be grouped together; how queries can extract information from more than one graph, and how it is also possible to restrict the values allowed in matching a pattern.

In this section, we cover simple triple patterns, basic graph patterns as well as the SPARQL syntax for basic pattern queries.

2.1 Writing a Simple Query

The example below shows a SPARQL query to find the title of a book from the information in the given RDF graph. The query consists of two parts, the SELECT clause and the WHERE clause. The SELECT clause identifies the variables to appear in the query results, and the WHERE clause has one triple pattern.

title
"SPARQL Tutorial"

2.1.1 Query Term Syntax

The terms delimited by "<>" are IRI references [RFC3987]. They stand for IRIs, either directly, or relative to a base IRI. IRIs are a generalization of URIs [RFC3986] and are fully compatible with URIs and URLs.

The query terms can be literals which are a string (enclosed in quotes, either double quotes "" or single quotes '' ), with an optional language tag (introduced by @) or an optional datatype IRI or qname (introduced by ^^). As a convenience, integers can be written directly and are interpreted as typed literals of datatype xsd:integer; floating point numbers can also be directly written and are interpreted as xsd:double. Values of type xsd:boolean can also be written as true or false.

Variables in SPARQL queries have global scope; it is the same variable everywhere in the query that the same name is used. Variables are indicated by "?"; the "?" does not form part of the variable. "$" is an alternative to "?". In a query, $abc and ?abc are the same variable.

SPARQL provides a two abbreviation mechanisms for IRIs, namespace prefixes and relative IRIs.

The PREFIX keyword binds a prefix to a namespace IRI [NAMESPACE]. A prefix binding applies to any QNames in the query with that prefix; a prefix may be defined only once. A QName is mapped to an IRI by appending the local name to the namespace IRI corresponding to the prefix.

Relative IRIs are combined with base IRIs as per Uniform Resource Identifier (URI): Generic Syntax [RFC3986] using only the basic algorithm in Section 5.2 . Neither Syntax-Based Normalization nor Scheme-Based Normalization (described in sections 6.2.2 and 6.2.3 of RFC3986) is performed. Characters additionally allowed in IRI references are treated in the same way that unreserved characters are treated in URI references, per section 6.5 of Internationalized Resource Identifiers (IRIs) [RFC3987].

The BASE keyword defines the Base IRI used to resolve relative IRIs per RFC3986 section 5.1.1, "Base URI Embedded in Content". Section 5.1.2, "Base URI from the Encapsulating Entity" defines how the Base IRI may come from a an encapsulating document, such as a SOAP envelope with an xml:base directive, or a mime multipart document with a Content-Location header. The "Retrieval URI" identified in 5.1.3, Base "URI from the Retrieval URI", is the URL from which a particular SPARQL query was retrieved.

Triple Patterns are written as a list of subject, predicate, object; there are abbreviated ways of writing some common triple pattern constructs.

2.1.2 Examples of Query Syntax

Prefixes are syntactic: the prefix name does not affect the query, nor do prefix names in queries need to be the same prefixes as used for data. The following query is equivalent to the previous examples and will give the same results when applied to the same data:

2.1.3 Data descriptions used in this document

The data format used in this document is Turtle [15], used to show each triple explicitly. Turtle allows URIs to be abbreviated with prefixes:

2.1.4 Result Descriptions used in this document

The term "binding" is used as a descriptive term to refer to a pair of (variable, RDF term). In this document, we illustrate results in tabular form. If variable x is bound to "Alice" and variable y is bound to "Bob", we show this as:

x	y
"Alice"	"Bob"

2.2 Initial Definitions

This definition of RDF Term collects together several basic notions from the RDF data model.

Note that all IRIs are absolute; they may or may not include a fragment identifier [RFC3987, section 3.1]. Also note that IRIs include URIs [RFC3986] and URLs.

Queries can include blank nodes; the blank nodes in a query are disjoint from all blank nodes in the RDF graphs being matched and members of the set of variables.

The graph pattern may be the empty pattern. The set of solution modifiers may be the empty set.

2.3 Triple Patterns

The building blocks of queries are triple patterns. The following triple pattern has a subject variable (the variable book), a predicate of dc:title and an object variable (the variable title).

Matching a triple pattern to a graph gives bindings between variables and RDF Terms so that the triple pattern, with the variables replaced by the corresponding RDF terms, is a triple of the graph being matched.

Any SPARQL triple pattern with a literal as subject will fail to match on any RDF graph.

2.4 Pattern Solutions

has a single triple pattern as the query pattern. It matches a graph of a single triple:

?book	?title
<http://example.org/book/book1>	"SPARQL"

2.5 Basic Graph Patterns

For a basic graph pattern to match some dataset, there must be a solution where each of the triple patterns matches the dataset with that solution.

There is a blank node [12] in this dataset, identified by _:a. The label is only used within the file for encoding purposes. The label information is not in the RDF graph.

This query contains a basic graph pattern of two triple patterns, each of which must match for the graph pattern to match.

mbox
<mailto:jlow@example.com>

2.6 Multiple Matches

The results of a query are all the ways a query can match the graph being queried. Each result is one solution to the query and there may be zero, one or multiple results to a query.

name	mbox
"Johnny Lee Outlaw"	<mailto:jlow@example.com>
"Peter Goodguy"	<mailto:peter@example.org>

The results enumerate the RDF terms to which the selected variables can be bound in the query pattern. In the above example, the following two subsets of the data provided the two matches.

This is a simple, conjunctive graph pattern match, and all the variables used in the query pattern must be bound in every solution.

2.7 Blank Nodes

Blank Nodes and Queries

A blank node can appear in a query pattern. A blank node in a query pattern may match any RDF term.

Blank Nodes and Query Results

The presence of blank nodes can be indicated by labels in the serialization of query results. An application or client receiving the results of a query can tell that two solutions or two variable bindings differ in blank nodes but this information is only scoped to the results as defined in "SPARQL Variable Binding Results XML Format" or the CONSTRUCT result form.

The results above could equally be given with different blank node labels because the labels in the results only indicate whether RDF terms in the solutions were the same or different.

x	name
_:c	"Alice"
_:d	"Bob"

x	name
_:r	"Alice"
_:s	"Bob"

These two results have the same information: the blank nodes used to match the query are different in the two solutions. There is no relation between using _:a in the results and any blank node label in the data graph.

2.8 Other Syntactic Forms

There are a number of syntactic forms that abbreviate some common sequences of triples. These syntactic forms do not change the meaning of the query.

Predicate-Object Lists

Triple patterns with a common subject can be written so that the subject is only written once, and used for more than one triple pattern by employing the ";" notation.

Object Lists

If triple patterns share both subject and predicate, then these can be written using the "," notation.

Blank Nodes

Blank nodes have labels which are scoped to the query. They are written as "_:a" for a blank node with label "a".

A blank node that is used in only one place in the query syntax can be abbreviated with []. A unique blank node will be created and used to form the triple pattern.

The [:p :v] construct can be used in triple patterns. It creates a blank node label which is used as the subject of all contained predicate-object pairs. The created blank node can also be used in further triple patterns in the subject and object positions.

allocate a unique blank node label (here "b57") and are equivalent to writing:

This allocated blank node label can be used as the subject or object of further triple patterns. For example, as a subject:

Abbreviated blank node syntax can be combined with other abbreviations for common predicates and common objects.

This is the same as writing the following basic graph pattern for some uniquely allocated blank node:

RDF Collections

RDF collections can be written in triple patterns using the syntax "( )". The form () is an alternative for the IRI rdf:nil which is http://www.w3.org/1999/02/22-rdf-syntax-ns#nil. When used with collection elements, such as (1 ?x 3) then triple patterns and blank nodes are allocated for the collection and the blank node at the head of the collection can be used as a subject or object in other triple patterns.

Other

The keyword "a" can be used as a predicate in a triple pattern and is an alternative for the IRI rdf:type which is


http://www.w3.org/1999/02/22-rdf-syntax-ns#type

3 Working with RDF Literals

An RDF Literal is written in SPARQL as a string containing the lexical form of the literal, followed by an optional language tag or an optional datatype. There are convenience forms for numeric-types literals which are of type xsd:integer, xsd:double and also for xsd:boolean.

3.1 Matching RDF Literals

Matching Integers

The pattern in the following query has a solution :x because 42 is syntax for "42"^^<http://www.w3.org/2001/XMLSchema#integer>.

Matching Arbitrary Datatypes

The following query has a solution with variable v being :y. The query processor does not have to have any understanding of the values in the space of the datatype because, in this case, lexical form and datatype IRI both match exactly.

Matching Language Tags

This following query has no solution because


  "cat"

is not the same RDF literal as "cat"@en:

3.2 Value Constraints

Graph pattern matching creates bindings of variables. It is possible to further restrict solutions by constraining the allowable bindings of variables to RDF Terms. Value constraints take the form of boolean-valued expressions; the language also allows application-specific constraints on the values in a query solution.

title	price
"The Semantic Web"	23

By having a constraint on the "price" variable, only book2 matches the query because there is a restriction on the allowable values of "price".

3.3 Value Constraints – Definition

Constraints may be restrictions of the value of an RDF Term or they may be restrictions on some part of an RDF term, such as its lexical form. There is a set of functions & operators in SPARQL for constraints. In addition, there is an extension mechanism to provide access to functions that are not defined in the SPARQL language.

A constraint may lead to an error condition when testing some RDF term. The exact error will depend on the constraint: for example, in numeric operations, solutions with variables bound to a non-number or a blank node will lead to an error. Any potential solution that causes an error condition in a constraint will not form part of the final results, but does not cause the query to fail.

4 Graph Patterns

Complex graph patterns can be made by combining simpler graph patterns. The ways of creating graph patterns are:

4.1 Group Graph Patterns

For any solution, the same variable is given the same value everywhere in the set of graph patterns making up the group graph pattern. A Basic Graph Pattern is a group of triple patterns. For example, this query has a group pattern of one basic graph pattern as the query pattern.

Because a solution to a group graph pattern is a solution to each element of the group, and a solution of a basic graph pattern is a solution to each triple pattern, these queries also have the same solutions as:

4.2 Unbound variables

Solutions to graph patterns do not necessarily have to have every variable bound in every solution that causes a graph pattern to be matched. In particular, the OPTIONAL and UNION graph patterns can lead to query results where a variable may be bound in some solutions, but not in others.

4.3 Order of Evaluation

There is no implied order of graph patterns within a Group Graph Pattern. Any solution for the group graph pattern that can satisfy all the graph patterns in the group is valid, independently of the order that may be implied by the lexical order of the graph patterns in the group.

5 Including Optional Values

Basic graph patterns allow applications to make queries where entire query pattern must match for there to be a solution. For every solution of the query, every variable is bound to an RDF Term in a pattern solution. RDF is semi-structured: a regular, complete structure can not be assumed and it is useful to be able to have queries that allow information to be added to the solution where the information is available, but not to have the solution rejected because some part of the query pattern does not match. Optional matching provides this facility; if the optional part does not lead to any solutions, variables can be left unbound.

5.1 Optional Pattern Matching

Optional parts of the graph pattern may be specified syntactically with the OPTIONAL keyword applied to a graph pattern:

name	mbox
"Alice"	<mailto:alice@example.com>
"Bob"

There is no value of mbox in the solution where the name is "Bob". It is unbound.

This query finds the names of people in the data. If there is a triple with predicate mbox and same subject, a solution will contain the object of that triple as well. In the example, only a single triple pattern is given in the optional match part of the query but, in general, it is any graph pattern. The whole graph pattern of an optional graph pattern must match for the optional graph pattern to add to the query solution.

5.2 Constraints in Optional Pattern Matching

title	price
"SPARQL Tutorial"
"The Semantic Web"	23

No price appears for the book with title "SPARQL Tutorial" because the optional graph pattern did not lead to a solution involving the variable price.

5.3 Multiple Optional Graph Patterns

Graph patterns are defined recursively. A graph pattern may have zero or more optional graph patterns, and any part of a query pattern may have an optional part. In this example, there are two optional graph patterns.

name	mbox	hpage
"Alice"		<http://work.example.org/alice/>
"Bob"	<mailto:bob@example.com>

5.4 Optional Matching – Formal Definition

In an optional match, either an additional graph pattern matches a graph, thereby defining one or more pattern solutions; or it passes any solutions without adding any additional bindings.

5.5 Nested Optional Graph Patterns

Optional patterns can occur inside any group graph pattern, including a group graph pattern which itself is optional, forming a nested pattern. The outer optional graph pattern must match for any nested optional pattern to be matched.

This query finds the name, optionally the mbox, and also the vCard given name; further, if there is a vCard Family name as well as the Given name, the query finds that as well.

foafName	mbox	gname	fname
"Alice"	<mailto:alice@work.example>	"Alice"	"Hacker"
"Bob"	<mailto:bob@work.example>
"Ella"		"Eleanor"

By nesting the optional pattern involving vcard:Family, the query only reaches these if there is a vcard:N predicate. Here the expression is a simple triple pattern on vcard:N but it could be a complex graph pattern with value constraints.

6 Matching Alternatives

SPARQL provides a means of combining graph patterns so that one of several alternative graph patterns may match. If more than one of the alternatives matches, all the possible pattern solutions are found.

6.1 Joining Patterns with UNION

title
"SPARQL Protocol Tutorial"
"SPARQL"
"SPARQL (updated)"
"SPARQL Query Language Tutorial"

This query finds titles of the books in the data, whether the title is recorded using Dublin Core properties from version 1.0 or version 1.1. If the application wishes to know how exactly the information was recorded, then the query:

x	y
	"SPARQL (updated)"
	"SPARQL Protocol Tutorial"
"SPARQL"
"SPARQL Query Language Tutorial"

will return results with the variables x or y bound depending on which way the query processor matches the pattern to the data. Note that, unlike an OPTIONAL pattern, if neither part of the UNION pattern matched, then the graph pattern would not match.

The UNION operator combines graph patterns, so more than one triple pattern can be given in each alternative possibility:

This query will only match a book if it has both a title and creator predicate from the same version of Dublin Core.

author	title
"Alice"	"SPARQL Protocol Tutorial"
"Bob"	"SPARQL Query Language Tutorial"

6.2 Union Matching – Formal Definition

Query results involving a pattern containing GP1 and GP2 will include separate solutions for each match where GP1 and GP2 give rise to different sets of bindings.

7 RDF Dataset

The RDF data model expresses information as graphs, comprising of triples with subject, predicate and object. Many RDF data stores hold multiple RDF graphs, and record information about each graph, allowing an application to make queries that involve information from more than one graph.

A SPARQL query is executed against an RDF Dataset which represents such a collection of graphs. Different parts of the query may be matched against different graphs as described in the next section. There is one graph, the default graph, which does not have a name, and zero or more named graphs, each identified by IRI.

In the previous sections, all queries have been shown executed against a single, default graph. A query does not need to involve the default graph; the query can just involve matching named graphs.

7.1 Examples of RDF Datasets

The definition of RDF Dataset does not restrict the relationships of named and default graphs. Two useful arrangements are:

Example 1:

In this example, the default graph contains the names of the publishers of two named graphs. The triples in the named graphs are not visible in the default graph in this example.

Example 2:

RDF data can be combined by RDF merge [RDF-MT] of graphs so that the default graph can be made to include the RDF merge of some or all of the information in the named graphs.

In this next example, the named graphs contain the same information as before. The RDF dataset includes an RDF merge of the named graphs in the default graph, re-labeling blank nodes to keep them distinct.

8 Querying the Dataset

When querying a collection of graphs, the GRAPH keyword is used to match patterns against named graphs. This is by either using an IRI to select a graph or using a variable to range over the IRIs naming graphs.

8.1 Accessing Graph Names

The query below matches the graph pattern on each of the named graphs in the dataset and forms solutions which have the src variable bound to IRIs of the graph being matched.

The query result gives the name of the graphs where the information was found and the value for Bob's nick:

src	bobNick
<http://example.org/foaf/aliceFoaf>	"Bobby"
<http://example.org/foaf/bobFoaf>	"Robert"

8.2 Restricting by Graph IRI

The query can restrict the matching applied to a specific graph by supplying the graph IRI. This query looks for Bob's nick as given in the graph http://example.org/foaf/bobFoaf.

nick
"Robert"

8.3 Restricting by Bound Variables

A variable used in the GRAPH clause may also be used in another GRAPH clause or in a graph pattern matched against the default graph in the dataset.

This can be used to find information in one part of a query, and thus restrict the graphs matched in another part of the query. The query below uses the graph with IRI http://example.org/foaf/aliceFoaf to find the profile document for Bob; it then matches another pattern against that graph. The pattern in the second GRAPH clause finds the blank node (variable w) for the person with the same mail box (given by variable mbox) as found in the first GRAPH clause (variable whom), because the blank node used to match for variable whom from Alice's FOAF file is not the same as the blank node in the profile document (they are in different graphs).

mbox	nick	ppd
<mailto:bob@work.example>	"Robert"	<http://example.org/foaf/bobFoaf>

Any triple in Alice's FOAF file giving Bob's nick is not used to provide a nick for Bob because the pattern involving variable nick is restricted by ppd to a particular Personal Profile Document.

8.4 Named and Default Graphs

Query patterns can involve both the default graph and the named graphs. In this example, an aggregator has read in a Web resource on two different occasions. Each time a graph is read into the aggregator, it is given an IRI by the local system. The graphs are nearly the same but the email address for "Bob" has changed.

The default graph is being used to record the provenance information and the RDF data actually read is kept in two separate graphs, each of which is given a different IRI by the system. The RDF dataset consists of two, named graphs and the information about them.

This query finds email addresses, detailing the name of the person and the date the information was discovered.

name	mbox	date
"Bob"	<mailto:bob@oldcorp.example.org>	"2004-12-06"^^xsd:date
"Bob"	<mailto:bob@newcorp.example.org>	"2005-01-10"^^xsd:date

9 Specifying RDF Datasets

A SPARQL query may specify the dataset to be used for matching. The FROM clauses give IRIs that the query processor can use to create the default graph and the FROM NAMED clause can be used to specify named graphs. The RDF dataset may also be specified in a SPARQL protocol request, in which case the protocol description overrides any description in the query itself.

A query processor may use these IRIs in any way to associate an RDF Dataset with a query. For example, it could use IRIs to retrieve documents, parse them and use the resulting triples as one of the graphs; alternatively, it might only service queries that specify IRIs of graphs that it already has stored.

The FROM and FROM NAMED keywords allow a query to specify an RDF dataset by reference; they indicate that the dataset should include graphs that are obtained from representations of the resources identified by the given IRIs (i.e. the absolute form of the given IRI references). The dataset resulting from a number of FROM and FROM NAMED clauses is:

If a query provides such a dataset description, then it is used in place of any dataset that the query service would use if no dataset description is provided in a query.

9.1 Specifying the Default Graph

The FROM clause contains an IRI that indicates the graph to be used to form the default graph. This does not automatically put the graph in as a named graph; a query can do this by also specifying the graph in the FROM NAMED clause.

name
"Alice"

If a query provides more than one FROM clause, providing more than one IRI to indicate the default graph, then the default graph is based on the RDF merge of the graphs obtained from representations of the resources identified by the given IRIs.

9.2 Specifying Named Graphs

A query can supply IRIs for the named graphs in the RDF Dataset using the FROM NAMED clause. Each IRI is used to provide one named graph in the RDF Dataset.

src	name
<http://example.org/bob>	"Bob"
<http://example.org/alice>	"Alice"

The FROM NAMED syntax suggests that the IRI identifies the corresponding graph, but actually the relationship between a URI and a graph in an RDF dataset is indirect. the IRI identifies a resource, and the resource is represented by a graph (or, more precisely: by a document that serializes a graph). For further details see [20].

9.3 Combining FROM and FROM NAMED

who	g	mbox
"Bob Hacker"	<http://example.org/bob>	<mailto:bob@oldcorp.example.org>
"Alice Hacker"	<http://example.org/alice>	<mailto:alice@work.example.org>

This query finds the mbox together with the information in the default graph about the publisher.


    <http://example.org/dft.ttl>

is just the IRI used to form the default graph, not it's name.

10 Query Result Forms

SPARQL has four query result forms. These result forms use the solutions from pattern matching to form result sets or RDF graphs. The query forms are:

10.1 Solution Sequences and Result Forms

Query patterns generate an unordered collection of solutions, each solution being a function from variables to RDF terms. These solutions are then treated as a sequence, initially in no specific order; any sequence modifiers are then applied to create another sequence. Finally, this latter sequence is used to generate one of the SPARQL result forms.

The effect of applying these controls is as if they are applied in the order given.

10.1.1 Projection

The solution sequence can be transformed into one involving only a subset of the variables. For each solution in the sequence, a new solution is formed using a specified selection of the variables.

The following example shows a query to extract just the names of people described in an RDF graph using FOAF properties.

name
"Bob"
"Alice"

10.1.2 DISTINCT

The solution sequence can be modified by adding the DISTINCT keyword which ensures that every combination of variable bindings (i.e. each solution) in the sequence is unique.

name
"Alice"

If DISTINCT and


  LIMIT


  OFFSET

are specified, then duplicates are eliminated before the limit or offset is applied.

10.1.3 ORDER BY

The ORDER BY clause takes a solution sequence and applies ordering conditions. An ordering condition can be a variable or a function call. The direction of ordering is ascending by default. It can be explicitly set to ascending or descending by enclosing the condition in ASC() or DESC() respectively. If multiple conditions are given, then they are applied in turn until one gives the indication of the ordering.

Using ORDER BY on a solution sequence for a result form other than SELECT has no direct effect because only SELECT returns a sequence of results. In combination with LIMIT and OFFSET, it can be used to return partial results.

The "<" operator (see the Operator Mapping Table) defines the relative order of pairs of numerics, xsd:dateTimes and xsd:strings. SPARQL defines a fixed, arbitrary order between some kinds of RDF terms that would not otherwise be ordered. This arbitrary order is necessary to provide slicing of query solutions by use of


  LIMIT

and OFFSET.

If the ordering criteria do not specify the order of values, then the ordering in the solution sequence is undefined.

Ordering a sequence of solutions always results in a sequence with the same number of solutions in it, even if the ordering criteria does not differentiate between two solutions.

10.1.4 LIMIT

The LIMIT form puts an upper bound on the number of solutions returned. If the number of actual solutions is greater than the limit, then at most the limit number of solutions will be returned.

A limit of 0 would cause no results to be returned. A limit may not be negative.

10.1.5 OFFSET

OFFSET causes the solutions generated to start after the specified number of solutions. An OFFSET of zero has no effect.

The order in which solutions are returned is initially undefined. Using LIMIT and OFFSET to select different subsets of the query solutions will not be useful unless the order is made predictable by using


  ORDER BY

10.2 Selecting Variables

The SELECT form of results returns the variables directly. The syntax SELECT * is an abbreviation that selects all of the named variables.

Results can be thought of as a table or result set, with one row per query solution. Some cells may be empty because a variable is not bound in that particular solution.

nameX	nameY	nickY
"Alice"	"Bob"
"Alice"	"Clare"	"CT"

Result sets can be accessed by the local API but also can be serialized into either XML or an RDF graph. The SPARQL Query Results XML Format form of this result set gives:

10.3 Constructing an Output Graph

The CONSTRUCT result form returns a single RDF graph specified by a graph template. The result is an RDF graph formed by taking each query solution in the solution sequence, substituting for the variables into the graph template and combining the triples into a single RDF graph by set union.

If any such instantiation produces a triple containing an unbound variable, or an illegal RDF construct (such as a literal in subject or predicate position), then that triple is not included in the RDF graph, and a warning may be generated. The graph template may contain ground or explicit triples, that is, triples with no variables, and these also appear in the RDF graph returned by the CONSTRUCT query form.

10.3.1 Templates with Blank Nodes

A template can create an RDF graph containing blank nodes. The blank node labels are scoped to the template for each solution. If the same label occurs twice in a template, then there will be one blank node created for each query solution but there will be different blank nodes across triples generated by different query solutions.

The use of variable ?x in the template, which in this example will be bound to blank nodes (which have labels _:a and _:b in the data) causes different blank node labels (_:v1 and _:v2) as shown by the results.

10.3.2 Accessing Graphs in the RDF Dataset

Using CONSTRUCT it is possible to extract parts or the whole of graphs from the target RDF dataset. This first example returns the graph (if it is in the dataset) with IRI label


  http://example.org/aGraph

; otherwise, it returns an empty graph.

The access to the graph can be conditional on other information. Suppose the default graph contains metadata about the named graphs in the dataset, then a query like the following one can extract one graph based on information about the named graph:

where app:customDate identified an extension function to turn the data format into an xsd:dateTime RDF Term.

10.3.3 Solution Modifiers and CONSTRUCT

The solution modifiers of a query affect the results of a CONSTRUCT query. In this example, the output graph from the CONSTRUCT template is formed from just 2 of the solutions from graph pattern matching. The query outputs a graph with the names of the people with the top 2 sites, rated by hits. The triples in the RDF graph are not ordered.

10.4 Descriptions of Resources

The DESCRIBE form returns a single result RDF graph containing RDF data about resources. This data is not prescribed by a SPARQL query, where the query client would need to know the structure of the RDF in the data source, but, instead, is determined by the SPARQL query processor.

The query pattern is used to create a result set. The


  DESCRIBE

form takes each of the resources identified in a solution, together with any resources directly named by IRI, and assembles a single RDF graph by taking a "description" from the target knowledge base. The description is determined by the query service.

If a data source has no information about a resource, no RDF triples are added to the result graph but the query does not fail.

10.4.1 Explicit IRIs

The DESCRIBE clause itself can take IRIs to identify the resources. The simplest DESCRIBE query is just an IRI in the DESCRIBE clause:

10.4.2 Identifying Resources

The resources can also be a query variable from a result set. This enables description of resources whether they are identified by IRI or by blank node in the dataset:

The property foaf:mbox is defined as being an inverse function property in the FOAF vocabulary. If treated as such, this query will return information about at most one person. If, however, the query pattern has multiple solutions, the RDF data for each is the union of all RDF graph descriptions.

10.4.3 Descriptions of Resources

The RDF returned is the determined by the information publisher. It is the useful information the service has about a resource. It may include information about other resources: the RDF data for a book may also include details about the author.

might return a description of the employee and some other potentially useful details:

which includes the blank node closure for the vcard vocabulary vcard:N. For a vocabulary such as FOAF, where the resources are typically blank nodes, returning sufficient information to identify a node such as the InverseFunctionalProperty foaf:mbox_sha1sum as well information which as name and other details recorded would be appropriate. In the example, the match to the WHERE clause was returned but this is not required.

10.5 Asking "yes or no" questions

Applications can use the ASK form to test whether or not a query pattern has a solution. No information is returned about the possible query solutions, just whether the server can find one or not.

On the same data, the following returns no match because Alice's mbox is not mentioned.

11 Testing Values

SPARQL FILTERs restrict the set of solutions according to the given expression. Specifically, FILTERs eliminate any solutions that, when substituted into the expression, result in either an effective boolean value of false or produce a type error. Effective boolean values are defined in section 11.2.2 Effective Boolean Value, type error is defined in XQuery 1.0: An XML Query Language [18] section 2.3.1, Kinds of Errors.

SPARQL expressions are constructed according to the grammar and provide access to named functions and syntactically constructed operations. The operands of these functions and operators are the subset of XML Schema DataTypes {xsd:string, xsd:decimal, xsd:double, xsd:dateTime} and types derived from xsd:decimal. The SPARQL operations are listed in table 11.1 and are associated with their productions in the grammar. In addition, SPARQL imports a subset of the XPath casting functions, listed in table 11.2, which are invokable by name within a SPARQL query. These functions and operators are taken from the XQuery 1.0 and XPath 2.0 Functions and Operators [17].

As described above, RDF Terms are made of IRIs, Literals and Blank Nodes. RDF Literals may have datatypes in the instance data:

The first dc:date arc has no type information. The second is tagged with the type xsd:dateTime. SPARQL operators compare the values of typed literals:

The namespace for XPath functions that are directly available by name is http://www.w3.org/2004/07/xpath-functions. The associated namespace prefix used in this document is fn:. XPath operators are named with the prefix op:, XML Schema datatypes with the prefix xsd:, and types of RDF Schema terms with the prefix rdfs:. SPARQL operators are named with the prefix sop:.

11.1 Operand Data Types

SPARQL defines a subset of the XPath functions and operators with operands of the following XML Schema datatypes [XMLSCHEMA-2]:

11.1.1 Type Promotion

XPath defines a set of Numeric Type Promotions. Numeric operators are defined for the following three primitive XML Schema numeric types:

These invoke XQuery's numeric type promotion to cast function arguments to the appropriate type. In summary: each of the numeric types is promoted to any type earlier in the above ordered list when used as an argument to function expecting that higher type. When an argument is promoted, the value is cast to the expected type. For instance, a "7"^^xsd:decimal will be converted to an "7.0E0"^^xsd:double when passed to an argument expecting an xsd:double. Promotion does not change the bindings of variables.

The operators defined below that take numeric arguments expect all arguments to be the same type. This is accomplished by promoting the argument with the lower type to the same type as the other argument. For example, "7"^^xsd:decimal + "6.5"^^xsd:float would call op:numeric-add("7"^^xsd:float, "6.5"^^xsd:float). In addition, any rdfs:Literal may be cast to xsd:string or xsd:numeric when used as an argument to an operator expecting that type.

XML Schema defines a set of types derived from decimal: integer; nonPositiveInteger; negativeInteger; long; int; short; byte; nonNegativeInteger; unsignedLong; unsignedInt; unsignedShort; unsignedByte and positiveInteger. These are all treated as decimals for arithmetic operations in FILTERs. SPARQL does not specifically require integrity checks on derived subtypes. SPARQL has no numeric type test operators so the distinction between a primitive type and a type derived from that primitive type is unobservable.

11.2 SPARQL Functions and Operators

SPARQL provides a subset of the functions and operators defined by XQuery Operator Mapping. XQuery 1.0 section 2.2.3 Expression Processing describes the invocation of XPath functions. The following rules accommodate the differences in the data and execution models between XQuery and SPARQL:

The logical-and and logical-or truth table for true, false, and error is as follows:

A	B	A \|\| B	A && B
E	E	E	E
E	T	T	E
E	F	E	F
T	E	T	E
T	T	T	E
T	F	T	F
F	E	E	F
F	T	T	F
F	F	F	F

11.2.1 Invocation

SPARQL defines a syntax for invoking functions and operators on a list of arguments. These are invoked as follows:

If any of these steps fails, the invocation generates an error. The effects of type errors are defined in SPARQL Functions and Operators.

11.2.2 Effective Boolean Value

When an operand is coerced to xsd:boolean through invoking a function that takes a xsd:boolean argument, the following rules apply:

Table 11.1 Operator Mapping

The SPARQL grammar identifies a set of operators (for instance, &&, *, isIRI) used to construct constraints. The following table associates each of these grammatical productions with an operator defined by either the XQuery Operator Mapping or the additional SPARQL operators specified in section 11.2.3. When selecting the operator definition for a given set of parameters, the definition with the most specific parameters applies. For instance, when evaluating xsd:integer = xsd:signedInt, the definition for = with two numeric parameters applies, rather than the one with two RDF terms. The table is arranged so that upper-most viable candiate is the most specific.

Some of the operators are associated with nested function expressions, e.g. fn:not(op:numeric-equal(A, B)). Note that per the xpath definitions, fn:not and op:numeric-equal return an error if their argument is an error.

SPARQL Unary Operators
Operator	Type(A)	Function	Result type
XQuery Connectives
! A	xsd:boolean	fn:not(A)	xsd:boolean
+ A	numeric	op:numeric-unary-plus(A)	numeric
- A	numeric	op:numeric-unary-minus(A)	numeric
SPARQL Tests, defined in section 11.2.3
BOUND(A)	variable	sop:isBound(A)	xsd:boolean
isIRI(A)	RDF term	sop:isIRI(A)	xsd:boolean
isBLANK(A)	RDF term	sop:isBlank(A)	xsd:boolean
isLITERAL(A)	RDF term	sop:isLiteral(A)	xsd:boolean
SPARQL Accessors
STR(A)	rdf:literal	sop:str(A)	xsd:string
STR(A)	xsd:anyURI	sop:str(A)	xsd:string
LANG(A)	rdf:literal	sop:lang(A)	xsd:string
DATATYPE(A)	rdf:literal	sop:datatype(A)	xsd:anyURI

SPARQL Binary Operators
Operator	Type(A)	Type(B)	Function	Result type
Logical Connectives
A \|\| B	xsd:boolean	xsd:boolean	sop:logical-or(A, B)	xsd:boolean
A && B	xsd:boolean	xsd:boolean	sop:logical-and(A, B)	xsd:boolean
XPath Tests
A = B	numeric	numeric	op:numeric-equal(A, B)	xsd:boolean
A = B	xsd:string	xsd:string	op:numeric-equal(fn:compare(A, B), 0)	xsd:boolean
A = B	xsd:dateTime	xsd:dateTime	op:dateTime-equal(A, B)	xsd:boolean
A != B	numeric	numeric	fn:not(op:numeric-equal(A, B))	xsd:boolean
A != B	xsd:string	xsd:string	fn:not(op:numeric-equal(fn:compare(A, B), 0))	xsd:boolean
A != B	xsd:dateTime	xsd:dateTime	fn:not(op:dateTime-equal(A, B))	xsd:boolean
A < B	numeric	numeric	op:numeric-less-than(A, B)	xsd:boolean
A < B	xsd:string	xsd:string	op:numeric-equal(fn:compare(A, B), -1)	xsd:boolean
A < B	xsd:dateTime	xsd:dateTime	op:dateTime-less-than(A, B)	xsd:boolean
A > B	numeric	numeric	op:numeric-greater-than(A, B)	xsd:boolean
A > B	xsd:string	xsd:string	op:numeric-equal(fn:compare(A, B), 1)	xsd:boolean
A > B	xsd:dateTime	xsd:dateTime	op:dateTime-greater-than(A, B)	xsd:boolean
A <= B	numeric	numeric	sop:logical-or(op:numeric-less-than(A, B), op:numeric-equal(A, B))	xsd:boolean
A <= B	xsd:string	xsd:string	fn:not(op:numeric-equal(fn:compare(A, B), 1))	xsd:boolean
A <= B	xsd:dateTime	xsd:dateTime	fn:not(op:dateTime-greater-than(A, B))	xsd:boolean
A >= B	numeric	numeric	sop:logical-or(op:numeric-greater-than(A, B), op:numeric-equal(A, B))	xsd:boolean
A >= B	xsd:string	xsd:string	fn:not(op:numeric-equal(fn:compare(A, B), -1))	xsd:boolean
A >= B	xsd:dateTime	xsd:dateTime	fn:not(op:dateTime-less-than(A, B))	xsd:boolean
A * B	numeric	numeric	op:numeric-multiply(A, B)	numeric
A / B	numeric	numeric	op:numeric-divide(A, B)	numeric; but xsd:decimal if both operands are xsd:integer
A + B	numeric	numeric	op:numeric-add(A, B)	numeric
A - B	numeric	numeric	op:numeric-subtract(A, B)	numeric
SPARQL Tests, defined in section 11.2.3
A = B	RDF term	RDF term	sop:RDFterm-equal(A, B)	xsd:boolean
A != B	RDF term	RDF term	fn:not(sop:RDFterm-equal(A, B))	xsd:boolean
langMATCHES(A, B)	xsd:string	xsd:string	sop:langMatches(A, B)	xsd:boolean
REGEX(STRING, PATTERN)	xsd:string	xsd:string	fn:matches(STRING, PATTERN)	xsd:boolean

SPARQL Trinary Operators
Operator	Type(A)	Type(B)	Type(C)	Function	Result type
SPARQL Tests, defined in section 11.2.3
REGEX(STRING, PATTERN, FLAGS)	xsd:string	xsd:string	xsd:string	fn:matches(STRING, PATTERN, FLAGS)	xsd:boolean

11.2.3 Operators introduced in SPARQL

This section defines the operators introduced by the SPARQL Query language. The names of the operators are prefixed with sop:. The examples show the behavior of the operators as invoked by the appropriate grammatical constructs.

11.2.3.1 sop:bound

Returns true if a var is bound to a value. Returns false otherwise. Variables with the value NaN or INF are considered bound. See 4.2 Unbound Variables for a discussion of why variables may be unbound.

givenName
"Bob"

One may test that a graph pattern is not expressed by specifying an optional graph pattern that introduces a variable and testing to see that the variable is not bound. This is called Negation as Failure in logic programming.

name
"Alice"

11.2.3.2 sop:isIRI

name	mbox
"Alice"	<mailto:alice@work.example>

11.2.3.3 sop:isBlank

given	family
"Bob"	"Smith"

In this example, there were two objects of foaf:knows predicates, but only one (_:c) was a blank node.

11.2.3.4 sop:isLiteral

This query is similar to the one in 1.2.1.3 except that is matches the people with a name and an mbox which is a Literal. This could be used to look for erroneous data (foaf:mbox should only have a URI as its object).

name	mbox
"Bob"	"bob@work.example"

11.2.3.5 sop:str

Returns the lexical form of literal (an rdf:literal); returns the codepoint representation of iri (an xsd:anyURI). This is useful for examining parts of a IRI, for instance, the host-name.

This query selects the set of people who use their work.example address in their foaf profile:

name	mbox
"Alice"	<alice@work.example>

11.2.3.6 sop:lang

Returns an [RFC3066] language tag representing the XML schema language datatype of arg if arg has a language tag. It returns "" if the argument is a literal with no language tag.

name	mbox
"Roberto"@ES	<mailto:bob@work.example>

11.2.3.7 sop:datatype

Returns the datatype of arg if arg is a typed literal. It returns <xsd:string> if the argument is an untyped literal.

name	shoeSize
"Bob"	42

11.2.3.8 sop:logical-or

Returns a logical OR of left and right. As with other functions and operators with boolean arguments, sop:logical-or operates on the effective boolean value of its arguments.

11.2.3.9 sop:logical-and

Returns a logical AND of left and right. As with other functions and operators with boolean arguments, sop:logical-and operates on the effective boolean value of its arguments.

11.2.3.10 sop:RDFterm-equal

xs:boolean	RDF term		RDF term
	term1	=	term2

The following @@@ sop:RDFterm-equal example passes the test because the mbox terms are the same RDF term:

In this query for documents that were annotated on new years day (2004 or 2005), the RDF terms are not the same, but have equivalent values:

name1	name2
"Alice"	"Ms A."
"Ms A."	"Alice"

annotates
<http://www.w3.org/TR/rdf-sparql-query/>

11.2.3.11 sop:langMatches

Returns true if language-range (first argument) matches language-tag (second argument) per Tags for the Identification of Languages [RFC3066] section 2.5. RFC3066 defines a case-insensitive, hierarchical matching algorithm which operates on ISO-defined subtags for language and country codes, and user defined subtags. In SPARQL, a language-range of "*" matches any non-empty language-tag string.

This query uses langMatches and lang (described in section 11.2.3.8) to find the French titles for the show known in English as "That Seventies Show":

title
"Cette Série des Années Soixante-dix"@fr
"Cette Série des Années Septante"@fr-BE

The idiom langMatches( lang( ?v ), "*" ) will not match literals without a language tag as lang( ?v ) will return an empty string.

11.2.3.12 sop:regex

Invokes the XPath fn:matches function to match text against a regular expression pattern. The regular expression language is defined in XQuery 1.0 and XPath 2.0 Functions and Operators section 7.6.1 Regular Expression Syntax [17]. The collation is defined in section 7.3.1 Collations.

name
"Alice"

Table 11.2 SPARQL Constructor Functions

SPARQL imports a subset of the XPath constructor functions defined in XQuery 1.0 and XPath 2.0 Functions and Operators [17] in section 17.1 Casting from primitive types to primitive types. SPARQL constructors include all of the XPath constructors for the SPARQL operand data types plus the additional datatypes imposed by the RDF data model. Casting in SPARQL is performed by calling a constructor function for the target type on an operand of the source type. The table below summarizes the casting operations that are always allowed (Y), never allowed (N) and dependent on the lexical value (M). A casting operation from an xsd:string (the first row) to an xsd:float (the second column) is dependent on the lexical value (M).

From \ To	str	flt	dbl	dec	int	dT	bool	IRI	ltrl
str	Y	M	M	M	M	M	M	Y	Y
flt	Y	Y	Y	M	M	N	Y	N	N
dbl	Y	Y	Y	M	M	N	Y	N	N
dec	Y	Y	Y	Y	Y	N	Y	N	N
int	Y	Y	Y	Y	Y	N	Y	N	N
dT	Y	N	N	N	N	Y	N	N	N
bool	Y	Y	Y	Y	Y	N	Y	N	N
IRI	Y	N	N	N	N	N	N	Y	Y
ltrl	Y	M	M	M	M	M	M	Y	Y

11.2.4 Extensible Value Testing

An expression can also be a function call to an extension function. A function is named by an IRI, and returns an RDF term. The semantics of these functions are identified by the IRI identifying the function.

If a query request contains a function that it is not supported, the query is not executed and an error is returned.

SPARQL queries using extension functions are likely to have limited interoperability.

A function returns an RDF term. It might be used to test some application datatype not supported by the core SPARQL specification, it might be a transformation between datatype formats, for example into an XSD dateTime RDF term from another date format.

A. SPARQL Grammar

A SPARQL query string is a Unicode character string (c.f. section 6.1 String concepts of [CHARMOD]) in the language defined by the following grammar, starting with the Query production. The EBNF format is the same as that used in the XML 1.1 specification[XML11]. Please see the "Notation" section of that specification for specific information about the notation.

A.1 IRI References

Text matched by the Q_IRI_REF production and QName (after prefix expansion) production, after escape processing, must be conform to the generic syntax of IRI references in section 2.2 of RFC 3987 "ABNF for IRI References and IRIs" [RFC3987]. For example, the Q_IRI_REF <abc#def> may occur in a SPARQL query string, but the Q_IRI_REF <abc##def> must not.

A.2 White Space

White space (production WS) is used to separate two terminals which would otherwise be (mis-)recognized as one terminal. White space is significant in strings, and as noted in the grammar, but otherwise white space is ignored. As a hint, rule names below in capitals indicate a possible choice of terminals.

A.3 Keywords

The exception is the keyword 'a' which, in line with Turtle and N3, is used in place of the IRI rdf:type (in full,


  http://www.w3.org/1999/02/22-rdf-syntax-ns#type)

A.4 Comments

Comments in SPARQL queries take the form of '#', outside an IRI or string, and continue to the end of line or end of file if there is no end of line after the comment marker.

A.5 Escape sequences in strings

Strings are used for the lexical form of RDF terms and in expressions. Within a string, the following escape sequences apply. The escape character is backslash "\" (#x5C). No other escape sequences are defined for strings. Names for characters given are the common names.

Escape	Unicode code point
'\u' HEX HEX HEX HEX	A Unicode code point in the range U+0 to U+FFFF inclusive corresponding to the encoded hexadecimal value.
'\U' HEX HEX HEX HEX HEX HEX HEX HEX	A Unicode code point in the range U+10000 to U+10FFFF inclusive corresponding to the encoded hexadecimal value.
'\t'	U+0009 (tab)
'\n'	U+000A (line feed)
'\r'	U+000D (carriage return)
'\b'	U+0008 (backspace)
'\f'	U+000C (form feed)
'\"'	U+0022 (quotation mark, double quote mark)
"\'"	U+0027 (apostrophe-quote, single quote mark)
'\\'	U+005C (backslash)

These escape sequences are included in the grammar below through the ECHAR rule.

A.6 Escape sequences in IRI references, prefixed names and variable names

Escape	Unicode code point
'\u' HEX HEX HEX HEX	A Unicode code point in the range U+0 to U+FFFF inclusive corresponding to the encoded hexadecimal value.
'\U' HEX HEX HEX HEX HEX HEX HEX HEX	A Unicode code point in the range U+10000 to U+10FFFF inclusive corresponding to the encoded hexadecimal value.

These escape sequences are not included in the grammar below. Only escape sequences for characters that would be legal at that point in the grammar may be given. For example, the variable "?x\u0020y" is not legal (\u0020 is a space and is not permitted in a variable name).

A.7 Grammar

`[1]`	`Query`	::=	`Prolog ( SelectQuery \| ConstructQuery \| DescribeQuery \| AskQuery )`
`[2]`	`Prolog`	::=	`BaseDecl? PrefixDecl*`
`[3]`	`BaseDecl`	::=	`'BASE' Q_IRI_REF`
`[4]`	`PrefixDecl`	::=	`'PREFIX' QNAME_NS Q_IRI_REF`
`[5]`	`SelectQuery`	::=	`'SELECT' 'DISTINCT'? ( Var+ \| '' ) DatasetClause WhereClause SolutionModifier`
`[6]`	`ConstructQuery`	::=	`'CONSTRUCT' ConstructTemplate DatasetClause* WhereClause SolutionModifier`
`[7]`	`DescribeQuery`	::=	`'DESCRIBE' ( VarOrIRIref+ \| '' ) DatasetClause WhereClause? SolutionModifier`
`[8]`	`AskQuery`	::=	`'ASK' DatasetClause* WhereClause`
`[9]`	`DatasetClause`	::=	`'FROM' ( DefaultGraphClause \| NamedGraphClause )`
`[10]`	`DefaultGraphClause`	::=	`SourceSelector`
`[11]`	`NamedGraphClause`	::=	`'NAMED' SourceSelector`
`[12]`	`SourceSelector`	::=	`IRIref`
`[13]`	`WhereClause`	::=	`'WHERE'? GroupGraphPattern`
`[14]`	`SolutionModifier`	::=	`OrderClause? LimitClause? OffsetClause?`
`[15]`	`OrderClause`	::=	`'ORDER' 'BY' OrderCondition+`
`[16]`	`OrderCondition`	::=	`( ( 'ASC' \| 'DESC' ) BrackettedExpression ) \| ( FunctionCall \| Var \| BrackettedExpression )`
`[17]`	`LimitClause`	::=	`'LIMIT' INTEGER`
`[18]`	`OffsetClause`	::=	`'OFFSET' INTEGER`
`[19]`	`GroupGraphPattern`	::=	`'{' GraphPattern '}'`
`[20]`	`GraphPattern`	::=	`Triples? ( GraphPatternNotTriples '.'? GraphPattern )?`
`[21]`	`GraphPatternNotTriples`	::=	`OptionalGraphPattern \| GroupOrUnionGraphPattern \| GraphGraphPattern \| Constraint`
`[22]`	`OptionalGraphPattern`	::=	`'OPTIONAL' GroupGraphPattern`
`[23]`	`GraphGraphPattern`	::=	`'GRAPH' VarOrBlankNodeOrIRIref GroupGraphPattern`
`[24]`	`GroupOrUnionGraphPattern`	::=	`GroupGraphPattern ( 'UNION' GroupGraphPattern )*`
`[25]`	`Constraint`	::=	`'FILTER' ( BrackettedExpression \| BuiltInCall \| FunctionCall )`
`[26]`	`ConstructTemplate`	::=	`'{' ConstructTriples '}'`
`[27]`	`ConstructTriples`	::=	`( Triples1 ( '.' ConstructTriples )? )?`
`[28]`	`Triples`	::=	`Triples1 ( '.' Triples? )?`
`[29]`	`Triples1`	::=	`VarOrTerm PropertyListNotEmpty \| TriplesNode PropertyList`
`[30]`	`PropertyList`	::=	`PropertyListNotEmpty?`
`[31]`	`PropertyListNotEmpty`	::=	`Verb ObjectList ( ';' PropertyList )?`
`[32]`	`ObjectList`	::=	`GraphNode ( ',' ObjectList )?`
`[33]`	`Verb`	::=	`VarOrBlankNodeOrIRIref \| 'a'`
`[34]`	`TriplesNode`	::=	`Collection \| BlankNodePropertyList`
`[35]`	`BlankNodePropertyList`	::=	`'[' PropertyListNotEmpty ']'`
`[36]`	`Collection`	::=	`'(' GraphNode+ ')'`
`[37]`	`GraphNode`	::=	`VarOrTerm \| TriplesNode`
`[38]`	`VarOrTerm`	::=	`Var \| GraphTerm`
`[39]`	`VarOrIRIref`	::=	`Var \| IRIref`
`[40]`	`VarOrBlankNodeOrIRIref`	::=	`Var \| BlankNode \| IRIref`
`[41]`	`Var`	::=	`VAR1 \| VAR2`
`[42]`	`GraphTerm`	::=	`IRIref \| RDFLiteral \| NumericLiteral \| BooleanLiteral \| BlankNode \| NIL`
`[43]`	`Expression`	::=	`ConditionalOrExpression`
`[44]`	`ConditionalOrExpression`	::=	`ConditionalAndExpression ( '\|\|' ConditionalAndExpression )*`
`[45]`	`ConditionalAndExpression`	::=	`ValueLogical ( '&&' ValueLogical )*`
`[46]`	`ValueLogical`	::=	`RelationalExpression`
`[47]`	`RelationalExpression`	::=	`NumericExpression ( '=' NumericExpression \| '!=' NumericExpression \| '<' NumericExpression \| '>' NumericExpression \| '<=' NumericExpression \| '>=' NumericExpression )?`
`[48]`	`NumericExpression`	::=	`AdditiveExpression`
`[49]`	`AdditiveExpression`	::=	`MultiplicativeExpression ( '+' MultiplicativeExpression \| '-' MultiplicativeExpression )*`
`[50]`	`MultiplicativeExpression`	::=	`UnaryExpression ( '' UnaryExpression \| '/' UnaryExpression )`
`[51]`	`UnaryExpression`	::=	`'!' PrimaryExpression \| '+' PrimaryExpression \| '-' PrimaryExpression \| PrimaryExpression`
`[52]`	`BuiltInCall`	::=	`'STR' '(' Expression ')' \| 'LANG' '(' Expression ')' \| 'DATATYPE' '(' Expression ')' \| 'BOUND' '(' Var ')' \| 'isIRI' '(' Expression ')' \| 'isBLANK' '(' Expression ')' \| 'isLITERAL' '(' Expression ')' \| RegexExpression`
`[53]`	`RegexExpression`	::=	`'REGEX' '(' Expression ',' Expression ( ',' Expression )? ')'`
`[54]`	`FunctionCall`	::=	`IRIref ArgList`
`[55]`	`IRIrefOrFunction`	::=	`IRIref ArgList?`
`[56]`	`ArgList`	::=	`( NIL \| '(' Expression ( ',' Expression )* ')' )`
`[57]`	`BrackettedExpression`	::=	`'(' Expression ')'`
`[58]`	`PrimaryExpression`	::=	`BrackettedExpression \| BuiltInCall \| IRIrefOrFunction \| RDFLiteral \| NumericLiteral \| BooleanLiteral \| BlankNode \| Var`
`[59]`	`NumericLiteral`	::=	`INTEGER \| FLOATING_POINT`
`[60]`	`RDFLiteral`	::=	`String ( LANGTAG \| ( '^^' IRIref ) )?`
`[61]`	`BooleanLiteral`	::=	`'true' \| 'false'`
`[62]`	`String`	::=	`STRING_LITERAL1 \| STRING_LITERAL2 \| STRING_LITERAL_LONG1 \| STRING_LITERAL_LONG2`
`[63]`	`IRIref`	::=	`Q_IRI_REF \| QName`
`[64]`	`QName`	::=	`QNAME \| QNAME_NS`
`[65]`	`BlankNode`	::=	`BLANK_NODE_LABEL \| ANON`
`[66]`	`Q_IRI_REF`	::=	'<' ([^<>'{}\|^`]-[#00-#20])* '>'
`[67]`	`QNAME_NS`	::=	`NCNAME_PREFIX? ':'`
`[68]`	`QNAME`	::=	`NCNAME_PREFIX? ':' NCNAME?`
`[69]`	`BLANK_NODE_LABEL`	::=	`'_:' NCNAME`
`[70]`	`VAR1`	::=	`'?' VARNAME`
`[71]`	`VAR2`	::=	`'$' VARNAME`
`[72]`	`LANGTAG`	::=	`'@' [a-zA-Z]+ ('-' [a-zA-Z0-9]+)*`
`[73]`	`INTEGER`	::=	`[0-9]+`
`[74]`	`DECIMAL`	::=	`[0-9]+ '.' [0-9]* \| '.' [0-9]+`
`[75]`	`FLOATING_POINT`	::=	`[0-9]+ '.' [0-9]* EXPONENT? \| '.' ([0-9])+ EXPONENT? \| ([0-9])+ EXPONENT`
`[76]`	`EXPONENT`	::=	`[eE] [+-]? [0-9]+`
`[77]`	`ECHAR`	::=	`'\' [tbnrf\"']`
`[78]`	`STRING_LITERAL1`	::=	`"'" ( ([^#x27#x5C#xA#xD]) \| ECHAR )* "'"`
`[79]`	`STRING_LITERAL2`	::=	`'"' ( ([^#x22#x5C#xA#xD]) \| ECHAR )* '"'`
`[80]`	`STRING_LITERAL_LONG1`	::=	`"'''" ( [^'\] \| ECHAR \| ("'" [^']) \| ("''" [^']) )* "'''"`
`[81]`	`STRING_LITERAL_LONG2`	::=	`'"""' ( [^"\] \| ECHAR \| ('"' [^"]) \| ('""' [^"]) )* '"""'`
`[82]`	`NIL`	::=	`'(' WS* ')'`
`[83]`	`WS`	::=	`#x20 \| #x9 \| #xD \| #xA`
`[84]`	`ANON`	::=	`'[' WS* ']'`
`[85]`	`NCCHAR1p`	::=	`[A-Z] \| [a-z] \| [#x00C0-#x00D6] \| [#x00D8-#x00F6] \| [#x00F8-#x02FF] \| [#x0370-#x037D] \| [#x037F-#x1FFF] \| [#x200C-#x200D] \| [#x2070-#x218F] \| [#x2C00-#x2FEF] \| [#x3001-#xD7FF] \| [#xF900-#xFDCF] \| [#xFDF0-#xFFFD] \| [#x10000-#xEFFFF]`
`[86]`	`NCCHAR1`	::=	`NCCHAR1p \| '_'`
`[87]`	`VARNAME`	::=	`( NCCHAR1 \| [0-9] ) ( NCCHAR1 \| [0-9] \| #x00B7 \| [#x0300-#x036F] \| [#x203F-#x2040] )*`
`[88]`	`NCCHAR`	::=	`NCCHAR1 \| '-' \| [0-9] \| #x00B7 \| [#x0300-#x036F] \| [#x203F-#x2040]`
`[89]`	`NCNAME_PREFIX`	::=	`NCCHAR1p ((NCCHAR\|'.')* NCCHAR)?`
`[90]`	`NCNAME`	::=	`NCCHAR1 ((NCCHAR\|'.')* NCCHAR)?`

The SPARQL grammar is LL(1) when the rules with uppercased names are used as terminals.

B. Security Considerations

SPARQL queries using FROM, FROM NAMED, or GRAPH may cause the specified URI to be dereferenced. This may cause additional use of network, disk or CPU resources along with associated secondary issues such as denial of service. The security issues of Uniform Resource Identifier (URI): Generic Syntax [RFC3986] Section 7 should be considered. In addition, the contents of file: URIs can in some cases be accessed, processed and returned as results, providing unintended access to local resources.

The SPARQL language permits extensions, which will have their own security implications.

Multiple IRIs may have the same appearance. Characters in different scripts may look similar (a Cyrillic "о" may appear similar to a Latin "o"). A character followed by combining characters may have the same visual representation as another character (LATIN SMALL LETTER E followed by COMBINING ACUTE ACCENT has the same visual representation as LATIN SMALL LETTER E WITH ACUTE). Users of SPARQL must take care to construct queries with IRIs that match the IRIs in the data. Further information about matching of similar characters can be found in Unicode Security Considerations [UNISEC] and Internationalized Resource Identifiers (IRIs) [RFC3987] Section 8.

D. Internet Media Type, File Extension and Macintosh File Type (Normative)

The Internet Media Type / MIME Type for the SPARQL Query Language is "application/sparql-query".

It is recommended that sparql query files have the extension ".rq" (all lowercase) on all platforms.

It is recommended that sparql query files stored on Macintosh HFS file systems be given a file type of "TEXT".

This information that follows is intended to be submitted to the IESG for review, approval, and registration with IANA.

F. Acknowledgements

G. Change Log

Changes since the 21 July 2005 Working Draft:

 
$Log: rq23-cmp.html,v $
Revision 1.2  2005/10/21 15:58:07  eric
added stule

Revision 1.1  2005/10/21 15:54:46  eric
alternative layout for WWW/2001/sw/DataAccess/rq23/Overview.html

Revision 1.513  2005/10/20 18:35:26  eric
~ reworked sop:langMatches to not lie.
~ clarified domain of sop:lang and sop:datatype by enumerating the types of valid input and the associated return types

Revision 1.512  2005/10/18 12:22:57  aseaborne
@@ to remember to link in generated parsers

Revision 1.511  2005/10/17 16:44:26  aseaborne
Correct reference in 2.1 to RFC 3987/3.1

Revision 1.510  2005/10/11 18:02:03  eric
~ changed sop:lang and sop:datatype per DAWG meeting 2005-10-11T15:48:08Z
~ fixed some operator argument types (! isIRI isBLANK isLITERAL) per AndyS's observations

Revision 1.509  2005/10/11 15:59:59  eric
~ changed sop:lang and sop:datatype per DAWG meeting 2005-10-11T15:48:08Z

Revision 1.508  2005/10/11 15:14:34  eric
typo

Revision 1.507  2005/10/11 15:13:15  eric
~ fixed direction of lt/gt

Revision 1.506  2005/10/11 13:42:03  eric
~ s/isIri/isURI/ -- some snuck back in but AndyS caught them.

Revision 1.505  2005/10/11 09:10:17  eric
~ clarified behavoir of sop:datatype and sop:lang (message to DAWG)

Revision 1.504  2005/10/11 08:15:30  eric
~ clarified sop:langMatches per advice from fsasaki

Revision 1.503  2005/10/10 18:31:44  eric
~ addressed TimBL's comments per 2005 Sep 13 teleconference resolution (message to DAWG)

Revision 1.502  2005/10/10 16:35:37  eric
+ sketched out langMatches design

Revision 1.501  2005/10/10 11:31:32  eric
~ updated 2.1.1 Query Term Syntax per redefining a namespace prefix and Base IRI definition

Revision 1.500  2005/10/10 10:39:43  eric
~ fixed xs: namespaces erroneously added in 1.493

Revision 1.499  2005/10/07 08:46:27  aseaborne
Remove comma in 9.2

Revision 1.498  2005/10/06 10:13:15  aseaborne
~ A.1 Added ",after escape processing,"
~ Grammar chnaged to add extra excluded characters to IRI rule.
~ Removed the comment about IRIs in the grammar itself.

Revision 1.497  2005/10/05 14:46:33  aseaborne
~ Missing [ in a ref to RFC 3987

Revision 1.496  2005/10/05 14:33:54  aseaborne
~ Removed reference to RDF URI references in 2.2

Revision 1.495  2005/09/28 12:40:01  aseaborne
~ A.1 IRI References text updated
~ Whitespace made a reference into the gramamr table

+ Section numbering in 2 and 10.
+ Ids for section 2/H4's

Revision 1.494  2005/09/27 17:15:39  aseaborne
~ Grammar update
  * Explicit tokens for '(' ')' and '[' ']'
  * BNODE_LABEL => BLANK_NODE_LABEL

Revision 1.493  2005/09/27 11:44:32  eric
- removed the "If the ordering condition is a named variable" constraint on ordering RDF literals
- literal < literal
+ xs:string < xs:string
- func-literal-less-than section
- func-literal-greater-than section

Revision 1.492  2005/09/26 13:53:51  eric
~ ordering literals depends exclusively on the < operator
+ precedence rules for the Operator Mapping Table
+ < operators et al for literals
- simplifed func-RDFterm-equal by depending on precedence in Operator Mapping Table

Revision 1.491  2005/09/23 04:06:00  eric
~ editorial: verb and article agreement, redundent adjectives
- removed unused rdf/rdfs namespace declarations in examples

Revision 1.490  2005/09/14 10:08:42  aseaborne
~ Improve IRI text in grammar section

Revision 1.489  2005/09/12 12:58:28  eric
~ editorial changes -- namespaces and normative XML Schema ref

Revision 1.488  2005/09/12 11:12:01  aseaborne
~ 2005Sep/0002 -- editorial changes

Revision 1.487  2005/09/10 05:10:36  eric
~ changed r:IRI and r:Literal to xs:anyURI and rdfs:Literal per DanC's comments (announced)

Revision 1.486  2005/09/08 17:30:44  aseaborne
Part changes from: Sep2005/0011
~ Added URI for loading default graph in comment in data for 9.1 & 9.3

Revision 1.485  2005/09/07 16:09:37  aseaborne
Part changes from: Sep2005/0011
~ 8.4 Remove urn:x-local and use tag:example.org,2005-06-06:
  in manifest default graph
~ 10.2 "?name X" ==>  ?nameX in example

Revision 1.484  2005/09/07 12:30:39  aseaborne
BNodes explicitly liek variables in defn Pattern Solution

Revision 1.483  2005/09/07 09:26:03  aseaborne
~ Editorial changes : S2005Sep/0001

Revision 1.482  2005/09/06 21:28:41  eric
~ update Query Term Syntax to respond to DanC's ammendment to clarify IRI namespace resolution

Revision 1.481  2005/09/06 21:14:28  eric
~ respond to derhoermi's comment
  (text proposed to WG)

Revision 1.480  2005/09/05 15:00:39  eric
irc://irc.w3.org/#dawg 2005-09-05T14:55:36Z <DanC> can you change eric@w3.org to public-rdf-dawg-comments@w3.org real quick, ericp?

Revision 1.479  2005/08/31 17:12:40  aseaborne
Fix links (in CVSlog)

Revision 1.478  2005/08/31 14:22:15  aseaborne
~ See 2005JulSep/0323 and 2005JulSep/0324

Revision 1.477  2005/08/31 09:57:26  aseaborne
Reset grammar (CVS problems)

Revision 1.476  2005/08/31 09:27:45  aseaborne
~ & => &amp; in MIME type registration

~ Tidied grammar

Revision 1.475  2005/08/30 13:55:17  aseaborne
grammar section: s/is is/and is/

Revision 1.474  2005/08/30 12:17:49  aseaborne
~ Fixed typo
  2005Aug/0084

Revision 1.473  2005/08/30 12:13:53  aseaborne
Typo s/must valid/must be valid/

Revision 1.472  2005/08/30 09:30:28  aseaborne
Grammar:
~ QuotedIRIRef => IRI_REF
+ text stating the grammar is LL(1)

Revision 1.471  2005/08/29 08:50:33  eric
+ integrated DaveB's mime registration text
~ renumbered Appendixes

Revision 1.470  2005/08/23 15:08:40  aseaborne
Grammar section.

Text to say IRI refs must be legal IRIs
so that <a##b> is not legal even though
the grammar rule might allow it.

Revision 1.469  2005/08/23 15:04:29  aseaborne
Grammar section.

Removed comment on ambiguity.  No longer applies.

Revision 1.468  2005/08/23 14:57:40  aseaborne
Gramm section.

~ grammar upgrade: draft LL(1) grammar; tested
  with LALR(1) (Bison) as well.
  Esacpes (\t etc) in grammar

+ Escapes \u described but not in grammar.
~ IRIRefOrFunction to be clear about the expression case.

~ Some tidying.

Revision 1.467  2005/08/16 13:56:55  eric
~ use some of DanC's words about base IRI normalization

Revision 1.466  2005/08/16 13:35:40  eric
+ clarify SPARQL QL spec about base IRI normalization per ericP's action

Revision 1.465  2005/08/16 08:13:20  eric
~ updated B Security Considerations:
  + added text from mime.txt

Revision 1.464  2005/08/10 15:51:50  aseaborne
Validation fix

Revision 1.463  2005/08/10 15:46:08  aseaborne
+ Added placeholder for acknowledgements section
~ Upgraded the text of r[:p :v] along the lines of Dave Beckett's
suggestion in 2005JulSep/0178

Revision 1.462  2005/08/10 06:47:52  eric
~ s/RFC3896/RFC3986/g
~ s/RFC3897/RFC3987/g

Revision 1.461  2005/08/10 06:37:55  eric
~ updated B Security Considerations:
  ~ reference RFC3986 (instead of 1738) for URL security considerations
  + add reference to RFC3987 for IRI security considerations

Revision 1.460  2005/08/09 15:50:47  eric
+added B Security Considerations in response to Bjoern Hoehrmann comments

Revision 1.459  2005/08/08 08:55:15  aseaborne
Didn't flush this log to previous version:

Added text to say the rules of RFC 3897/3896 apply to SPARQL.
Changed references marked as [13] and [19] to [RFC3897] and [RFC3896].

Revision 1.458  2005/08/08 08:49:51  aseaborne
+ Escape sequences (finished)
~ Validation fixes (CVS log!)

Revision 1.457  2005/08/05 21:49:22  aseaborne
Fix examples in 9.2 and 9.3

Revision 1.456  2005/08/05 15:48:29  aseaborne
Make the fact that CONSTRUCT templates can have triples (no variables) explicit

Revision 1.455  2005/08/04 13:42:59  aseaborne
Escape corrections

Revision 1.454  2005/08/04 12:51:16  aseaborne
Escape examples

Revision 1.453  2005/08/04 12:33:05  aseaborne
+ Escape sequences (finished)
~ Validation fixes (CVS log!)

Revision 1.452  2005/08/04 10:57:36  aseaborne
+ Added "_" to start of variable names
~ Tidied up by intrpducing NCCHAR1p and making NCCHAR1 be NCCHAR1p | '_'

Revision 1.451  2005/08/03 14:48:47  aseaborne
+ Added "UNICODE" to
  "A SPARQL query string is a UNICODE character string ..."

+ Escape sequences (not complete - info web site offline)

Revision 1.450  2005/08/02 11:27:49  aseaborne
Add link to protocol doc about dataset description

Revision 1.449  2005/08/01 14:03:03  aseaborne
Add text to explain [:p :q] and (1 2 3) as a subject or object
in response to 2005Jul/0053

Revision 1.448  2005/08/01 13:20:06  aseaborne
~ Use the future SPARQL query results namespace
  to http://www.w3.org/2005/sparql-results#
~ Change the published W3C Technical Report
  Version link to be the current latest version

Revision 1.447  2005/08/01 10:25:53  aseaborne
part II of v1.446

Revision 1.445  2005/07/27 17:36:40  eric
~s/isURI/isIRI/ potentially addressing Bjoern Hoehrmann's comments

Revision 1.444  2005/07/27 17:20:58  eric
~s/11.2.1.1/11.2.2/ and fixed refs

Revision 1.443  2005/07/27 16:48:03  eric
~less brutal segue from constraints to F&O ref

Revision 1.442  2005/07/27 16:38:57  eric
~rewording refs to "implementation"

Revision 1.441  2005/07/26 09:01:47  aseaborne
+ RDF colections link to
http://www.w3.org/TR/2004/REC-rdf-mt-20040210/#collections
+ Removed direct link to for URI of rdf:nil.
+ Removed direct link to for URI of rdf:type.

+ Fixed charmod reference (use &#252;)

Revision 1.440  2005/07/26 08:42:42  aseaborne
+ Referernce to charmod
+ refer to this reference in grammar sction

Revision 1.439  2005/07/25 17:19:12  aseaborne
appNS:myDataType -> appNS:appDataType

Revision 1.438  2005/07/25 14:47:57  aseaborne
Editorial in response to 2005Jul/0028
+ Sec 2.8  replace ":myClass" with ":appClass"
+ Sec 3:   "abc"^^myNS:myDataType => "abc"^^appNS:myDataType
+ Sec 10.3:
    http://example.org/myGraph => http://example.org/aGraph
    app:myDate => app:customDate
+ sec 10.4
    myOrg.example => org.example.com
    myOrg: => exOrg:

+ sec 11.2.4 Extensible Value Testing
    my: => func:
    http://my.example => http://example.org/
    myGeo: => aGeo:

Editorial in response to 2005Jul/0041
+ Fix example in 9.3: use mailto:aloce@work.example.org

Same mistake in 5.1 - fixed

Revision 1.437  2005/07/25 14:33:16  connolly
added link to XML11 from grammar section

Revision 1.436  2005/07/24 00:15:01  connolly
- class="norm" for
 - rdf-concepts (#dfn-literal etc.)
 - rdf-mt (#graphdefs, merge, etc.)
  - added [RDF-MT] at first mention
  - added bib entry
 - xquery
 - xpath-functions
 - xpath20
 - xmlschema-2
 - REC-xml11

- class="inform" for
 - rdf-dawg-uc
 - rdf-sparql-protocol
 - turtle df1/
 - rdf-sparql-XMLres
 - dublin core
 - webarch
 - vcard-rdf

- added normative ref entries for
 - RDF-MT
 - RFC 3066 (hmm... informative?)
 - XPATH20
 - XMLSCHEMA-2
 - XML11

- added informative ref entries for
 - rdf-sparql-protocol
 - unicode security considerations (hmm... normative?)
 - VCARD
 - dublin core

- replaced bogus [11] with [SPROT]

- changed RFC 3066 from faqs.org to ietf.org;
  not sure if it's informative or normative

- reformatted RFC 3986 bib entry to use cite etc.

- tweaked latest version URI of /TR/xquery/ to
  match links from the body (but differ from
  bibliography generator. hmm.)

- got rid of see also list under TOC
 - DAWG test cases link goes to status section
 - protocol, results docs become informative citations
 - punt slide set

- got rid of issues list pointer under TOC (belongs in status)

- removed one of the two "Valid XHTML!" buttons;
  added class="nav"

- truncated the CVS log at 21 July