SPARQL Query Language for RDF

@@Consider putting formal definitions first for OPTIONAL, UNION and result forms (10.2, 10.3, 10.4, 10.5)

@@ Grammar extracts maybe out of line with the grammar while the grammar is revised for clarity.

An RDF graph is a set of triples; each triple consists of a subject, a predicate and an object. RDF graphs are defined in RDF Concepts and Abstract Syntax [CONCEPTS]. These triples can come from a variety of sources. For instance, they may come directly from an RDF document; they may be inferred from other RDF triples; or they may be the RDF expression of data stored in other formats, such as XML or relational databases. The RDF graph may be virtual, in that it is not fully materialized, only doing the work needed for each query to execute.

SPARQL is a query language for getting information from such RDF graphs. It provides facilities to:

As a data access language, it is suitable for both local and remote use. The companion SPARQL Protocol for RDF document [SPROT] describes the remote access protocol.

1.1 Document Outline

Later sections of this document describe how other graph patterns can be built using the graph operators OPTIONAL and UNION; how graph patterns can be grouped together; how queries can extract information from more than one graph, and how it is also possible to restrict the values allowed in matching a pattern.

1.2 Document Conventions

1.2.1 Namespaces

In this document, examples assume the following namespace prefix bindings unless otherwise stated:

Prefix	IRI
`rdf:`	`http://www.w3.org/1999/02/22-rdf-syntax-ns#`
`rdfs:`	`http://www.w3.org/2000/01/rdf-schema#`
`xsd:`	`http://www.w3.org/2001/XMLSchema#`
`fn:`	`http://www.w3.org/2005/xpath-functions#`

1.2.2 Data Descriptions

The data format used in this document is Turtle [TURTLE], used to show each triple explicitly. Turtle allows URIs to be abbreviated with prefixes:

1.2.3 Result Descriptions

x	y	z
"Alice"	`<http://example/a>`

The term "binding" is used as a descriptive term to refer to a pair (variable, RDF term). In this result set, there are variables x, y and z (shown as column hearers). Each solution is shown as a row in the body of the table. Here, there is a single solution, where variable x is bound to "Alice", variable y is bound to <http://example/a>, and variable z is not bound to an RDF term. Variables are not required to be bound in a solution, for example, optional matches and alternative matches may leave some variables unbound in some rows.

2 Making Simple Queries

The SPARQL query language is based on matching graph patterns. The simplest graph pattern is the triple pattern, which is like an RDF triple, but with the possibility of a query variable instead of an RDF term in the subject, predicate or object positions. Combining triple patterns gives a basic graph pattern, where an exact match to a graph is needed to fulfill a pattern.

2.1 Writing a Simple Query

The example below shows a SPARQL query to find the title of a book from the information in the given RDF graph. The query consists of two parts, the SELECT clause and the WHERE clause. The SELECT clause identifies the variables to appear in the query results, and the WHERE clause has one triple pattern.

title
"SPARQL Tutorial"

2.2 Multiple Matches

The results of a query is a sequence of solutions, giving the ways in which the query pattern matches the data. The sequence of solutions is further modified by the solution sequence modifiers. There may be zero, one or multiple solutions to a query.

The results enumerate the RDF terms to which the selected variables can be bound in the query pattern in order to match triples in the data. In the above example, the following two subsets of the data provided the two matches.

name	mbox
"Johnny Lee Outlaw"	<mailto:jlow@example.com>
"Peter Goodguy"	<mailto:peter@example.org>

This is a basic graph pattern match, and all the named variables used in the query pattern must be bound in every solution.

2.3 Matching RDF Literals

v
<http://example.org/ns#x>

v
<http://example.org/ns#y>

x
<http://example.org/ns#z>

2.4 Value Constraints

Graph pattern matching creates bindings of variables. It is possible to further restrict solutions by constraining the RDF terms that can be used as bindings of variables. Value constraints take the form of boolean-valued expressions; the language also allows application-specific constraints on the values in a solution.

title
"SPARQL Tutorial"

title
"The Semantic Web"

title	price
"The Semantic Web"	23

By constraining the price variable, only book2 matches the query because only book2 has a price less than 30.5, as the filter condition requires.

2.5 Term Constraints

2.6 Querying Reification Vocabulary

RDF defines a reification vocabulary which provides for describing RDF statements without stating them. These descriptions of statements can be queried by using the defined vocabulary. SPARQL does not treat querying reified data differently from any other RDF data. SPARQL can be used to query graph-pattern matches using the reification vocabulary.

book	title
<http://example.org/book/book1>	"SPARQL Tutorial"

2.7 Blank Node Labels in Query Results

The presence of blank nodes in query results can be indicated by labels in the serialization of query results.

Blank nodes labels are local to the result set of a query. An application or client cannot usefully use blank node labels to formulate a query that directly refers to a particular blank node. In effect, this means that information about co-occurrences of blank nodes may be treated as scoped to the results as defined in "SPARQL Query Results XML Format" or the CONSTRUCT result form.

x	name
_:c	"Alice"
_:d	"Bob"

x	name
_:r	"Alice"
_:s	"Bob"

These two results have the same information: the blank nodes used to match the query are different in the two solutions. There would be no relation if the label _:a were used in the results and the blank node label in the data graph.

3 SPARQL Term Syntax

3.1 RDF Term Syntax

3.1.1 Syntax for IRIs

The terms delimited by "<>" are IRI references [RFC3987]; the delimiters do not form part of the reference. They stand for IRIs, either directly, or relative to a base IRI. IRIs are a generalization of URIs [RFC3986] and are fully compatible with URIs and URLs.

The SPARQL syntax provides two abbreviation mechanisms for IRIs, prefixed names and relative IRIs.

Prefixed names

The PREFIX keyword associates a prefix label with an IRI. A prefixed name is a prefix label and a local part, separated by a colon ":". It is mapped to an IRI by concatenating the local part to the IRI corresponding to the prefix. The prefix label may be the empty string.

Relative IRIs

Relative IRIs are combined with base IRIs as per Uniform Resource Identifier (URI): Generic Syntax [RFC3986] using only the basic algorithm in Section 5.2 . Neither Syntax-Based Normalization nor Scheme-Based Normalization (described in sections 6.2.2 and 6.2.3 of RFC3986) are performed. Characters additionally allowed in IRI references are treated in the same way that unreserved characters are treated in URI references, per section 6.5 of Internationalized Resource Identifiers (IRIs) [RFC3987].

The BASE keyword defines the Base IRI used to resolve relative IRIs per RFC3986 section 5.1.1, "Base URI Embedded in Content". Section 5.1.2, "Base URI from the Encapsulating Entity" defines how the Base IRI may come from an encapsulating document, such as a SOAP envelope with an xml:base directive, or a mime multipart document with a Content-Location header. The "Retrieval URI" identified in 5.1.3, Base "URI from the Retrieval URI", is the URL from which a particular SPARQL query was retrieved. If none of the above specifies the Base URI, the default Base URI (section 5.1.4, "Default Base URI") is used.

3.1.2 Syntax for Literals

The general syntax for literals is a string (enclosed in quotes, either double quotes "" or single quotes '' ), with either an optional language tag (introduced by @) or an optional datatype IRI or prefixed name (introduced by ^^).

As a convenience, integers can be written directly and are interpreted as typed literals of datatype xsd:integer; decimal numbers, where there is '.' in the number but no exponent, are interpreted as xsd:decimal and a number with an exponent is interpreted as an xsd:double. Values of type xsd:boolean can also be written as true or false.

To facilitate writing literal values which themselves contain quotation marks or which are long and contain newline characters, SPARQL provides an additional quoting construct in which literals are enclosed in three single- or double-quotation marks.

3.1.3 Syntax for Query Variables

Query variables in SPARQL queries have global scope; use of a given variable name anywhere in a query identifies the same variable. Variables are indicated by "?"; the "?" does not form part of the variable name. "$" is an alternative to "?". In a query, $abc and ?abc are the same variable. The possible names for variables are given in the SPARQL grammar.

3.1.4 Syntax for Blank Nodes

A blank node can appear in a query pattern and will take part in the pattern matching. Blank nodes are indicated by either the label form "_:a" or by use of "[ ]". A blank node that is used in only one place in the query syntax can be abbreviated with []. A unique blank node will be created and used to form the triple pattern. Blank node labels are written as "_:a" for a blank node with label "a" and the label is scoped to the basic graph pattern.

The [:p :v] construct can be used in triple patterns. It creates a blank node label which is used as the subject of all contained predicate-object pairs. The created blank node can also be used in further triple patterns in the subject and object positions.

allocate a unique blank node label (here "b57") and are equivalent to writing:

This allocated blank node label can be used as the subject or object of further triple patterns. For example, as a subject:

Abbreviated blank node syntax can be combined with other abbreviations for common subjects and predicates.

This is the same as writing the following basic graph pattern for some uniquely allocated blank node:

3.2 Syntax for Triple Patterns

Triple Patterns are written as a list of subject, predicate, object; there are abbreviated ways of writing some common triple pattern constructs.

3.2.1 Predicate-Object Lists

Triple patterns with a common subject can be written so that the subject is only written once, and used for more than one triple pattern by employing the ";" notation.

3.2.2 Object Lists

If triple patterns share both subject and predicate, then these can be written using the "," notation.

3.2.3 RDF Collections

RDF collections can be written in triple patterns using the syntax "( )". The form () is an alternative for the IRI http://www.w3.org/1999/02/22-rdf-syntax-ns#nil. When used with collection elements, such as

(1 ?x 3
  4)

, triple patterns and blank nodes are allocated for the collection and the blank node at the head of the collection can be used as a subject or object in other triple patterns. Blank nodes allocated do not occur else in the query.

is a short form for (the blank node labels do not occur anywhere else in the query):

3.2.4 rdf:type

4 Initial Definitions

SPARQL is defined in terms of IRIs. RDF Concepts and Abstract Syntax "anticipates an RFC on Internationalized Resource Identifiers. Implementations may issue warnings concerning the use of RDF URI References that do not conform with [IRI draft] or its successors."

4.1 RDF Terms

SPARQL is defined in terms of IRIs, a subset of RDF URI References that omits spaces.

This definition of RDF Term collects together several basic notions from the RDF data model, but updated to refer to IRIs rather than RDF URI references.

Note that all IRIs are absolute; they may or may not include a fragment identifier [RFC3987, section 3.1]. IRIs include URIs [RFC3986] and URLs. The abbreviated forms (relative IRIs and prefixed names) in the SPARQL syntax are resolved to produce absolute IRIs.

4.2 Query Variables

4.3 Triple Patterns

Because RDF graphs may not contain literal subjects, any SPARQL triple pattern with a literal as subject will fail to match on any RDF graph.

4.4 Graph Pattern

SPARQL queries are made of one or more graph patterns. Graph patterns can be combined into larger patterns.

4.5 SPARQL Query

The graph pattern may be the empty pattern. The set of solution modifiers may be the empty set.

4.6 Pattern Solutions

Graph patterns match against the default graph of an RDF dataset, except for the RDF Dataset Graph Pattern. In this section, all matching is described for a single graph, being the default graph of the RDF dataset being queried.

4.7 Value Constraints

Constraints may be restrictions on the value associated with an RDF Term or they may be restrictions on some part of an RDF term, such as its lexical form. There is a set of functions & operators in SPARQL for constraints. In addition, there is an extension mechanism to provide access to functions that are not defined in the SPARQL language. Restrictions on the value of a RDF term are based on its value, as given by any datatype; value tests only apply to RDF literals.

A constraint may lead to an error condition when testing some RDF term. The exact error will depend on the constraint: for example, in numeric operations, solutions with variables bound to a non-number or a blank node will lead to an error. Any potential solution that causes an error condition in a constraint will not form part of the final results, but does not cause the query to fail.

@@Filters apply to the whole of the group they are in. Canonically, all matchingis done, then filters are applied. Implementations wil optimize this.

5 Basic Graph Patterns

A basic graph pattern is a set of triple patterns and forms the basis of SPARQL query matching. Matching a basic graph pattern is defined in terms of generic entailment to allow for future extension of the language.

Logical entailment may result in inconsistent RDF graphs. For example, "-1"^^xsd:positiveInteger is inconsistent with respect to D-entailment [RDF-MT]. The effect of any query on an inconsistent graph is not covered by this specification.

This definition extends that for RDF graph-equivalence to basic graph patterns by preserving variable names across equivalent graphs.

5.1 General Framework

The scoping set restricts the values of variable assignments in a solution. The scoping set may be characterized differently by different entailment regimes.

The scoping graph makes the graph to be matched independent of the chosen blank node names.

The same scoping set and scoping graph is used for all basic graph pattern matching in a single SPARQL query request.

The introduction of the basic graph pattern BGP' in the above definition makes the query basic graph pattern independent of the choice of blank node names in the basic graph pattern.

5.2 SPARQL Basic Graph Pattern Matching

These definitions allow for future extensions to SPARQL. This document defines SPARQL for simple entailment and the scoping set B is the set of all RDF terms in G'.

When using simple entailment, the operation of querying an RDF graph provides access to the graph structure, up to blank node renaming; nothing that is not already in the graph G needs to be inferred or constructed, even implicitly.

A pattern solution can then be defined as follows: to match a basic graph pattern under simple entailment, it is possible to proceed by finding a mapping from blank nodes and variables in the basic graph pattern to terms in the graph being matched; a pattern solution is then a mapping restricted to just the variables, possibly with blank nodes renamed. Moreover, a uniqueness property guarantees the interoperability between SPARQL systems: given a graph and a basic graph pattern, the set of all the pattern solutions is unique up to blank node renaming.

5.3 Example of Basic Graph Pattern Matching

mbox
<mailto:outlaw@example.com>

This query contains a basic graph pattern of two triple patterns, each of which must match with the same solution for the graph pattern to match. The pattern solution matching the basic graph pattern maps the variable 'x' to blank node _:a and variable 'mbox' to the IRI mailto:outlaw@example.com. The query only returns the variable 'mbox'.

5.4 Basic Graph Patterns in the SPARQL Syntax

In the SPARQL syntax, Basic Graph Patterns are sequences of triple patterns. Other graph patterns separate basic patterns. The two query fragments below each contain the same basic graph pattern of

6 Group Graph Patterns

Complex graph patterns can be made by combining simpler graph patterns. The ways of creating graph patterns are:

6.1 Group Graph Patterns

For any solution, the same variable is given the same value everywhere in the set of graph patterns making up the group graph pattern.

In a SPARQL query string, a group graph pattern is delimited with braces: {}. For example, this query has a group graph pattern of one basic graph pattern as the query pattern.

6.2 Empty Group Pattern

matches any graph (including the empty graph) requiring no substitutions for variables.

6.3 Order of Evaluation

There is no implied order of graph patterns within a Group Graph Pattern. Any solution for the group graph pattern that can satisfy all the graph patterns in the group is valid, independently of the order that may be implied by the lexical order of the graph patterns in the group.

7 Including Optional Values

Basic graph patterns allow applications to make queries where the entire query pattern must match for there to be a solution. For every solution of the query, every variable is bound to an RDF Term in a pattern solution. However, regular, complete structures cannot be assumed in all RDF graphs and it is useful to be able to have queries that allow information to be added to the solution where the information is available, but not to have the solution rejected because some part of the query pattern does not match. Optional matching provides this facility; if the optional part does not lead to any solutions, it creates no bindings.

7.1 Optional Pattern Matching

Optional parts of the graph pattern may be specified syntactically with the OPTIONAL keyword applied to a graph pattern:

name	mbox
"Alice"	<mailto:alice@example.com>
"Alice"	<mailto:alice@work.example>
"Bob"

There is no value of mbox in the solution where the name is "Bob". It is unbound.

This query finds the names of people in the data. If there is a triple with predicate mbox and same subject, a solution will contain the object of that triple as well. In the example, only a single triple pattern is given in the optional match part of the query but, in general, it is any graph pattern. The whole graph pattern of an optional graph pattern must match for the optional graph pattern to affect the query solution.

7.2 Constraints in Optional Pattern Matching

title	price
"SPARQL Tutorial"
"The Semantic Web"	23

No price appears for the book with title "SPARQL Tutorial" because the optional graph pattern did not lead to a solution involving the variable "price".

7.3 Multiple Optional Graph Patterns

Graph patterns are defined recursively. A graph pattern may have zero or more optional graph patterns, and any part of a query pattern may have an optional part. In this example, there are two optional graph patterns.

name	mbox	hpage
"Alice"		<http://work.example.org/alice/>
"Bob"	<mailto:bob@work.example>

7.4 Optional Matching – Formal Definition

In an optional match, either an additional graph pattern matches a graph, thereby defining one or more pattern solutions; or it passes the solution without adding any additional bindings.

7.5 Nested Optional Graph Patterns

Optional patterns can occur inside any group graph pattern, including a group graph pattern which itself is optional, forming a nested pattern. The outer optional graph pattern must match for any nested optional pattern to be matched.

This query finds the name, optionally the mbox, and also the vCard given name; further, if there is a vCard Family name as well as the Given name, the query finds that as well.

foafName	mbox	gname	fname
"Alice"	<mailto:alice@work.example>	"Alice"	"Hacker"
"Bob"	<mailto:bob@work.example>
"Ella"		"Eleanor"

By nesting the optional pattern involving vcard:Family, the query only matches these if there are appropriate vcard:N and vcard:Given triples in the data. Here the expression is a simple triple pattern on vcard:N but it could be a complex graph pattern with value constraints.

8 Matching Alternatives

SPARQL provides a means of combining graph patterns so that one of several alternative graph patterns may match. If more than one of the alternatives matches, all the possible pattern solutions are found.

8.1 Joining Patterns with UNION

This query will only match a book if it has both a title and creator predicate from the same version of Dublin Core.

title
"SPARQL Protocol Tutorial"
"SPARQL"
"SPARQL (updated)"
"SPARQL Query Language Tutorial"

x	y
	"SPARQL (updated)"
	"SPARQL Protocol Tutorial"
"SPARQL"
"SPARQL Query Language Tutorial"

author	title
"Alice"	"SPARQL Protocol Tutorial"
"Bob"	"SPARQL Query Language Tutorial"

8.2 Union Matching – Formal Definition

Query results involving a pattern containing GP1 and GP2 will include separate solutions for each match where GP1 and GP2 give rise to different sets of bindings.

9 RDF Dataset

The RDF data model expresses information as graphs, consisting of triples with subject, predicate and object. Many RDF data stores hold multiple RDF graphs, and record information about each graph, allowing an application to make queries that involve information from more than one graph.

A SPARQL query is executed against an RDF Dataset which represents a collection of graphs. An RDF Dataset comprises one graph, the default graph, which does not have a name, and zero or more named graphs, each identified by IRI. A SPARQL query can match different parts of the query pattern against different graphs as described in the query section.

In the previous sections, all queries have been shown executed against a single graph, being the default graph of an RDF dataset. A query does not need to involve the default graph; the query can just involve matching named graphs.

9.1 Examples of RDF Datasets

The definition of RDF Dataset does not restrict the relationships of named and default graphs. Information can be repeated in different graphs; relationships between graph can exposed. Two useful arrangements are:

In this example, the default graph contains the names of the publishers of two named graphs. The triples in the named graphs are not visible in the default graph in this example.

RDF data can be combined by the RDF merge [RDF-MT] of graphs. One possible arranegment of graphs in an RDF Dataset is to have the default graph being the RDF merge of some or all of the information in the named graphs.

In this next example, the named graphs contain the same triples as before. The RDF dataset includes an RDF merge of the named graphs in the default graph, re-labeling blank nodes to keep them distinct.

9.2 Specifying RDF Datasets

A SPARQL query may specify the dataset to be used for matching using the FROM clause and the

FROM
  NAMED

clause to describe the RDF dataset. If a query provides such a dataset description, then it is used in place of any dataset that the query service would use if no dataset description is provided in a query. The RDF dataset may also be specified in a SPARQL protocol request, in which case the protocol description overrides any description in the query itself. A query service may refuse a query request if the dataset description is not acceptable to the service.

A query processor may use these IRIs in any way to associate an RDF Dataset with a query. For example, it could use IRIs to retrieve documents, parse them and use the resulting triples as one of the graphs; alternatively, it might only service queries that specify IRIs of graphs that it has already stored.

The FROM and FROM NAMED keywords allow a query to specify an RDF dataset by reference; they indicate that the dataset should include graphs that are obtained from representations of the resources identified by the given IRIs (i.e. the absolute form of the given IRI references). The dataset resulting from a number of FROM and FROM NAMED clauses is:

9.2.1 Specifying the Default Graph

Each FROM clause contains an IRI that indicates the graph to be used to form the default graph. This does not put the graph in as a named graph; a query can do this by also specifying the graph in the FROM NAMED clause.

name
"Alice"

If a query provides more than one FROM clause, providing more than one IRI to indicate the default graph, then the default graph is based on the RDF merge of the graphs obtained from representations of the resources identified by the given IRIs.

9.2.2 Specifying Named Graphs

A query can supply IRIs for the named graphs in the RDF Dataset using the FROM NAMED clause. Each IRI is used to provide one named graph in the RDF Dataset. Using the same IRI in two or more FROM NAMED clauses results in one named graph with that IRI appearing in the dataset.

The FROM NAMED syntax suggests that the IRI identifies the corresponding graph, but actually the relationship between a URI and a graph in an RDF dataset is indirect. The IRI identifies a resource, and the resource is represented by a graph (or, more precisely: by a document that serializes a graph). For further details see [WEBARCH].

9.2.3 Combining FROM and FROM NAMED

who	g	mbox
"Bob Hacker"	<http://example.org/bob>	<mailto:bob@oldcorp.example.org>
"Alice Hacker"	<http://example.org/alice>	<mailto:alice@work.example.org>

This query finds the mbox together with the information in the default graph about the publisher. <http://example.org/dft.ttl> is just the IRI used to form the default graph, not it's name.

9.3 Querying the Dataset

When querying a collection of graphs, the GRAPH keyword is used to match patterns against named graphs. This is by either using an IRI to select a graph or using a variable to range over the IRIs naming graphs.

src	bobNick
<http://example.org/foaf/aliceFoaf>	"Bobby"
<http://example.org/foaf/bobFoaf>	"Robert"

nick
"Robert"

mbox	nick	ppd
<mailto:bob@work.example>	"Robert"	<http://example.org/foaf/bobFoaf>

Any triple in Alice's FOAF file giving Bob's nick is not used to provide a nick for Bob because the pattern involving variable nick is restricted by ppd to a particular Personal Profile Document.

9.3.4 Named and Default Graphs

Query patterns can involve both the default graph and the named graphs. In this example, an aggregator has read in a Web resource on two different occasions. Each time a graph is read into the aggregator, it is given an IRI by the local system. The graphs are nearly the same but the email address for "Bob" has changed.

The default graph is being used to record the provenance information and the RDF data actually read is kept in two separate graphs, each of which is given a different IRI by the system. The RDF dataset consists of two named graphs and the information about them.

name	mbox	date
"Bob"	<mailto:bob@oldcorp.example.org>	"2004-12-06"^^xsd:date
"Bob"	<mailto:bob@newcorp.example.org>	"2005-01-10"^^xsd:date

10 Query Result Forms

SPARQL has four query result forms. These result forms use the solutions from pattern matching to form result sets or RDF graphs. The query result forms are:

10.1 Solution Sequences and Result Forms

Query patterns generate an unordered collection of solutions, each solution being a function from variables to RDF terms. These solutions are then treated as a sequence, initially in no specific order; any sequence modifiers are then applied to create another sequence. Finally, this latter sequence is used to generate one of the SPARQL result forms.

The solution sequence from matching the query pattern is a collection formed from the solutions of the query pattern with no defined order.

10.1.1 ORDER BY

The ORDER BY clause takes a solution sequence and applies an ordering condition based on all the expressions and directions specified in the ORDER BY clause. The ordering condition also involves a direction of ordering which is ascending by default. It can be explicitly set to ascending or descending by enclosing the condition in ASC() or DESC() respectively. If multiple expressions are given, then they are applied in turn until one gives the indication of the ordering.

Using ORDER BY on a solution sequence for a result form other than SELECT has no direct effect because only SELECT returns a sequence of results. In combination with LIMIT and OFFSET, it can be used to return partial results.

The "<" operator (see the Operator Mapping) defines the relative order of pairs of numerics, xsd:dateTimes and xsd:strings.

IRIs are ordered by comparing the character strings making up each IRI using the "<" operator.

SPARQL also defines a fixed, arbitrary order between some kinds of RDF terms that would not otherwise be ordered. This arbitrary order is necessary to provide slicing of query solutions by use of LIMIT and OFFSET.

If the ordering criteria do not specify the order of values, then the ordering in the solution sequence is undefined.

Ordering a sequence of solutions always results in a sequence with the same number of solutions in it, even if the ordering criteria does not differentiate between two solutions.

10.1.2 Projection

The solution sequence can be transformed into one involving only a subset of the variables. For each solution in the sequence, a new solution is formed using a specified selection of the variables.

The following example shows a query to extract just the names of people described in an RDF graph using FOAF properties.

name
"Bob"
"Alice"

10.1.3 DISTINCT

The solution sequence can be modified by adding the DISTINCT keyword which ensures that every combination of variable bindings (i.e. each solution) in the sequence is unique.

name
"Alice"

If DISTINCT and LIMIT or OFFSET are specified, then duplicates are eliminated before the limit or offset is applied.

10.1.4 OFFSET

OFFSET causes the solutions generated to start after the specified number of solutions. An OFFSET of zero has no effect.

The order in which solutions are returned is initially undefined. Using LIMIT and OFFSET to select different subsets of the query solutions will not be useful unless the order is made predictable by using

ORDER
  BY

10.1.5 LIMIT

The LIMIT form puts an upper bound on the number of solutions returned. If the number of actual solutions is greater than the limit, then at most the limit number of solutions will be returned.

A limit of 0 would cause no results to be returned. A limit may not be negative.

10.2 Selecting Variables

The SELECT form of results returns the variables directly. The syntax SELECT * is an abbreviation that selects all of the variables in a query.

nameX	nameY	nickY
"Alice"	"Bob"
"Alice"	"Clare"	"CT"

10.3 Constructing an Output Graph

The CONSTRUCT result form returns a single RDF graph specified by a graph template. The result is an RDF graph formed by taking each query solution in the solution sequence, substituting for the variables into the graph template, and combining the triples into a single RDF graph by set union.

If any such instantiation produces a triple containing an unbound variable, or an illegal RDF construct (such as a literal in subject or predicate position), then that triple is not included in the output RDF graph. The graph template can contain ground or explicit triples, that is, triples with no variables, and these also appear in the output RDF graph returned by the CONSTRUCT query form.

10.3.1 Templates with Blank Nodes

A template can create an RDF graph containing blank nodes. The blank node labels are scoped to the template for each solution. If the same label occurs twice in a template, then there will be one blank node created for each query solution but there will be different blank nodes across triples generated by different query solutions.

The use of variable ?x in the template, which in this example will be bound to blank nodes (which have labels _:a and _:b in the data) causes different blank node labels (_:v1 and _:v2) as shown by the results.

10.3.2 Accessing Graphs in the RDF Dataset

Using CONSTRUCT it is possible to extract parts or the whole of graphs from the target RDF dataset. This first example returns the graph (if it is in the dataset) with IRI label http://example.org/aGraph; otherwise, it returns an empty graph.

The access to the graph can be conditional on other information. Suppose the default graph contains metadata about the named graphs in the dataset, then a query like the following one can extract one graph based on information about the named graph:

where app:customDate identified an extension function to turn the data format into an xsd:dateTime RDF Term.

10.3.3 Solution Modifiers and CONSTRUCT

The solution modifiers of a query affect the results of a CONSTRUCT query. In this example, the output graph from the CONSTRUCT template is formed from just 2 of the solutions from graph pattern matching. The query outputs a graph with the names of the people with the top 2 sites, rated by hits. The triples in the RDF graph are not ordered.

10.4 Descriptions of Resources (Non-normative)

Current conventions for DESCRIBE return an RDF graph without any specified constraints. Future SPARQL specifications may further constrain the results of DESCRIBE, rendering some currently valid DESCRIBE responses invalid. As with any query, a service may refuse to serve a DESCRIBE query.

The DESCRIBE form returns a single result RDF graph containing RDF data about resources. This data is not prescribed by a SPARQL query, where the query client would need to know the structure of the RDF in the data source, but, instead, is determined by the SPARQL query processor. The query pattern is used to create a result set. The DESCRIBE form takes each of the resources identified in a solution, together with any resources directly named by IRI, and assembles a single RDF graph by taking a "description" from the target knowledge base. The description is determined by the query service. The syntax DESCRIBE * is an abbreviation that identifies all of the variables in a query.

10.4.1 Explicit IRIs

The DESCRIBE clause itself can take IRIs to identify the resources. The simplest DESCRIBE query is just an IRI in the DESCRIBE clause:

10.4.2 Identifying Resources

The resources can also be a query variable from a result set. This enables description of resources whether they are identified by IRI or by blank node in the dataset:

The property foaf:mbox is defined as being an inverse function property in the FOAF vocabulary. If treated as such, this query will return information about at most one person. If, however, the query pattern has multiple solutions, the RDF data for each is the union of all RDF graph descriptions.

10.4.3 Descriptions of Resources

The RDF returned is determined by the information publisher. It is the useful information the service has about a resource. It may include information about other resources: the RDF data for a book may also include details about the author.

which includes the blank node closure for the vcard vocabulary vcard:N. Other possible mechanisms for deciding what information to return include Concise Bounded Descriptions [CBD].

For a vocabulary such as FOAF, where the resources are typically blank nodes, returning sufficient information to identify a node such as the InverseFunctionalProperty foaf:mbox_sha1sum as well information like name and other details recorded would be appropriate. In the example, the match to the WHERE clause was returned but this is not required.

10.5 Asking "yes or no" questions

Applications can use the ASK form to test whether or not a query pattern has a solution. No information is returned about the possible query solutions, just whether the server can find one or not.

11 Testing Values

SPARQL FILTERs restrict the set of solutions according to a given expression. Specifically, FILTERs eliminate any solutions that, when substituted into the expression, either result in an effective boolean value of false or produce an error. Effective boolean values are defined in section 11.2.2 Effective Boolean Value; errors are defined in XQuery 1.0: An XML Query Language [XQUERY] section 2.3.1, Kinds of Errors.

The SPARQL operators are listed in section 11.3 and are associated with their productions in the grammar.

In addition, SPARQL provides the ability to invoke arbitrary functions, including a subset of the XPath casting functions, listed in section 11.5. The are invoked by name (an IRI) within a SPARQL query:

11.1 Operand Data Types

SPARQL functions and operators operate on RDF terms and SPARQL variables. A subset of these functions and operators are taken from the XQuery 1.0 and XPath 2.0 Functions and Operators [FUNCOP] and have XML Schema typed value arguments and return types. RDF typed literals passed as arguments to these functions and operators are mapped to XML Schema typed values with a string value of the lexical form and an atomic datatype corresponding to the datatype IRI. The returned typed values are mapped back to RDF typed literals @@by reversing this mapping@@.

When referring to a type, the following terms denote a typed literal with the corresponding XML Schema [XSDT] datatype IRI:

Extended SPARQL implementations may treat additional types as being derived from numeric types.

11.2 Filter Evaluation

SPARQL provides a subset of the functions and operators defined by XQuery Operator Mapping. XQuery 1.0 section 2.2.3 Expression Processing describes the invocation of XPath functions. The following rules accommodate the differences in the data and execution models between XQuery and SPARQL:

The logical-and and logical-or truth table for true (T), false (F), and error (E) is as follows:

11.2.1 Invocation

SPARQL defines a syntax for invoking functions and operators on a list of arguments. These are invoked as follows:

If any of these steps fails, the invocation generates an error. The effects of errors are defined in Filter Evaluation.

11.2.2 Effective Boolean Value (EBV)

The XQuery Effective Boolean Value rules rely on the definition of XPath's fn:boolean. The following rules reflect the rules for fn:boolean applied to the argument types present in SPARQL Queries:

An EBV of true is represented as a typed literal with a datatype of xsd:boolean and a lexical value of "true"; an EBV of false is represented with a lexical value of "false".

[Informative: Effective boolean value is used to calculate the arguments to the logical functions logical-and, logical-or, and fn:not, as well as evaluate the result of a filter.]

11.3 Operator Mapping

The SPARQL grammar identifies a set of operators (for instance, &&, *, isIRI) used to construct constraints. The following table associates each of these grammatical productions with the appropriate operands and an operator function defined by either XQuery 1.0 and XPath 2.0 Functions and Operators [FUNCOP] or the SPARQL operators specified in section 11.4. When selecting the operator definition for a given set of parameters, the definition with the most specific parameters applies. For instance, when evaluating xsd:integer = xsd:signedInt, the definition for = with two numeric parameters applies, rather than the one with two RDF terms. The table is arranged so that upper-most viable candiate is the most specific. Operators invoked without appropriate operands result in a type error.

SPARQL follows XPath's scheme for numeric type promotions and subtype substitution for arguments to numeric operators. The XPath Operator Mapping rules for numeric operands {xs:integer, xs:decimal, xs:float, xs:double, and types derived from a numeric type} apply to SPARQL operators as well (see XML Path Language (XPath) 2.0 [XPATH20] for defintions of numeric type promotions and subtype substitution). Some of the operators are associated with nested function expressions, e.g. fn:not(op:numeric-equal(A, B)). Note that per the XPath definitions, fn:not and op:numeric-equal produce an error if their argument is an error.

The collation for fn:compare is defined by XPath and identified by http://www.w3.org/2005/xpath-functions/collation/codepoint. This collation allows for string comparison based on code point values. Codepoint string equivilence can be tested with RDF term equivilence.

11.3.1 Operator Extensibility

Extended SPARQL implementations may support additional associations between operators and operator functions; this amounts to adding rows to the table above. No additional operator support may yield a result that replaces any result other than a type error in an unextended implementation. The consequence of this rule is that extended SPARQL implementations will produce at least the same solutions as an unextended implementation, and may, for some queries, produce more solutions.

11.4 Operators Definitions

This section defines the operators introduced by the SPARQL Query language. The examples show the behavior of the operators as invoked by the appropriate grammatical constructs.

11.4.1 bound

Returns true if a var is bound to a value. Returns false otherwise. Variables with the value NaN or INF are considered bound. See 6.3 Unbound Variables for a discussion of why variables may be unbound.

11.4.2 isIRI

Returns true if term is an IRI. Returns false otherwise. isURI is an alternate spelling for the isIRI operator.

11.4.3 isBlank

In this example, there were two objects of foaf:knows predicates, but only one (_:c) was a blank node.

11.4.4 isLiteral

11.4.5 str

Returns the lexical form of ltrl, a literal; returns the codepoint representation of rsrc (an IRI). This is useful for examining parts of an IRI, for instance, the host-name.

11.4.6 lang

Returns the language tag of ltrl, if it has one. It returns "" if ltrl has no language tag.

11.4.7 datatype

Returns the datatype IRI of ltrl if ltrl is a typed literal; returns xsd:string if ltrl is a simple literal; produces an error otherwise.

11.4.8 logical-or

Returns a logical OR of left and right. As with other functions and operators with boolean arguments, logical-or operates on the effective boolean value of its arguments.

Note: see section 11.2, Filter Evaluation, for the || operator's treatment of errors.

11.4.9 logical-and

Returns a logical AND of left and right. As with other functions and operators with boolean arguments, logical-and operates on the effective boolean value of its arguments.

Note: see section 11.2, Filter Evaluation, for the && operator's treatment of errors.

11.4.10 RDFterm-equal

Returns TRUE if term1 and term2 are the same RDF term as defined in Resource Description Framework (RDF): Concepts and Abstract Syntax [CONCEPTS]; produces a type error if the arguments are both literal but are not the same RDF term ^*; returns FALSE otherwise. term1 and term2 are the same if any of the following is true:

In this query for documents that were annotated on New Year's Day (2004 or 2005), the RDF terms are not the same, but have equivalent values:

^* Invoking RDFterm-equal on two types literals tests for equivilent values. An extended implementation may have support for additional datatypes. An implementation processing a query that tests for equivalence on unsupported datatypes (and non-identical lexical form and datatype URI) returns an error, indicating that it was unable to determine whether or not the values are equivalent. For example, an unextended implementation will produce an error when testing either "iiii"^^my:romanNumeral = "iv"^^my:romanNumeral or "iiii"^^my:romanNumeral != "iv"^^my:romanNumeral.

11.4.10bis sameTerm

Unlike RDFterm-equal, sameTerm can be used to test for non-equivilent typed literals with unsupported data types:

The test for boxes with the same weight may also be done with the '=' operator (RDFterm-equal) as the test for "100"^^t:kilos = "85"^^t:kilos will result in an error, eliminating that potential solution. In the same way that pointer comparisons are usually more efficient than value comparisons, sameTerm is likely to be more efficient than RDFterm-equal. @@is that worth mentioning?@@

11.4.11 langMatches

Returns true if language-range (second argument) matches language-tag (first argument) per Tags for the Identification of Languages [RFC3066] section 2.5. RFC3066 @@reference rfc4647 (3066bis) term "well-formed" if the RFC is ready in time@@ defines a case-insensitive, hierarchical matching algorithm which operates on ISO-defined subtags for language and country codes, and user defined subtags. In SPARQL, a language-range of "*" matches any non-empty language-tag string.

11.4.12 regex

Invokes the XPath fn:matches function to match text against a regular expression pattern. The regular expression language is defined in XQuery 1.0 and XPath 2.0 Functions and Operators section 7.6.1 Regular Expression Syntax [FUNCOP].

11.5 Constructor Functions

XPath defines only the casts from one XML Schema datatype to another. The remaining casts are defined as follows:

The table below summarizes the casting operations that are always allowed (Y), never allowed (N) and dependent on the lexical value (M). For example, a casting operation from an xsd:string (the first row) to an xsd:float (the second column) is dependent on the lexical value (M).

11.6 Extensible Value Testing

A PrimaryExpression (see the grammar, production [55]) can be a call to an extension function named by an IRI. An extension function takes some number of RDF terms as arguments and returns an RDF term. The semantics of these functions are identified by the IRI that identifies the function.

SPARQL queries using extension functions are likely to have limited interoperability.

For a second example, consider a function aGeo:distance that calculates the distance between two points, which is used here to find the places near Grenoble:

An extension function might be used to test some application datatype not supported by the core SPARQL specification, it might be a transformation between datatype formats, for example into an XSD dateTime RDF term from another date format.

A. SPARQL Grammar

A.1 SPARQL Query String

A SPARQL query string is a Unicode character string (c.f. section 6.1 String concepts of [CHARMOD]) in the language defined by the following grammar, starting with the Query production. For compatibility with future versions of Unicode, the characters in this string may include Unicode codepoints that are unassigned as of the date of this publication (see Identifier and Pattern Syntax [UNIID] section 4 Pattern Syntax). For productions with excluded character classes (for example [^<>'{}|^`]), the characters are excluded from the range #x0 - #x10FFFF.

A.2 Codepoint Escape Sequences

A SPARQL Query String is processed for codepoint escape sequences before parsing by the grammar defined in EBNF below. The codepoint escape sequences for a SPARQL query string are:

Codepoint escape sequences can appear anywhere in the query string. They are processed before parsing based on the grammar rules and so may be replaced by codepoints with significance in the grammar, such as ":" marking a prefixed name.

These escape sequences are not included in the grammar below. Only escape sequences for characters that would be legal at that point in the grammar may be given. For example, the variable "?x\u0020y" is not legal (\u0020 is a space and is not permitted in a variable name).

A.3 White Space

White space (production WS) is used to separate two terminals which would otherwise be (mis-)recognized as one terminal. Rule names below in capitals indicate where whitespace is significant; these form a possible choice of terminals for constructing a SPARQL parser. White space is significant in strings.

A.4 Comments

Comments in SPARQL queries take the form of '#', outside an IRI or string, and continue to the end of line (marked by characters 0x0D or 0x0A) or end of file if there is no end of line after the comment marker. Comments are treated as white space.

A.5 IRI References

Text matched by the Q_IRI_REF production and QName (after prefix expansion) production, after escape processing, must be conform to the generic syntax of IRI references in section 2.2 of RFC 3987 "ABNF for IRI References and IRIs" [RFC3987]. For example, the Q_IRI_REF <abc#def> may occur in a SPARQL query string, but the Q_IRI_REF <abc##def> must not.

Base IRIs declared with the BASE keyword must be absolute IRIs. A prefix declared with the PREFIX keyword may not be re-declared in the same query. See see section 2.1.1, Syntax of IRI Terms, for a description of BASE and PREFIX.

A.6 Escape sequences in strings

Any escaped character in the range #x0 - #x10FFFF may appear in any string production. For instance, "\n" may appear in a STRING_LITERAL1 even though the unescaped form is not valid in that production.

A.7 Grammar

The EBNF notation used in the grammar is defined in Extensible Markup Language (XML) 1.1 [XML11] section 6 Notation.

Keywords are matched in a case-insensitive manner with the exception of the keyword 'a' which, in line with Turtle and N3, is used in place of the IRI rdf:type (in full, http://www.w3.org/1999/02/22-rdf-syntax-ns#type).

`[1]`	`Query`	::=	`Prolog ( SelectQuery \| ConstructQuery \| DescribeQuery \| AskQuery )`
`[2]`	`Prolog`	::=	`BaseDecl? PrefixDecl*`
`[3]`	`BaseDecl`	::=	`'BASE' Q_IRI_REF`
`[4]`	`PrefixDecl`	::=	`'PREFIX' QNAME_NS Q_IRI_REF`
`[5]`	`SelectQuery`	::=	`'SELECT' 'DISTINCT'? ( Var+ \| '' ) DatasetClause WhereClause SolutionModifier`
`[6]`	`ConstructQuery`	::=	`'CONSTRUCT' ConstructTemplate DatasetClause* WhereClause SolutionModifier`
`[7]`	`DescribeQuery`	::=	`'DESCRIBE' ( VarOrIRIref+ \| '' ) DatasetClause WhereClause? SolutionModifier`
`[8]`	`AskQuery`	::=	`'ASK' DatasetClause* WhereClause`
`[9]`	`DatasetClause`	::=	`'FROM' ( DefaultGraphClause \| NamedGraphClause )`
`[10]`	`DefaultGraphClause`	::=	`SourceSelector`
`[11]`	`NamedGraphClause`	::=	`'NAMED' SourceSelector`
`[12]`	`SourceSelector`	::=	`IRIref`
`[13]`	`WhereClause`	::=	`'WHERE'? GroupGraphPattern`
`[14]`	`SolutionModifier`	::=	`OrderClause? LimitOffsetClauses?`
`[15]`	`LimitOffsetClauses`	::=	`( LimitClause OffsetClause? \| OffsetClause LimitClause? )`
`[16]`	`OrderClause`	::=	`'ORDER' 'BY' OrderCondition+`
`[17]`	`OrderCondition`	::=	`( ( 'ASC' \| 'DESC' ) BrackettedExpression ) \| ( FunctionCall \| BuiltInCall \| Var \| BrackettedExpression )`
`[18]`	`LimitClause`	::=	`'LIMIT' INTEGER`
`[19]`	`OffsetClause`	::=	`'OFFSET' INTEGER`
`[20]`	`GroupGraphPattern`	::=	`'{' GroupElement '}'`
`[21]`	`GroupElement`	::=	`GraphPattern ( OptionalGraphPattern '.'? GroupElement )?`
`[22]`	`GraphPattern`	::=	`BasicGraphPattern? ( GraphPatternNotTriples '.'? GraphPattern )?`
`[23]`	`BasicGraphPattern`	::=	`BlockOfTriples`
`[24]`	`BlockOfTriples`	::=	`TriplesSameSubject ( '.' TriplesSameSubject? )*`
`[25]`	`GraphPatternNotTriples`	::=	`GroupOrUnionGraphPattern \| GraphGraphPattern \| Constraint`
`[26]`	`OptionalGraphPattern`	::=	`'OPTIONAL' GroupGraphPattern`
`[27]`	`GraphGraphPattern`	::=	`'GRAPH' VarOrIRIref GroupGraphPattern`
`[28]`	`GroupOrUnionGraphPattern`	::=	`GroupGraphPattern ( 'UNION' GroupGraphPattern )*`
`[29]`	`Constraint`	::=	`'FILTER' ( BrackettedExpression \| BuiltInCall \| FunctionCall )`
`[30]`	`FunctionCall`	::=	`IRIref ArgList`
`[31]`	`ArgList`	::=	`( NIL \| '(' Expression ( ',' Expression )* ')' )`
`[32]`	`ConstructTemplate`	::=	`'{' ConstructTriples '}'`
`[33]`	`ConstructTriples`	::=	`( TriplesSameSubject ( '.' ConstructTriples )? )?`
`[34]`	`TriplesSameSubject`	::=	`VarOrTerm PropertyListNotEmpty \| TriplesNode PropertyList`
`[35]`	`PropertyList`	::=	`PropertyListNotEmpty?`
`[36]`	`PropertyListNotEmpty`	::=	`Verb ObjectList ( ';' PropertyList )?`
`[37]`	`ObjectList`	::=	`GraphNode ( ',' ObjectList )?`
`[38]`	`Verb`	::=	`VarOrIRIref \| 'a'`
`[39]`	`TriplesNode`	::=	`Collection \| BlankNodePropertyList`
`[40]`	`BlankNodePropertyList`	::=	`'[' PropertyListNotEmpty ']'`
`[41]`	`Collection`	::=	`'(' GraphNode+ ')'`
`[42]`	`GraphNode`	::=	`VarOrTerm \| TriplesNode`
`[43]`	`VarOrTerm`	::=	`Var \| GraphTerm`
`[44]`	`VarOrIRIref`	::=	`Var \| IRIref`
`[45]`	`Var`	::=	`VAR1 \| VAR2`
`[46]`	`GraphTerm`	::=	`IRIref \| RDFLiteral \| ( '-' \| '+' )? NumericLiteral \| BooleanLiteral \| BlankNode \| NIL`
`[47]`	`Expression`	::=	`ConditionalOrExpression`
`[48]`	`ConditionalOrExpression`	::=	`ConditionalAndExpression ( '\|\|' ConditionalAndExpression )*`
`[49]`	`ConditionalAndExpression`	::=	`ValueLogical ( '&&' ValueLogical )*`
`[50]`	`ValueLogical`	::=	`RelationalExpression`
`[51]`	`RelationalExpression`	::=	`NumericExpression ( '=' NumericExpression \| '!=' NumericExpression \| '<' NumericExpression \| '>' NumericExpression \| '<=' NumericExpression \| '>=' NumericExpression )?`
`[52]`	`NumericExpression`	::=	`AdditiveExpression`
`[53]`	`AdditiveExpression`	::=	`MultiplicativeExpression ( '+' MultiplicativeExpression \| '-' MultiplicativeExpression )*`
`[54]`	`MultiplicativeExpression`	::=	`UnaryExpression ( '' UnaryExpression \| '/' UnaryExpression )`
`[55]`	`UnaryExpression`	::=	`'!' PrimaryExpression \| '+' PrimaryExpression \| '-' PrimaryExpression \| PrimaryExpression`
`[56]`	`PrimaryExpression`	::=	`BrackettedExpression \| BuiltInCall \| IRIrefOrFunction \| RDFLiteral \| NumericLiteral \| BooleanLiteral \| Var`
`[57]`	`BrackettedExpression`	::=	`'(' Expression ')'`
`[58]`	`BuiltInCall`	::=	`'STR' '(' Expression ')' \| 'LANG' '(' Expression ')' \| 'LANGMATCHES' '(' Expression ',' Expression ')' \| 'DATATYPE' '(' Expression ')' \| 'BOUND' '(' Var ')' \| 'sameTerm' '(' Expression ',' Expression ')' \| 'isIRI' '(' Expression ')' \| 'isURI' '(' Expression ')' \| 'isBLANK' '(' Expression ')' \| 'isLITERAL' '(' Expression ')' \| RegexExpression`
`[59]`	`RegexExpression`	::=	`'REGEX' '(' Expression ',' Expression ( ',' Expression )? ')'`
`[60]`	`IRIrefOrFunction`	::=	`IRIref ArgList?`
`[61]`	`RDFLiteral`	::=	`String ( LANGTAG \| ( '^^' IRIref ) )?`
`[62]`	`NumericLiteral`	::=	`INTEGER \| DECIMAL \| DOUBLE`
`[63]`	`BooleanLiteral`	::=	`'true' \| 'false'`
`[64]`	`String`	::=	`STRING_LITERAL1 \| STRING_LITERAL2 \| STRING_LITERAL_LONG1 \| STRING_LITERAL_LONG2`
`[65]`	`IRIref`	::=	`Q_IRI_REF \| QName`
`[66]`	`QName`	::=	`QNAME \| QNAME_NS`
`[67]`	`BlankNode`	::=	`BLANK_NODE_LABEL \| ANON`
`[68]`	`Q_IRI_REF`	::=	'<' ([^<>'{}\|^`]-[#x00-#x20])* '>'
`[69]`	`QNAME_NS`	::=	`NCNAME_PREFIX? ':'`
`[70]`	`QNAME`	::=	`NCNAME_PREFIX? ':' NCNAME?`
`[71]`	`BLANK_NODE_LABEL`	::=	`'_:' NCNAME`
`[72]`	`VAR1`	::=	`'?' VARNAME`
`[73]`	`VAR2`	::=	`'$' VARNAME`
`[74]`	`LANGTAG`	::=	`'@' [a-zA-Z]+ ('-' [a-zA-Z0-9]+)*`
`[75]`	`INTEGER`	::=	`[0-9]+`
`[76]`	`DECIMAL`	::=	`[0-9]+ '.' [0-9]* \| '.' [0-9]+`
`[77]`	`DOUBLE`	::=	`[0-9]+ '.' [0-9]* EXPONENT \| '.' ([0-9])+ EXPONENT \| ([0-9])+ EXPONENT`
`[78]`	`EXPONENT`	::=	`[eE] [+-]? [0-9]+`
`[79]`	`STRING_LITERAL1`	::=	`"'" ( ([^#x27#x5C#xA#xD]) \| ECHAR )* "'"`
`[80]`	`STRING_LITERAL2`	::=	`'"' ( ([^#x22#x5C#xA#xD]) \| ECHAR )* '"'`
`[81]`	`STRING_LITERAL_LONG1`	::=	`"'''" ( ( "'" \| "''" )? ( [^'\] \| ECHAR ) )* "'''"`
`[82]`	`STRING_LITERAL_LONG2`	::=	`'"""' ( ( '"' \| '""' )? ( [^"\] \| ECHAR ) )* '"""'`
`[83]`	`ECHAR`	::=	`'\' [tbnrf\"']`
`[84]`	`HEX`	::=	`[0-9] \| [A-F] \| [a-f]`
`[85]`	`NIL`	::=	`'(' WS* ')'`
`[86]`	`WS`	::=	`#x20 \| #x9 \| #xD \| #xA`
`[87]`	`ANON`	::=	`'[' WS* ']'`
`[88]`	`NCCHAR1P`	::=	`[A-Z] \| [a-z] \| [#x00C0-#x00D6] \| [#x00D8-#x00F6] \| [#x00F8-#x02FF] \| [#x0370-#x037D] \| [#x037F-#x1FFF] \| [#x200C-#x200D] \| [#x2070-#x218F] \| [#x2C00-#x2FEF] \| [#x3001-#xD7FF] \| [#xF900-#xFDCF] \| [#xFDF0-#xFFFD] \| [#x10000-#xEFFFF]`
`[89]`	`NCCHAR1`	::=	`NCCHAR1P \| '_'`
`[90]`	`VARNAME`	::=	`( NCCHAR1 \| [0-9] ) ( NCCHAR1 \| [0-9] \| #x00B7 \| [#x0300-#x036F] \| [#x203F-#x2040] )*`
`[91]`	`NCCHAR`	::=	`NCCHAR1 \| '-' \| [0-9] \| #x00B7 \| [#x0300-#x036F] \| [#x203F-#x2040]`
`[92]`	`NCNAME_PREFIX`	::=	`NCCHAR1P ((NCCHAR\|'.')* NCCHAR)?`
`[93]`	`NCNAME`	::=	`NCCHAR1 ((NCCHAR\|'.')* NCCHAR)?`

The SPARQL grammar is LL(1) when the rules with uppercased names are used as terminals.

B. Conformance

This specification is intended for use in conjunction with the SPARQL Protocol [SPROT] and the SPARQL Query Results XML Format [RESULTS]. See those specifications for their conformance criteria.

Note that the SPARQL protocol describes an abstract interface as well as a network protocol, and the abstract interface may apply to APIs as well as network interfaces.

C. Security Considerations

SPARQL queries using FROM, FROM NAMED, or GRAPH may cause the specified URI to be dereferenced. This may cause additional use of network, disk or CPU resources along with associated secondary issues such as denial of service. The security issues of Uniform Resource Identifier (URI): Generic Syntax [RFC3986] Section 7 should be considered. In addition, the contents of file: URIs can in some cases be accessed, processed and returned as results, providing unintended access to local resources.

The SPARQL language permits extensions, which will have their own security implications.

Multiple IRIs may have the same appearance. Characters in different scripts may look similar (a Cyrillic "о" may appear similar to a Latin "o"). A character followed by combining characters may have the same visual representation as another character (LATIN SMALL LETTER E followed by COMBINING ACUTE ACCENT has the same visual representation as LATIN SMALL LETTER E WITH ACUTE). Users of SPARQL must take care to construct queries with IRIs that match the IRIs in the data. Further information about matching of similar characters can be found in Unicode Security Considerations [UNISEC] and Internationalized Resource Identifiers (IRIs) [RFC3987] Section 8.

D. Collected Formal Definitions

E. Internet Media Type, File Extension and Macintosh File Type (Normative)

The Internet Media Type / MIME Type for the SPARQL Query Language is "application/sparql-query".

It is recommended that sparql query files have the extension ".rq" (all lowercase) on all platforms.

It is recommended that sparql query files stored on Macintosh HFS file systems be given a file type of "TEXT".

This information that follows is intended to be submitted to the IESG for review, approval, and registration with IANA.

G. Acknowledgements

In addition, we have had comments and discussions with many people through the working group comments list. All comments go to making a better document. Andy would also like to particularly thank Geoff Chappell, Bob MacGregor, Yosi Scharf and Richard Newman for exploring specific issues related to SPARQL. Eric would like to acknowledge the invaluable help of Björn Höhrmann.

Operator	Type(A)	Function	Result type
XQuery Unary Operators
! A	xsd:boolean	fn:not(A)	xsd:boolean
+ A	numeric	op:numeric-unary-plus(A)	numeric
- A	numeric	op:numeric-unary-minus(A)	numeric
SPARQL Tests, defined in section 11.4
BOUND(A)	variable	bound(A)	xsd:boolean
isIRI(A) isURI(A)	RDF term	isIRI(A)	xsd:boolean
isBLANK(A)	RDF term	isBlank(A)	xsd:boolean
isLITERAL(A)	RDF term	isLiteral(A)	xsd:boolean
SPARQL Accessors
STR(A)	literal	str(A)	simple literal
STR(A)	IRI	str(A)	simple literal
LANG(A)	literal	lang(A)	simple literal
DATATYPE(A)	RDF term	datatype(A)	IRI

Operator	Type(A)	Type(B)	Type(C)	Function	Result type
SPARQL Tests, defined in section 11.4
REGEX(STRING, PATTERN, FLAGS)	simple literal	simple literal	simple literal	fn:matches(STRING, PATTERN, FLAGS)	xsd:boolean

name1	name2
"Alice"	"Ms A."
"Ms A."	"Alice"

name1	name2
"Alice"	"Ms A."
"Ms A."	"Alice"

aLabel	bLabel
"Container 1"	"Container 2"
"Container 2"	"Container 1"

Operator	Type(A)	Type(B)	Function	Result type
Logical Connectives, defined in section 11.4
A \|\| B	xsd:boolean	xsd:boolean	logical-or(A, B)	xsd:boolean
A && B	xsd:boolean	xsd:boolean	logical-and(A, B)	xsd:boolean
XPath Tests
A = B	numeric	numeric	op:numeric-equal(A, B)	xsd:boolean
A = B	simple literal	simple literal	op:numeric-equal(fn:compare(A, B), 0)	xsd:boolean
A = B	xsd:string	xsd:string	op:numeric-equal(fn:compare(STR(A), STR(B)), 0)	xsd:boolean
A = B	xsd:boolean	xsd:boolean	op:boolean-equal(A, B)	xsd:boolean
A = B	xsd:dateTime	xsd:dateTime	op:dateTime-equal(A, B)	xsd:boolean
A != B	numeric	numeric	fn:not(op:numeric-equal(A, B))	xsd:boolean
A != B	simple literal	simple literal	fn:not(op:numeric-equal(fn:compare(A, B), 0))	xsd:boolean
A != B	xsd:string	xsd:string	fn:not(op:numeric-equal(fn:compare(STR(A), STR(B)), 0))	xsd:boolean
A != B	xsd:boolean	xsd:boolean	fn:not(op:boolean-equal(A, B))	xsd:boolean
A != B	xsd:dateTime	xsd:dateTime	fn:not(op:dateTime-equal(A, B))	xsd:boolean
A < B	numeric	numeric	op:numeric-less-than(A, B)	xsd:boolean
A < B	simple literal	simple literal	op:numeric-equal(fn:compare(A, B), -1)	xsd:boolean
A < B	xsd:string	xsd:string	op:numeric-equal(fn:compare(STR(A), STR(B)), -1)	xsd:boolean
A < B	xsd:boolean	xsd:boolean	op:boolean-less-than(A, B)	xsd:boolean
A < B	xsd:dateTime	xsd:dateTime	op:dateTime-less-than(A, B)	xsd:boolean
A > B	numeric	numeric	op:numeric-greater-than(A, B)	xsd:boolean
A > B	simple literal	simple literal	op:numeric-equal(fn:compare(A, B), 1)	xsd:boolean
A > B	xsd:string	xsd:string	op:numeric-equal(fn:compare(STR(A), STR(B)), 1)	xsd:boolean
A > B	xsd:boolean	xsd:boolean	op:boolean-greater-than(A, B)	xsd:boolean
A > B	xsd:dateTime	xsd:dateTime	op:dateTime-greater-than(A, B)	xsd:boolean
A <= B	numeric	numeric	logical-or(op:numeric-less-than(A, B), op:numeric-equal(A, B))	xsd:boolean
A <= B	simple literal	simple literal	fn:not(op:numeric-equal(fn:compare(A, B), 1))	xsd:boolean
A <= B	xsd:string	xsd:string	fn:not(op:numeric-equal(fn:compare(STR(A), STR(B)), 1))	xsd:boolean
A <= B	xsd:boolean	xsd:boolean	fn:not(op:boolean-greater-than(A, B))	xsd:boolean
A <= B	xsd:dateTime	xsd:dateTime	fn:not(op:dateTime-greater-than(A, B))	xsd:boolean
A >= B	numeric	numeric	logical-or(op:numeric-greater-than(A, B), op:numeric-equal(A, B))	xsd:boolean
A >= B	simple literal	simple literal	fn:not(op:numeric-equal(fn:compare(A, B), -1))	xsd:boolean
A >= B	xsd:string	xsd:string	fn:not(op:numeric-equal(fn:compare(STR(A), STR(B)), -1))	xsd:boolean
A >= B	xsd:boolean	xsd:boolean	fn:not(op:boolean-less-than(A, B))	xsd:boolean
A >= B	xsd:dateTime	xsd:dateTime	fn:not(op:dateTime-less-than(A, B))	xsd:boolean
XPath Arithmetic
A * B	numeric	numeric	op:numeric-multiply(A, B)	numeric
A / B	numeric	numeric	op:numeric-divide(A, B)	numeric; but xsd:decimal if both operands are xsd:integer
A + B	numeric	numeric	op:numeric-add(A, B)	numeric
A - B	numeric	numeric	op:numeric-subtract(A, B)	numeric
SPARQL Tests, defined in section 11.4
A = B	RDF term	RDF term	RDFterm-equal(A, B)	xsd:boolean
A != B	RDF term	RDF term	fn:not(RDFterm-equal(A, B))	xsd:boolean
sameTERM(A)	RDF term	RDF term	sameTerm(A, B)	xsd:boolean
langMATCHES(A, B)	simple literal	simple literal	langMatches(A, B)	xsd:boolean
REGEX(STRING, PATTERN)	simple literal	simple literal	fn:matches(STRING, PATTERN)	xsd:boolean

From \ To	str	flt	dbl	dec	int	dT	bool
str	Y	M	M	M	M	M	M
flt	Y	Y	Y	M	M	N	Y
dbl	Y	Y	Y	M	M	N	Y
dec	Y	Y	Y	Y	Y	N	Y
int	Y	Y	Y	Y	Y	N	Y
dT	Y	N	N	N	N	Y	N
bool	Y	Y	Y	Y	Y	N	Y
IRI	Y	N	N	N	N	N	N
ltrl	Y	M	M	M	M	M	M

Escape	Unicode code point
'\u' HEX HEX HEX HEX	A Unicode code point in the range U+0 to U+FFFF inclusive corresponding to the encoded hexadecimal value.
'\U' HEX HEX HEX HEX HEX HEX HEX HEX	A Unicode code point in the range U+0 to U+10FFFF inclusive corresponding to the encoded hexadecimal value.

Escape	Unicode code point
'\t'	U+0009 (tab)
'\n'	U+000A (line feed)
'\r'	U+000D (carriage return)
'\b'	U+0008 (backspace)
'\f'	U+000C (form feed)
'\"'	U+0022 (quotation mark, double quote mark)
"\'"	U+0027 (apostrophe-quote, single quote mark)
'\\'	U+005C (backslash)