SparqlUpdateLanguage
The first version of SPARQL is designed to query RDF data; while update features have been postponed for a future version of the standard, let's share experimental implementation experience here.
Related Work
In addition to nearby topics such as DeltaView, DiffAndPatch, BeesAndAnts, RDFAccessProtocol, UpdatingRelationalDataViaSPARUL, a survey of related work includes the following (listed newest first):
- 2009-07-02: http://www.w3.org/TR/sparql-features/#sparql-update
- 2008-07: W3C submission on SPARQL/Update (10 submitting organisations). Team comment.
- 2007-11: SPARQL+ by BenjaminNowack
- 2007-07 (?): SPARQL Update in Algae by Eric Prud'Hommeaux (IRC Log).
- 2007-03: SPARQL/Update A language for updating RDF graphs Andy Seaborne and Geetha Manjunath; cited in 9 Mar 2007 message to public-sparql-dev. This proposal has been updated several times since originally being published. It is commonly referred to as SPARUL. Virtuoso also supports SPARUL.
- 2007-02: Implementation of insert mechanisms using sparql and a modified sparql4j driver by BenjaminHorak
- 2006-11: straw proposal by MaxVoelkel and RichardCyganiak; see below.
- 2005-12: DAWG postpones update issue
- 2005-03: DAWG discussion 1 Mar 2005 in Boston
- 2004-?? RUL: A Declarative Update Language for RDF, M. Magiridou, S. Sahtouris, V. Christophides, and M. Koubarakis . based on updating RDF Schema instances, not RDF graphs
- 2004-03 Delta: an ontology for the distribution of differences between RDF graphs, Berners-Lee and Connolly 2004 (work in progress)
hmm... dbin has some diff/sync stuff, no?
and where is the IBM/Boca stuff? LinkMe
Update queries to RDF stores using sparql4j
I implemented a rudimentary query update mechanism by using construct queries and sparql4j. I developed a small driver which is able to insert results of a construct query into an RDF store by using rdf2go. Maybe I could implement a query rewrite mechanisms of queries described in SPARQL/Update A language for updating RDF graphs to have a small update query prototype.
Strawman Proposal
This is a strawman proposal for a SPARQL-based RDF update language. It is based on discussions between MaxVölkel and RichardCyganiak.
Big missing pieces: This deals only with updates to the default graph, and there is no account of how to deal with blank nodes when removing triples, and there's no account of what to do in the presence of inferred triples.
The WHERE
keyword is still a full query pattern which can include OPTIONAL
and FILTER
, etc., like in traditional SPARQL. The ADD
and REMOVE
keywords take a graph template similar to CONSTRUCT
. The template is optional in cases where a graph is provided as an input or generated by a DESCRIBE
clause in the query.
In operations that both add and delete statements, deletion always happens first, no matter which order the ADD
and REMOVE
keywords appear in. Maybe we should require the REMOVE
to be written first.
No-input operations
Add a fixed graph:
ADD { [] a foaf:Person; foaf:name "Max"; foaf:mbox <mailto:max@xam.de> }
Delete a fixed graph:
REMOVE { :person123 foaf:mbox <mailto:max@xam.de> }
Atomic add and delete:
ADD { :person123 foaf:mbox_sha1sum "9dec5f368c386776b648332359a4b9ec0156471d" } REMOVE { :person123 foaf:mbox <mailto:max@xam.de> }
Find bindings matching a query pattern, then delete based on a graph template:
REMOVE { ?s ?p ?o } WHERE { ?s foaf:name "Max" }
Do atomic find/add/delete based on a query pattern and two graph templates:
ADD { ?x foaf:name "Max Völkel" } REMOVE { ?x foaf:name [] } WHERE { ?x foaf:mbox <mailto:max@xam.de> }
Delete based on DESCRIBE
. It's up to the server to decide which triples “belong” to the resource.
REMOVE DESCRIBE :person123
Delete based on DESCRIBE
. It's up to the server to decide which triples “belong” to the resource.
REMOVE DESCRIBE ?x WHERE { ?x foaf:name "Max" }
Graph input operations
These operations take one or two input graphs. In the HTTP bindings, the input would be POST
ed. For two-graph operations, mime/multipart
would be used.
Adds an input graph:
ADD
Deletes an input graph:
REMOVE
Atomic delete and add of two input graphs:
UPDATE
Result set operations
These operations take a result set as an input. I'm not sure if any of these are useful. Probably not.
Create several persons based on data from the input bindings:
ADD { [] foaf:name ?name; a foaf:Person; foaf:mbox ?email }
Delete several statements based on data from input bindings:
REMOVE { ?x foaf:nick [] }
Delete several persons based on data from input bindings. The server decides which triples “belong” to each person:
REMOVE DESCRIBE ?x
Update the names of several persons based on input bindings. The input bindings contain the variables ?x, ?oldName, ?newName
:
ADD { ?x foaf:name ?newName } REMOVE { ?x foaf:name ?oldName }
Q & A
Q: Why not just use HTTP?
There's many things that HTTP alone can't do, like atomic updates to large graphs (see ADD {...} REMOVE {...}
example), or updates based on a query result (see REMOVE {...} WHERE
example).
That said, SPARQL has HTTP bindings. The update language must of course work over HTTP too. Especially the graph-input operations would work well with that interface -- POST to ...?query=ADD
to add a subgraph; POST to ...?query=REMOVE
to remove a subgraph.
REST-style HTTP operations could play a bigger role in operations for adding, updating, and removing entire named graphs.
Criticism
Don't do protocol "bindings"
The current notion of SPARQL protocol bindings is broken. It just happens to (sort of) work out because SPARQL is read-only and so can be mapped into URIs (and therefore GET
) without breaking too many principles of Web architecture. That's not the case for updates. Please try to avoid making the same mistakes as Web services; application protocols were not made to be "bound onto", because doing so requires masking most of their value.
See also: Mark Baker: The trouble with “binding” and discussion
You should use other HTTP verbs, not PUT and POST
While PUT
and POST
are generally useful, neither REST nor the Web architecture precludes the use of other methods which might better facilitate "atomic updates to large graphs". I expect that HTTP PATCH would be quite useful for those kinds of updates.
However, PATCH was never really implemented and was removed from RFC 2616. Proposals that rely on new HTTP verbs usually are not popular, see URIQA.
Atomicity doesn't require a single message
This approach to atomicity may be overloading the HTTP verbs unnecessarily, a reliable ADD+REMOVE
should be doable without combining them in a single message. See also : HTTPLR