From Semantic Web Standards
Jump to: navigation, search

This is for discussion among folks. No decisions have been made. Don't rely on the numbers -- they'll change when people edit this page.

If you care, please add vote and comment lines. Votes: +1 this should be a requirement, +0 this should be considered desirable but not required, -0 this isn't even desirable, -1 this would make the system not usable for me. Syntax like:

sandro: +1 in product foo we've been using this, and need it

Proposed Requirements:

1 Syntax

1.1 Human Readable

Patches need to be fairly humanly readable, like Turtle or SPARQL, more than N-Quads or RDF/XML.

1.2 SPARQL Syntax

Patch syntax is a subset of SPARQL Update syntax.


 PREFIX foaf <> 
   <> foaf:mbox <>
   <> foaf:mbox <>

Probably only makes sense if the semantics are the same as SPARQL's, for the given expressions. Unclear how to provide metadata.

1.3 Dataset Syntax

Patches are expressed in RDF Datasets (quads). As such, they can be transmitted in TriG and stored in quad stores (perhaps with a little munging of graph names).

In TriG it might be serialized like:

[ a :BasicPatch;
  :startETag "fa65e4c5ad0e5f7a94337910847bd10f7af10c74";
  :delete _:d1;
  :insert _:i1 ].
GRAPH _:d1 {   
   <> foaf:mbox < >
GRAPH _:i1 {     
   <> foaf:mbox <>

1.4 Streaming

Patches can be processed in a single pass, using much less memory than it takes to store large patches.

1.5 Metadata

Syntax includes a way to include arbitrary metadata about patches.

1.6 End-State ID

Sender can say what the ending state ID is.

For example, Alice and Bob both know the exact graph which is the state of resource R, with etag "v1". Alice sense Bob a patch to change it to a new state. Does Alice get to tell Bob, as part of that patch, what she wants the new etag to be?

1.7 Packaging

A single patch document may contain multiple patches, which can fail or succeed independently, and are processed as separate transactions, if ACID properties are being provided.

1.8 Compact

Patches are smaller than just sending the triples to be deleted and the triples to be added (in some normal RDF syntax).

2 General

2.1 Universality

Given any two RDF Graphs, G1 and G2, it is possible construct a patch from G1 to G2

2.2 Tractability

All operations are tractable (that is, can be done in polynomial time and space with respect to the inputs).

This is not easy with RDF, because subgraph matching involves the NP Graph Isomorphism Problem. More intuitively for some people, a database WHERE clause can easily have a combinatorial explosion in the number of matches, so a patch involving WHERE matching might take a very long time to process.

Can be broken into:

  • It is possible to construct a tractable patch, for any pair of graphs
  • It is tractable to construct a tractable patch, for any pair of graphs
  • All patches are tractable, for any pair of graphs

2.3 Atomicity

Patches are "all or nothing". If the patch fails in some way, then no change is done.

Either all patches are atomic, or a patch can be flagged as requiring atomic processing.

2.4 Isolation

Patches appear to other systems to be applied all at once.

Either all patches are processes in isolation, or the client can ask that they are.

2.5 Durability

Once a patch has been confirmed as processed, the client can assume it will not be lost, eg due to power failure or software crash.

2.6 Rigidity

Patches may be expressed as working only against a given start graph.

If a rigid patch says some triples are to be deleted, and any of them are not present, the patch is rejected.

2.7 Blindness

Patches may be expressed as working against any graph, where the patch constructor doesn't exactly know the content of the graph. Might just be adding triples, regardless of the graph. Might use wildcards or variables.

If a blind patch says some triples are to be deleted, and they are not present, the patch process continues anyway.

2.8 Concurrency

Systems are allowed to handle multiple rigid patches to the same start graph via some kind of safe merging. For example, two patches that each add triples can both be handled.

(Concurrent blind patches are not an issue, since they don't specify the start graph.)

2.9 Turing Complete

Patches need to be able to make arbitrary, smart, blind changes by including code which runs on the server, using the state of the resource at the time of processing.

3 Functionality

3.1 Wildcards

Patches may include wildcard terms, like "delete { * :p * }" to delete all triples using predicate :p. Weaker/simpler than full variables.

3.2 Variables

Patches can be written with variables. Relates to All-Matches and Wildcards, but different. Tends to create combinatoric explosion, making processing intractable.

Variables might be explicit, or blank nodes might be treated as variables, or both.

Variables which occur only in inserted triples (not in delete or where) are probably to be interpreted as fresh blank nodes.

3.3 Repeated Matching

Patches can be written which operate against all matching triples, possibly with combinatoric explosion. Alternatively, patches might only process one match and then stop.

(This only matters if there are some kind of variables.)

3.4 Node Renaming

Patches may express that some node is to be replaced, wherever it occurs, with some other node.

3.5 List Operations

Well-formed RDF lists can be manipulated (insert & delete) by simple instructions in a patch.

3.6 String Operations

If there's a triple like { :s :p "This is a very, very long string." }, the string can be modified in a patch without sending all the new text of the string.

Something like replace_substring(:s, :p, 1, 3, "hat") would change the above triple to { :s :p "That is a very, very long string." } (assuming there were no other { :s :p ?x } triples in the store.

3.7 Node Renumbering

See near the word "range".