There is currently no specification for how a client can determine which patch classes a server supports. The HTTP mechanism only works on content-type, so it doesn't apply when all patches are expressed as RDF datasets. Options include using a profile argument on the content-type, or using a predicate like { ?resource :acceptsPATCH ?patchClass }
Some possible ideas for extensions:
* Allow wildcard deletes, so that in some cases triples can be deleted without enumerating them * Allows some code (JavaScript or JVM) to be included which programmatically adjusts the patch, so that arbitrary change can be made server-side. ## Blank Nodes and Performance The RDF model uses (subject, property, value) triples, where the subject and any non-literal values are optionally identified by an IRI. If an IRI is not supplied, the terms are called "blank nodes". In supplying data, decisions must be made about when to assign IRIs instead of leaving nodes blank. With patch, this decision has particular significance since patching a resource with blank nodes is a form of the graph isomorphism problem, which is generally accepted as being intractable. Fortunately, the problem can usually be managed in practice; details are covered in Performance Considerations. ## Syntax In datapatch, patches themselves are expressed in RDF. In order to conveniently talk about RDF inside RDF, datapatch uses RDF datasets, which go beyond a single set of triples (a single graph) to also include additional "named" graphs. A downside to this approach is that conventional RDF syntaxes like RDF/XML and Turtle cannot conventiently be used to express patches. Instead, newer dataset-aware syntaxes must be used, such a JSON-LD and TriG. SPARQL works with datasets as well, so SPARQL can be used to store and query over patches.There is no good syntax to use in this document. The SPARQL graph pattern syntax and TriG are almost the same, but they differ in trivial ways, making them incompatible. Perhaps the next draft of TriG will fix this. For now, we'll assume it does, and use a subset of SPARQL as a declarative language, as if it were TriG.
## Relationship to SPARQL UpdateTODO: Explain how users should navigate the overlap in functionality between datapatch and SPARQL 1.1 Update.
# Use Cases and Examples The following subsections explain some of situations in which datapatch might be useful and which have motivated its creations. They are each illustrated with examples based on the following scenario:Imagine a group of scientists is replicating the classic Stanford marshmallow experiment, where a child is given a marshmallow and offered a second one if they can refrain from eating the first one 15 minutes. In this example, the scientists are looking for cultural variations, so they have recruited collaborators from around the world. Each of the collaborators is given access to the complete dataset and asked to add their own observations each time they run the experiment. The collaborators are granted machine-level access so they can create their own local-language/local-culture experimenter user interface, and they are given free reign to include their own additional data (beyond the common fields such as age of the child, and how long they refrained from eating the first marshmallow).## Adding Data The simplest use case for datapatch is to add to some unified collection of data (a database, in the most general sense) using HTTP. Without PATCH, this would require either (1) downloading the contents and then uploading a modified version using PUT, or (2) defining an application-specific usage of POST. Using PATCH, TriG, and datapatch:BasicPatch, it could look like this:
Client sends:## Removing Data ## Updating ## Keeping a Record of Changes ## Mirroring Data ## Sharing Application State # Definition of BasicPatch @@@ rename to DataPatch? A _patch_ is a relation involving two sets of RDF triples (that is, two RDF Graphs). Specifically, a patch __from__ RDF Graph G1 __to__ RDF Graph G2 embodies the differences between G1 and G2 such that a system storing G1 can algorithmically construct G2. A _BasicPatch_ is a kind patch of which embodies these differences by simply enumerating the triples to change. If one is construcing G2 from a copy of G1, the BasicPatch says which triples to delete and which to insert. If one is constructing G2 from G1 (without modification), the BasicPatch can be understood as saying which triples to leave out and which extra ones to include. There is one complication: blank nodes. Because they lack identifiers (IRIs), when they occur in triples, those triples cannot simply be enumerated for deletion or insertion. Instead, BasicPatch uses another blank node as a placeholder and provides additional constraints showing which blank node it is standing-in for. Servers are then required to figure out which blank node in G1 is meant by each placeholder in the patch. (See Performance Considerations for guidance on how to do this.) Formally speaking, a BasicPatch is expressed using the elements of a dataset matching this SPARQL pattern:PATCH /observations HTTP/1.1 Host: www.example.org Content-Type: text/trig Transfer-Encoding: Chunked PREFIX : <http://www.w3.org/ns/ldp#> PREFIX xs: <http://www.w3.org/2001/XMLSchema#> PREFIX mme: <http://www.example.org/2013/world-marshmallow-vocab#> PREFIX univ: <http://local-universite.example.org/psychlab-vocab#> [ a :BasicPatch; :where <#w1>; :delete <#d1>; :insert <#i1> ] <#w1> { } <#d1> { } <#i1> { [ a mme:ExperimentRun; mme:observer univ:DrSmith, univ:DrJones; mme:countryCode "UK"; mme:childAge 7; mme:secondsLasted 35 ] }
Server responds with a 2xx, indicated success:HTTP/1.1 204 No Content Content-Location: /observations ETag: "e0023aa4f"
?patchId a :BasicPatch;
:where ?w;
:delete ?d;
:insert ?i.
GRAPH ?w { ... the "W" graph ... }
GRAPH ?d { ... the "D" graph ... }
GRAPH ?i { ... the "I" graph ... }
To define the meaning of this pattern, we first define these functions:
* Given two RDF graphs X and Y, union(X, Y) is the RDF Graph which contains exactly those triples which are in X or Y
* Given two RDF graphs X and Y, minus(X, Y) is the RDF Graph which contains exactly those triples which are in X but not in Y
* Given a function M and a graph X, mapNodes(M, X) is the Graph which contains exactly the triples (M(S), P, M(O)) such that (S, P, O) is a triple in X.
In the conditions below, "W" is the graph paired with ?w in the dataset, as suggested in the pattern above. Similarly for D/?d and I/?i.
If the dataset pattern above is matched in a dataset, that dataset is true only if:
* ?patchId is a "patch" from some graph G1 to some graph G2.
* M1 is a mapping such that mapNodes(M1, union(W,D)) is a subgraph of G1
* M2 is a mapping to fresh blank nodes from each the blank nodes in I which is not in W or D
* M is the union of M1 and M2
* G2 = union(mapNodes(M, I), minus(G1, mapNodes(M, D)))
# HTTP PATCH using BasicPatch
## Describing vs Sending
?patch :onNamed ?NamedGraph lets you patch a dataset; you send this PATCH and including this triple to say it applies to a specific named graph If you don't say this, its just a patch of the default graph.## Literal Matching In RDF (using the Turtle notation), these literals are different literals with the same value: * "01"^^xs:int * "1"^^xs:int * "1"^^xs:decimal * "1.0"^^xs:decimal
RDF Datatype | JavaScript Type | Constraints |
---|---|---|
xs:integer | Number | integer, in canonical form, in range -9007199254740992 .. 9007199254740992) |
xs:double | Number | in canonical form, non-integer, or integer outside of in range -9007199254740992 .. 9007199254740992) |
SHOULD BE: respect if-match, use if-match. @@@ Basically, respect if-match. But *I THINK* the server is allowed to cheat a little, and do a patch even if if-match is wrong, as long as the result would have been the same doing two patches in a different order. Or not.... Actually as a client, what I want is to be getting a feed of those other people's changes. If a server gets two or more patches with the same if-match etag it SHOULD merge them (instead of rejecting all but one) as long as the result would be the same AND it has some way to tell the client someone else also made some changes....??? Or I just don't if-match -- I could just patch blindly. TEST CASE: two changes with the same if-match. what happens??? In practice, it's probably okay to ADD blindly and even DEL blindly, but set/edit/alter (DEL+ADD) .# HTTP GET-ing BasicPatch
@@@ just raw notes. Basically, the only way to gracefully handle multiple clients doing patches is if they can also GET patches. That's pretty easy to do, I think:Servers MAY allow clients to get patches, so that clients can maintain an accurate copy of the resource representation when it is being changed by other clients or by server logic. One approach for this: the server provides this information either as part of the resource representation or via the Link header:
?resourceURL ldp:nextBasicPatch ?nextPatchURL
This tells clients a GET of ?nextPatchURL can be used to obtain a patch from the version which has that triple in it to some later version. The later version need not be the patch actually sent by a client; the server MAY merge all patches from the starting version until the time the GET is actually handled.
If there have been no changes since this version, GET on ?nextPatchURL MUST return a 404. In contrast:
?resourceURL ldp:nextBasicPatchLongpoll ?nextPatchURL
has the same semantics, and the same behavior, except that if no changes have occured, a GET on ?nextPatchURL MUST not return a 404 or a 2xx. The server MUST keep the connection open, not returning any result, until there is a change or the server resource limits for this client are exceeded (in which case a 503 Service Unavailable MUST be returned.) Clients SHOULD anticipate receiving 504 Gateway Timeout as well, due to intervening proxies.
# Test Cases
@@@ use TriG to define dptest:LoopTest sequences of patches from a zero-triple resource back to a zero-triple graph.