REST

From Semantic Web Standards
Revision as of 19:44, 16 December 2011 by Sandro (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

__NUMBEREDHEADINGS__

This is a discussion draft, with no formal status at W3C.

Edited by Sandro Hawke (sandro@w3.org), based on ideas from many sources, including SPARQL1.1 Graph Store HTTP Protocol and discussion in RDF Working Group, SPARQL Working Group, and with participants of the Linked Enterprise Data Patterns Workshop.

RDF Simple Data Interface Protocol - Level Zero

1 Introduction

This document shows a simple way to use HTTP operations (GET, PUT, etc) to read and alter information exposed as RDF graphs, following the REST architectural style. Many types of systems can act as servers for this protocol, including both domain-specific applications, such as CAD, games, GIS, finance, etc, and general storage systems, such as RDF quadstores, SQL databases, and filesystems. This protocol may be implemented in parallel to other interfaces (SOAP, SPARQL, etc), providing an easy-to-implement standard interface to some or all of the same underlying features.

This document specifies an initial, "Level Zero" interface that is, arguably, already specified by a combination of existing standards: HTTP/1.1, WebArch, and RDF. This document gathers and rearticulates the necessary concepts to make clear the RESTful approach to accessing RDF information.

This document anticipates future work to define higher levels of this interface, providing significant additional functionality that is likely to be necessary for most practical applications. The primary goal of this document is to clarify the foundation on which that more-useful work is to be built.

This design is based on a Web server exposing a view of internal state as one or more RDF graphs. In some cases, the server will simply be storing RDF triples; in other cases, it will implement a mapping between portions of its internal state and RDF triples which completetly convey that state. The details of how to design such a mapping are out of scope for this specification.

2 Operations on Elements

This protocol applies to "Graph-State Resources" (GSRs), which are Resource which have their state exposed on the Web as sets of RDF triples, also know as RDF graphs. These are the REST elements of this protocol; we are specifying how Web servers are to respond to HTTP operations on Graph-State Resources. Level zero does not specify how to determine if a Resource is a Graph-State Resource, so that information must be conveyed using other conventions.

HTTP Verb Behavior
GET Returns a serialization of the RDF graph which encodes the state of the given resource. Content negotiation MUST be performed. In level zero, at least one of application/rdf+xml or text/turtle SHOULD be available from the server.
HEAD As normal, on the information GET would return. In particular, metadata may be returned using Link Headers, indicating, for instance, a SPARQL endpoint which can be used to query the data.
POST Not specified in general, to allow for application use.
PUT If the media type of the payload is an RDF graph serialization language, then set the resource state to be as encoded in the serialized RDF graph. Otherwise, not specified in level zero. Creates the resource, if it does not already exist. Some resources may be flagged "no clobber", to reject PUT if they already exist; level zero does not specify how to indicate this.
DELETE Remove the association between the resource and the URL used in the delete operation. The server MAY retain the underlying resource, perhaps accessible via a different URL. That is, a successful DELETE removes this one reference to the resource, but does not necessarily affect other references, or the resource itself. Creating multiple references or determining whether they exist is beyond the scope of level zero.
PATCH Modify the state of the resource as specified by the payload, according to its media type. Servers MAY accept SPARQL 1.1 Update (Content-Type: application/sparql-update) on element resource to modify the RDF graph view of the resource state, acting as a SPARQL endpoints with only a default graph.

3 Operations on Collection GSRs

A "Collection GSR" is a Graph-State Resource which is also a collection. Collection GSRs are minimally defined in level zero:

  • A POST to a Collection GSR is a request to create a new Resource which has an initial state which is the same as the payload of the POST operation.

In response to a successful POST operation, the server MUST return a 201 Created status code and a Location: header giving a URL of the new Resource. (The 202 Accepted response, defined for HTTP/1.1 as "deferred creation", is NOT RECOMMENDED.)

Collection GSRs are not related to RDF Lists, which are sometimes called RDF Collections.

Level zero does not say anything about the state of Collection GSRs. In particular, it does not specify how to find out which GSR's are "in" the collection. Note that conceptually a Collection GSR's state typically includes the URLs of its elements, but not the state of those elements. A GET on a collection is used, with the right vocabulary, to find out what Resources are in the collection, not to get the full state of all those resources in one operation.

4 Not in Level Zero

The following features are likely to be useful, but are left to be addressed at higher level specifications.

4.1 Self-Reference During POST

How does one create a GSR, using POST-to-Collection, and have the new content refer to itself, such as to give its cc:license?

Possible Solution: Use an empty relative URI, like <> in Turtle. But does this violate the RDF specs that say all node-labeling URIs in a graph are absolute? Perhaps not; the base is just not known by the client.

4.2 Creating Collections

How does a client request the creation of a new collection?

Possible Solution: POST to the collection, with the graph containing the triple { <> a rdf:ResourceCollection }.

4.3 Finding SPARQL Endpoints

How does a client find out about SPARQL servers that might be able to answer queries about the state of a GSR?

Possible Solution: Respond to HEAD and GET, and in a 201 Created response, with one or more Link: headers, with a (not yet registered) link relation type, "sparql-endpoint". At the given endpoint, the state of this GSR would be available using the GRAPH keyword with the GSR's URL.

4.4 Declaring as a GSR, or GSR Collection

How does a client know which Resources are GSRs or Resource Collections?

Possible Solution: Either in the returned RDF, or in a Link: header, indicate rdf:type rdf:GraphStateResource or rdf:ResourceCollection.

4.5 Signalling Writability

How does a client know when a Resource will accept PUT and PATCH operations?

4.6 Listing a Collection

How does a client find out which elements are in a Resource Collection?

Possible Solution: Have the Collection state include a triple like { <CollectionURL> rdfs:member <ElementURL> } for each element in the collection.

4.7 Paging

How does a client limit the size of the RDF graph serialization it receives?

Possible Solution: Use HTTP query parameters "page" (for the page number, >=1) and maxPageBytes. If omited, maxPageBytes is 1048576 (2^20, 1 MiB). Servers transmitting GSR state serializations over 1 MiB SHOULD support the page and maxPageBytes parameters. The Last-Modified date of the pages MUST be the same as for full GSR, and MUST be altered if the contents change in some way would would affect traversal by via page numbers.

4.8 Formats

Which RDF graph serializations can a client assume a server will accept and transmit-upon-request?

Possible Solution: Turtle.

4.9 Patch Types

Which PATCH media types can a client assume will be accepted?

Possible Solution: TurtlePatch

4.10 Patch-Workaround

How can clients perform PATCH operations in environments (like some Web Browsers) which do not yet support the PATCH verb?

Possible Solution: Servers SHOULD accept a POST to a GSR, with the URL modified to add the query parameter "_method=PATCH", as a PATCH to the GSR.

4.11 Multigraph Operations

How can a client operate on a large number of small graphs efficiently (without using SPARQL)?

Possible Solution: Servers MAY use a Link header with the (not yet registered) relation type "in-dataset", to obtain the URL of an RDF dataset which contains this graph state. RESTful operations may then be performed on that dataset, using suitable formats, when defined (such a TriG or N-Quads).

5 Comparison to SPARQL 1.1 Graph Store HTTP Protocol

The 2009 SPARQL WG Charter includes this deliverable:

Protocol enhancements for update. The group will also define protocol to update RDF graphs using ReSTful methods.

This is linked to additional details, as Protocol enhancements for update:

4.2.1 Motivations

By making it possible to update an RDF graph using RESTful HTTP methods, it becomes possible to use either a SPARQL endpoint or a plain Web server to update RDF data.

4.2.2 Description

It should be possible to manipulate RDF graphs using HTTP verbs, notably PUT, POST and DELETE. By this, clients doesn't need to know the SPARQL language to update graphs when it is not needed.

4.2.3 Existing implementation(s)

The following systems are known by the WG at the time of publication to support a RESTful update protocol.

  • Garlik's JXT supports HTTP PUT and DELETE.
  • IBM's Jazz Foundation supports graph update via a RESTful protocol.

The editor's draft addressing this is SPARQL1.1 Graph Store HTTP Protocol.

The differences, as of this writing, between that document (GSHP) and this one (SDIP):

  • The terminology and conceptualization are quite different. In GSHP the focus is the GraphStore; SDIP needs has no concept of a GraphStore, there are just Graph-State Resources, Collections, Web Servers, and Clients.
  • GSHP says POST "SHOULD" be understood as a request to merge in the payload to the state graph
  • GSHP says RDF/XML "SHOULD" be understood as the payload language @@check this
  • GSHP includes has two non-element URIs, the "Graph Store" and the "Dataset", neither of which quite aligns with DSIP's "Collection"
  • GSHP defines how to construct URIs for access to graphs in SPARQL datasets, with a graphstore+"?graph="+graphlabel construct. Perhaps this is better done in the SPARQL Service Description, instead, with a predicate relating the dataset to a URI prefix string, not necessarily graphstore+"?graph=".)
  • @@@ review more, looking for other things