HTTP Protocol review from Gregory Williams on 2010-05-15 (public-rdf-dawg@w3.org from April to June 2010)

From: Gregory Williams <greg@evilfunhouse.com>
Date: Sat, 15 May 2010 03:31:49 -0400
To: Chimezie Ogbuji <ogbujic@ccf.org>, SPARQL Working Group <public-rdf-dawg@w3.org>
Message-Id: <D91B9E99-F289-4BD3-BC3B-3E5B200372D9@evilfunhouse.com>

Chime,

Below is my review (ACTION-235) of the HTTP Protocol document. Most of the issues detailed just need some clarification in the text, or are formatting issues. There are a few bigger issues (the use and meaning of "dataset", discussion of what constitutes a "compliant implementation", the affect of conditional requests on non-GET operations). These might need discussion, but I don't know that they have to be nailed down before this draft publication.

thanks,
.greg

--------
Abstract

You link to SPARQL Query, but not to SPARQL Update (even though both are mentioned directly).

--------
1. Introduction

I'm not sure I understand the second sentence. It reads:

"""It emphasizes a clear separation between the RDF graph management actions performed from the networked body of RDF knowledge identified by a URI as the target of the actions, the lexical form of a Request URI, the URI of a graph in an RDF dataset, and the (optional) RDF delivered with the message."""

Is there any way to make that more clear? In particular, I can't seem to parse "the RDF graph management actions performed from the networked body of RDF knowledge identified by a URI as the target of the actions".

--------
2. Terminology

sp. "enduce".

"Architectural style" is only ever used in the immediately following definition for REST. Is it really necessary, or might it be rolled into the definition for REST?

"IRI" should be defined before it is used in the definition for "Resource". Also, this definition says "Before an IRI found in a document is used by HTTP, the IRI is first converted to a URI. See Section 4.1", but section 4.1 doesn't seem to actually talk about this conversion process.

"Network-manipulable RDF (Dataset)" - Is "dataset" here used in the same way as it is in SPARQL (with a default graph)? I can't find any discussion of a default graph in the rest of the document, and fear two definitions of "dataset" in SPARQL documents might be confusing. Also, worth adding an explicit reference to section 4.2 when you say "URIs that can be embedded in the query component of URI in a manner described later in this document".

"Networked RDF knowledge" - I find the use of "knowledge" to be confusing for an information resource (perhaps just a personal preference, but I'd think knowledge would be more likely to be a non-info resource, all else being equal). Use of "IRI" and "URI" in this definition and the previous one don't seem consistent (almost being used interchangably).

At the end of this section, you define "the resolvable URI of a graph" as a shorthand, but that phrase isn't actually used in the rest of the document as far as I can see. You do use "resolvable URI" once, but in a parenthetical, so might just be worth expanding the shorthand for the one use.

--------
3. Protocol Model

This section says, "A compliant implementation of this specification MUST accept HTTP requests directed at its dataset and handle them as specified by this protocol," but this disagrees with the discussion in section 7 (Security Considerations) and doesn't seem to leave much wiggle room for things like refusing requests that seem like DOS attacks, etc.

--------
4.1 Direct Graph Identification

"However, in using URIs in this way, we are not directly identifying the RDF graphs but rather the networked RDF knowledge they represent." Isn't this backwards? Doesn't the networked RDF knowledge represent the graph?

The diagram in this section should be labeled. The description of the diagram talks about URIs, but the diagram itself uses "IRI". As previously, I'm not sure if the "representedBy" arc in the diagram should be "represents". Is "NetworkManipulableDataset" meant to signify a single graph, or a dataset? Finally, what do the black arcs represent (serialized RDF)?

--------
4.2 Indirect Graph Identification

Are there any restrictions on what URIs can be used with the ?graph=... query component? It might be helpful to see a full example URI using indirect graph identification.

This diagram should be labelled, too, and at least some mention made of it in text. There should also be some sort of visual connection in the diagram between the "Networked RDF knowledge" node and the 'http://..?graph=...' node.

--------
5. Graph Management Operations

I'm not sure "Networked RDF Knowledge" can be used in the first sentence if you want to include indirectly identified graphs as possible targets for these operations. The definition of "Networked RDF Knowledge" seems to only include the directly identifiable graphs. This terminology issue occurs throughout section 5.

--------
5.1 PUT

The example SPARQL UPDATE operations should seperated by semicolons. The syntax should also be aligned with the most recent draft of the Update doc (I believe this means using 'INSERT DATA { GRAPH <graph_uri> { ... } }' instead of 'INSERT DATA INTO <graph_uri> { ... }'.

"Note that the DROP and CREATE expressions are only necessary if the networked RDF knowledge does not already exist in the server." I'm not sure I understand this. If the networked RDF knowledge doesn't exist, why would the DROP be necessary?

The use of the word "can" seems very weak in describing the semantics of PUT: "the origin server can create the knowledge with that URI in the associated network-manipulable dataset". Why not "SHOULD"?

--------
5.2 DELETE

Why does the text here explicitly say that "the client cannot be guaranteed that the operation has been carried out, even if the status code returned from the origin server indicates that the action has been completed successfully"? What good is a "completed successfully" response if it doesn't actually mean anything? Would using HTTP 202 (Accepted) be a better code to use in that situation?

Why are brackets used in the SPARQL UPDATE operation?

--------
5.3 POST

Again, why are the brackets used in the SPARQL UPDATE? This is another place where the SPARQL UPDATE syntax should be aligned with the current draft.

I can't make sense of this: "and distinguish such a request from the insertion use case on the basis of whether or not the request URI identifies networked RDF knowledge managed by the server". Is the "insertion use case" just the use of POST on a previously-non-existant graph? Should the Location HTTP header be returned with a 201 Created response even if the graph is indirectly identified (and possibly not dereferencable)?

--------
6. Conditional Requests

The text in this section talks about "any of the operations", but only goes on to talk about GET requests. I'd like to see some discussion of how conditional requests work with PUT, DELETE, and POST requests, and especially in combination with indirect graph identification.

Received on Saturday, 15 May 2010 07:32:21 UTC