Awwsw200OkResources

From W3C Wiki
Revision as of 16:26, 18 June 2010 by Jrees (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

See also AwwswHome.

Things for which - assuming an agent controls HTTP responses for a URI U, and the agent says that U "identifies" (names) that thing - a 200 response to a GET request with target URI U - is considered "good practice". Also, which 200 responses are OK for which things.

This document is meant to be provocative. Please disagree and record your view here.

Why does this matter? I have a thing in mind, and I'm trying to decide what URI to give it (if any), and how that URI should be served. I have to decide between

  1. Non-hash URI and GET/200
  2. Non-hash URI and GET/303
  3. Hash URI

This also bears on how we might use LRDD (Link: header and .well-known/host-meta) to provide metadata for resources served with 200 responses. That is, if we haven't decided what the resource is but are committed to delivering 200 responses, how much freedom do we have in preparing metadata for the resource?

I'm not calling this class "information resources" due to lack of consensus on what that term means. Maybe it's the same, maybe not. The question is what a 200 commits us to, not whether we want to call something an "information resource" (which is of absolutely no consequence other than to tell us that a 200 is OK).

Note that a 200 response is never required. A server is always free to give a 404, 303, etc., no matter what kind of thing it is.

Positive examples

Things on the web

If the URI owner has said nothing at all about what a URI U names, and the server is already delivering 200 responses, then web architecture (HTTP, RFC 3986, AWWSW) says that there exists a thing and that U names it. (This is purely a theoretical convenience, as such an assertion is not testable.) So it is always OK for the agent to say that U names some thing, without committing to any particular properties of the thing. In this case, according to the httpRange-14 rule, the thing is an information resource, even though the agent hasn't explicitly said so. That's OK, since the agent doesn't care what it is.

The 200 responses that are OK for these things include, at least, the ones that are returned in 'authoritative' responses when it is agreed that the URI 'identifies' the thing.

Wild-type web resources

I like to think there is a class of "wild type web resources" whose members are defined to be things that a URI owner can say that the resource is, when it isn't anything else. This way, if an agent is challenged to say what one of its URIs names, it can just conjure the appropriate member of this class, and it will be right by construction. (This would of course make it harder for the agent to change its mind later on.) Any 200 response at all to a GET is permitted, since the resource is defined to be the one whose webarch-representations are the ones that are actually delivered in 200 responses.

For each URI, there would be one of these resources, although the URI wouldn't necessarily name it.

This catchall category would contain all kinds of pathological resources - non-safe, non-RESTy, highly context sensitive, etc.

The 200 responses that are OK for these things include, and perhaps are limited to, the 'authoritative' ones (see above).

REST resources

If a resource is of the kind described by Fielding's REST theory - i.e., if it goes through a sequence of "states" through time, and the current state has a "representation" permitted by the resource's current value set - then it seems reasonable for a server to yield a 200 response containing such a representation for a URI that names the resource.

The 200 responses that are OK for these things are the ones that contain "representations" of the state of the resource in accordance with REST theory.

Files

If you were to approach a site operator with a URI for something on the site, and ask them which resource the URI "identifies" for their purposes, it's easy to imagine that they would look at their site configuration and say that the URI identifies some particular file in a file system. Indeed this seems very close to the original design of the Web.

They couldn't be wrong about this, could they?

Services

Consider this URI and the behavior one observes when doing a GET on it:

If you were the operator of the site, what might you say about what this URI names? It's not a REST resource, since the "representation" (in the response) does not reflect any particular state through which the "resource" passes (well, you could force this to be true through some unnatural construction, but that would be in poor taste, and counter to the spirit of REST).

The natural description of this entity might be "random number generator" and it might be a kind of "service".

Oaxaca weather report

The example in AWWW [TBD]

Not clear whether any current Oaxaca weather report is OK as a 200 response for this resource, or whether there is some restriction such as being authorized by some domain owner.

News site

E.g. http://news.google.com/ - it seems perfectly reasonable to say that this is a news site, and that there are certain things about it that make it a news site, and that it has properties that are characteristic of a news site (e.g. frequency of update, geographical coverage).

200 responses are current news.

foaf:Document

Lots of things would break if 200 responses were not permitted for members of this class, so I think we're obligated to say they're OK. There is no definition to speak of for this class, so maybe it has members for which we wouldn't want to permit a 200 (e.g. someone could reasonably put a physical copy of a book in this class).

The 200 responses that are OK for a foaf:Document would be those that are the document in its current state (the current version of the document). But which languages and encodings are acceptable may be a function of the particular document - i.e. two documents with the same content at all times may differ just because a French "representation" is allowed for one but not the other. That is, the content does not determine the document; restrictions on acceptable "representations" also affect the document's identity.

Maximally ill-behaved web thing

Thought experiment: Consider a server that processes GET U as follows: It chooses a URI Z more or less at random, does GET Z, and returns the response as the response to GET U. Basically, every webarch-representation is an authorized webarch-representation.

Web architecture says that U names some 'resource'. The HTTP protocol licenses the server to respond as it does. REST advises that this is not a good thing to do since architecturally we haven't specified what the resource's states or 'representations' are, but REST is not 'normative'.

One could take any of various attitudes toward this.

  1. This is not a useful resource, so we are unlikely to talk about it, so it doesn't much matter what we think or say about it.
  2. We can meddle (where obviously no one cares what we say) and say that this is 'not good practice'.
  3. We can modify webarch and say that not all 200ish URIs "identify" "resources" - this URI doesn't.

etc.

Negative examples

Physical things

It seems generally agreed, except by those who still reject the httpRange-14 rule entirely, that GETs of URIs naming physical things such as (physical) books, people, and cities should not lead to 200 responses.

Boundary cases

It is suggested that we not spend any time worrying about boundary cases. There are two possible attitudes one could take toward such an agreement: 1. If I want to know whether 200s are OK for a URI naming x (for example, some particular 'information content entity' in IAO), I have the freedom to say (within reason) that x is an 'information resource', if I like, if it seems reasonable to me. 1. It is better to err on the safe side, and when in doubt never use 200 responses (always use a # URI, 303, or whatever).

Mathematical things

It is said that URIs for mathematical things like numbers, strings, and graphs similarly should never lead to 200 responses. This is described not in ontological terms but as a type discipline: putting one of these where a 200-OK resource was expected, or vice versa, is probably a programmer error. This is a rejection of the AWWW definition of 'information resource' since clearly all essential characteristics of a number can be conveyed in a message.

A 200 response for a mathematical thing would be a textual encoding of that thing, e.g. for a number you might get a numeral and for an RDF graph you might get a serialization of the graph.

Nobody I know much likes this practice, but I feel I had to include it as illustrating an inconsistency with AWWW.

Things that might not be on the web

TimBL likes to say that 200s are OK for URIs naming literary works (e.g. The King James Bible, or Moby Dick). More detail at Generic Resources.

Pat Hayes disagrees.

FRBR

It seems reasonable that if you allow a 200 for a URI naming Moby Dick, you would also allow a 200 for a URI naming any member of the FRBR classes Manifestation, Expression, and Work. (Not Item, as Items are physical things.)

In each case the 200 response would carry what amounts to a Manifestation.

Resources named by data: URIs

data: URIs presumably "identify" "resources". What do they identify? If a data: URI were the request-URI in an HTTP GET request, would it be correct for a server to respond to a GET with a 200? The RFC is not explicit but I would think so, by analogy to what browsers do when they encounter a data: URI in @href or @src.

The 200 response would obviously have Content-type and octet sequence as specified by the URI itself. I would guess that adding information not present in the URI, such as Content-language, would be an error. Expires: is sort of meaningless but could be set far in the future to permit caching.

Webarch-representations

TimBL and others insist that URIs naming "representations" (in the webarch/REST sense) are not suitable, but this seems very weird to me, and it has some strange consequences (clearly they possess Dublin Core properties, like many other 200-OK resources, so what does this say about the domain of the DC properties?).

Now in a sense this doesn't matter as no one is likely to give a URI to a webarch-representation. However these things will occur in any ontology of web architecture. If we were to attempt to categorize ontologically those things for which it's OK for a GET of a URI naming the thing to lead to a 200, it would be nice to know what it is intrinsic to these things that disqualifies them.

If one did get a 200 for a URI naming a webarch-representation, it would make sense for the response to carry that representation directly, with Expires: arbitrarily far in the future (since it is resources that change, not their webarch-representations).

Textual things

What about words? Sentences? Paragraphs?

Journal articles

Similar question to FRBR Expression. As a kind of literary work, Pat would say no.

This question bears on how the BIBO ontology is to be deployed and how Zotero works. Extant practice may have already overtaken anything we might say. To be researched.

Predicates

Dan Brickley feels (or felt in 2007, TBD: find the email) that making classes (one-place predicates) and 'properties' (two-place predicates) be 'information resources' seems quite natural to him.

TBD

To examine: IAO, CITO, BIBO, IRW, etc. etc.