ErrataHttpRange14

From W3C Wiki

The httpRange-14 resolution has some bugs. This page proposes to repair some of the wording, without altering the intent.

The literature on httpRange-14 is huge. See http://www.w3.org/wiki/HttpRange14Webography for some representative contributions.

Original Text

agreed on 15 Jun 2005

The TAG provides advice to the community that they may mint "http" URIs for any resource provided that they follow this simple rule for the sake of removing ambiguity:

  • If an "http" resource responds to a GET request with a 2xx response, then the resource identified by that URI is an information resource;
  • If an "http" resource responds to a GET request with a 303 (See Other) response, then the resource identified by that URI could be any resource;
  • If an "http" resource responds to a GET request with a 4xx (error) response, then the nature of the resource is unknown.

"http resource responds" - what?

This would imply that a resource can "respond". This appears to be a typo (in discussion during awwsw). A 'server' is the sort of thing that responds, not necessarily a resource.

'"http" resource' isn't defined; I think what's probably meant is a resource that has an http: URI, although the advice really shouldn't be specific to http: URIs.

We will try to reword this to make it clearer and/or submit this to the TAG. Something like: If a GET request (on an http: URI?) yields a 2xx response, then the resource must be an information resource (etc.).

 DBooth: Or: "If a GET request on an http: URI yields a 2xx response, then the URI identifies an information resource".  
 In general I prefer "denotes" to "identifies", but since "identifies" has been used so many times in AWWW already, it may 
 make more sense to continue to use it in this context. 
 JAR: I prefer "names" to either...

is an information resource - really?

AWWW says that the naming authority gets to say of what the URI is a name. Suppose the naming authority says that the URI names a person (not an IR), but then, through error or willfulness, delivers a 200 response for the URI. The rule has not been followed, leading to an apparent contradiction. Who is wrong?

This is easily fixed by changing the wording of the rule to make it sound more like a rule - e.g. change "is" to "must be". The httpRange-14 resolution gives permission, not granted by the prevailing specification (RFC 2616), to use http: URIs to name things that are not information resources, but only when 2xx responses are not used in conjunction with them. The use of 2xx in this case would then be a failure to follow the extended protocol, not an alternative communication providing contradictory information about the resource.

DBooth: I think the httpRange-14 decision is for more than just giving permission to use http URIs to denote non-IRs. 
I think it is *good* that it gives a rule.  Changing "is" to "must be" doesn't help to my ear.  I think the reality is 
that if the URI owner says one thing by configuring his/her server to give a 200 response and says a different thing 
via some other pronouncement, the fact is that the owner made contradictory statements.  Perhaps the term "conclusive 
evidence" would help: "a 2xx response is conclusive evidence that the URI denotes an information resource".
 JAR: I think you're reading too much into it. I don't think you can conclude from the resolution that the TAG meant 
 for 200 to communicate that the resource is an IR.  The resolution just says "to avoid ambiguity" - it could just be 
 good practice because some user-agents might (whether permitted to or not) infer IR-ness from a 200.
  DBooth: I guess we're reading it differently.  To my mind there are two things being said.  One is the advice for 
  people who are minting URIs.  But the second thing is the *reason* for this advice: that a 200 response *does* imply 
  an IR.  I don't view the httpRange-14 decision as as an attempt to legislate this inference rule.  Rather, this 
  inference rule is being mentioned as a pre-existing fact, in order to explain the URI minting advice that is being 
  given.  To my mind "200 response implies IR" is a tautology, because only IRs can have awww:representations, but ask 
  I check AWWW to see where this is stated, I see it is not stated explicitly: http://www.w3.org/TR/webarch/#id-resources 
   There is a problem with "is" in that naming (so-called identification) is not objective; it is part of a 3-way relation 
   between a name, someone using the name, and what that someone takes the thing to name.  So it's important to deconstruct 
   what's going on here and not hide behind objective terminology.  If necessary we should talk about conforming 
   clients and servers, wording that is common in many specifications.
    By the way, the server does not necessarily speak for the naming authority (consider absolute URIs) unless, maybe, 
    it's conforming to RFC 2616, and there's no way to test for that.  Assertions carried by a response are only 
    "authoritative" regarding the binding of the URI if they are attributable to the naming authority.
     DBooth: Yes, it could happen that the owner of domain foo.example cannot control what is served from URIs 
     at http://foo.example/*, but if so it was the owner's *choice* to give up that control.  The owner pays for
     the domain, and if the owner chooses, he/she certainly can have it configured to serve response codes as desired.

JAR's version of what the resolution should have been

The TAG provides advice to the community that they may mint absolute http: URIs to name arbitrary things provided that they follow this simple rule for the sake of removing ambiguity:

  • If an absolute URI is dereferenceable (i.e. if an HTTP GET request could be answered with a 2xx response), then it names the information resource whose associated representations are retrieved using that URI.
  • If an HTTP GET request for an absolute http: URI is answered with a 303 (See Other) response, then the URI can name anything, including any information resource, as determined by the URI owner.

(Longer version.)

:DBooth: 1. I disagree with casting this as "the TAG asks the community to observe" this rule.
The whole point of this rule is that it is needed architecturally to enable clients to use a simple,
automatable, architecturally authoritative algorithm to determine the identity that the URI owner has chosen to 
establish for the resource.  It is *not* a matter of asking the community to be nice.  It is stating (part of) 
what that algorithm is, so that semantic web clients and publishers anywhere in the world can be assured of
communicating with full fidelity *if* they so choose. 

JAR: Agreed, I changed "ask" to "advise". I almost used the word "recommend" but that already has a technical meaning in W3C.

I haven't a clue what you mean by "architecturally". Architecture has to be either empirical or prescriptive. The rule is not empirically true but merely respected most of the time, and the level of review (i.e. claim to legitimacy) is just that the TAG in 2005 thought it was a good enough idea that it advised the community to follow it; and then the idea got uptake.

As I remember your definition of "identity" is that a URI's identity is a set of RDF statements that's somehow glued to the URI. That is a very strange way to use the word and I will certainly not use the word "identity" that way. But the important thing is agreement on how to use URIs, not on what RDF is glued to them. If RDF happens to help achieve agreement, that's fine, just as it would be wonderful if SVG or Excel could help. But it's not something I particularly care about, and neither does the TAG.

That said, expressing in RDF that a URI refers to the information resource at that URI is trivial, and is covered in the note.

:DBooth: 2. The definition of "information resource" in AWWW *does* need to be corrected as part of clarifying
the httpRange-14 resolution, but I think that should be done as an addendum to AWWW, rather than here.  In particular, the
definition should either: (a) specify exactly what RDF assertions constitute its definition; or (b) make 
clear that it is *intended* for things like documents and web pages, whose "essential 
characteristics can be conveyed in a message", but some URIs may identify resources that are (ambiguously) both
"information resources" and resources that one may not normally think of as being "information resources",
such as cars or people.

JAR: I am not interested in issuing a new edition of AWWW right now. The purpose of this blurb is that I have frequently had need to refer to something like this (most recently at the TAG F2F). To talk about essence or cars would be completely wrong and would just feed the FUD. If other people want to introduce FUD that's their business, but the deductive contract I've spelled out for dereferenceable absolute URIs is clear and provides definite, actionable consequences. Expressing those consequences in RDF would be possible but harmful since it would suggest that there is something special about RDF.

Alan has asked me to use a noncolliding word instead of "information resource" and I may yet do so, although I think my definition is probably much closer to actual usage than AWWW's is, so not likely to cause confusion among people who haven't been tainted by AWWW.

Anyhow I agree that trying to define a local term is distracting, and I've removed the definition.

:DBooth: 3. I think it would be better just to refer to the AWWW definition of "URI owner" instead
of providing a new definition here.

JAR: I don't like AWWW's "URI owner" but have reverted it to avoid distraction. May get burned by it though.