DRAFT AWWSW Progress Report

Jonathan Rees, David Booth, Michael Hausenblas
25 May 2010

A report of the AWWSW Task Force.

This is a draft.

Introduction

The AWWSW "task force" has been working to clarify a number of thorny questions of web architecture. These include

  1. For any particular entity, what might we expect, or not expect, when a URI that names it is used as the request URI in an HTTP GET request?
  2. Under what circumstances should an http: (or https:) URI be used to name something, in particular a metadata subject?
  3. What is "best practice" around the interaction between HTTP redirects and the things designated by redirected URIs?
  4. How might designation (the use of a URI to designate a thing) be communicated and/or discovered, and with what authority?

We cannot recommend any particular "best practice" yet, so for the time being the goal is to put all positions in a common framework so that they can be rationally compared.

The purpose of the project is to apply a rudimentary formal approach to the debates around location/designation best practice. The idea is to record as a logical statement each opinion that has been expressed, which then can be either be accepted as a new axiom or not. The benefit of this approach over informal natural-language debate is that, if it works, it will turn fruitless arguments over definitions and boundary cases into a "cafeteria" of possible solutions, and instead of talking about what is right or wrong, we can turn our attention to solution selection based on engineering and pragmatic concerns.

While the main audience is users of RDF and OWL, the framework applies to web architecture as a whole and ought to apply equally to any language or protocol that uses URIs to designate entities of any kind.

The exposition applies the axiomatic method. By this we mean nothing is to be taken for granted - if some result can't be proven from axioms already stated, it is not to be assumed to be true. If you think an important statement is true but not provable here, please email it to jar@creativecommons.org.

URIs for metadata subjects

The volume of structured information about document-like objects, including web pages, articles, books, images, audio and video recordings, data sets, and databases, is poised to increase in the next five years, spurred by ongoing work on standards for metadata formats and protocols, including DC, BIBO, CITO, FRBR, genont, ORE, IAO, IRW, POWDER, and LRDD. These objects - we will call them "metadata subjects" even though this may not be quite accurate - must be designated (named) somehow in structured descriptive records, i.e. it must be possible to give a metadata subject a name and explain which thing is named by that name.

[Undecided whether 'designated' or 'named' is better]

When a metadata subject is made available on the web it may end up being designated by an actionable http: URI that is somehow associated with it, as web publishers may be strongly inclined to reuse an existing web URI rather than 'mint' a different kind of designator for the purpose, such as a URN, handle, or http: URIref with fragment id.

Understanding 'best practice' around the relationship between these two uses of URIs - as locators and as designators - is therefore necessary in order to avoid ambiguities and mistakes. But converging on 'best practice' for naming metadata subjects has eluded the community for years due mainly, in my opinion, to the inability of the parties to the debate to define terms carefully and communicate their assumptions clearly.

To take a concrete example: Suppose Alice writes a book, publishes it, and puts it on the web at URI U1. There are many notations that Bob, a metadata expert, might use for preparing descriptive metadata about Alice's book (author, title, publication date, etc.). In RDF (OWL, etc.) it is desirable (for the purpose of data linking) to designate the subject of such metadata assertions using a URI U2. Is it a good idea for Bob (with the cooperation of U1's "URI owner") to take U1 = U2, i.e. use the same URI both with HTTP and with RDF? Is the book necessarily different from the "resource" that RFC 2616 talks about? Does it matter?

A variation on this problem: What if GET U1 delivers not the book itself but a brief abstract of the book. Is it still a good idea to take U1 = U2?

There are many variants of these questions with different kinds of metadata subject substituted for 'book' and responses of varying degrees of pathology substituted for the carried in the response.

Web architecture review

We won't make much progress without a rigorous treatment of web architecture, even if we end up disagreeing with it, since all the relevant specifications and discussions assume it.

(Indented text is meant to to give motivation, or suggest a way in which the formal treatment might be informally conceived or interpreted. It should be taken as "informative" rather than "normative". If you think the indented remarks entail some interesting truth, by all means propose a new axiom or theorem to capture that truth.)

Let Thing be the universe of discourse, i.e. for all x, x is a Thing.

We use the word "thing" instead of "resource" to avoid confusion with the many mutually incompatible definitions of "resource".

Let 'RR' be a proper subclass of Thing.

Think of RRs ('REST-representations') as being similar to HTTP 'entities' - they consist of content and a few headers such as Language and Content-type. If R and S have the same content and headers then R = S. We'll figure out the details of encodings, which response headers to include, and so forth, later on.

Note that these RRs are not "on the wire" or associated with particular events or messages or "resources". A single RR might be stored in a file or transmitted multiple times. To talk about events one might have a different class, say a particular arrival of a RR at some agent.

Posit a relationship, which I'll call 'W', between some Things and some RRs. Write W(T,R) to mean that W holds between T, a Thing, and R, a member of RR.

Stipulate that a GET/200 HTTP exchange expresses a W-relationship between a Thing and an RR. That is, let us interpret an HTTP exchange with request GET U and response containing R, where R is a RR, as another way of writing the logical statement W(<U>,R). Here <U> is a term in our meta-language that refers to the "requested resource" (for a deconstruction of which see below).

Suppose, for example, that U is "http://example.org/z". Then a request GET http://example.org/z followed by a response carrying a RR R is interpreted as W(<http://example.org/z>, R). Nothing is said about what <http://example.org/z> "means" or "refers to".

We are not saying what W 'means' - merely that whatever it is that an HTTP GET/200 exchange means, that is what W means. The exchange and the W-statement have the same consequences.

A GET/200 exchange says more than just a W-statement: the response headers also tell us about cache control and other circumstances related to the exchange. Let's ignore these details for now.

We're using the notational convention <...> only as a way to choose constants in our meta-language for use in W-statements. Any other consistent rule for choosing logical terms for the things designated by request URIs in HTTP exchanges would work just as well. Later on we'll look more carefully at how URIs map to things.

There is a view that URIs in HTTP are purely operational and either do not designate anything or designate exactly some member of RR delivered in a recent GET request. For example, see this message and the email thread that contains it. In this view "resources" (here, the referents of <...> terms in our logic) are unnecessary theoretical constructs that don't exist. However, if we understand Pat correctly, "designation" ("naming", etc.) refers to the linguistic behavior of a community, and we have at least two communities that use http: URIs to designate: the expositors of web architecture and the specifications derived from it, and users of RDF and OWL. For web architecture, the resource theory is a useful framework for explaining recommended practice, and for RDF and OWL the question of whether resources exist or are necessary is hardly of consequence - if the theory helps in getting work done (for example, in proving consistency of a set of axioms) that's adequate justification, just as it is in mathematics or physics.

A series of HTTP exchanges may then be understood as a set of statements (perhaps statements that someone might take to be axioms of some theory), and we can consider the consequences of those statements within an axiomatic system of the kind that we're developing here.

Whether we choose to assume a W-statement W(<U>,R) derived from an HTTP exchange, and act on its consequences, is another story. (Consider a buggy or malicious proxy. HTTPbis starts to address believability by trying to specify a notion of 'authority', see below.) Issues of trust and authority will be treated separately, if we get around to them at all. A statement can still be meaningful without being believed.

We might fudge this by speaking of "credible" HTTP exchanges without saying exactly what that means (as indeed one cannot say).

At this point in the treatment, W-statements have no consequences so it hardly matters whether we assume them or not.

The relationship between a Thing and an RR, as expressed by a GET/200 exchange and here called W, is discussed in RDF 2616, in AWWW, and in Fielding & Taylor's writings. It is described in a variety of ways:

Nowhere is the relationship defined - thus the need for reverse engineering.

A single RR can be W-related to multiple Things, i.e. there exist T, T', R such that W(T,R) and W(T',R) and T != T'.

Example:
T = <http://www.w3.org/TR/2004/REC-rdf-mt-20040210/>,
T' = <http://www.w3.org/TR/rdf-mt/>,
R = {an RR whose content is 233076 octets long}

Some people might want to argue with this by saying that REST-representations are necessarily "on the wire" or something like that. Well, that kind of REST-representation is not what we're talking about.

W is not functional: A single Thing can be W-related to more than one RR, i.e. there exist T, R, R' such that W(T,R) and W(T,R') and R != R'.

Example: any web page subject to content negotiation. http://xmlns.com/foaf/0.1/ http://www.debian.org/intro/cn

Example: a bank uses a URI http://bank.example.com/balances to show a customer's account balances. If ten different customers are logged in to the bank's site at the same time, and all browse to that URI, each will get a different response, resulting in W(T,R1), W(T,R2), ... W(T,R10) (where T = <http://bank.example.com/balances>).

These W-relationships are a consequence of web architecture as expressed in RFC 2616 and AWWW.

GET/200 exchanges are only one way to express W(<U>,R). The absence of such an exchange as a historical fact does not necessarily imply that W(<U>,R) is false. For example, it might be useful to assume W(<data:,x>, R1) where R1 is an RR with content-type text/plain and content "x", even when no GET/200 exchange has stated this.

A trivial consequence of the stipulation is that if W(<U>,R) is true (i.e. if a GET U/200 R exchange is 'true') then <U> is a Thing. But this is just because the domain of W is Thing (or a subclass).

You can at this point deny that a particular URI U designates a Thing by just denying the truth of all statements involving <U>, in particular W(<U>,R). The more careful one is about which W-statements are assumed, the easier it will be for the W-statements that are assumed to be consistent with a restrictive theory of W.

The equivalence of GET/200 and W is a powerful constraint on W, when one takes actual web behavior to be 'true'. A server can produce whatever 200 responses it likes for a URI that it controls, and not violate protocol. That is, for any arbitrary sequence of RRs concoctable by a server R1 ... Rn, the server might, for some URI U, respond to a series of n GET U requests with 200 R1 ... 200 Rn. The resulting W-axioms - should we choose to assume them - would imply the existence of a Thing <U> satisfying W(<U>,R1), ... W(<U>,Rn).

No number of GET U/200 R exchanges can tell you exactly what <U> is - they do not "identify" <U> in the dictionary sense of "establish the identity of someone or something". There are several reasons for this.

  1. The absence of a GET/200 giving W(T,R) does not mean that W(T,R) isn't true.
  2. Two Things T,T' could have W(T,R) but not W(T',R) for some REST-representation R not hitherto obtained by a GET/200 exchange.
  3. T and T' could satisfy {W(T,R) iff W(T',R)} for all R in RR, and still be different
Information distinguishing such Things, if it were available, would have to come from a different source (e.g. RDF).

What is not provable at this point

We haven't assumed a way to falsify any W-statement. That is, there is no way to infer that W(T,R) does not hold. Therefore this theory is satisfiable by having a single Thing T having the property that W(T,R) for all REST-representations R.

(Well, the rdf-mt example could be taken to be an axiom that forces the existence of two Ts.)

We don't have any consequences of W(T,R) yet. This doesn't mean that there aren't any, just that it will take work to figure out what they are.

Note on time

Although W is time-sensitive, we'll ignore time as it is not helpful to account for it right now and notationally it gets in the way. Later we'll redo the treatment to take time into account.

So W is OK as a binary relation on Things to RRs for now. Later it might be a ternary relation W(X,R,t); or operate on timeslices of Things, W(slice(X,t),R); or use the 'correspondence' idea of our earlier report; or we might switch to temporal logic.

Note on RDF / OWL

RDF and its variants are just vectors for various fragments of first-order logic (FOL), and the ordinary mathematical notation for FOL has advantages for expository purposes (it is more expressive than RDF and OWL and clearly not tied to them), so better to start with FOL and then render it in RDF (and/or other notations) later on.

Properties of W

The TAG's httpRange-14 decision casts uncertainty on location/designation practice by advising for or against the use of GET/200 depending on what the URI names. At the same time it (a) fails to explain which things are restricted, (b) fails to prove the existence of the claimed the ambiguity (between a thing and a web page about it), (c) fails to solve the ambiguity when the thing itself is a "information resource". It appears that the intent is that a GET should retrieve something that is "like" the resource, yet the "like" relationship isn't specified adequately to explain the use of 200 responses with entities defined by the various metadata ontologies.

There is enough confusion over the use of http: URIs to name such entities, and over the "resource" theory that justifies it, that web publishers are likely to either ignore the httpRange-14 advice altogether and use http: URIs in situations where such use might lead to problems, or use distinct URIs for the object and for the web page when a single URI would have sufficed.

The admissibility of http: URIs sans fragment id to name things that aren't "information resources", and the use of the 303 response to support this, is only a side issue, since even if you reject this practice and commit to # URIrefs for non-"information resources", you still have to determine that something is an "information resource" before you are allowed to use fragmentless URIs and 200 responses.

A goal here is to render the ambiguity (between a thing and a description of a thing) referenced in the resolution as a contradiction within the axiomatic exposition. This goal has not yet been achieved.

Let WR be a proper subclass of Thing containing the domain of W, i.e. suppose W(T,R) implies that T is in WR. That is, we may not know what W means, but suppose that being in the domain of W implies membership in the class WR.

WR plays the role of "information resource" - if you're not in WR you can't be W-related to a RR and a GET/200 won't be "true". Again, we don't know what WR is; the approach is to take WR as an unknown to be solved for.

WR might also be a subset of the "metadata subjects" of the introduction, above.

That WR is a proper subclass of Thing is implied by the httpRange-14 read in conjunction with AWWW 2.2 (cars are not in WR).

WR = domain of W (every member of WR is W-related to some R) would be a plausible axiom, but I'm not sure it's needed.

Axioms proposed for WR:

  1. TimBL: "generic resources" such as the King James Bible (see genont) are in WR
  2. TimBL: literary works are in WR
  3. Pat: literary works are not in WR
  4. Larry M (?): WR is web resources only. Web pages, network resources, whatever. Not literary works.
  5. AWWW 2.2: cars, dogs, and printed documents are not in WR (by extension: anything physical)
  6. TimBL: strings and numbers are not in WR (by extension: anything mathematical)
  7. Pat: RDF graphs are not in WR (follows from prohibition of mathematical things)
  8. TimBL: RR is disjoint with WR (JAR doesn't understand the point of this)
  9. ?: Logical predicates (i.e. classes and so-called "properties") are not in WR (that is, WR is a subclass of owl:Thing)
  10. Mark Nottingham / Lisa Dusseault: at least some logical relations are in WR [need reference to thread]
  11. TimBL: members of WR are not determined by their W-relations. I.e. one might have W(T,R) iff W(T',R) for all R in RR, yet T != T' (time sheet example, recall AWWSW discussions a while back of the "trace" of a resource and of "phlogiston") (Pat: "sad, if true") (JAR: If members of WR must have phlogiston, this means that data: URIs can't refer to members of WR!)

None of these axioms (except the conflicting ones about literary works) provides much help with classifying metadata subjects (those treated in DC, FRBR, and so on) as being in WR or not in WR.

There are various theories of W in the works that are all intended to (a) sharpen the axioms for WR (b) explain why you would need the existence of x such that x is not in WR (c) help explain what distinguishes acceptable W-statements from incorrect ones. Here are some.

Says

Dan C's speaks-for theory: W(T,R) means T 'says' R in the sense of ABLP logic [need ref]. This means that T is a "principal" - anything that's not a principal can't be in WR - and that R is attributable to ("said" by) T, which ought to place a constraint on W.

The relation to ABLP logic, which would turn T into an ambient authority, suggests vulnerability to confused deputy attacks. This in turn suggests that what we infer from W(T,R) should follow only from verifiable (encrypted or signed) statements residing in R and not from anything we know about T.

On the web

Alan's what-is-on-the-web theory. This says that WR consists of those kinds of things that http: URIs actually designated before we started making them designate exotic things using RDF. Work in progress.

Property sharing

JAR's property-transfer-from-representation-to-resource theory. This says that if W(T,R), then T has some properties in common with R. (Unfortunately, which properties is not clear.) Something that can't share properties with any RR can't be in WR. An axiom of the form if W(T,R) and dc:creator(R,C) then dc:creator(T,C) would make dog(<U>) inconsistent with GET U / 200 R assuming dogs are not in the domain of dc:creator. (Not sure if this is right, but consider it.) Inspired by the way the FRBR hierarchy works. Work in progress.

Head-in-sand

The theory in which W(T,R) iff T = R. (Pat relectantly relaxed it to also admit T = a modest generalization of R.) This has the virtue that an HTTP response does identify the referent of the URI. It has limited expressive power, but if accepted a new class of entities could be introduced to recover the ability to talk about variation over time, multiple sessions, conneg, and so on.

Not a problem

There are also dissenters who say that if there is ambiguity, restricting WR is not the way to solve it:

Communication

Work in progress

An agent may, as we have been doing, formulate its own theory of the world - its own set of axioms that can be fed into some deductive engine and acted upon. If it needs information held by another such agent, it needs to communicate. To communicate, it needs a language, and expressive languages require 'words'. In the web architecture approach these 'words' are URIs.

There is no requirement that an agent use URIs in its own internal deliberations (axioms / theorems), but when it communicates it needs to choose URIs that will be understood by the agent it consults.

There is also no requirement that agents' theories be consistent with one another, or even that each URI designates the same thing to each communicating agent. Some amount of ambiguity of reference is unavoidable. The bottom line is whether successful communication is taking place, as questions of whether a word means the same thing to two agents are unanswerable in general. [need elaboration. start with Hurd's paper. note connections to ABLP and modal logic. bow to Wittgenstein. try to keep can of worms closed.]

Assume that we don't need to worry about using choosing different URIs for different interlocutors; that is, for any thing, the same URI for that thing can be used with all interlocutors. This seems reasonable given the web architecture goal of global naming, although it is easy to see how it might go wrong. We could develop a theory without this assumption but it would be unnecessarily complex for this level of analysis.

End of informal preliminaries...

Assume a proper subclass of Thing whose members are URIs... [hmm, call it URI perhaps?] [include also URIrefs of the form URI#fragid?]

Assume a relation hasURI(T,U) relating Things to URIs.

Write hasURI(T,U) to mean that we can use the URI U with our interlocutors to refer to thing T. This is not to say that we necessarily know what T is, or that our interlocutors also refer to T using U.

(This is a bit confused; we may need two relations, one for mapping URIs we get from others, and one for generating URIs to transmit to others. And choice of URI may depend on purpose: it makes sense to use a temporary redirect target for an HTTP request, but not for a SPARQL request. Operational details need working out.)

Following the stipulation above regarding GET/200 exchanges we would write for example: hasURI(<http://example.org/>,"http://example.org/")

If hasURI(T,"http://example.org/"), then we might issue a request GET http://example.org/ , obtaining R, hence learning W(T,R) (assuming the server is to be believed). (Again, this doesn't tell us what T "is" except under the Pat/Ian theory of WR.)

HasURI is not total.

There are things that do not "have" URIs.

HasURI is not inverse total.

There are URIs that the community does not "understand."

It ought to be useful for us to assume that hasURI is inverse functional, i.e. for any U there is at most one T with hasURI(T,U).

I.e. we (privately, within the theory we are building) "understand" any given URI as naming at most one thing (AWWW 2.2.1).

OK, now we should be in a position to talk about how 'hasURI' changes over time, e.g. through PUT/201, DELETE/200, 410, redirects, and nose-following. To account for these, the hasURI relation needs to be time-indexed just as W is.

An exchange GET U / 301 Location: V should cause hasURI(T,U) and hasURI(T',V) to stop being true at the time of the exchange and should cause hasURI(T,V) to start being true. This is a relation between U (or T?), V, and the time(s) at which the exchange occurs: [UNDEBUGGED]

PermanentRedirect(T,V,t1,t2)

There is some wobble around just what t1 and t2 are. Dan C will tell us to do the analysis in terms of events, not times, and he's probably right.

The exchange GET U / 307 Location: V should cause hasURI(T,U) and hasURI(T',V) to stop being true at the time of the exchange, and hasURI(T,V) to start being true for the duration specified by the cache control information in the response. If there is no cache control, we have to be more operational (you'll want to use the redirection at least once, probably). 302 is the same.

PUT U R/201 (created) should cause hasURI(T,U) to start being true at the time of the request, where T is some thing such that W(T,R).

If we have a time-sensitive exists(T) predicate then it should transition from being false to being true.

DELETE U/200 is a funny one. exists(T) should go from true to false, and we might or might not want for hasURI(T,U) to go to false.

These are all predicated on our "believing" (assuming) the exchange, which is a private matter possibly influenced by beliefs in the security of the network and/or the "authority" of the response (per HTTPbis http://tools.ietf.org/html/draft-ietf-httpbis-p1-messaging-09#section-2.6.1 ).

Nose-following

A community using a vocabulary may find a dictionary helpful. A dictionary is not necessarily an authority but rather is likely to be an imperfect but useful guide to usage and meaning.

Ideally a dictionary tells you how your community is using words, but as language is constantly changing through coinage, misuse, semantic drift, and reinterpretation, there is always likely to be some variance between what the dictionary says and what the community practices.

Community agreement on the "authority" of a dictionary (as when one is chosen to adjudicate disputes in Scrabble) is possible but the circumstances in which this authority is to be respected must be clearly circumscribed.

The principle of nose-following is that the web itself can act as a dictionary for URIs. An informally specified method maps URIs to "descriptions" (smallish documents, similar to dictionary entries) that contain structured or unstructured components, or both. The sum of all such descriptions constitutes the virtual dictionary.

We might write: hasDescription(U,T) meaning thing T is a description associated with URI U. (This relation is also time-sensitive.)

Method to find D where hasDescription(U,D)

  1. If U = V#fragid then hasDescription(U,<V>)
  2. Using HTTP protocol if necessary, either find R such that GET(<U>,R) or find D such that hasDescription(U,D)
  3. If 200 R, then hasDescription(U,D) where D consists of a rendering of the statement W(<U>,R)
  4. Else give up

"Using HTTP protocol" means following redirections, and interpreting GET U / 303 Location: D as expressing hasDescription("U",D). This one is a bit iffy, but is sanctioned by HTTPbis. [Should probably work this out in more detail.]

To match current practice the word "description" has to be interpreted quite liberally. For example one often obtains in this way a single document (a vocabulary or ontology) describing many URIs, not just the one in question.

Not coincidentally, the nose-following rule is consistent with the webarch "URI ownership" idea, since the description comes from the URI owner and may permit someone reading it to make an "association" of a resource chosen by the URI owner with the URI. (The URI owner is the party responsible for controlling HTTP responses to requests on <U>.)

Oddly, this method could be made to work with non-http: URIs, but generally HTTP servers only deliver useful responses for http: URIs (with the exception of proxy servers that handle ftp:).

When the method is formulated as above, one obtains an explanation of what a redirect with Location: uri#fragid has to mean, assuming parsimony.

If you have a way to turn the description into axioms to feed into your working theory, then you might choose to assert these axioms. However this must only be done if the consequences of mischief are in measure with the degree to which you trust the source.

The description you get for a GET/200 is pretty unsatisfying (unless in the Pat/Ian theory) - you learn that the thing satisfies this one W-statement, but not much else. The extension that has been floated to address this is to add LRDD (Link: + .well-known/host-meta) to the nose-following method. Then one may get an arbitrary description even when a GET yields a 200 response.

The quality of this virtual URI dictionary might be expected to be highly variable. Actual use of a URI in the community may bear no resemblance to the URI's description, especially if the description has gone through rounds of revision. Entries may be missing (404), internally inconsistent, or mutually inconsistent. As with any dictionary it is buyer beware; unfortunately, as compared to (say) wiktionary, there is no organized method for fixing this one.