AwwswHome/Late2010

From W3C Wiki

Toward a next draft

Make a pass over this to separate discourse level from meta-level... identify the two different "we"s

Metadata use cases

See W3C Metadata Activity Statement - a good argument for web metadata.

We want to be able to write metadata, and we want to use a document's URI as the subject in statements.

Terms that we will need

T3. URI (class)

A string that is a syntactically valid URI. We can use the datatype xsd:anyURI for this class without going too far wrong.

(check to see what Tabulator does.)

T2. Representation(?) - what gets delivered (class)

The payload of a message. Content (bits) together with directives specifying the intended interpretation of the content (media type, language, possibly other metadata).

Roughly the same as "entity" in RFC 2616 or "representation" in AWWW or HTTPbis.

These things can be copied, cached, retransmitted by a proxy, etc.

Stuart and Noah prefer to reserve the word "representation" for the bits while they're "on the wire" (cannot be copied, cached, etc.).

A similar question is whether these are formal entities or if the bit-identical "representations" can be considered different on account of their provenance.

AWWW defines "representation" differently, as does RFC 3986, as does RFC 2616.

Not sure what Tabulator does about this class; I suspect it talks about responses rather than T2s, but I haven't checked.

T1. dereferencesTo

U T1 Z means that U (a URI) can legitimately dereference to Z (a representation) a la RFC 3986. (We'll have to reverse the polarity to get this into RDF.)

In particular, if we observe that U dereferences to Z, and believe that our infrastructure is behaving correctly, then we can deduce that U T1 Z. In turn, if we know that U T1 Z, then we can become part of the infrastructure and relay this information to someone else.

In HTTP, dereferencing is associated with GET U / 200 Z exchanges. The representation is chosen or generated by a server, where the choice of server may be nondeterministic if multiple DNS A records, load balancing routers, etc. are involved.

Another way to deduce U T1 Z is by exploiting caching parameters and protocols.

According to HTTPbis, only responses "authorized" by the domain owner are legitimate. The assumption is that the server hardware is acting on the domain owner's behalf.

AWWW generalizes "authorization" to other URI schemes with its "URI owner" concept.

We probably have to assume the ICANN DNS root and adherence to all appropriate specifications such as TCP/IP.

On factoring dereference.

The central dogma of web architecture is that whenever U T1 Z (U dereferencesTo Z) holds, there is an intermediate entity involved, called a "resource" in RFCs 3986 and 2616. The resource is said to be "located at" U (in URI space, that is) and Z is said to be a "representation" of the resource. That is:

U dereferencesTo Z.   if and only if    [locatedAt U] hasRepresentation Z.

There is no way to test whether this is true, as the URI owner may have arranged for U to dereference to Z in the absence of any such resource. However: In the case of HTTP, it is impossible to follow the letter of the specification without having to come up with *some* resource as being the one "located at" U, so we can assume that if challenged, the URI owner ought to be able to say what the resource is. However: the choice of resource is arbitrary, as the specifications give no way to falsify any statement of the form R hasRepresentation Z.

In fact, the specs give us no way to prove that there is more than one such resource. A consistent interpretation would be for the server to simply look at the URI and deliver a representation based on that, choosing the resource at random. Another consistent interpretation would be to take "located at" to be the identity, equating resources and URIs (as many people do anyhow).

(still mulling this over.)

Why is this factoring useful - why talk about resources rather than URIs?

  • No one should care about the URI, only about associated properties and behavior. (Ideally, permuting the entire URI space ought to cause no change in the behavior of the web.) (Compare "referential transparency".)
  • Separating a thing from its location facilitates relocation (and reasoning about same)
  • Addressing systems other than URIs ought to be admissible (e.g. in some successor to the web, or in a logic or programming language)
  • This is how the web specs are written, so useful or essential consequences may have already been drawn from the factoring (inertia)

Clearly, total freedom to designate the resource independently of retrieved representations is not what was intended. Some observations...

  • The architecture was designed with information retrieval systems in mind, so this is the primary use case. The representation is a specific incarnation of the resource, which is a more or less generic digital object.
  • In the information retrieval case, resources are documents (or more generally, digital objects). They are dynamic by design. Conneg was introduced as a way to get URI portability: to future-proof the Web, promote internationalization, and accommodate variation in client capabilities.
  • Roy's REST model sets up a similar framework that is useful if you like it. REST can be gamed, hard to falsify.
  • Not clear whether continuous sensors and random number generators fit in the REST model.
  • Not clear whether REST, which is a "software architecture", has any business in an ontological treatment or in RDF.
  • Sessions (client-modulated content) completely break the connection between a URI and any particular digital-object-like resource.
  • Servers can do whatever they please - they are not obligated to respect REST or "information retrieval". Example: Random page.
  • With POST there can be an effect beyond just one resource or even its server(s) (e.g. send email, control robot). Is POST behavior a property of a "resource", or just something the server does on the side?
  • ?queries, CSS, Javascript, AJAX, plugins, and so on make the boundaries and nature of the "resource" even less clear.
  • From these examples it feels as if the resource or a copy of it is somehow situated inside the server(s), but in a way that enables mobility.

So the central dogma creates the problem of explaining what might - or might not - be said to be located at any particular web URI. (By "web URI" I just mean one in the domain of T1.)

The choice could be left unconstrained, in which case anyone could "locate" whatever thing they liked at a web URI they controlled, and assign any representation at all as a representation of the thing. Perhaps some common practice would emerge.

However, the representations of something located at a URI are not meant to be arbitrarily chosen by the URI owner, even if the URI owner is somewhat involved in doing the "representing". The main message of a representation, its "essence", as opposed to inconsequential aspects such as formatting or sidebar advertisements, is supposed to derive from - or ideally *be* - the resource.

(This sort of hand waving is inadequate to constrain "representation". Still struggling.)

Another way to say this: The resource is what is invariant under relocation; its properties are what we want to preserve and talk about, not what is a hosting or conneg accident. When someone talks about a web "resource" (in natural language or RDF):

  • what are they likely to be talking about? (or, what would we like to talk about?)
  • what might they say about it? (or, what would we like to say?)
  • what goals do we want to advance? (or, would others like to advance?)
  • when nothing is already being said, are there certain practices we'd want to encourage over others, in order to advance those goals?
  • to the extent we're creating new linguistic practices, what kinds of constructions are most useful?

But this is all pretty circular.

T5. locatedAt - between a thing and a URI

I suggest calling this particular interpretation relation "located at" as suggested by RFC 2616. This is not to imply physical location, but rather logical location in URI space. (Actually I hadn't realized before how confusing this is: not the same sense as the L in URL.)

T5 is inverse functional, i.e. there can only be one thing on the web at a particular URI.

The resource is one determined by server implementation (for example, a file?) or otherwise designated by the "URI owner". The URI owner can choose subject to appropriate constraints, namely that for all Z, U dereferencesTo Z implies that [locatedAt U] hasRepresentation Z (i.e.: a server under the URI owner's control is limited to delivering representations of the resource). [and probably also some other constraints on PUT and POST and other operations like PROPFIND ??] If no resource is designated, or the designated doesn't make sense, it should be taken as unknown (except to the extent that its representations are known). If multiple resources are designated, all bets are off. If you don't trust the owner to tell you the same resource that s/he tells others, things are bad. This just isn't objective, and I don't see how it can be because it's all about predictions about what people are going to do.

(The weird thing here is that the owner gets to either choose the resource and then generate representations appropriate for it, or generate representations and then in hindsight decide what the resource is. This seems like too much freedom, not enough objectivity. Weird.)

Popper-like, there is no way (for document-like things at least) to prove that any particular resource is located at a given URI; such a statement can only be falsified, by the retrieval of a representation that is not a representation of the resource. You can be pretty darn sure by investigation of the apparatus (server) and by consulting with those who manage it, but there is always the chance that a non-representation will slip out at some unknown future time. Like all human institutions, web sites are fallible.

T6. Between a thing and one of its "representations"

The other (and more interesting) part of the factoring of T1.

The party line is that the main purpose of the resource is to carry, or to be, 'abstract information', and each representation has to also carry this information, possibly with some degradation but only if required by delivery requirements. The representation can contain other information too, such as advertising, but only 'inconsequential' information that is not a 'main message'.

Even if we have no idea what's located at the URI, we can still say that a 200 response to a GET yields a representation that is T6-related to the thing located at that URI.

Examples:

  • Generic resources (sensu TimBL)
    • A 'fixed resource' has its single 'representation'
    • A 'representation' of a document would be a representation that is recognizable as being that document... same title, author, content
    • Generic resources can relate to one another by 'subresource' relations (degrees of specificity), e.g. might have same content but one might language-invariant and another not. (fixed resource as a special case.)
  • A 'representation' of the thing located at http://en.wikipedia.org/wiki/Special:Random would be a random page. If successive GETs on a URI did not yield random pages, then this random-page thing would not be located at that URI.
  • Everything on the web. The design requirement is that whenever U is T1-related to z on the GOFHW, there exists some x such that x is T5-related to U and x is T6-related to z. However it is hard to say just what the best choice of x would be.

T4. "Information resource"

The domains of T6 and T5 ought to be compatible if they're to compose to make T1. We can consider inventing a class "information resource" for those things that can act as intermediaries in the T1 relation. (However, something can "have representations" without being "on the web", and possibly vice versa (e.g. a POST target that doesn't handle GET).) The hard part is saying what is not an IR.

IRs seem to have nothing in common with one another. Trying to define this class ontologically is difficult, if not impossible, and possibly not even desirable.

Examples:

  • Everything on the GOFHW (good old-fashioned hypermedia web)
  • Documents
  • Journal articles
  • 'Fixed resources'
  • Random page e.g. http://en.wikipedia.org/wiki/Special:Random
  • Resource with blank representation but nontrivial action for POST

Near misses:

  • REST resource (time-varying information with associated 'representations') (problem: random page)
  • A generic individual, generalizing specific 'representations' (Pat?) (problem: random page)
  • A web page, 'page of the web' (problem: things not yet on the web. need to include potential web pages)
  • Something on the web, or that could be on the web (circular. can a dog be on the web?)
  • A facet of a web server (that part of it specialized to serving a particular URI)
  • An API or contract (are documents APIs?)
  • An 'object' in an OOP sense (can a journal article handle an OOP message?)
  • Something induced by an 'exchange class' (maybe, but what? and how?)
  • Either a generic media resource (declarative-friendly) or an OOP 'object' (operational-friendly) but not both
  • "Metadata subject"

Consider the possibility that this is not a natural category, or that there is no natural way to talk about these things and they have to be treated operationally and axiomatically (i.e. non-ontologically).

Talk about Alan R's idea of an information resource "court" empowered to approve classes (with their representation relations) as IR classes.

Reflection

In any given RDF graph, <U> might or might not be used as a name for the resource [locatedAt U]. For graphs that are intended to be taken as reflecting the truth, the practice of interpreting <U> to be the resource located at U is recommended, although contradictions are a risk if another graph to be integrated that takes it to mean something else. To be sure, include the statement <U> locatedAt U. so that (if there is inference and all goes well) mistakes show up as inconsistencies or validation failures.

Invariant content property transfer

If P is a content property (roughly speaking: informative over representations), and R has at least one representation, then we encourage that the following rule be established and used:

  • if every representation of R has property P, then deduce that R has property P.

Dually, if R has property P, and Z is a representation of R, then deduce that Z has property P.

For example, if every representation of R has dc:title "Bleak House", then conclude that R has dc:title "Bleak House".

Conclude this independently of the type of R.

The purpose of this rule is to make metadata curation both possible (you don't need the URI owner's permission) and useful (it provides testable information).

This practice forces the 2xx rule given in the httpRange-14 resolution. Suppose that we have GET U/200 Z where Z is, say, a person P. Note that "is about" is a content property. If every GET U yields 200 Z where Z is about P (as it very well might), we conclude that [locatedAt U] is about P. If P locatedAt U, we have P is about P, which is not reasonable.

(tighten this up. there are multiple ways this can go wrong, fix them all.)

HTTP

As HTTP is the primary incarnation of web architecture, it deserves some ontology of its own.

  • Exchange (a kind of event)
  • Message
  • Request
  • Response
  • a property giving the target-URI in an Exchange (or in the Request)
  • properties for method name, status code, etc.
  • properties for media type, language
  • particular media types... (see TimBL's ontologies)

We can express in OWL: From a GET U / 200 Z exchange, deduce [locatedAt U] hasRepresentation Z.

Redirects and content-location:

Caching (at least Expires:)

describedby link relation / POWDER property

Time