AWWSW Final Report

Initial draft by Jonathan Rees (through 12-Nov-2011)
Substantially reorganized and edited by David Booth

Major AWWSW Topics

The AWWSW task force was created in a 2007 HCLS/TAG meeting to look into the semantics of web architecture, the semantic web, and HTTP interactions. Although the AWWSW task force discussed many thorny questions over its duration (see AwwswQuestions), the bulk of the work centered around the following four topic areas, which are further expanded below:

The meaning of a URI: where it comes from, what it is.
What is the purpose of "information resources"?
How should we interpret/repair httpRange-14?
What should be inferred from an HTTP response?

This report summarizes the results of the group's work. In most cases, the group did not produce consensus documents. Documents were typically written by individual members and circulated for comments by others.

The meaning of a URI: where it comes from, what it is.

When a URI appears in an RDF statement, how can the reader of that statement determine the author's intended meaning? What RDF triples characterize that meaning? Where does the meaning come from? How should the meaning be determined, particularly in the context of the HTTP protocol, for an http URI? Can we codify a suite of nose-following methods for semantic web use -- a recipe one can follow in order to obtain a canonical graph (or "definition", "description resource", "URI documentation") for a URI? Some points:

There is some actual practice around this, namely # and 303, and some documentation ("Cool URIs for the Semantic Web"), and this could be "standardized".
HTTPbis documents GET/303. Meaning of GET/303 is legitimized because before the httpRange-14 resolution there was no use of GET/303 on the web.
Alan Ruttenberg is not happy with how vague HTTPbis is about GET/303.
What do we mean here by "meaning"? Some views:
- The set of RDF triples that constrain the intended interpretations of a graph that uses that URI
- Meaning of a message = agreement between sender and receiver on antecedents and consequents
- Meaning of message to sender or receiver = what they infer from it
- When sender/receiver not specified we assume some general case - "typical" sender and/or receiver within some community
- Meaning depends on context of interaction (HTTP vs. SMTP vs. XML for example)
- Common sense usage. Not particularly philosophical. No need to ask "what is a meaning" (Quine).
- HST: Shouldn't something be said here about 'interpretation', per RDF Semantics?

Documents that address this topic:

What is the purpose of "information resources"?

What use is the AWWW notion of "information resource"? Is it needed? How should it be described formally? This group rejects the AWWW definition of "information resource" as untenable. We believe the AWWW notion of "information resource" needs to be replaced with something closer to TimBL's "generic resource", for the following reasons:

The only sensible purpose of "generic resource" is to enable the rule P(<U>) and GET U/200 Z implies P(Z), e.g. <U> dc:title "Fog" and GET U/200 Z implies Z dc:title "Fog".
It seems to agree with actual practice (e.g. use of Dublin Core and non-landing-page uses of xhv:license).

But has TimBL reviewed this idea? Pat Hayes and Tim don't seem to agree on the ontology of generic resources. Pat rejects over-genericness, and might prefer a practice where U is constrained only to the extent that any such Z is a correct interpretation of it (this may be equivalent). Alan Ruttenberg doesn't buy the notion of generic resource on ontological grounds: he wants a suite of particular kinds of generic resources, and insists on commitment to some category in each case. Jonathan notes that the TimBL architecture is a special case of the Fielding architecture: if x is an instance of y then x is a representation of y (but not vice versa).

Documents that address this topic:

How should we interpret/repair httpRange-14?

Although the TAG "resolved" the httpRange-14 issue in 2005, questions and controversy have continued, and some notable figures point the finger at the httpRange-14 resolution as a major impediment to linked data uptake. What does the httpRange-14 resolution mean formally? If httpRange-14 clause (a) means anything at all, its meaning depends on what "information resource" means. Furthermore, it may be that the httpRange-14 resolution cannot be made to stick (e.g. flickr, jamendo, Ian Davis), in which case there should be an explicit way to communicate what needs to be said. Should httpRange-14 be further clarified, rescinded or replaced? What is the practical reasoning behind it? Can it be amended to be sensible? Can it be formalized somehow? In the least, httpRange-14 needs to be amended to say that the URI "identifies" the generic resource served from that URI -- not some other information resource -- though in practise everyone interprets the intent of the httpRange-14 resolution this way anyway.

HST: This section, particularly the next paragraph, is out-of-date -- it should reference the public consultation process the TAG set in motion, and the results to date.

As preparation for a W3C Recommendation-track document that would supersede the existing httpRange-14 resolution, Jonathan has drafted a baseline proposal, Understanding URI Hosting Practice as Support for Documentation Discovery. The hope is that the TAG will adopt that draft as its starting point, and encourage people to submit change proposals against that document as a means toward achieving a consensus-based Recommendation.

Documents that address this topic:

Understanding URI Hosting Practice as Support for Documentation Discovery -- Jonathan's proposed baseline document for superseding httpRange-14
Jonathan's DRAFT call for change proposals against his baseline document
Interoperability of referential uses of hashless URIs

What should be inferred from an HTTP response?

For example, what can be inferred from an HTTP 200 response? Survey of status codes and requests. Formalizing RFC2616 -- ontology or rules that characterizes 2616. For example, Tabulator does HTTP requests. Can we standardize on a way to express, in RDF, what it learns from doing these HTTP exchanges? What does an HTTP exchange mean? Especially GET/200 and GET/303.

Some thoughts from Jonathan:

Usually for GET/200 the meaning comes from the hypertext context. GET means send me back whatever you like. The parties sort of just figure it out, often based on the text inside an anchor element.
At the protocol level what the client means, and what the server means, depend on which agreements they buy into:
- Do they agree on RFC 3986?
- Do they agree on RFC 2616?
- Do they agree on httpRange-14?
- Do they agree on the use of hash URIs ("#") in semantic-web fashion?
But when you do an exchange with someone you don't usually know what they're likely to assume.

Jonathan's answer: According to 3986/2616, an HTTP exchange tells the client nothing about the resource. "Identifies" and "representation of" are so underspecified that plausible deniability would always hold: Post hoc explanations of the "resource" and "representation" can always be synthesized by the server to meet specs. But GET/200 probably is widely understood to give us a "definition" for fragid-URIs whose stem is the URI, and GET/303 is understood as giving us the URI of a "definition" (description document, whatever) for the original URI. These exchanges are meaningful in a semantic web context.

Documents that address this topic:

Web Architecture Ontology
DBooth draft N3 rules for HTTP inferences
Toward an ontology for HTTP/1.1 semantics
JAR thinks Nathan and Michael may have written more about this.

AwwswFinalReport

AWWSW Final Report

Major AWWSW Topics

The meaning of a URI: where it comes from, what it is.

What is the purpose of "information resources"?

How should we interpret/repair httpRange-14?

What should be inferred from an HTTP response?

Other AWWSW Documents