Text by JAR, struggling toward an account of what the project is about.
---------------

The volume of structured information about document-like objects,
including web pages, articles, books, images, audio and video
recordings, data sets, and databases, is poised to increase greatly in
the next five years, spurred by ongoing work on standards for metadata
formats and protocols, including BIBO, CITO, FRBR, ORE, IAO, POWDER,
and LRDD. These objects - I will call them "metadata subjects"
acknowledging that the class of objects in question may be broader
than just "data" - must be designated (named) somehow in structured
descriptive records, i.e. it must be possible to give a metadata
subject a name and explain which one is named by that name.

[Undecided whether 'designated' or 'named' is better]

When a metadata subject is made available on the web it may end up
being designated by an actionable http: URI that is somehow associated
with it, as web publishers may be strongly inclined to reuse an
existing web URI rather than 'mint' a different kind of designator for
the purpose.

Understanding 'best practice' around the relationship between these
two uses of URIs - as locators and as designators - is therefore
crucial in order to avoid confusion and mistakes. But converging on
'best practice' for naming metadata subjects has eluded the community
for years due mainly, in my opinion, to the inability of the parties
to the debate to define terms carefully and communicate their
assumptions clearly.

The purpose of the present project is to apply a rudimentary formal
approach to the debates around location/designation best practice. The
idea is to record each opinion that has been expressed as a logical
statement, which then can be either be accepted as a new axiom or
rejected. The benefit of this approach over informal debate is that,
if it works, it will turn a host of fruitless arguments over
definitions and boundary cases into a 'cafeteria' of possible
solutions, and instead of talking endlessly about what is "right" or
"wrong", we can turn our attention to engineering and pragmatic
concerns.

`````

[Raman and Larry keep insisting that the question of whether and how
to push RDF and http: URIs on the world seems to be still open.  Is
some kind of apology needed?]

`````

The document will have several parts [some of which haven't been
written yet].

1. Web architecture (resource/representation)
2. httpRange-14
3. Redirects
4. URI binding and resolution
5. Provenance and authority

The exposition applies the "axiomatic method".  By this I mean nothing
is to be taken for granted - if some result can't be proven from
axioms already stated, it is not to be assumed to be true.

Indented text is meant to suggest a way in which the formal symbols
might be informally conceived or interpreted, or to give motivation;
but they must be taken as "informative" rather than "normative".  If
you think the indented remarks entail some interesting truth, by all
means propose a new axiom or theorem to capture that truth.

`````

1. Web architecture

Let Thing be the universe of discourse, i.e. for all x, x is a
Thing.

Let 'RR' be a proper subclass of Thing.

  Think of RRs ('REST-representations') as being similar to HTTP
  'entities' - they consist of content and a few headers such as
  Language and Content-type.  If R and S have the same content and
  headers then R = S.  We'll figure out the details of encodings,
  which headers, and so on later.

  Note that these RRs are not "on the wire" or associated with
  particular events or messages or "resources".  A single RR might be
  stored in a file or transmitted multiple times.

Posit a relationship, which I'll call 'W', between some Things and
some RRs.  Write W(T,R) to mean that W holds between T, a Thing, and
R, a RR.

Stipulate that a GET/200 HTTP exchange expresses a W-relationship
between a Thing and an RR.  That is, let us interpret an HTTP exchange
with request GET U and response containing R, where R is a RR, as
another way of writing the logical statement W(<U>,R).

  Suppose, for example, that U is "http://example.org/z".  Then a
  request GET http://example.org/z followed by a response carrying a
  RR R is interpreted as W(<http://example.org/z>, R).  Nothing is
  said about what <http://example.org/z> "means" or "refers to".

  Note that we are not saying what W 'means' - merely that whatever it
  is that an HTTP GET/200 exchange means, that is what W means.  The
  exchange and the W-statement have the same consequences.

A series of HTTP exchanges may then be understood as a set of
statements (perhaps statements that someone might take to be axioms of
some theory), and we can consider their consequences within an
axiomatic system of the kind that we are developing.

  Whether we choose to "believe" a statement W(<U>,R) derived from an
  HTTP exchange, or act on its consequences, is another story.
  (Consider a buggy or malicious proxy.  HTTPbis starts to address
  believability by trying to specify a notion of 'authority'.)  Issues
  of trust and authority will be treated separately, if we get around
  to them at all.  A statement can still be meaningful without being
  believed.

  We might fudge this by speaking of "credible" HTTP exchanges without
  saying exactly what that means (as indeed one cannot say).

  At this point in the treatment, W-statements have no consequences so
  it hardly matters whether we "believe" them or not.

  The relationship between a Thing and an RR expressed by a GET/200
  exchange, here called W, is discussed in RDF 2616, in AWWW, and in
  Fielding & Taylor's writings.  It is described in a variety of ways:

    R is "an entity corresponding to" T (RFC 2616 10.2.1)
    T "corresponds to" R (RFC 2616 10.3.1)
    R is a representation of the state of T (Fielding and Taylor)
    R "encodes information about state" of T (AWWW glossary)
    R "is a representation of" T (AWWW 2.4)

The same RR can be W-related to multiple Things, i.e. there exist T,
T', R such that W(T,R) and W(T',R) and T != T'.

  Example: 
    T = <http://www.w3.org/TR/2004/REC-rdf-mt-20040210/>,
    T' = <http://www.w3.org/TR/rdf-mt/>,
    R = {an RR whose content is 233076 octets long}

  Some people might want to argue with me here by saying that
  REST-representations are necessarily "on the wire" or something like
  that.  Well, that kind of REST-representation is not what I am
  talking about.  I am talking about Things ('RR's) that satisfy the
  axioms I write down.

One Thing can be W-related to more than one RR, i.e. there exist T, R,
R' such that W(T,R) and W(T,R') and R != R'.

  Example: any web page subject to content negotiation.  (Need
  concrete example... FOAF?)

  Example: a bank uses a URI http://bank.example.com/balances to show
  a customer's account balances.  If 100 different customers are
  logged in to the bank's site at the same time, and all browse to
  that URI, each will get a different response, resulting in W(T,R1),
  W(T,R2), ... W(T,R100) (where T =
  <http://bank.example.com/balances>).

  These W-relationships are a consequence of web architecture as
  expressed in RFC 2616 and AWWW.

  GET/200 exchanges are only one way to express W(<U>,R).  The absence
  of such an exchange as a historical fact does not necessarily imply
  that W(<U>,R) is false.  For example, it might be useful to assume
  W(<data:,x>, {RR with content-type text/plain and content "x"}), even
  when no GET/200 exchange has stated this.

A trivial consequence of the stipulation is that if W(<U>,R) is true
(i.e. if a GET U/200 R exchange is 'true') then <U> is a Thing.  But
this is just because the domain of W is Thing (or a subclass).

  You can at this point deny that a particular URI U designates a
  Thing by just denying the truth of all statements involving <U>, 
  W(<U>,R) in particular.  This seems to be the position advanced by
  the current HTML editor, who has said that resources don't exist.

  The equivalence of GET/200 and W is a powerful constraint on W, when
  one takes actual web behavior to be 'true'.  A server can produce
  whatever 200 responses it likes for a URI that it controls, and not
  violate protocol.  That is, for any *arbitrary* sequence of RRs
  concoctable by a server R1 ... Rn, the server might, for some URI U,
  respond to a series of n GET U requests with 200 R1 ... 200 Rn.  The
  resulting W-axioms - should we choose to believe them - would imply
  the existence of a Thing <U> satisfying W(<U>,R1), ... W(<U>,Rn).

Note on what is NOT provable at this point

  We haven't assumed a way to falsify any W-statement.  That is, there
  is no way to infer that W(T,R) does not hold.  Therefore this theory
  is satisfiable by having a single Thing T having the property that
  W(T,R) for all REST-representations R.

  (well, the rdf-mt example could be taken to be an axiom that forces
  two Ts.)

Note on time

  Although W is time-sensitive, we'll ignore time as it is not helpful
  to account for it right now and notationally it gets in the way.
  Later we'll redo the treatment to take time into account.

  So W is OK as a binary relation on Things to RRs for now.  Later it
  might be a ternary W(X,R,t) or operate on timeslices of Things
  W(s(X,t),R).

Note on RDF / OWL

  RDF and its variants are just vectors for various fragments of
  first-order logic (FOL), and FOL is a lot easier to read and think
  about, so better to start with FOL and then render it in RDF (and/or
  other notations) later on.

No number of GET/200 exchanges can tell you what a resource is.
There are several reasons for this.
  1. The absence of a GET/200 giving W(T,R) does not mean that W(T,R)
     isn't true.
  2. Two Things T,T' could have W(T,R) but not W(T',R) for some
     REST-representation R not hitherto obtained by a GET/200 exchange.
  3. T and T' could satisfy {W(T,R) iff W(T',R)} for all R in RR, and
     *still* be different

Information distinguishing such Things, if it were available, would
have to come from a different source (e.g. RDF).

httpRange-14
------------

[[

The TAG's httpRange-14 decision throws doubt on location/designation
practice by advising against the use of http: URIs to name certain
kinds of things.  At the same time it (a) fails to explain which
things are restricted, (b) fails to explain the inferential source of
the ambiguity (between a thing and a web page about it), (c) fails to
solve the ambiguity between when the thing itself is a web page.  It
appears that the intent is that a GET should retrieve something that
shares is "like" the resource, yet the "like" relationship isn't
specified.

There is so much controversy over the use of http: URIs to name such
objects, and over the "resource" theory that justifies it, that web
publishers are likely to either ignore the httpRange-14 advice
altogether and use http: URIs in situations where such use might lead
to problems, or use distinct URIs for the object and for the web page
when a single URI would have sufficed.

The 303 issue is a red herring since even if # URIs are used you have
to decide whether you need a # URI at all.

It would be nice if we could provide a defense of the practice (of
using URIs involved in GET/200 exchanges to refer to members of class
WR) that might be understandable to those who simply attack it now ...

I would think we ought to be able to reconstruct the ambiguity
referenced in the resolution within the axiomatic exposition.  
]]

Let WR be a proper subclass of Thing containing the domain of W,
i.e. suppose W(T,R) implies that T is in WR.  That is, we may not know
what W means, but suppose that being in the domain of W implies
membership in the class WR: W(T,R) implies T in WR.

  WR plays the role of "information resource" - if you're not in WR
  you can't be W-related to a RR.

  WR might also roughly coincide with the "metadata subjects" of the
  introduction, above.

  That WR is a *proper* subclass of Thing is, I believe, implied by
  the httpRange-14 resolution - otherwise why would one raise the
  question?

  WR = domain of W would be a plausible axiom, but I'm not sure it's
  needed.

Axioms proposed for WR:
  TimBL: "generic resources" such as the King James Bible (see genont)
    are in WR
  TimBL: literary works are in WR  (Pat Hayes disagrees)
  Pat: literary works are not in WR

  TimBL: the classes of dogs and of people are disjoint with WR
    (by extension: anything physical)
  TimBL: strings and numbers are disjoint with WR
    (by extension: anything mathematical)
  TimBL: RR is disjoint with WR
    (JAR doesn't understand the point of this)
  Pat: RDF graphs are not in WR
    (follows from prohibition of mathematical things)

  TimBL: members of WR are not determined by their W-relations
    i.e. one might have W(T,R) = W(T',R) for all REST-representations
    R, yet T != T'   (time sheet example, recall discussions a while
    back of the "trace" of a resource and of "phlogiston")
    (Pat: "sad, if true")

We have various theories of W that are all intended to (a) sharpen the
axioms for WR (b) explain why you would want the existence of x such
that x is not in WR (c) help explain what distinguishes acceptable
W-statements from incorrect ones.

. Dan C's speaks-for theory:  W(T,R) mean W 'says' R in the sense of
  ABLP logic.  This means that T is a "principal", which has certain
  consequences.

. Alan's what-is-on-the-web theory.  This says that WR consists of
  those things that http: URIs actually designated before about 1999
  when we started making them designate exotic things using RDF.
  Work in progress.

. JAR's property-transfer-from-representation-to-resource theory.
  This says that in W(T,R), T has to have some properties in common
  with R.  (Unfortunately, which ones is not clear.)  Things that
  can't share properties with any RRs can't be in WR. Work in
  progress.

. Pat's theory in which WR = RR (plus *possibly* some things that are
  very modest generalizationss of RRs)

. David B's "it depends on how you interpret things"

. Xiaoshu's architecture in which W(T,R) only means that R "is about" T

. JAR's 'good practice note' theory which says explain to people what
  the real problem is and leave it up to them


Redirects
---------

The relation W abstracts from our everyday experience with the web: it
holds when we get a 200, but it also holds when we consult a cache, or
when we do the equivalent of GET using an API or an alternative
protocol.  Essentially W is whatever it has to be in order to suppose
that there is a *reason* for 200 and caching.
Of course this is a sort of fiction, but it helps to explain the
architecture and give freedom in infrastructure design.

We can apply the same approach to redirection, by giving names to the
relationships that give rise to redirection responses.  E.g. we might
write

  PermanentRedirect(T1, T2)
  Found(T1, T2)
  TemporaryRedirect(T1, T2)

as being what a 301, 302, or 307 response communicates:

  301 U1 / Location: U2  ==  PermanentRedirect(<U1>,<U2>)

and so on.

[Some rambling... need to tighten up this argument]

At this point some people raise the question: Why are these relations
between a thing and another thing, instead of between a thing and a
URI, or between a URI and a thing, or a URI and a URI?  Why should
there even exist things T1 and T2 that are related in some way?

The methodological answer is that Occam's razor should be applied: If
you talk about U1 being related somehow to U2, then you have to
introduce a new class of URIs into the universe of discourse, and that
hasn't been needed so far.  Then if U2 is involved in a GET/200
exchange, what resource is the exchange about?  To answer this you
need to explain how a URI comes to be interpreted as a Thing,
introducing a new relationship that has to be explained and
axiomatized.  We tackle this on further down, but the answer ends up
being much more complicated than anything else we've done.

If instead the URI / Thing relationship isn't brought into the
picture, and URIs are just taken to be symbols in our axiomatic
system, the theory of redirects is much simpler.  And that's good
because nobody really cares about URIs - they are just names and
easily changed.  Rather it is the things that matter.

But that doesn't really answer the question.

The HTTP spec explains redirects in terms of the resource <U1>
"residing at" the URI U2.  (Fielding and Taylor do something similar
in allowing a "value" to be a URI.)  However even if you take this
interpretation to be true, the likely next step for a client is going
to be to use U2 in a GET request.  The outcome of the request is to
learn something not about the URI U2, but about the Thing <U2>.  That
is, the URI U2 is *only* used to designate the Thing <U2>, so we might
as well be talking about that Thing, and not the URI.  That is:

  301 U1 / Location: U2
     == ResidesAt(<U1>,"U2")
     == PermanentRedirect(<U1>,<U2>)  assuming U2 designates <U2>

Here "designation" as a logical relation is assumed to be
unproblematic, but it is not unproblematic.

------
URI binding and resolution

To cover:
. the role of RDF and OWL model theory
. the inevitability of ambiguity
. the role of follow-your-nose
. why one would be skeptical: inconsistencies, ontology version skew,
  deception, use at variance from definition, etc.
. LRDD (Link: and .well-known/host-meta)

Pat: "How is that choice [of what thing a URI refers to]
manifested functionally? What if two people make different choices for
the same URI?

Seems to me that 'personal' choice is a red herring. What matters for
any actual use of referential names is that some *community* of people
has somehow conspired to use a name referentially among themselves.

...if we are communicating and I understand a name in the message to
refer to X but you intended to refer to Y, usually we have a genuine
MIScommunication. That is different from me misunderstanding something
you say about X, while still knowing that it is X you are referring
to."

JAR: reference is not "identification" (in the dictionary sense)

------
Provenance and authority

About the new text in HTTPbis explaining "authority"

Different metadata from different sources

Embedded metadata

  If we consider the server to be "authoritative" for GET U/200 and
  therefore statements of the form W(<U>,...), we are obligated to
  "believe" in such a Thing <U>.

------
Acks