<?xml version="1.0" encoding="UTF-8"?> 
<?xml-stylesheet type="text/xsl" href="../../../doc/xmlspec.xsl"?>
<!DOCTYPE spec SYSTEM
"http://www.w3.org/2002/xmlspec/dtd/2.6/xmlspec.dtd" [ 
<!--
================================================================
--> 
<!ATTLIST spec xmlns:xlink CDATA #IMPLIED>
<!ENTITY mdash " &#8212; "> 
<!ENTITY epsilon "&#949;"> 
<!ENTITY Oacute "&#211;"> 

<!ENTITY draft.day "10"> 
<!ENTITY draft.monthname "April"> 
<!ENTITY draft.year "2011">
]>

<!-- 
Alan R: - feeling at atm - just before glossary, is that good content but
better presentation order needed.
In intro make clear that it is use of URI is in sentences. (there are
other points to move there, I think).
 -->

<!-- Providing and discovering URI definitions -->

<spec xmlns:xlink="http://www.w3.org/1999/xlink" w3c-doctype="wd" role="editors-copy"> 
  <header>

    <title> Providing and discovering definitions of URIs
    </title>

  <!-- 
    <w3c-designation>http://www.w3.org/TR/2009/WD-hash-in-url-20090415/</w3c-designation> 
  -->
    <w3c-doctype>Editor's Draft</w3c-doctype> 
    <pubdate> 
      <day>&draft.day;</day>
      <month>&draft.monthname;</month> 
      <year>&draft.year;</year>
    </pubdate> 

    <publoc> 
      <loc href="http://www.w3.org/2001/tag/awwsw/issue57/20110410/" >
        http://www.w3.org/2001/tag/awwsw/issue57/20110410/
      </loc>
    </publoc>

    <prevlocs>
      <loc href="http://www.w3.org/2001/tag/awwsw/issue57/20110327/" >
        http://www.w3.org/2001/tag/awwsw/issue57/20110327/
      </loc>
    </prevlocs>

    <altlocs>
      <loc role="xml" href="issue57.xml"
           xlink:type="simple">XML</loc>
    </altlocs>
    <latestloc> 
      <loc href="http://www.w3.org/2001/tag/awwsw/issue57/latest/" 
        >http://www.w3.org/2001/tag/awwsw/issue57/latest/</loc> 
    </latestloc>  

    <authlist> 
      <author>

        <name>Jonathan A. Rees
        </name> 
        <email href="mailto:rees@mumble.net"
	   >rees@mumble.net</email> 
      </author>

    </authlist> 
    <status> 
      <p>
        This report has been developed by the 
        <loc href="http://www.w3.org/2001/tag/awwsw/"
          >AWWSW Task Group</loc>
        of the
        <loc href="http://www/w3.org/2001/tag/"
          >W3C Technical Architecture Group</loc>
        in order to provide background material for further discussion
        among those affected by this architectural question, and to help drive
        TAG issue 57 <bibref ref="issue-57"/> to a conclusion.
      </p> 

      <p>
        This version has
        not received review within either the task force or the TAG.</p>

      <p>
        Publication of this draft
        finding does not imply endorsement by the W3C Membership. This is
        a draft document and may be updated, replaced or obsoleted by
        other documents at any time.
      </p> 

      <p>
        Please send comments on this
        document to the editor at
	<loc href="mailto:rees@mumble.net" 
	 >rees@mumble.net</loc>.
        The development of this report is discussed on the public-awwsw@w3.org
        mailing list, with archives at 
        <loc href="http://lists.w3.org/Archives/Public/public-awwsw/" 
         >http://lists.w3.org/Archives/Public/public-awwsw/</loc>.
      </p>

      <!-- 
      <p>Please send comments on this
      document to the publicly archived TAG mailing list 
      <loc
          href="mailto:www-tag@w3.org" >www-tag@w3.org
      </loc> (
      <loc
          href="http://lists.w3.org/Archives/Public/www-tag/"
          >archive
      </loc>).
      </p> 
      -->
    </status> 

    <abstract> 
      <p>
        The specification governing Uniform Resource Identifiers
        (URIs) <bibref ref="rfc3986"/> allows URIs to mean anything at all,
        and this unbounded flexibility is exploited in
        a variety contexts, notably the Semantic Web and Linked Data.
        To use a URI to mean something, an agent (a) selects a URI,
        (b) provides a definition of the URI in a manner that
        permits discovery by agents who encounter
        the URI, and (c) uses the URI.  
	Subsequently other agents may not only understand the URI (by
        discovering and consulting the definition) but may use it
        themselves.
	<!--  redundant:
	As long as the definition remains
        discoverable, the URI may then be used and understood by other
        agents.
        or,
        As long as the definition remains discoverable, agents
        encountering the URI will be able to understand it [to the
        extent that the definition is helpful].
         -->
      </p>
      <p>
        A few widely known methods are in use to help agents provide
        and discover URI definitions,
        including RDF fragment
        identifier resolution and the HTTP 303 redirect.  However,
        difficulties in using these methods
        have led to a search for new methods that
        are easier to deploy, and perform better,
        than the established ones.  This report
        brings together in one place information on current and
        proposed practices, with analysis of benefits and shortcomings
        of each.
      </p>
      <p>
        The purpose of this report is not to make recommendations but
        rather to initiate a discussion that might lead to
        consensus on the use of current and/or new methods.
      </p>
    </abstract> 

    <langusage> 
      <language id="en-US">English</language> 
    </langusage>

    <revisiondesc> 
      <p>
        <ulist> 
          <item>
            <p>$Id: issue57.xml,v 1.1 2011/04/11 03:14:40 jrees Exp $
            </p>
          </item>          
        </ulist> 
      </p> 
    </revisiondesc> 
  </header>

  
  <body> 
    <div1>
      <head>Introduction
      </head>

<p><emph>This is an old issue, and people are tired of it.  
&mdash;Sandro Hawke, January 2003</emph> 
<bibref ref="disambiguating"/></p>

      <p>
        In any kind of discourse it is very useful for an agent to be
        able to provide a definition of a term, in such a way that other agents
        can discover and use that definition in order to make sense of
        sentences that use that term, and to compose new ones.
      </p>

      <example>
	<head>Definition discovery</head>

	<graphic source="discovery.png"
		 alt='Definition of "Peak XV"'/>

	<p>
	  Suppose that Alice, in
	  communication with Bob, uses
	  the term "Peak XV" to mean
	  Mount Everest, as in "Alice would
	  like to climb Peak XV next summer".  If Bob does
	  not know what "Peak XV" means, he will have to find out. He
	  might be able to ask Alice directly, although in many 
	  cases this will be impossible - Alice might be too busy, or
	  otherwise unavailable.
	  Lacking that option he must do some research, consulting
	  dictionaries, encyclopedias, or search engines in the hope of
	  obtaining the correct 
	  explanation of Alice's use of the term "Peak XV".
	  </p>

	<p>
	  The essential idea is that there are one or more methods
	  available to Bob by which he can discover 
	  bits of writing that explain what 
	  what Alice
	  means by "Peak XV".
	</p>

      </example>

      <p>
        In this report, the terms being defined are URIs, and
 	the bits of writing that might explain the meaning of a URI are
 	called "definitions".  URIs can be used
	to mean all sorts of things
	in many different technical contexts.  Two contexts of 
	special interest to this report are
	in natural language (e.g. "The W3C home page is 
	'http://www.w3.org/'"), and in declarative languages
 	such as RDF and OWL.
      </p>

      <p>
        The nature of definitions need not concern us here - many forms
        are familiar, including translation between
        languages (e.g. providing an English or Spanish term equivalent to a
        given term), descriptions (the term refers to an entity possessing
        some set of properties), explanation by example, axiomatic
        systems, and so on.  Also
        not of concern here are the many ways in which
        meaning can fail as a result
        of <emph>what</emph> a definition says about the
        term in question, or how a term is used.  Our concern is only with
        the method by which definitions are conveyed.
      </p>

      <p>
        When the term in question is a URI,
        discovery methods
        include, in addition to those already mentioned, network
        protocols such as HTTP that involve the URI directly.  
        <!-- 
        Methods that use the
        Web protocol for the URI (HTTP, in this case) in order to determine
        what the URI means are called "follow
        your nose" (FYN) methods.
         -->
      </p>

      <!-- 
      <p> [Draft note, Alan:
But this suggests that you introduce earlier: "sentences, phrases"
etc, as the scope of URI use you are interested in.
I see you define "phrase" later. With this audience they will read
their own meaning. So either use terms outside their repertoire or use
typography to distinguish and warning at top of document to read
carefully.] </p>
 -->

      <p>
        Definition discovery is not
        the same as Web dereference, however, 
	since dereferencing a URI gives you 
	information - i.e. the document, image, etc. specified by the
        URI - not necessarily related to defining anything.
        Care must be taken to avoid confusing the two operations.
        In theory dereference could
        play a role in explaining the meaning of a dereferenceable URI
        (see
	<specref ref="depends"/>), but this is
        not generally done at present, since a dereferenceable URI
        refers to the information resource accessible via that URI,
 	not to what that information resource defines or describes
 	(see <specref ref="ir-ref"/>).
      </p>

      <p>
	The reason we define definition discovery methods is 
	interoperability - so that everyone gets the same definition
	of each URI.
	We only need consensus on methods such as the ones
	surveyed here for URIs
	that are to be shared widely.  If 
	agents that use a URI in one way never use it in communication with
	agents that use it in another way, then it is OK for the URI
	to have
	distinct senses in the two communities, and there is no
	problem to be solved - each community can use the URI in its
	own way, and there will be no confusion.
      </p>

      <p>
        The operative word here is "if".  Isolation is fragile and
        means lost opportunities for synergy and unintended reuse; all
        the arguments in favor of a World Wide Web, which depends on the
        global nature of the URI vocabulary, apply here.
      </p>

      <p>
        This report presents discovery methods in current use,
        reports some 
        criticisms of them, and presents some new discovery methods that
        have been proposed to address the criticisms.
      </p>

      <!-- 
      <p>
        [Draft note: Maybe talk in the introduction about alternatives
        to defining a URI: using
        non-URI phrases and syntactic sugar (these used to be sections).
	Discussion currently relegated to <specref ref="ddi"/>. ]
      </p>
     -->


    <div2>
      <head>Glossary</head>

      <p>
        This section defines terms that are used in this report.
        An attempt has been made to avoid gratuitous differences
        from the way these terms are used elsewhere, but in a few
        cases choice of terminology has been difficult and words
        with other meanings (such as "definition") are
        given technical definitions.  These definitions are not being
        proposed for general adoption.
      </p>

      <p>
        [Draft comment: All terminology choices are provisional; 
        for most of them I
        am testing the waters to see how well the word works, and am
        prepared to change.]
      </p>

      <glist>
        <label>accessible via</label>
        <def> 
          When a URI is dereferenceable,
          "the information resource accessible via a URI" 
          (abbreviated IR(that URI), see below)
          is the information resource whose versions 
          are the versions obtained by dereferencing that URI.
        </def>

	<!-- 
        <label>ADI(u,v)</label>
        <def> 
	  The meaning of URI v, as defined in IR(u).
	  [not sure we need this one.]
        </def>
	 -->

        <label>definition</label>
        <def>
          A document or document part that provides
          information about the meaning of a URI or other kind of term.
	  This term is not meant to be either rigorous or exclusive.  The
          "information" could be prose, RDF, OWL, or some combination.
          It needn't be successful, specific, or comprehensive in defining the 
	  term in the ordinary sense of "defining".  Rather, the term
          as used here refers to the role it plays in discovery.  We
          might more accurately say "putative definition".

          [Draft note: Alan R: Is sound recording possible definition?]

          <!-- 
          [We need a word for this, and its relation
          to a phrase whose meaning is in question.  "Description" (or
          Eran H-L's "description resource") is
          incorrect as it shifts focus from the term to some (unknown)
          resource - I don't start out knowing what the resource is and then
          look for a description of it, I start out knowing a term and
          then I want to know what resource is meant.  "Definition" is
          another option but may be misleading.  David B likes
          "URI declaration" but this term is evocative of his architecture,
          which I don't want to evoke.]
           -->
        </def> 

        <label>dereferenceable</label>
        <def> 
          A URI is dereferenceable if it may be used with a standard
          access mechanism to retrieve information, or to perform some
          other action on an associated 
          resource (<bibref ref="rfc3986"/> section 1.2.2).
          In particular, hashless http: URIs 
          are
          dereferenceable if some HTTP method (or
          equivalent) is successful (2xx response).  Some URIs
          belonging to some other 
          URI schemes are also dereferenceable.
        </def> 

        <label>fixed information resource</label>
        <def>
          A document, image, sound recording, or
          other replicable entity as encoded in
          an octet sequence, together with
          optional brief annotations, such as media type and language,
          intended to guide the interpretation of the
          content.  There is no requirement that a given fixed
          information resource is accessible via any URI.
        </def>

        <label>hashless</label>
        <def>
	  A URI is 'hashless' if it has no hash '#' sign or subsequent
	  fragment identifier.  'Hashless URI' is synonymous with
	  'absolute URI' as defined in <bibref ref="rfc3986"/>.
        </def>

        <label>http: URI</label>
        <def>
          A URI whose scheme (the part before the colon) is 'http' or 'https'.
        </def>

        <label>information resource</label>
        <def>
          Roughly speaking, something that is appropriate as the
          subject of metadata.  See <specref ref="ir"/>.
	  <!-- 
          Because of the controversy around this term we will not
          attempt to define it, but rather say that: (1) An
          information resource
          is associated with a set of fixed information 
          resources (its versions). (2)
          An information resource is "similar" to its versions in that
          metadata that applies to each version of an
          information resource applies
          to the information resource itself, and vice versa.
 	  -->
        </def>

        <label>IR(u)</label>
        <def> 
          IR(u) is shorthand for the information resource accessible
          via URI u.  For example, 
          if 'http://example/image23' is dereferenceable, then
          IR('http://example/image23')
          is the information resource accessible via that URI.
        </def>

        <label>metadata</label>
        <def>
          Information about information, or about an information 
          resource.  In RDF, metadata might
          be written using vocabularies such as Dublin Core, FOAF,
          or CC REL.
        </def>

        <label>term</label>
        <def>A URI, word, name, or phrase
          that can serve in subject or object position in a statement.  In an
          RDF serialization, for example, a term might be a qname,
          URI, or blank
          node label.  In Turtle, a term might be any Turtle term,
          including one written using blank node [...] notation.
        </def>

        <label>refer</label>
        <def>
          For the purposes of this report, reference is just one way to
          mean.  There may be other ways to mean other than to
          refer, but none are specified here.
        </def>

        <label>version (of an information resource)</label>
        <def>
          A fixed information resource associated with an information
          resource is a version of the information resource.
          <footnote>
            "Version of" as used here is similar to one of the senses in
            which "representation of" is used in
            discussions of Web architecture.
            We have two reasons to avoid "representation".  One is that
	    "representation" has been used in different ways by
            different parties and it seems wise to avoid risk of
            misinterpretation.  Another is that our versions have to be
            the same kind of thing as the information resource that
            they are versions of, so that they can have metadata.
 	    In most treatments of Web
            Architecture, representations are considered very
            different from information resources and
            do not have the same sorts of properties as information
            resources.
	    <!-- 
            due to the different ways in which Roy Fielding (in his
            REST work) and Tim Berners-Lee [citation needed] use the word.
            It seems better to avoid the word entirely and use a new
            word to specifically mean the Tim Berners-Lee sense.
 	    -->
          </footnote>

          <!-- 
          [Cf. TimBL 'fixed resource.']
          [Searching for a new term since Nathan and JAR don't 
          like "representation".
          Consider: version, content+, continent, malcontent, discontent, epresentation, 
          represen-tation, specific information resource, simple information
          resource, fixed resource, specialization.]
          [Consider trying to write the document without any need for
          this word!]
          -->
        </def>

      </glist>
    </div2>

    </div1> <!-- end introduction -->


    <div1>
      <head>Use case scenarios</head> 

      <p>
        Use cases need to be presented as being independent of any
        particular solution to be used, in order that the solution space
        can be explored more objectively.  This leads to some
        frustrating vaguenesses in the following, but the vagueness is
        intentional and necessary.
      </p>

      <div2>
        <head>Choosing a URI, providing a definition of the URI, using 
          the URI</head> 
        <p>
          Alice wants to refer to a particular canoe being offered for
          sale.
          Alice "mints" a new URI (one that is not yet in use) with the
          purpose of using that URI to refer to her canoe.  Alice
          publishes a document containing a definition of the URI, i.e.
	  a document that
          would lead a reader to understand that the URI refers to the
          canoe.
        </p>
        <p>
          Bob then learns of Alice's URI and uses it in a document
          of his own.
        </p>
        <p>
          Subsequently Carol encounters Bob's document.  Wanting to
          know what the URI means, she 
          is led to Alice's published definition, which she reads.  She is
          enlightened.
        </p>

	<p>
	  Any method for implementing this use case would need to explain:
	  what kind of URI Alice should use (syntactic constraints);
	  where and how should Alice should publish the definition so that it
	  can be found;
	  and how Carol might come to discover Alice's definition, given
	  the URI.
	</p>

        <!-- 
        <p>
          Variant use case: Same as above, but instead of the canoe, the
          referent of the URI is to be an 
          information resource that is not accessible on the Web, or at least
          not at any URI known to Alice.  The definition might describe where
          the information resource might be found, and other aspects
          such as bibliographic metadata (author, title, etc.) or SHA1 hash.
        </p>
        <p>
          Variant use case: Same as above, but instead of the canoe, the
          referent is to be an 
          information resource that <emph>is</emph>
          Web accessible, via a URI known to Alice.
          The definition that Alice
          writes explains that the term is to refer to that information
          resource.  That is, there are <emph>two</emph> information
          resource at play here, one carrying the definition and one
          that's a subject of the definition.  It's important in this
          case to make sure that metadata can be written about either
          information resource.
          (In this situation, which is common in the publishing industry and
          digital archives, Alice's definition is often
          called a "landing page".)
        </p>
	-->

      </div2>

      <div2>
        <head>Using a document as a definition by reference to its
          primary topic</head>  
        <p>
          Bob desires to refer to Chicago.  
          He finds a Web page 
          on the Web at 'http://example/about-chicago' (provided by,
          say, Alice) that consists
          of a description of Chicago, and wants to use it for the
          purpose of referring to Chicago.  He chooses
          a URI and associates it with Alice's Web page 
          in such a way that Bob's URI will be understood as referring to
          Chicago.
        </p>
        <p>
          Carol encounters Bob's URI, is led to 'http://example/about-chicago'
          and thence to Alice's description of
          Chicago, and then somehow understands that Bob's URI is
          meant to refer to Chicago.
        </p>
        <p>
          (This differs from the previous use case in that here Bob is not
          involved in the creation of the
          definition (the description of Chicago).  The document about
          Chicago was
          not written with the purpose of defining Bob's URI - in fact 
          Bob's URI doesn't even occur in it.  Rather than look 
	  in the document for a
          definition mentioning Bob's URI, Carol must determine the
          topic of the document and take the topic as the meaning of
          Bob's URI.)
        </p>
      </div2>

    </div1> 


    <div1>
      <head>General definition methods in current use</head>

        <p>
          This section describes how people currently implement the
          use cases.
        </p>

	<!-- 
        <p>
          (Any of these methods may be used to define a URI that refers
          to something for which a more specialized generic 
	  definition already exists, for
          example a mailbox (for which there is the mailto: URI scheme
          registration document) or
          an information resource (for which there are the
          registrations for http:, ftp:, gopher:, data:, and so on).
          In theory, an information 
          resource could be specified in a URI definition by
          spelling out the details of its versions, perhaps in RDF.
          However, this is ordinarily not necessary, since usually the
          specialized naming system can be used.)
        </p>
 	-->

      <div2 id="colocate">
        <head>Colocate definition and use</head> 
        <p>
          One way to lead someone encountering a URI to a definition
	  of the URI is to
	  make sure that the definition of the URI is propagated into
	  in each document in which the URI occurs.
	  This makes the definition easy to find, since anyone who
	  encounters the URI will have in hand
	  the definition that they need.  
	  The form of the URI in this case is arbitrary.
        </p>
      </div2>


      <div2 id="cite-source">
        <head>Link to documents containing definitions</head> 
        <p>
          Whenever using a URI to refer to something, provide,
	  again in the document or message in which the occurs,
          a link to a document that carries a definition of the URI.  
          This is the approach taken by OWL; the document containing
          the URI is related to the one from which the definition of
          the URI should be obtained via the owl:imports 
	  relation.<footnote>More precisely, the definition will be
	  found in the imports closure of the
          document containing the URI.</footnote>
        <p>
        </p>
          The rdfs:isDefinedBy property might also be used for this
          purpose.
        </p>

        <!-- 
        <p>
          Both of these properties beg the question in that
          they do not say how to figure out what the URI that is the
          target of owl:imports or rdfs:definedBy refers to. 
	  _____

          If the
          meaning of <emph>that</emph> URI had to be given by citing a source,
          there would be infinite regress.
        </p>
        -->
      </div2>

      <div2 id="new-scheme">
        <head>Register a URI scheme or URN namespace</head>

        <p>
          Each URI scheme, e.g. mailto:, http:, ftp:, and so on, has
          its own URI scheme registration, accessible via a registry
          maintained by IANA
          <bibref ref="rfc4395"/>.
          A URI scheme registration helps to define the
          meaning of URIs using that scheme.
          For example, the registration for the data: URI scheme
          fully explains the meaning of every URI that uses that
          scheme, while the mailto: scheme registration explains 
	  that each URI refers to a particular mailbox, understood
          relative to the domain name system and the mailbox
          assignments made by each particular host.
        </p>

        <p>
          Most URI scheme registrations, such as that for http:, only
          provide a partial definition, and other sources of
          information must be consulted in order to understand a
          particular URI using that scheme.  For example, to
          understand an http: URI, one generally needs to dereference
          it (and even then one only knows a single version of it; see
          <specref ref="ir-ref"/>).
        </p>

	<example>
	  <head>Defining a URI by registering a URI scheme</head>
	  <p>
	    To define a URI to refer to Mount Everest, Alice
	    invents a new URI scheme, say mountain:, and publishes a
	    registration for it via IETF and IANA that says that
	    'mountain:peakxv' refers to Mount Everest.  Bob, on
	    encountering 'mountain:peakxv', checks the IANA URI scheme 
	    registry
	    (which he knows about because the registry is specified
	    by IETF),
	    obtains a link to Alice's 
	    registration for the 'mountain:' scheme, 
	    reads the registration, and is enlightened.
	  </p>
	</example>

  	<p>
	  Practically
          speaking, this approach is very challenging due to the
          rigor of the review process for URI scheme registrations
	  (see <bibref ref="rfc4395"/>).
          Furthermore, Web clients will not understand the new URI
          scheme, making the definition of the URI
          effectively inaccessible for most agents encountering the URI,
	  at least until the mountain: scheme becomes as well known as
          the http: scheme.
        </p>

        <p>
          URN namespaces [draft note: cite RFC 3406]
			 work in a similar way.  Each namespace has a
          registration document that is formally reviewed through IETF and
          placed on file
          with IANA.
        </p>
      </div2>

      <div2 id="lsid">
        <head>Use the LSID getMetadata() method</head>
	<!-- 
        <p>
          [Draft note: LSID is not exactly common - is this worthy of mention?
          Maybe rule out all non-linked-data solutions up front?
          But it is used and I'd like some of those users to read this report.]
        </p>
 	-->
        <p>
          A URN namespace for which there is a general definition method
          is the 'lsid' namespace.<footnote
            >Unfortunately the 'lsid' URN namespace is not in the
            IANA registry.  Someone encountering an LSID may need
            to do a search in order to locate the LSID specification and
            consequently determine what the LSID means.
	    In addition each LSID contains an "authority" field
	    whose meaning is not assigned by the LSID specification,
            requiring even more research on the part of someone trying
            to understand an LSID.
	    </footnote>
          URIs beginning 'urn:lsid:' are called LSIDs.
          LSIDs have an associated SOAP-based 
          protocol that has separate methods for dereference (getData)
          and discovery (getMetadata).
          According to the LSID specification,
          an LSID for which the getData method yields nonempty
          content refers to a what is here called a fixed information resource,
          while the LSID could refer to
          anything at all if getData yields empty content.  
          In the latter case the information yielded by the
          getMetadata method generally constitutes, or at least
          contains, a definition of the LSID.
        </p>
      </div2>

      <div2 id="hash">
        <head>'Hash URI'</head> 
        <p>
          With this method, the URI must be a 'hash URI', i.e. must
	  contain a fragment identifier.
	  The definition of the URI
          is placed in the document accessible via the URI that is the
	  pre-fragment stem of the URI.
        </p>
	<example>
	  <head>'Hash URI'</head>
	  <graphic source="hash.png"/>
	</example>
        <p>
          The interpretation of a URI possessing a fragment
          identifier, say 'http://example/sale#p16',
          is governed (according to <bibref ref="rfc3986"/>) 
	  by the media type of some version
          of the information resource accessible at its stem URI
          'http://example/sale'.
          For RDF-enabled media types, the media type registration
 	  defers to the content of the version - that is, the version
          itself gets to arbitrarily define what the 'hash' URI 
	  means.<footnote>
          If
          the information resource IR('http://example/sale')
          has multiple versions, it is important that all versions
          provide definitions of every URI that needs one, and that
          corresponding definitions in different versions be
          compatible with one another.
          (See <bibref ref="webarch"/> section 3.2.)</footnote>
        </p>
      </div2>

      <div2 id="303">
        <head>'Hashless URI' with HTTP 303 See Other redirect</head> 
        <p>
          In this approach, one mints hashless http: URI,
	  makes a definition of it accessible via a second URI,
	  and then arranges for a GET request of the first URI to
	  redirect to the second URI.  The first URI then gets it
	  meaning according to the document accessible via the second
	  URI.
	</p>

	<example>
	<head>303 redirect</head>
	<graphic source="303.png"/>
	<p>
 	  Alice chooses 'http://example/p16' as the way she will refer
 	  to a particular canoe being put up for sale.
	  At 'http://example/about-p16' she publishes text and/or RDF
	  that defines 'http://example/p16', explaining the URI by
 	  providing details about the
 	  canoe (make, model, length, location).
	  For the URI 'http://example/p16', which will not be
 	  dereferenceable (since otherwise it would refer to the
 	  information resource at that URI, not the canoe),
	  she arranges that a GET request yields a 303 redirect with
 	  a Location: header specifying 'http://example/p16' as the
 	  redirect target.
        </p>
        <p>
          Those encountering 'http://example/p16' will attempt 
          to dereference it, but
          this will fail, with a 303 redirect delivered instead.  
          The 303 redirect indicates that
          the URI does not refer IR('http://example/p16'), but rather that
          the document at 'http://example/about-p16'
          provides a definition of 'http://example/p16'.
          [Draft note: TBD: cite HTTPbis]
        </p>
	</example>

	<!-- 
        <p>
	  [ADI('http://example/about-p16', 'http://example/p16') ?]
        </p>
	 -->
        <p>
          Another pattern is to use 303 redirect to a document whose
          primary topic is the intended referent, similar to the
          Chicago example above.  This could, in theory, lead to
          ambiguities, as the primary topic and the entity referred to
          using the URI might be different.

	  <!-- 
          [Draft note: Is anyone, in practice, deploying 303 redirects to a
          "primary topic" page not mentioning the URI to be 
          defined, rather than to a document that explicitly mentions
          the URI?   == YES == ]
	   -->
        </p>
      </div2>

      <!-- 
        <p>
          With any of these methods other than dereferenceable hashless URIs,
          the URI may refer to anything at all, including an
          information resource.  [COMMON MISUNDERSTANDING, not sure
          where this goes in the document.
          <loc href="http://lists.w3.org/Archives/Public/public-lod/2010Nov/0249.html"
          >This email</loc>, for example, gets it wrong; the question is not
          IR vs. NIR, it's about which thing the URI is to refer to,
          IR(u) vs. FV(u).]
        </p>
 -->
    </div1>


    <div1>
      <head>Critique of the current methods</head>

      <p>
        The methods <specref ref="colocate"/>
	and <specref ref="cite-source"/>
	work as far as they go, but may not be practical in all
	situations.  <specref ref="new-scheme"/>
	is not practical, and <specref ref="lsid"/>
	relies on an unregistered URN namespace and complex protocol.
	And when domain names are used as authority components in
	LSIDs, as they often are,
	the resulting URNs are no more "persistent" than http: URIs,
	leaving the advantages of LSIDs over http: URIs uncertain.
      </p>

      <p>
        We therefore focus in this section on the two widely used
        definition methods, fragment identifiers and 303 redirects.
      </p>

      <div2>
        <head>Fragment identifiers are fragile</head> 
        <p>
          "People forget to put it there
          when writing and cut and pasting URIs."  (Harry)
          [Draft note: More information needed.]
        </p>

	<p>
	  The meaning of a hash URI "depends on how you access it, which
	  is nuts. Its as though a word has different meanings
	  depending on whether you read it in a book or have it read
	  out to you." 
	  (<a href="http://blog.iandavis.com/2007/11/17/fragmentation-reprise/"
	    >Ian Davis</a>)
	    &mdash; I think he's talking about the situation where
	  there is content
	  negotiation <emph>and</emph> there is inconsistency between the
	  variants.  The more common problem with content negotiation is
	  that there is no way to know ahead of time which variant 
	  has the definition at all, and thus which one to
	  request in content negotiation.
	</p>

	<p>
	  Ian points out that RDF Concepts says:
	  "a URI reference in an RDF graph is treated with respect to
	  the MIME type application/rdf+xml [RDF-MIME-TYPE]. Given an
	  RDF URI reference consisting of an absolute URI and a
	  fragment identifier, the fragment identifer identifies the
	  same thing that it does in an application/rdf+xml
	  representation of the resource identified by the absolute
	  URI component."
	  and that this appears to conflict with webarch.
	  [Draft note: TBD: try to figure out what is going on here.]
	</p>

      </div2>

      <div2>
        <head>The common 'hash URI' pattern fails with large namespaces</head> 
        <p>When a large number of URIs are formed by combining a
          fixed "namespace" prefix with many suffixes using hash as a
          connector, there will be a single underlying document 
	  at the pre-hash URI
	  that must
          provide definitions of all of the large number of URIs.
          This is an unacceptable performance hit for the server, the
          network, and the client.  'Hashless' URIs don't have this problem
          as the response can be specific to each URI.
        </p>
	<p>[Draft note: look at 
	  <a href="http://norman.walsh.name/2007/02/18/bigBang"
	   >Norm Walsh's 2007-02-17</a> post]</p>
      </div2>

      <div2>
        <head>Hash URIs don't support REST architecture</head> 
        <p>
          Hash URIs don't work with HTTP PUT, POST, or DELETE methods.
          (Manu)
        </p>
      </div2>

      <div2>
        <head>303 is difficult, sometimes impossible, to deploy</head> 
        <p>
          Deploying a 303 redirect requires giving the correct
          directive to a web server, for example adding 
          a Redirect line to .htaccess in Apache.  Unfortunately
          many hosting solutions do not allow this.</p>
        <p>
          The Chicago use case is an extreme version of this - the
          entity providing access to the Chicago document (Alice) does not
          even care about providing URIs that refer to Chicago; it is
          someone having no control over how the URI dereferences (Bob)
          who needs a reference to Chicago.
        </p>
      </div2>

      <div2>
        <head>303 leads to too many round trips</head> 
        <p>To get definitions of N URIs by redirecting through
          303 responses,
          you need to do 2N HTTP requests.</p>
      </div2>

      <div2>
        <head>303 makes the URI difficult to bookmark</head> 
        <p>
	  "the user enters one URI into their browser and ends up at
          a different one, causing confusion when they want to reuse
          the URI of the toucan. Often they use the document URI by
          mistake." 
	  (<a href="http://iand.posterous.com/is-303-really-necessary"
	    >Ian Davis</a>)
	</p>
        <!-- 
           [See also JAR's "tempolink" blog post]
 -->

        <p>
	  "Redirection has in fact very confusing side effects; as we expect the
	  semantic web to work seamlessly with the web, it is very odd that a
	  semantic web uri cannot be copy pasted to a browser without seeing it
	  change to something that is not the same as before."  
	  <bibref ref="giovanni"/>
        </p>

      </div2>

      <div2>
        <head>The normative specifications are incomplete</head> 
        <p>
          The architecture in common use has not received adequate
	  review such as W3C recommendation track; in fact it is not
	  really documented at all in any adequate form.
	  [Harry's complaint]
        </p>
        <p>
          [Draft note: Talk about conneg, media type, and FYN woes here?]
        </p>
      </div2>

    </div1>


    <div1>
      <head>Possible mitigations</head>

      <p>
        With fragment identifiers and the 303 redirect identified as
        the sources of current difficulties, a number of alternative
        methods have been suggested to get around their problems.
      </p>

      <div2 id="ddi">
        <head>Use something other than a URI</head> 

        <p>
           [Draft note: This section derives from 
            <a href="http://www.w3.org/2001/tag/2011/02/metadata-arch.html#slide9"
            >JAR's TAG F2F presentation slides</a>.  The purpose of
            talking about this idea is
            mainly to remind people that the problem is one of notational
            engineering, not philosophy.  This doesn't work very well,
            though, and I will probably flush this section.]
        </p>

        <p>
	  URIs are just one kind of term that might be used to
          refer to something.  If defining a URI is too difficult or
          costly, then perhaps one might do without.
          In RDF serializations such as Turtle, 
          for example, we have blank node notation:
        </p>
        <eg>
         [ foaf:isPrimaryTopicOf &lt;http://example/about-chicago&gt; ] </eg>
        <p>
          Here we have managed to refer to Chicago without defining a
          new URI; we have simply referred indirectly using a URI that 
          refers to an information resource according to a generic method
          (see <specref ref="ir-ref"/>).
        </p>

        <!-- 
      </div2>
      <div2 id="sugar">
        <head>Syntactic sugar</head> 
         -->

        <p>
          A more concise alternative is syntactic sugar:
        </p>
        <eg>
           *&lt;http://example/about-chicago&gt; </eg>
        <p>
	  which might be supported in a hypothetical RDF serialization
          as a shorthand for the previous example.
          (The asterisk is meant to be suggestive of indirection in the
          C programming language.)
        </p>
      </div2>


      <div2 id="suffix">
        <head>'Hash URI' with fixed suffix</head> 
        <p>
          This idea attempts to address one reason for using 'hashless'
          URIs instead of fragment identifiers.  Suppose you want to
          combine a large number of local names a, b, c, ... into a
          namespace.  The usual solutions would be to write
          'http://example/namespace#a' (a "hash namespace") or 
          'http://example/namespace/a' (a "hashless namespace").
        </p>
        <p>  
          In the "singleton fragid" approach one would write
          'http://example/namespace/a#' (a null fragment identifier) or
          'http://example/namespace/a#_', using a fixed suffix for every
          URI and varying the part between the namespace prefix and
          the suffix.
        </p>
        <p>
          As in the 303 approach, each URI in the namespace would (or
          could) have its own document, providing a definition for that
          single URI rather than for every URI in the namespace.
        </p>
        <p>
          The choice of fixed fragment identifier (null, "_", or
          something else) is largely a matter of taste.
        </p>
        <p>
          A null fragid precludes the use of qnames to abbreviate such URIs.
          (In particular it would not be possible to use them as
          predicate names in RDF/XML.)
          However, SPARQL, Turtle, and RDFa 
          are being extended to admit CURIEs that include #, making this a
          newly attractive option.
        </p>

        <p>
          To address the "hash gets lost" problem we could explore
          heuristics to automatically replace 'http://example/p16' with
          'http://example/p16#' (or 'http://example/p16#_') when needed.
        </p>
      </div2>

      <div2 id="hostrule">
          <head>'Hashless URI' with site-specific discovery rules</head> 
          <p>
	    The network round-trip (303 redirect) to map the URI whose
	    definition is 
	    to be discovered to the URI of the information resource
	    that defines it can be avoided if we know a general rule
	    that maps one kind of URI to the other, as such a rule can
	    be applied on the client without server involvement.
	    It is too much to hope that a single rule could work
	    uniformly for all URIs whose definition might be sought,
	    but an individual host may have a rule that applies for 
	    URIs at that host.
          </p>
          <p>
            The "well known URIs" protocol gives a place where such
	    a file containing such rules can be stored <bibref ref="rfc5988"/>.
	    The rule might be stored in a well-known file
	    'definition-rule', as in 
	    'http://example/.well-known/definition-rule'.
	    To obtain a definition of 'http://example/p16', obtain the
	    definition-rule file for its host.
            Then if the rule says to map 'http://example/{path}' to, say,
            'http://example/{path}.about', a
            definition of 'http://example/p16' can be sought by dereferencing
 	    'http://example/p16.about'.
          </p>
          <p>
            When the mapping is cached, this reduces the number
            of round trips from two (in the 303 case) to one.
          </p>
          <p>
            This would be a new protocol and the name and format of
	    the definition-rule file would have to be pinned down.
	    One option might be to use the link-template feature of
	    the host-meta file <bibref ref="rfc5988"/>.
          </p>
	  <!-- 
          <p>
            Such rules could augment or replace the use of 303
            responses in order to reduce the number of round trips
            required to obtain definitions of URIs.
          </p> -->
          <p>
            Looking for a definition-rule file for every host that has URIs
            for which definitions need to be discovered would be
            expensive if only a few of them have such files, but with some
            cleverness the number of such failed requests can probably
            be kept small.
            The details would have to be worked out, but this approach
            could be a boon to bulk consumers of 'hashless' URI definitions.
          </p>
      </div2>

      <div2 id="newhttp">
          <head>'Hashless URI' with new HTTP request or response</head> 
          <p> To reduce the number of round trips, we might use a new
            HTTP method to request a definition of 
            a URI, or the server could use a new status code
            to indicate that what it is returning is a definition of 
            the request URI.
          </p>
          <p> 
            The URIQA specification <bibref ref="uriqa"/> defines MGET, 
            a new HTTP request method.
            An MGET request on a URI yields a response containing a
            definition of that URI.
          </p>
          <p> 
            In response to GET of a URI,
            a server might provide a definition in a non-success
            response.  (A successful response would mean that the URI
            refers to the information resource at the URI.)
            Possibilities for HTTP response status codes that might
            signal this situation: 
            203 Non-Authoritative Information, a new 2xx status
            (e.g. 209), a new 3xx status (e.g. 308), 
            or a variety of 4xx codes.
            (Placing the definition in the content of a redirect response
            (status code 301,
            302, 303, and 307) is unsatisfactory as the
            content would not be displayed in a Web browser.)
          </p>
          <p>
	    The Link: header or other HTTP header might play a role here.
	    [TBD: explain and cite Web Linking and HTTPbis]
          </p>
          <p>
            Any of these options would mean fewer round trips than
            following a 303 redirect.
            A downside is that they are all generally as difficult, or more
            difficult, to deploy than 303 redirects.
          </p> 
      </div2>

      <div2 id="chimera">
        <head>'Hashless' URI dereferences to its definition (compatibly)</head>

        <p>
	  [Draft note: We are trying to represent 
	  <a href="http://inkdroid.org/journal/2010/07/07/linking-things-and-common-sense/"
	   >Ed Summers's view</a>
	  in this section.]
        </p>
        <p>
          Currently we use a dereferenceable hashless URI 'http://example/p16'
          to refer to the
          information resource at that URI, IR('http://example/p16')
          (see <specref ref="ir-ref"/>).
          To use an http: scheme 'hashless URI' to refer to anything
          else, one uses a 303 redirect.
          To address performance and deployment difficulties with 303 
	  redirects,
          it has been suggested that the same URI
	  be used for two purposes: to refer to the information
          resource at that URI,
          <emph>and</emph> to refer to
          some entity given by a definition of the URI
	  that is carried by (a version of) the information resource
          itself.
        </p>

	<example>
	  <head>A URI that refers to, <emph>and</emph> is defined by,
	    its information resource</head>
        <p>
	  Suppose that Alice wants to use the URI 'http://example/p16' to
	  refer to a canoe.  She publishes a definition containing the
	  following at 'http://example/p16':</p>
	  <eg>
          &lt;http://example/p16> foo:mass 2140.
          &lt;http://example/p16> foaf:name "Assabet Angler". </eg>
	<p>
	  Bob then comes along and writes the
	  following metadata about IR('http://example/p16') in the
	  usual way, i.e. using the URI to refer to the information
	  resource, based on what information is accessed via that
	  URI:
        </p>
        <eg>
          &lt;http://example/p16> dc:creator "Alice".
          &lt;http://example/p16> dc:title "All about the Assabet Angler".</eg>

        <p>
	  Carol encounters both bits of RDF (or either) and needs to
	  make sense of 
	  them.  Suppose she is aware that 'http://example/p16' might be
	  used in both ways - in metadata, with the intent that the
	  metadata is about IR('http://example/p16'); and also according to
	  a definition of 'http://example/p16' found 
	  in IR('http://example/p16').  For each use of
	  'http://example/p16' she (or her software) needs to
	  determine which sense is supposed to apply.
	</p>
	</example>

        <p>
          In general, what agents using this protocol need - both those
          composing statements and those deciphering them - is an
          agreed rule for classifying
          each occurrence of a URI u as referring either to the
          information resource IR(u) or to what the content at IR(u)
          describes.
	</p>

        <p>
	  There are probably
          many ways in which one might accomplish this; the
	  following method is provided for illustration.
	  Suppose that it "makes sense" or "is appropriate" for the subject of a
          particular property
          to be an information resource.  
	  For example, the subject of Dublin Core properties might be
          seen as "making sense" when the subject is an information resource.
	  The judgment of "making sense" might be
          made according to an asserted 
          or inferred domain constraint, or it might simply be by
          fiat (asserted).  
 	  Call such a property a subject-IRS property.
	  A property that is not subject-IRS would be
          subject-NIRS.
	  Similarly, we would have object-IRS and object-NIRS properties.
          The decision of
          which sense is meant for a particular occurrence of a URI
          is then based on the subject or object classification of the 
          property in the statement in which the URI occurs.
        </p>

        <p>
	  In the example, dc:title would be classified as
	  subject-IRS object-NIRS, while foo:mass would be classified
	  as subject-NIRS object-NIRS.  To avoid mistakes,
	  these classifications would have to
	  be understood in the same way by both Bob and Carol,
	  i.e. they would have to 
	  be part of the shared meaning of the properties in question.
        </p>

        <p>
          This approach presents a couple of challenges.
        </p>

        <p>
          First, not all subject or object positions of properties
	  are easily classified as IRS vs. NIRS.  For example, 
	  the object of "likes" and the subject of "is located at"
	  are not obviously either IRS or NIRS.
	  No matter which choice is made in these cases, meanings 
	  that required the other choice would be
	  difficult to express - you would have to revert to a mode of
	  expression that did not involve a 200 response (hash, 303,
	  blank node, etc.).
        </p>

        <p> 
          Second, care must be taken to ensure that 
	  the two senses can be recovered 
	  even in the presence of 
          inference.  Especially troubling would be if 
          equations were inferred for one sense
 	  that were unsound for the other, e.g. an incorrect
          identification of two information resources on the basis of
          an identity between the things they describe.
          To rule this out would require 
          adoption of practices and conventions designed to prevent
          such conclusions, such as avoiding the use of functional
          properties and owl:sameAs in conjunction with URIs subject
          to dual use.
        </p>
      </div2>


      <div2 id="depends">
        <head>'Hashless' URI dereferences to its definition (incompatibly)</head>

        <p>
	  Under this proposal, a dereferenceable URI would, in some
          cases at least, get its meaning according to a definition
          found in the document to which the URI dereferences,
	  rather than according to the 
          <loc href="#ir-ref"
	  >IR reference rule</loc>.
	  This approach avoids the deployment and performance
	  difficulties of 303
          redirects.  Defining a URI is easy - it is the same as
          publishing any Web document - and access to the definition 
	  is also easy, not
          requiring an indirection step.
        </p>

        <p>
          This would be an incompatible change, as tools
          that assume that u
          refers to IR(u) will misunderstand
          uses of u where u is meant to be defined by the definition
          in IR(u)'s content, and vice versa.
          However, 
          most of the time, a URI does not dereference to a definition
          of itself.  Therefore
          it might make sense for some of those URIs to refer
          to their information resources.  This would maintain
          backward compatibility for those URIs, at least, limiting
          the damage incurred by the incompatible change.  (Tools that
          uniformly assume the IR reference rule would still be
          incompatible, of course.)
        </p>

        <p>
          The challenge is how to distinguish the two situations.  The criterion
          "provides a definition of URI u" is not machine
          actionable as stated, both because the definition might be couched
          in an arbitrary language or notation, and because it is not obvious
          how to distinguish content that contains a
          definition of a particular URI from content that doesn't.
          But an approximation to the criterion
          might be made actionable, based on some combination of media type
          and aspects of the content.  One approximation that has been
          proposed if as follows: If
          IR(u) has a version with media type 'application/rdf+xml', then
          take u to be defined by IR(u), otherwise take u to refer to
          IR(u).  This rule would generate false positives (e.g. documents
          not containing u) and false negatives (e.g. only having
          a text/html version),
          but it illustrates the idea.
        </p>

        <p>
          Some machine-actionable rule is desirable, since without one there
          is no reliable way to use <emph>any</emph>
          hashless dereferenceable URI u to
          refer to IR(u), and all currently deployed metadata would fail.  There
          would always be the possibility that 
          u might be understood to be defined by IR(u) instead.
        </p>

        <p>
          Whatever rule is adopted (if any),
          for those URIs u whose meaning would be changed
          incompatibly, another way would have to
          be provided to refer to IR(u), so that metadata applicable
          to IR(u) could be written.  This could be done in RDF 
          given a standard way to write the predicate corresponding
          to what we've been calling 'is accessible via'.  For
          example, the Turtle term
        </p>
        <eg>
          [ :accessibleVia "http://example/p16"^^xsd:anyURI ] </eg>
        <p>
          could be a new way to refer to 
          IR('http://example/p16'), which we formerly
 	  referred to in Turtle as '&lt;http://example/p16&gt;'.
          A local shorthand
          could be defined to the same effect:
        </p>
        <eg>
          :about-p16 :accessibleVia "http://example/p16"^^xsd:anyURI . </eg>
        <p>
	  (Note that either a 'hash' URI or a 303 URI could be used to
	  refer to an information resource, perhaps defined in this way.)
        </p>
        <p>
          Or the referring document could just assert that it's using
          the URI to refer to the IR in question:
        </p>
        <eg>
          &lt;http://example/p16&gt; :accessibleVia "http://example/p16"^^xsd:anyURI . </eg>
        <p>
          which would constitute an explicit opt-in to the
	  <loc href="#ir-ref">
	  IR reference rule</loc>, running some
	  interoperability risk.  (This would be an
	  instance of <specref ref="colocate"/>.)
          <footnote>

	  <p>
	    One might think that the notation 
	    for referring to information resources could relate the
	    information resource to the referent of u (written
	    '&lt;http://example/p16&gt;' in Turtle) instead of to the
	    URI u itself
	    (written '"http://example/p16"^^xsd:anyURI'):
	  </p>
	  <eg>
	    [ rdfs:isDefinedBy &lt;http://example/p16&gt; ] </eg>
	  <p>
	    However, the meaning of this expression is then sensitive to the
	    interpretation of the URI 'http://example/p16', which
	    is what is in doubt and is therefore what the notation
	    has to avoid depending on.
	    The &lt;...&gt; notation is also
	    ambiguous according to RDF semantics, because
	    if two URIs, say 
	    'http://example/p16' and 'http://example/canoe571',
	    both refer to the same thing (whatever it is), there might
	    be two distinct information
	    resources IR('http://example/p16') and
	    IR('http://example/canoe571') satisfying this relationship,
	    with no way for the property to choose between them.
	  </p>
	  </footnote>
	</p>

        <p>
          To avoid the need for the :accessibleVia notation,
          some convention might be 
          used to provide a URI (other than u) to refer to IR(u), when
          one is available.
          This could be done using a Link: HTTP response header, or
          via an RDF statement such as
        </p>
        <eg>
          &lt;http://example/p16#ir&gt; :accessibleVia "http://example/p16"^^xsd:anyURI . </eg>

      </div2>

    </div1>


    <div1>
      <head>Summary</head>
      <p>
        The following table summarizes the candidate new discovery methods,
        evaluating each against a set of criteria, as described below.
      </p>
      <table rules='all'>
       <thead>
        <tr><td></td> <td>compatible?</td>
                      <td>robust?</td>
                      <td>easy to deploy?</td>
                      <td>min round trips</td> 
                      <td>ns scales?</td> 
                      <td>&gt;1 definition?</td> 
                  </tr>
       </thead>
       <tbody>
        <tr><td><loc href="#hash"
                 >Hash</loc>    </td>
            <td>+</td>
            <td>-</td> <td>+</td> <td>1</td>
            <td>-</td>
            <td>+</td></tr>

        <tr><td><loc href="#303"
                 >Hashless + 303</loc>    </td>
            <td>+</td>
            <td>+</td> <td>-</td> <td>2</td>
            <td>+</td>
            <td>+</td></tr>

        <tr><td><loc href="#suffix"
                 >Hash + fixed suffix</loc>    </td>
            <td>+</td>
            <td>-</td> <td>+</td> <td>1</td>
            <td>+</td>
            <td>+</td></tr>
        <tr><td><loc href="#hostrule"
                 >Hashless + definition-rule</loc></td>
            <td>+</td>
            <td>+</td> <td>?</td> <td>1+&epsilon;</td>
            <td>+</td>
            <td>+</td></tr>
        <tr><td><loc href="#newhttp"
                 >Hashless + new HTTP</loc> </td>
            <td>+</td>
            <td>+</td> <td>-</td> <td>1</td>
            <td>+</td>
            <td>+</td></tr>
        <tr><td><loc href="chimera"
                 >Overload</loc></td>
            <td>+</td>
            <td>+</td> <td>+</td> <td>1</td>
            <td>+</td>
            <td>-</td></tr>
        <tr><td><loc href="#depends"
                 >Depends</loc></td>
            <td>-</td>
            <td>+</td> <td>+</td> <td>1</td>
            <td>+</td>
            <td>+</td></tr>
       </tbody>
      </table>

      <glist>
        <label>compatible?</label>
        <def> 
          Does it assign a new, incompatible definition to existing URIs?
        </def> 

        <label>robust?</label>
        <def> 
          Is the URI free of fragment identifiers that can get lost?
        </def> 

        <label>easy to deploy?</label>
        <def> 
          Can a publisher with a file-upload-only hosting solution use 
          this method?
        </def> 

        <label>min round trips</label>
        <def> 
          How many network round trips are needed to find
          a definition, assuming (a) the definition is not cached and
          (b) the /.well-known/host-meta cache misses with probability
          &epsilon; ?
        </def> 

        <label>ns scales?</label>
        <def> 
          Can definition-containing document sizes be bounded as
          namespaces grow in size?
        </def> 

        <label>&gt;1 definition?</label>
        <def> 
          Can distinct definitions give the same meaning to distinct URIs?
        </def> 
      </glist>

      <p>[Draft note: 
	  For reference, 
         <a href="http://hueniverse.com/2008/09/discovery-and-http/"
          >here</a>'s a similar analysis - not the same problem, but a
        related one - with its own matrix.]
      </p>
    </div1> 


    <div1 id="ir">
      <head>Appendix. About information resources</head>

        <p>
          "Information resources" figure in this story
          as providers of definitions, as things that
          one refers to (metadata subjects), and as things
	  that are the referents of URIs.  As the desire to refer to
          information resources using dereferenceable URIs competes with the
          proposal (<specref ref="depends"/>) to refer to other things
          using those same dereferenceable 
          URIs, it is important to understand what needs are 
          met by information resources, what kinds of things one
          says about them, and what one means by
          saying things about them.
        </p>

      <div2>
        <head>Use case: Preparing and consuming metadata 
          for a Web-accessible information resource</head>

        <p>
          Bob is preparing a bibliography.  He finds a report on
          cicadas provided by Alice at the URI 'http://example/cicada'
          and wishes to refer to the report for the purpose of composing
          metadata such as its title, author, and
          publication date. He selects a URI, blank
          node, or other term to use to refer to the
          report, then composes the metadata,
          using his term in the metadata to refer to the report.
	  (Bob's term might be 'http://example/cicada' but could be
          something else,
          if there is the possibility that 'http://example/cicada'
	  does not refer to Alice's document.)
        </p>
        <p>
          Subsequently Carol encounters an entry from Bob's
          bibliography.  Wanting
          to know what the subject of the entry is, she is led somehow
          (depending on discovery method) from Bob's term to
          Alice's URI, and from there to Alice's document
          IR('http://example/cicada'), which is the document 
	  that Bob's term refers to.
        </p>

      <p>[Draft note: DB:
       a. What URI should Bob use to refer to the report?
       b. How should Carol know to dereference http://example/cicada ?
       c. How should Carol know that Bob's URI is intended to refer to
       IR('http://example/cicada')?]</p>

      </div2>

      <div2>
        <head>Natural history of information resources</head>
        <p>
          The following explains the particular theory of "information
          resources" assumed in this report.  The theory is
          independent of how one refers to information resources.
	  More elaborate theories
          are certainly possible, but this is all we <emph>need</emph> to
          assume in order to explain how they work and what they are good for.
        </p>

        <p>
          Each information resource has one or more
          associated <emph>versions,</emph> where each version 
          is a <emph>fixed information resource,</emph> consisting of
	  fixed content (octet
          sequence) and additional information (media type, language)
          affecting the interpretation of the content.
          Different versions may be appropriate at different times 
          or in different interaction contexts.
          No particular meaning is implied by the word "version;" the
          word is chosen as suggestive of its most common use.
        </p>

        <p>
          Metadata statements such as those giving authorship, title,
          and topic are true or false of fixed information resources
          in the obvious way &mdash; they are true according to the
          content, its interpretation, or its provenance.
          Such statements
          also apply to arbitrary information resources in a systematic way,
          as follows: If
          a statement is true all versions of the
          information resource, 
	  then the statement should be taken as true of the information
          resource, and vice versa.
        </p>

        <p>
          Operationally, this means that if you have knowledge of 
          an information resource's versions, you can write metadata using
          the information resource as subject, and someone reading this
          metadata can then apply that metadata to whatever version
          they access.
        </p>

        <p>
          An information resource need not be accessible via a URI, or
          even have any associated URI at all.  An information resource
          might exist only inside a local file system or database, or it
          might be ephemeral.
        </p>

      </div2> 

      <div2 id="ir-ref">
        <head>Using a URI to refer to the information resource
              accessible via that URI</head>
        <p>
          To refer to the information resource accessible via a
          URI when that URI is dereferenceable, one generally uses the
          URI itself.
          E.g. 'http://example/ir' refers to IR('http://example/ir'),
          if 'http://example/ir' is dereferenceable.
          One might use such a URI in a
          metadata statement, for example: "The creator of
          http://example/ir is Carol", 
          or, expressed equivalently in Turtle,
        </p>
        <eg>
          &lt;http://example/ir> dc:creator "Carol". </eg>

        <p>
          If one wants to refer to an information resource, 
          but it isn't accessible via any URI, one might choose a URI,
          publish the information resource's versions
          at that URI, and then use the URI to refer to the
          information resource.
        </p>

        <p>
          An agent who encounters a URI and wants to know what the URI means
          can dereference it, and if the
          dereference is successful (HTTP 2xx status as opposed to 303 or 404 or
          anything else),<footnote>
            Simple redirects (301, 302, 307) are generally taken as
            transparent with respect to dereference, but this is a
            side issue that we don't want to take up in this report.
          </footnote>
          the agent can take the URI
          to be a reference to the information resource 
          that is accessible via that 
          URI.<footnote>
	    The "u refers to IR(u)" convention is a common and intuitive interpretation of
	    the HTTP specification and is in widespread
	    use.  In 2005 the W3C TAG confirmed this interpretation 
	    (in contrast to "IR(u) defines u") in 
	    its "httpRange-14 resolution"
	    <bibref ref="issue-14-resolved"/>.
          </footnote>
        </p>

        <example>
	<head>Sample information resource</head>
	  <graphic source="ir.png"
		   alt="Relationships among URI, IR, versions, metadata"/>
          <p>
	    In this example, the URI 'http://example/chicago'
	    dereferences on two different occasions to two different
	    fixed information resources.  
	    (Perhaps the document was edited, or is available in two
	    different languages.)
	    These fixed IRs are versions
	    of the information resource IR('http://example/chicago').
	    Steel is a topic of both versions.  If steel is a
	    topic of <emph>any</emph> version of
	    IR('http://example/chicago'), it will also be considered
	    a topic of IR('http://example/chicago').
          </p>
          <p>
	    Dashed lines indicate relationships that are induced by
	    circumstances.
          </p>
	</example>

      </div2>
    </div1>


    <div1>
      <head>Acknowledgments</head> 
      <p>
        David Booth, Michael Hausenblas, Nathan Rixham, and
        Alan Ruttenberg contributed to
        the creation of this report.
      </p>
    </div1>

    <div1>
      <head>References
      </head> 
      <blist> 
        <bibl id="issue-14-resolved"
              href="http://lists.w3.org/Archives/Public/www-tag/2005Jun/0039.html">
          Roy Fielding.
	  <titleref href="http://lists.w3.org/Archives/Public/www-tag/2005Jun/0039.html"
           >[httpRange-14] Resolved.</titleref>
          Email to www-tag list, 2005.
        </bibl> 

        <bibl id="issue-57"
              href="http://www.w3.org/2001/tag/group/track/issues/57">
          <titleref href="http://www.w3.org/2001/tag/group/track/issues/57"
           >Issue 57.</titleref>
          W3C Technical Architecture Group, 2007-2011.
        </bibl> 

        <bibl id="rfc3986"
              href="http://www.ietf.org/rfc/rfc3986.txt">
          T. Berners-Lee, R. Fielding, L. Masinter.
	  <titleref href="http://www.ietf.org/rfc/rfc3986.txt"
           >Uniform Resource Identifier (URI): Generic Syntax.</titleref>
          RFC 3986, IETF, 2005.
        </bibl> 
        
        <bibl id="rfc4395"
              href="http://www.ietf.org/rfc/rfc4395.txt">
	  T. Hansen, T. Hardie, and L. Masinter.
          <titleref href="http://www.ietf.org/rfc/rfc4395.txt"
           >Guidelines and Registration Procedures for New URI Schemes.</titleref>
          RFC 4395, IETF, 2006.
        </bibl> 
        
        <bibl id="rfc5988"
              href="http://www.ietf.org/rfc/rfc5988.txt">
          M. Nottingham.
	  <titleref href="http://www.ietf.org/rfc/rfc5988.txt"
           >Web linking.</titleref>  
          RFC 5988, IETF, 2010.
        </bibl> 

        <bibl id="hostmeta"
              href="http://tools.ietf.org/html/draft-hammer-hostmeta-13">
          E. Hammer-Lahav.
	  <titleref href="http://tools.ietf.org/html/draft-hammer-hostmeta-13"
           >Web Host Metadata.</titleref>
          Internet-draft, IETF, 2010.
        </bibl> 

        <bibl id="webarch"
              href="http://www.w3.org/TR/webarch/">
          Ian Jacobs and Norman Walsh, editors.
	  <titleref href="http://www.w3.org/TR/webarch/"
           >Architecture of the World Wide Web, Volume One.</titleref>
          W3C Recommendation, December 2004.
        </bibl> 

        <bibl id="disambiguating"
              href="http://www.w3.org/2002/12/rdf-identifiers/">
          Sandro Hawke.
	  <titleref href="http://www.w3.org/2002/12/rdf-identifiers/"
           >Disambiguating RDF Identifiers.</titleref>
          W3C, January 2003.
        </bibl> 

        <bibl id="uriqa"
              href="http://sw.nokia.com/uriqa/URIQA.html">
          Patrick Stickler.
	  <titleref href="http://sw.nokia.com/uriqa/URIQA.html"
           >The URI Query Agent Protocol.</titleref>
          Nokia, 2010.
        </bibl> 

        <bibl id="giovanni"
              href="http://lists.w3.org/Archives/Public/www-tag/2007Jul/0034.html">
          Giovanni Tumarello.
	  <titleref href="http://lists.w3.org/Archives/Public/www-tag/2007Jul/0034.html"
	   >http-range-14 303 issue, request for reopening the 
	    discussion.</titleref>
         </bibl>

      </blist>

    </div1> 
  </body> 
</spec>


	<!-- 
        <label>FV(u)</label>
        <def> 
          FV(u) is shorthand for the meaning of a URI u
          according to the definition of u in (a version of)
          the information resource IR(u).  For
          example, if IR('http://example/p16') says that 
          'http://example/p16' refers to Alice's canoe,
          then FV('http://example/p16') is Alice's canoe.
          ('FV' stands for 'take at face value'.)
        </def>
 -->


      <!-- 
      <div2>
        <head>Alternative URI schemes and/or URN namespaces</head> 
        <p>
          The purpose of URI scheme registration is to create new
          classes of URIs with meanings specified by the
          registration.  That is, the registration is a definition
          (perhaps partial) of the meanings of the URIs having that
          scheme.
        </p>
        <p>
          One could derive a URI to refer to a canoe from a URI
          that dereferences to a definition (of the derived URI) by prefixing
          a particular URI prefix to
          the URI for the definition, e.g. fv:http://example/about-p16 .
        </p>
        <p>
          tdb: is close to this, but it covers the primary-topic-of
          use case, not the mint-a-term one - these would not have the
          same behavior.
        </p>
        <p>
          The process for registering a URI scheme is documented by 
          RFC 4395, and for registering a URN namespace is in RFC 
          3406.
        </p>
        <p>
          A problem shared by all non-http URIs is that they won't "work" in
          unmodified browsers.  (But "it's not about 
          browsers," cries Mark Wilkinson.)
        </p>
      </div2>
       -->



        <!-- 
        <p>
          Variant use case: Same as above, but Bob's bibliography
          includes a number of RDF 
          documents, and his metadata includes information relevant
          for making use of those RDF documents.
        </p>
        <p>
          Variant use case: Same as above, but instead of being a 
          person, Bob is a tool that
          is charged with updating all the documents on a Web site with
          license metadata.
        </p>
 	-->

      <!-- 
      <p>
        [Terminology option: Maybe "metadata subject" instead of
        "information resource"??]
      </p>
      -->


        <!-- 
        <p>
          (Why would one be dealing with both kinds of statements 
          at the same time?  Well, the two groups of statements
          might be inserted as RDFa into a
          single HTML document by different tools, or by different
          modules in a content management system.  Or the statements
          might be combined in a single triple store from multiple sources.)
        </p>

        <p>
          (Another way to make sense of this approach is to say that
          URI u refers to IR(u),
          but predicates such as foo:mass and foaf:name have their
          domains expanded 
          to include information resources, and IR(u) is "coerced" to
          FV(u) as needed in order for the predicates to make sense.
          In this view it is the predicates that are the chimeras, not
          the entities they apply to.)
        </p>
        -->

	<!-- 
        <p> 
          Second,
          if the definition of 'http://example/p16'
	  happens to specify an information resource
          other than IR('http://example/p16'), we will end up
          with incorrect statements, since metadata for two distinct
          information 
          resources would be attributed to a single entity.
          Consider, for example, the case where copyright license A applies to
          IR('http://example/p16') and copyright license B applies to
          FV('http://example/p16').  This would lead to both licenses
          being applied to CH('http://example/p16'), which would be
          impossible to interpret correctly, as neither subject is
          such that both licenses apply to it.
          We would have to obtain general agreement that the
          definition at IR('http://example/p16')
          must not
          lead to the URI being understood to refer to
          any information resource other than
          IR('http://example/p16') itself.
        </p>
 	-->

      <!-- 
      <p>
        [Draft note: still thrashing on terminology "definition"
        vs. "documentation" vs. "account"]
      </p>

      <p>
        Languages such as OWL and RDF that
        pervasively use URI-based vocabularies require that
        one be able to refer [mean?], in those languages, to things one
        has to refer to,
        in such a way that the reference will be understood by someone
        encountering the reference.  These references either are URIs
        or are built on URIs, so the problem of referring
        reduces to that of either knowing, or influencing, the way that
        readers will interpret URIs referentially.
      </p>
      -->
