<?xml version="1.0" encoding="UTF-8"?> 
<?xml-stylesheet type="text/xsl" href="../../../doc/xmlspec.xsl"?>
<!DOCTYPE spec SYSTEM
"http://www.w3.org/2002/xmlspec/dtd/2.6/xmlspec.dtd" [ 
<!--
================================================================
--> 
<!ATTLIST spec xmlns:xlink CDATA #IMPLIED>
<!ENTITY mdash " &#8212; "> 
<!ENTITY epsilon "&#949;"> 
<!ENTITY Oacute "&#211;"> 

<!ENTITY draft.day "2"> 
<!ENTITY draft.monthname "February"> 
<!ENTITY draft.year "2012">
]>

<!-- 
  Reference
  http://hueniverse.com/2009/11/xrd-alignment-with-link-syntax/ ??

  Reference http://tools.ietf.org/html/draft-hammer-discovery-06 ??
    (but that's only for documents)

  Leo S: * try to quote the TR/Cooluris/ as much as you can because it
  was very heavily peer reviewed by TAG and IvanHerman and others.

   LINK ELEMENT?

  Larry doesn't like the time dependence of it all.
  Ashok wonders how this relates to Web Linking.

  Larry: my concern about hash and 303 is that they make meaning depend
  on deployment of services infrastructure, which cost power and money
  to maintain, and that to mean something you shouldn't have to have a
  foundation

  Larry: I'd like to see in section 5 the issue of requirement for
  long-term availability of URI data or services (303 or hash)
    [addressed in another document...]

  HT: You might want to reference XRI and a whole bunch of other stuff...

  Jeni: as we highlighted at the F2F, as we come to use named graphs
  to enable us to describe the provenance/trust/temporal coverage
  characteristics of particular sets of information, it's going to
  become really important to keep the notions of document/named graph
  and the topic of that document/named graph separate in some way. I
  think you could make more of a case for that, to counter the view
  that information about documents isn't really important.
     - added mention of provenance as an app that needs this
-->

<!-- 
DB:
 16. Similarly, I think it would be helpful if each proposed solution
 explicitly stated what Alice, Bob and Carol should do, according to that
 solution: "According to this approach, in scenario 2.1, Alice
 should . . . Bob should  . . . Carol should . . . ".  I first noticed
 the need for this in sec 3.4 (LSID), perhaps because I don't know the
 details of how LSID works.
 -->

<!-- 
Alan R: - feeling at atm - just before glossary, is that good content but
better presentation order needed.
In intro make clear that it is use of URI is in sentences. (there are
other points to move there, I think).
 -->

<!-- Providing and discovering URI documentation -->

<spec xmlns:xlink="http://www.w3.org/1999/xlink" w3c-doctype="wd" role="editors-copy"> 
  <header>

    <title> Providing and Discovering URI Documentation
    </title>

  <!-- 
    <w3c-designation>http://www.w3.org/TR/2009/WD-hash-in-url-20090415/</w3c-designation> 
  -->
    <w3c-doctype>Editor's Draft</w3c-doctype> 
    <pubdate> 
      <day>&draft.day;</day>
      <month>&draft.monthname;</month> 
      <year>&draft.year;</year>
    </pubdate> 

    <publoc> 
  <!-- 
      No stable URI for this version.  When citing please specify date
      given above.
  -->

      <loc href="http://www.w3.org/2001/tag/awwsw/issue57/20120202/" >
        http://www.w3.org/2001/tag/awwsw/issue57/20120202/</loc>

    </publoc>

    <prevlocs>
      <loc href="http://www.w3.org/2001/tag/awwsw/issue57/20120130/" >
        http://www.w3.org/2001/tag/awwsw/issue57/20120130/</loc>,
      <loc href="http://www.w3.org/2001/tag/awwsw/issue57/20110625/" >
        http://www.w3.org/2001/tag/awwsw/issue57/20110625/</loc>
    </prevlocs>

    <altlocs>
      <loc role="xml" href="issue57.xml"
           xlink:type="simple">XML</loc>
    </altlocs>
    <latestloc> 
      <loc href="http://www.w3.org/2001/tag/awwsw/issue57/latest/" 
        >http://www.w3.org/2001/tag/awwsw/issue57/latest/</loc> 
    </latestloc>  

    <authlist> 
      <author>

        <name>Jonathan A. Rees
        </name> 
        <email href="mailto:rees@mumble.net"
	   >rees@mumble.net</email> 
      </author>

    </authlist> 
    <status> 
      <p>
        This report has been developed by the 
        <loc href="http://www.w3.org/2001/tag/awwsw/"
          >AWWSW Task Group</loc>
        of the
        <loc href="http://www.w3.org/2001/tag/"
          >W3C Technical Architecture Group</loc>
        in order to provide background material for further discussion
        among those affected by this architectural question, and to help drive
        TAG issue 57 <bibref ref="issue-57"/> to a conclusion.
	The task group's public discussion list is
	public-awwsw@w3.org
        (<loc href="http://lists.w3.org/Archives/Public/public-awwsw/" 
          >archive</loc>).
      </p> 

      <p>
        Earlier versions of this document have been reviewed by the
        task group and the TAG but this version has not.
	The content of this version is the sole responsibility of the
        editor.	<!-- 
	, and has not been formally endorsed by the task group
        or the TAG. -->
      </p>

      <p>
        Publication of this draft
        does not imply endorsement by the W3C Membership. This is
        a draft document and may be updated, replaced, or obsoleted by
        other documents at any time.
      </p> 

      <p>
	<!-- 
        Please send comments on this
        document to the editor at
	<loc href="mailto:rees@mumble.net" 
	 >rees@mumble.net</loc>.
 	-->
	Please send comments on this
	document to the publicly archived TAG mailing list 
	<loc
	    href="mailto:www-tag@w3.org">www-tag@w3.org</loc>
	(<loc href="http://lists.w3.org/Archives/Public/www-tag/"
	   >archive</loc>).
      </p>

      <!-- 
      <p>
        Changes expected for the next version of the document include:
        make the problem of
        metadata incompatibility much more prominent in abstract.
      </p>
 -->

    </status> 

    <abstract> 
      <p>
        The specification governing Uniform Resource Identifiers
        (URIs) <bibref ref="rfc3986"/> allows URIs to "identify" anything at all,
        and this unbounded flexibility is exploited in
        a variety contexts, notably Semantic Web and Linked Data
        applications.
        To exercise this freedom and use a URI to "identify" (or more
        generally "mean")
        something, an agent (a) selects a URI,
        (b) provides documentation for the URI in a manner that
        permits discovery by agents who encounter
        the URI, and (c) uses the URI.  
	Subsequently other agents may not only understand the URI (by
        discovering and consulting the documentation) but may also use
        the URI themselves with the intended meaning.
	<!--  redundant:
	As long as the URI documentation remains
        discoverable, the URI may then be used and understood by other
        agents.
        or,
        As long as the URI documentation remains discoverable, agents
        encountering the URI will be able to understand it [to the
        extent that the URI documentation is helpful].
         -->
      </p>
      <p>
        A few widely known methods are in use to help agents provide
        and discover URI documentation,
        including RDF fragment identifier resolution and the HTTP 303
        'See Other'
        redirect.  
        Difficulties in using these methods
        have led to a search for new methods that
        are easier to deploy, and perform better,
        than the established ones.  
	However, some of the proposed methods introduce new problems, such
        as incompatible changes to the way metadata is written.
	This report
        brings together in one place information on current and
        proposed practices, with analysis of benefits and shortcomings
        of each.
      </p>
      <p>
        The purpose of this report is not to make recommendations but
        rather to
	explore the design space and
	initiate a discussion that might lead to
        consensus on the use of current and/or new methods.
      </p>
    </abstract> 

    <langusage> 
      <language id="en-US">English</language> 
    </langusage>

    <revisiondesc> 
      <p>
        <ulist> 
          <item>
            <p>$Id: issue57.xml,v 1.4 2012/02/03 01:53:59 jrees Exp $
            </p>
          </item>          
        </ulist> 
      </p> 
    </revisiondesc> 
  </header>

  
  <body> 
    <div1>
      <head>Introduction</head>

<p><emph>This is an old issue, and people are tired of it.  
&mdash;Sandro Hawke, January 2003</emph> 
<bibref ref="disambiguating"/></p>

      <p>
        In any kind of discourse it is very useful for an agent to be
        able to provide documentation for a term, in such a way that other agents
        can discover and use that documentation in order to make sense of
        utterances that use that term, and to compose new utterances
        that use it.
      </p>

      <example>
	<head>Term documentation discovery</head>

	<graphic source="discovery.png"
		 alt='Term documentation for "EQ 018"'/>

	<p>
	  Suppose that Alice, in
	  communication with Bob, uses
	  the term "EQ 018" to mean
	  the Loma Prieta earthquake, as in "Alice was in the laboratory
	  during EQ 018".  If Bob does
	  not know what "EQ 018" means, he will have to find out. He
	  might be able to ask Alice directly, although  
	  this may be impossible, as Alice might be too busy, or
	  otherwise unavailable.
	  Lacking that option he does some research, consulting
	  a dictionary or similar resource (reference book, database, 
	  search engine)
	  in order to obtain the 
	  explanation of Alice's use of the term "EQ 018".
	</p>

	<!-- 
	<p>
	  The essential idea is that there are one or more methods
	  available to Bob by which he can discover 
	  bits of writing that explain what 
	  what Alice
	  means by "EQ 018".
	</p>
 	-->

      </example>

      <p>
        In this report, the terms to be documented are assumed to be
	URIs.  URIs can be used 
	to mean all sorts of things
	in many different technical contexts.  Contexts of 
	special interest to this report are
	those processed by machine,
 	including the RDF and OWL family of languages.  The question
	may appear to 
	be limited to RDF and its derivatives, but to the
	extent that there is supposed to be a single 
	meaning for each URI common to RDF and Web architecture
	<bibref ref="webarch"/>, the issue transcends RDF.
      </p>

      <p>
        The nature of URI documentation need not concern us here - many forms
        are familiar, including translation between
        languages (e.g. providing an English or Spanish phrase equivalent to a
        URI), descriptions (the URI refers to an entity possessing
        some set of properties), explanation by example, axiomatic
        method, and so on.  Also
        not of concern here are the many ways in which
        meaning can fail as a result
        of <emph>what</emph> URI documentation says or doesn't say about the
        URI in question, or the particular way in which a URI is
        used.  Our concern is only with 
        the method by which documentation is conveyed, and with meaning
        only to the extent the method impinges on interpretation.
      </p>

      <p>
        URI documentation is typically carried in documents.  No
        assumptions are made about what else might be in such a
        document; there could be additional related information,
        documentation for other URIs, and so on.  Nor is it important
        here that URI documentation be delimited or set off from the other
        information in the document.  As in an encyclopedia, the
        URI documentation part blurs into the other-information parts of the
        document.
      </p>

      <p>
        URI documentation
        discovery methods
        include, in addition to those already mentioned, network
        protocols such as HTTP that involve the URI as a protocol element.  
	Henceforth, in a URI documentation discovery scenario, the URI whose URI
        documentation is to be discovered will be called
        the <em>probe URI</em>.
      </p>

      <p>
        URI documentation discovery is similar to Web retrieval in that in
        both cases one can start with a URI and end with a document.
	The two must not be confused, however, since retrieval often
        yields information that does <emph>not</emph> document the URI,
        is not recognized as doing so, or is not intended to do so.
      </p>

      <p>
	The reason we define URI documentation discovery methods is 
	interoperability: so that there is agreement on how each URI
 	is to be understood.
	In principle, we only need consensus on methods, such as the ones
	surveyed here, for URIs
	that are to be shared widely.  If 
	agents in one community never use the URI in communication with
	agents in another community, then it is OK for the URI
	to have
	distinct senses in the two communities, and there is no
	problem to be solved.  Each community can use the URI in its
	own way, and there will be no confusion.
      </p>

      <p>
        The operative word here is "if".  Isolation is fragile and
        means lost opportunities for synergy and unintended reuse.  All
        the arguments in favor of a World Wide Web, which depends on the
        global nature of the URI vocabulary, apply here.
      </p>

      <p>
        This report presents discovery methods in current use,
        reports some 
        criticisms of them, and describes some additional discovery methods that
        have been proposed to address the criticisms.
      </p>

      <!-- 
      <p>
        [Draft note: Maybe talk in the introduction about alternatives
        to documenting a URI: using
        non-URI phrases and syntactic sugar (these used to be sections).
	Discussion currently relegated to <specref ref="ddi"/>. ]
      </p>
     -->

      <div2 id="desiderata">
        <head>Desiderata</head>
	<p>
	  No consensus on criteria for judging success of a
	  discovery method
	  has emerged from the
	  discussion of this question.  The following properties have
	  been articulated as desirable by various parties to the
	  discussion.  Unfortunately they apparently form a mutually
	  inconsistent set.
	</p>

	<glist>
	  <!-- 
	  <label id="d.simple">
	      Simple
	  </label><def>
	    Having too many options or too many things to remember makes
	    discovery fragile and impedes uptake.
	  </def>
 	  -->

	  <label id="d.uniform">
	      Uniform
	  </label><def>
	    The URI, considered as a reference to something, should
	    make sense on its own, independent of context or community of use.
	    Its meaning or "identification" should be uniform regardless
	    whether it's used as a protocol element, hyperlink, or name.
	    This property cannot necessarily be enforced through technical
	    design, but a discovery solution should not depend on non-uniform
	    meaning.  <bibref ref="rfc3986"/>
	  </def>

	  <label id="d.retrieval">
	      Retrieval-friendly
	  </label><def>
	    It should be possible to configure
	    a URI that has discoverable documentation
	    so that a retrieval request using the URI yields information
	    (such as URI documentation)
	    that is relevant and useful, especially in a Web browser
	    context.
	  </def>

	  <label id="d.easy">
	      Easy to deploy using a current widely deployed protocol stack
	  </label><def>
	    Discovery should employ a widely deployed network protocol
	    such as HTTP in order to avoid the need to deploy a
	    new protocol stack.  (This would likely be implied by "retrieval
	    friendly".)
	  </def>

	  <label id="d.hosting">
	      Easy to deploy on Web hosting services
	  </label><def>
	    Uptake of linked data depends on the technology being
	    accessible to as many Web publishers as possible, so
	    discovery should not require control over Web server behavior that
	    is not provided by typical hosting services.  For example,
	    some hosting services provide no way to deliver a 303 redirect.
	  </def>

	  <label id="d.efficient">
	      Efficient
	  </label><def>
	    (a) Accessing URI documentation should require at most one network
	    round trip, and (b) URI documentation should be cacheable.
	  </def>

	  <label id="d.resistant">
	      Substitution resistant
	  </label><def>
	    A URI should be conveyed unmodified through
	    typical client and server side data flows, so that it
	    retains its utility (such as for bookmarking).
	    Difficulties might include (a) content management systems that strip
	    fragment ids off of URIs at inopportune moments, and (b) Web
	    browsers that replace a redirected URI with its
	    redirect target.
	    Also (c) misspellings yielding invalid URIs should become evident
	    through routine error checks.
	  </def>

	  <label id="d.metadata">
	      Compatible with use of URI as metadata subject
	  </label><def>
	    Convention 1 (below) is widely observed, and it would be
	    nice if discovery methods didn't interfere with it.
	  </def>

	  <label id="d.inference">
	      Compatible with inference
	  </label><def>
	    URIs should participate gracefully in deployed frameworks for
	    ontologies and logical inference, specifically RDF and OWL.
	  </def>
	</glist>
	<p>
	  It is not certain that all of these goals can be met
	  simultaneously.
	</p>

	<p>
	  As any overall discovery <emph>solution</emph> will combine of a
	  number of methods, avoiding conflict between adopted methods is
	  also a goal for any solution.
	</p>
      </div2>


    </div1> <!-- end introduction -->


    <div1>
      <head>Use case scenarios</head> 

      <p>
        Use cases need to be presented as being independent of any
        particular solution to be used, in order that the solution space
        can be explored without bias.  This leads to some
        frustrating vagueness in the following, but the vagueness is
        intentional and necessary.
      </p>

      <div2 id="uc.abstraction">
        <head>Choose a URI, provide documentation for the URI, then use
          the URI</head> 
        <p>
          Alice wants to refer to a particular earthquake.
          Alice "mints" a new URI (one that is not yet in use) with the
          purpose of using that URI to refer to the earthquake.  Alice
          publishes a document containing documentation for the URI, i.e.
	  a document that
          would lead a reader to understand that the URI refers to the
          earthquake.
        </p>
        <p>
          Bob then learns of Alice's URI and its documentation, and uses
	  the URI in a document
          of his own.
        </p>
        <p>
          Subsequently Carol encounters Bob's document.  Wanting to
          know what the URI means, she 
          is led somehow to Alice's published URI documentation, which she
          reads.  She is enlightened.
        </p>

	<p>
	  Any method for implementing this use case would need to explain:
	  what kind of URI Alice should use (syntactic constraints);
	  where and how should Alice should publish the documentation so that it
	  can be found;
	  and how Carol might come to discover Alice's documentation, given
	  the URI.
	</p>

      </div2>

      <div2 id="uc.chicago">
        <head>Using a document as URI documentation by reference to its
          primary topic</head>  

	<ednote>
	  <date>2011-04-14</date>
	  <edtext>
	    Consider dropping this use case, and explain the
	    situation in some less prominent way.
	    The only evidence we have for this situation is from 
	    <loc href="http://lists.w3.org/Archives/Public/semantic-web/2011Apr/0001.html"
	     >Hugh Glaser's message</loc>,
	    and most of the discussion in this document does not apply
	    to this case.  On the other hand it is important to
	    understand the distinction being made.
	  </edtext>
	</ednote>

        <p>
          Bob desires to refer to Chicago.  
          He finds a Web page 
          on the Web at 'http://example/about-chicago' (provided by,
          say, Alice) that consists
          of a description of Chicago, and wants to use it for the
          purpose of referring to Chicago.  He chooses
          a URI and associates it with Alice's Web page 
          in such a way that Bob's URI will be understood as referring to
          Chicago.
        </p>
        <p>
          Carol encounters Bob's URI, is led to 'http://example/about-chicago'
          and thence to Alice's description of
          Chicago, and then somehow understands that Bob's URI is
          meant to refer to Chicago.
        </p>
        <p>
	  Any method for implementing this use case would need to
	  explain: what are the syntactic constraints on the URI Bob 
	  chooses; what
	  Bob needs to do to associate his URI with the document about
	  Chicago; and how Carol comes to discover and use that
	  association.
        </p>
        <p>
          (This differs from the previous use case in that the
	  document about Chicago was
          not written with the purpose of documenting Bob's URI.  In fact 
          Bob's URI doesn't even occur in it.  Rather than look 
	  in the document for 
          URI documentation for Bob's URI, Carol must determine the
          topic of the document and take the topic as the meaning of
          Bob's URI.)
        </p>
      </div2>

      <div2 id="uc.ir-ref">
        <head> Find instances of a resource at a URI, then refer to the
	       resource in metadata </head>
	<p>
	  Alice finds interesting retrieval responses at a URI and
	  wants to talk to Bob about them.  In communication with Bob,
	  Alice uses some agreed syntax (such as the URI itself, if
	  Alice and Bob so agree) to
	  refer the resource of which 
	  these responses are instances. (An instance is a particular
	  kind of "representation," in the sense of
	  <bibref ref="rfc3986"/>, that may share
 	  properties such as title or subject with a resource of
	  which it is an instance.
	  See <bibref ref="generic"/> for discussion of instances and
	  generic resources.)
	</p>
	<p>
	  Upon receipt of this syntax, Bob attempts retrieval of an
	  instance using the URI.  If this succeeds then Bob
	  understands Alice to be
	  referring to the resource of which it is an instance, not to
	  something else.
	</p>
      </div2>

    </div1> 


    <div1>
      <head>URI documentation discovery methods in current general use</head>

        <p>
          This section describes currently accepted methods for
          providing and discovering URI documentation.
        </p>

      <div2 id="colocate">
        <head>Colocate URI documentation and use</head> 
        <p>
          One way to lead someone encountering a URI to documentation
	  for the URI is to
	  make sure that the URI documentation occurs in
	  each document in which the URI occurs.
	  This makes the URI documentation easy to find, since anyone who
	  encounters the URI will already have it in hand.  
	  The form of the URI in this case is arbitrary.
        </p>
        <p>
	  This method treats URIs similarly to blank nodes in RDF, which
	  have to stay close to their own documentation, since they
	  are scoped to a graph.  An example of the application of
	  this approach would be the use of a 
	  URI in an OWL ontology file that carries the URI documentation.
        </p>

        <p>
	  <emph>Criticism:</emph>
	  In RDF, this method is fragile in the same way as are blank nodes,
	  because use and documentation can get
	  separated, e.g. when uses of the URI are deposited into a
	  triple store and then retrieved by a query.  
	  Carrying documentation around with a reference 
	  does not help in 
	  the common case where an out-of-context reference is needed (as
	  one would want in, say, a Semantic Web).
	  (Desideratum: <specref ref="d.uniform"/>.)
        </p>
      </div2>


      <div2 id="cite-source">
        <head>Specifically point (link) to the URI documentation</head> 
	<!-- 
	<p>
	  [Draft note: HH said the former section title "Point (link)
	  to the URI documentation" was not specific
	  enough. "Link to URI documentation using a special 
	  kind of link"?]
	</p>
 	-->
        <p>
          When using a URI, provide,
	  again in the document in which the URI occurs,
          a recognizable
	  reference to a document that carries the URI documentation.  
          This is the approach taken by OWL; the document containing
          the URI is related to the one from which the 
          URI documentation should be obtained via the owl:imports 
	  relation.<footnote>More precisely, the URI documentation will be
	  found in the imports closure of the
          document containing the URI.</footnote>
        </p>
        <p>
          The rdfs:isDefinedBy property might also be used for this
          purpose, but it probably isn't.
        </p>

        <p>
	  <emph>Criticism:</emph>
	  Like the previous approach, this one is good so far as it
	  goes, but it suffers in similar ways.  The URI and the link to
	  its documentation can get separated, or 
	  keeping the documentation link close to the occurrence of the URI
	  may prove to be too difficult for applications.
	  (Desideratum: <specref ref="d.uniform"/>.)
        </p>

        <!-- 
        <p>
          Both of these properties beg the question in that
          they do not say how to figure out what the URI that is the
          target of owl:imports or rdfs:definedBy refers to. 
	  _____

          If the
          meaning of <emph>that</emph> URI had to be given by citing a source,
          there would be infinite regress.
        </p>
        -->
      </div2>

      <div2 id="not-http">
        <head> Use non-http: URIs and a non-HTTP protocol</head>

        <p>
	  It is possible to create a new URI scheme or URN
	  namespace equipped with its own URI documentation discovery regime.
	  A recent example is RFC 5870 for URIs documented as naming
	  geographic locations, where the RFC itself constitutes URI
	  documentation for all of its URIs.  Another is
	  <a href="http://tools.ietf.org/html/draft-holsten-about-uri-scheme"
	  >the URI documentation for the URI about:blank</a> and other
	  about: URIs,
	  which is in progress as of this writing.
	  A "tdb:" (thing-described-by) URI scheme has also been
	  proposed, 
	  [TBD: cite 
	  <a href="http://tools.ietf.org/html/draft-masinter-dated-uri"
	  >Masinter</a>]
	  as has "xri:" for 
	  <a href="http://tools.ietf.org/html/draft-yevstifeyev-xri-uri-rsrv-00"
	   >"extensible resource identifiers"</a>
	  (n.b. xri: has been deprecated in favor of http: and Web Linking).
	  <!-- http://tools.ietf.org/html/draft-holsten-about-uri-scheme-06 -->
	  See <bibref ref="rfc4395"/> and <bibref ref="rfc3406"/>
	  for details.
	</p>

        <p>
	  The most fully developed and widely implemented such design is
          the 'lsid' URN namespace.
	    <!-- 
	    <footnote
            >Unfortunately the 'lsid' URN namespace is not in the
            IANA registry.  Someone encountering an LSID may need
            to do a search in order to locate the LSID specification and
            consequently determine what the LSID means.
	    In addition each LSID contains an "authority" field
	    whose meaning is not assigned by the LSID specification,
            requiring even more research on the part of someone trying
            to understand an LSID.
	    </footnote> -->
          URIs beginning 'urn:lsid:' are called LSIDs. <bibref ref="lsid"/>
          LSIDs have an associated SOAP-based 
          protocol that has separate methods for retrieval (getData)
          and discovery (getMetadata).
          According to the LSID specification,
          an LSID for which the getData method yields nonempty
          content refers to a 
	  <loc href="#representation">representation</loc>,
          while the LSID could refer to
          anything at all if getData yields empty content.  
          In the latter case the information yielded by the
          getMetadata method generally constitutes, or at least
          contains, documentation for the LSID.
        </p>

        <p>
	  For clients lacking an LSID protocol implementation,
	  HTTP/LSID gateways are available, suggesting the possible
	  applicability of the <specref ref="global-rule"/>
	  discovery method as an alternative to the LSID protocol.
        </p>

        <p>
	  <emph>Criticism:</emph>
	  The LSID protocol itself is not widely deployed,
	  and LSIDs are not currently processed in any useful way by
	  most Web clients.
	  (Desiderata: <specref ref="d.retrieval"/>, 
	    		 <specref ref="d.easy"/>,
	  		 <specref ref="d.hosting"/>.)
	</p>

      </div2>

      <div2 id="hash">
        <head>'Hash URI'</head> 
        <p>
          With this method, the probe URI must be a 'hash URI', i.e. must
	  contain a hash character '#'.
	  The URI documentation
          is placed in the document on the Web at the stem (where
	  stem URI = the
	  pre-hash prefix of the URI).
        </p>
        <p>
	  For historical reasons the part of the URI following '#' is
	  called the 
	  'fragment identifier', even when it is null.
	  We will call these 'local identifiers' in
	  recognition of their uses beyond just references to document
	  fragments.
        </p>
	<example>
	  <head>'Hash URI'</head>
	  <graphic source="hash.png"
		   alt="Hash URI documentation discovery"/>
	</example>
        <p>
          The interpretation of a 'hash URI', say 'http://example/eq018#_',
          depends (according to <bibref ref="rfc3986"/>) on  
	  the media types of 
	  <loc href="#representation">representations</loc>
	  of the resource on the Web at its stem URI
          'http://example/eq018'.
          For media type application/rdf+xml, the media type registration
 	  defers to the content of the
	  <loc href="#representation">representation</loc>
	  &mdash; that is, the 
	  <loc href="#representation">representation</loc>
          itself of the stem URI gets to document what the probe URI 
	  means.<footnote>
          If
          OW@('http://example/eq018')
          (the resource at URI 'http://example/eq018')
	  has multiple
	  <!-- associated -->
	  <loc href="#representation">representations</loc>,
	  it is important that all
	  <loc href="#representation">representations</loc>
          provide documentation for every URI that needs it, and that
          corresponding URI documentation in different
	  <loc href="#representation">representations</loc>
	  be compatible with one another.
          (See <bibref ref="webarch"/> section 3.2.)</footnote>
        </p>

	<p>
	    Because of the dependence on media type, care must be
	    taken to ensure that content negotiation does not muddy the
	    meaning of the probe URI.  Fortunately any of three
	    approaches may be used: (1) avoid content negotiation, (2) 
	    make sure that all representations provide the same
	    documentation (following section 3.2.2 of
	    <bibref ref="webarch"/>), or (3) institute, as a new
	    consensus practice, a priority ordering on media types,
	    so that, say, media type application/rdf+xml
	    deterministically takes 
	    priority over text/html (or vice versa).  (The latter in turn
	    requires modifications to discovery clients, so this would
	    would be in effect a new discovery method.)
	  </p>

	<p>
	    Similar considerations apply for competing use of local
	    identifiers as script-defined or as document fragment
	    identifiers: any potential conflicts must be either
	    avoided or resolved.
	</p>

        <p>
	  A second caveat around hash URIs is that
	  when a number of hash URIs are formed by combining a
	  fixed namespace prefix (stem) with many different suffixes
	  using hash as a
	  connector, there must be a single underlying document 
	    at the stem URI that
	    provides URI documentation for all of the URIs.
	    This leads to a number of annoyances, including inefficiency
	    (repeated retrieval of a large document is an unacceptable
	    performance hit for the server, the
	    network, and the client), analytics imprecision, and
	    unavailability of HTTP methods such as DELETE specific to
	    the particular URI.
	</p>

	  <p>
	    The answer to this difficulty has been reported a number of times
	    (e.g. <bibref ref="degraauw"/>) and might be called the
	    "single-hash-URI-per-stem-URI pattern" of use of hash URIs.
	    For a set of namespace members a, b, c, ...
	    instead of using URIs
	  </p>
	  <eg>
  http://example/ns#a  http://example/ns#b  http://example/ns#c ...</eg>
          <p>
	    use URIs that look like
	  </p>
	  <eg>
  http://example/ns/a#_  http://example/ns/b#_  http://example/ns/c#_ ...</eg>
	  <p>
	    where _ is a common suffix of your choice.
	  </p>


	<div3 id="misspellings">
	  <head>Local identifier misspellings go undetected</head> 
	  <p>
	    <emph>Criticism:</emph>
	    A hashless URI that is misspelled, when submitted to an
	    HTTP server, would normally evoke a 404 Not Found
	    response, alerting a user quickly to a misspelling.  A
	    hash URI, on the other hand, isn't sent to an HTTP server.
	    Any misspelling in the local identifier may go
	    undetected for a long time, since
	    it would only be detected as a failure to recover expected
	    information from the 
	    content that was supposed to document it.
 	    (Desideratum: <specref ref="d.resistant"/>.)
	  </p>

	  <p>
	    <emph>Response:</emph>
	    This is hard to argue with.  Mitigations such as use of
	    Javascript for error checking might be possible.
	  </p>
	</div3>

	<div3>
	  <head>Local identifiers are easily lost</head> 
	  <p>
	  <emph>Criticism:</emph>
	    Harry Halpin <bibref ref="halpin"/> reports that local
	    identifiers are often lost during document preparation
	    and cut/paste operations.  
	  </p>
	  <p>
	    Rumor has it that some MVC-based web frameworks (Django?,
	    Sinatra?) are not
	    good about preserving local identifiers.  This
	    needs to be verified.
 	    (Desideratum: <specref ref="d.resistant"/>.)
	  </p>
	  <p>
	    <emph>Response:</emph>
	    More information needed; it's not obvious [to the editor]
	    that this should be the case.
	    Concrete scenarios would help.
	  </p>
	</div3>

      </div2>


      <div2 id="seeother">
        <head>Hashless URI with HTTP 303 'See Other' redirect</head> 
        <p>
	  Initially (around 2000) 'hash URIs' were advanced
	  as the recommended method for URI documentation
 	  provision and discovery.
	  In the 2002-2005 time period
	  demand arose for a discovery method applicable to hashless
	  URIs.  This led 
	  to the invention of a new protocol for use in
	  situations where
	  'hash URIs' are considered unacceptable.
	</p>
        <p>
          In this approach, one mints an absolute hashless http: URI,
	  puts documentation for it on the Web at a second URI,
	  and then arranges for a GET request of the first (probe) URI to
	  redirect, using a 303 'See Other' status code, to the second
	  URI.  The probe URI is not
	  retrieval-enabled, and therefore does not name the
	  resource at that URI according to Convention 1 (since there
	  is none).  The probe
	  URI then gets its meaning 
	  by interpreting the document on the Web at the second URI,
	  which presumably contains documentation for the first URI.
	  The document may carry documentation for other URIs as well,
	  so the referent of the URI is not necessarily the document's
	  primary topic - it may be only one of many things "described
	  by" the document.
          [Draft note: TBD: cite HTTPbis]
	</p>

	<!-- redundant
        <p>
	  Similar to this is the practice of a 303 redirect to a
	  document, where the URI is taken to refer to the document's
	  primary topic.  Using this rule could give a lead
	  to a different meaning for the URI compared to what the
	  previous rule would give, so some tie-breaker is
	  needed in practice.
	</p> -->

	<example>
	  <head>303 redirect</head>
	  <graphic source="303.png"
		   alt="303 URI documentation discovery"/>
	  <p>
	    Alice chooses 'http://example/eq018' as the way she will refer
	    to a particular earthquake.
	    At 'http://example/about-eq018' she publishes text and/or RDF
	    that carries URI documentation for 'http://example/eq018', 
	    explaining the URI's meaning by
	    providing details about the
	    earthquake (date, location).
	    For the URI 'http://example/eq018', which will not be
	    retrieval-enabled (since otherwise, it would, by
	    Convention 1, refer to the
	    resource on the Web at that URI <bibref ref="generic"/>, 
	    not the earthquake),
	    she arranges that a GET request yields a 303 redirect with
	    a Location: header specifying 'http://example/eq018' as the
	    redirect target.
	  </p>
	  <p>
	    Those encountering 'http://example/eq018' will attempt 
	    a retrieval, but
	    this will fail, with a 303 redirect delivered instead.  
	    The 303 redirect indicates that
	    the document at 'http://example/about-eq018'
	    provides documentation of the URI 'http://example/eq018'.
	  </p>
	</example>

        <p>
          Another pattern is to use a 303 redirect to a document whose
          primary topic is the intended referent, similar to the
          Chicago use case (<specref ref="uc.chicago"/>).  This
          could, in theory, lead to 
          ambiguities, as the 
	  entity to which the URI refers in the document may not be
          the document's primary topic.

	  <!-- 
          [Draft note: This is the second use case.
		       Is anyone, in practice, deploying 303 redirects to a
          "primary topic" page not mentioning the URI to be 
          documented, rather than to a document that explicitly mentions
          the URI?  YES - Hugh Glaser.]
	   -->
        </p>

        <p>
	  Again, a number of objections to this approach have been raised:
	</p>

	<div3>
	  <head>303 is difficult, sometimes impossible, to deploy</head> 
	  <p>
	  <emph>Criticism:</emph>
	    Deploying a 303 redirect requires giving the correct
	    directive to a web server, for example adding 
	    a Redirect line to .htaccess in Apache HTTPD.  Unfortunately
	    many hosting solutions do not allow this, putting this
	    manner of publishing URI documentation off limits to many who
	    would otherwise like to use it.
	    (Desideratum: <specref ref="d.hosting"/>.)
	  </p>

	  <p>
 	  <emph>Response:</emph>
	    Web publishers whose ISP does not permit them to set up a
	    303 redirect, or for whom the overhead such as expertise
	    acquisition is
	    prohibitive in some other way, could choose to use a service
	    that provides 303 redirects to a location of their choosing.
	    One such service is purl.org, operated by OCLC, which
	    permits anyone to set up a 303 or other redirect from their domain.
	    The URI to be documented would have to have the form
	    http://purl.org/..., while the URI for the document carrying
	    the URI documentation could be anything at all.
	  </p>
	  <p>
	    Unfortunately,
	    use of a redirect service makes one dependent on two
	    service providers instead of 
	    one, making one's URI documentation more vulnerable than if only
	    one provider were involved.
	  </p>
	</div3>

	<div3>
	  <head>303 leads to too many round trips</head> 
	  <p>
	  <emph>Criticism:</emph>
	    To get URI documentation for N URIs by redirecting through
	    303 responses,
	    you need to do 2N HTTP requests (in the absence of cache hits).
	    This is a frustrating and apparently gratuitous performance
	    hit for those interested in publishing and accessing
	    large numbers of URI documentation-carrying documents.
 	    (Desideratum: <specref ref="d.efficient"/>.)
	  </p>
	  <p>
 	    <emph>Response:</emph>
	    See <specref ref="global-rule"/>.
	  </p>
	</div3>

	<div3>
	  <head>303 responses aren't cached</head> 
	  <p>
	  <emph>Criticism:</emph>
	    RFC 2616 <bibref ref="rfc2616"/> says that 303 responses
	    shouldn't be cached. 
	    Some caching software obeys this directive, with
	    negative consequences for the performance of GET/303 exchanges.
 	    (Desideratum: <specref ref="d.efficient"/>.)
	  </p>
	  <p>
 	    <emph>Response:</emph>
	    This problem was recognized quite early on as a mistake in
	    RFC 2616 <bibref ref="rfc2616"/>,
	    and an erratum was circulated. This is one of many changes
	    made in HTTPbis, which is being developed by the IETF
	    HTTP working group and should be published some time
	    soon.  Any software that fails to cache 303 responses
	    when allowed to by HTTPbis needs to be fixed.
	  </p>
	</div3>

	<div3>
	  <head>303 makes the URI difficult to bookmark</head> 

	  <p>
	  <emph>Criticism:</emph>
	    "The user enters one URI into their browser and ends up at
	    a different one, causing confusion when they want to reuse
	    the URI ... Often they use the document URI by
	    mistake."
	    <bibref ref="davis"/>
	  </p>

	  <p>
	    "Redirection has in fact very confusing side effects; as
	    we expect the 
	    semantic web to work seamlessly with the web, it is very odd that a
	    semantic web uri cannot be copy pasted to a browser without seeing it
	    change to something that is not the same as before."  
	    <bibref ref="tumarello"/>

 	    (Desideratum: <specref ref="d.resistant"/>)
	  </p>

	  <p>
 	    <emph>Response:</emph>
	    The location bar issue is discussed
	    <a href="http://www.w3.org/QA/2010/04/why_does_the_address_bar_show.html"
	     >here</a>. [TBD: citation]
	    The content from the redirect target does
	    not originate from the referent of the original URI, so
	    an interface that suggests otherwise is guilty of misattribution.
	    The best answer to this is that an additional user
	    interface element should be added to browsers that
	    provides access to the original URI.  Accomplishing this
	    would be a challenge.
	  </p>

	</div3>

	<!-- 
	<div3>
	  <head>This use of 303 has no consensus specification</head> 
	  <p>
	    HH: "The hash 303 redirect method in common use has
	    not received adequate 
	    review such as W3C recommendation track; in fact it is not
	    really documented at all in any adequate form."
	    <bibref ref="halpin"/>
	  </p>
	  <p>
 	    <emph>Response:</emph>
	    The IETF HTTP working group has taken on this issue.
	    <a href="http://tools.ietf.org/html/draft-ietf-httpbis-p2-semantics-14#section-8.3.4"
	     >HTTPbis's new text for GET/303</a>
 	    specifies the pattern, which
	    is now in common use in RDF deployment.  There is no issue
	    of incompatibility with prior usage because the current HTTP 
	    specification <bibref ref="rfc2616"/>
	    only defines what 303
	    means in conjunction with POST and says nothing about what
	    it means with GET.
	  </p>
	</div3>
 	-->

      </div2>


      <div2 id="convention1">
        <head>Retrieval as equivalent to instance relationship</head> 
	<p>
	  Widely observed convention relating retrieval to
	  meaning is the following:
	</p>
	<slist>
	  <rfc2119>Convention 1:</rfc2119> A
	  <loc href="#retrieval-enabled">retrieval-enabled</loc>
	  hashless URI
	  refers to the resource 
	  <loc href="#on-web-at">on the Web at</loc>
	  that URI 
	  (see <bibref ref="generic"/>),
	  independent of anything that the retrieval results (representations)
	  say about what the URI means.
	</slist>
	<p>
	  In effect, a response to a retrieval request is equivalent,
	  according to
	  Convention 1, to URI documentation that says that the response
	  is an instance of the thing named by the URI.  This in turn
	  implies (as explained in <bibref ref="generic"/>) that the
	  response (or rather its representation 
	  payload) may resemble that thing in properties such
	  as title, author, subject, creation date, and so on.
	  The URI is then useful as the subject of a statement of
	  metadata, which is understood as applying to the instance.
	  (See use case <specref ref="uc.ir-ref"/>.)
	</p>

	<p>
	  <emph>Criticism:</emph>
	  From the fact that a response is an instance of the URI's referent,
	  you learn from this that the referent is of a kind 
	  that might have properties characteristic of a retrieval
	  response.  But 
	  it's not obvious what in particular the response tells you about the
	  referent of the URI, since most relevant properties,
	  such as length, creation date, or even author, can vary among
	  responses.  According to the theory in <bibref ref="generic"/>,
	  a property will hold if it holds of <emph>all</emph> potential
	  responses, but whether it does would have to be
	  either conjectured or learned through other channels.
	</p>

      </div2>


    </div1>


    <div1>
      <head>Don't do it: Potential workarounds</head>

      <p>
        If issues around 'hash URIs' and 303 redirects
	render them unacceptable, it is worth considering alternatives.
	In this section we reconsider ways in which URI documentation
	discovery can be bypassed altogether.  In the following secion
	potential new discovery methods are considered.
      </p>

      <div2 id="ddi">
        <head>Use something other than a URI</head> 

        <ednote>
	  <date>2011-04-14</date>
          <edtext>This section derives from 
            <loc href="http://www.w3.org/2001/tag/2011/02/metadata-arch.html#slide9"
            >JAR's TAG F2F presentation slides</loc>.  The purpose of
            talking about this idea would be
            mainly to remind people that the problem is one of notational
            engineering, not philosophy.  I have been asked to remove this
            section.
	  </edtext>
        </ednote>

        <p>
	  URIs are just one kind of term that might be used to
          refer to something.  If defining a URI is too difficult or
          costly, then perhaps one might do without.
          In RDF serializations such as Turtle, 
          for example, we have blank node notation:
        </p>
        <eg>
  [ foaf:isPrimaryTopicOf &lt;http://example/about-chicago&gt; ] </eg>
        <p>
          Here we have managed to refer to Chicago without defining a
          new URI; we have simply referred indirectly using a URI that 
          refers to the resource on the Web at that URI
 	  according to a generic method
          (see <bibref ref="generic"/>).
        </p>

	<!-- 
      </div2>
      <div2 id="sugar">
        <head>Syntactic sugar</head> 
 	-->

        <p>
          A concise alternative would be syntactic sugar:
        </p>
        <eg>
  *&lt;http://example/about-chicago&gt; </eg>
        <p>
	  which might be supported in a hypothetical new RDF serialization
          as a shorthand for the previous example.
          (The asterisk is meant to be suggestive of indirection in the
          C programming language.)
        </p>

        <p>
	  <emph>Criticism:</emph>
	  These are good as far as they go, but they do not meet the
	  demand for documented URIs.  In particular, it is possible but
	  <a href="http://www.w3.org/wiki/RdfSmushing"
	   >difficult</a>
	  to detect that blank nodes in separate graphs are meant to
	  refer to the same thing.  Data integration is easier when
	  shared URIs are used.
        </p>

	<!-- 
	<p>
	  Each thing to be referenced has to have a dedicated page; pages
	  cannot be shared among multiple things.
	</p>
	 -->

        <p>
	  In the case of syntactic sugar, there would be
	  adoption overhead in publishing new 
	  RDF serialization specifications and getting them implemented.
        </p>

      </div2>

      <div2 id="parallel">
        <head>Express data in terms of named documents (parallel
 	  properties)</head> 
	<p>
	  The idea here is that you don't need to document a URI
	  if you are willing to use properties that 
	  are defined or understood as indirecting
	  through documents.  Instead, just use a URI that
	  refers to the document on the Web at that URI, and use
	  it as the subject of such properties.
	</p>
	<p>
	  Assume that each named document (i.e. document+name pair)
	  can have an associated entity, which we'll call its
	  "designated subject".<footnote>Why
	    'designated subject' instead of 'primary topic'?
	    Because they might be different things.  Consider
	    identical content,
	    served from two URIs u1 and u2,
	    containing information about subjects associated with the
	    two URIs.  Even though the content is identical, the URIs
	    would have to refer to distinct
	    entities with different designated subjects
	    (the ones associated with their respective names).
	    But the content can have only one primary topic.
	  </footnote>
	  Information about the designated subject is expressed using
	  properties whose subject is the document.
	</p>


	<example>
	  <head>Combining metadata and data using the same URI</head>
        <p>
	  Suppose that Alice wants to record some information about an
 	  earthquake.  She publishes URI documentation containing the
	  following so that it's on the Web at the URI
 	  'http://example/eq018':</p>
	  <eg>
  &lt;http://example/eq018> eq:magnitude 6.9.
  &lt;http://example/eq018> eq:epicenter &lt;geo:37.040,-121.877>. </eg>
	<p>
	  Bob then comes along and writes the
	  following metadata about OW@('http://example/eq018') in the
	  usual way, i.e. using the URI to refer to that
	  resource, based on what information is accessed via that
	  URI:
        </p>
        <eg>
  &lt;http://example/eq018> dc:creator "Alice".
  &lt;http://example/eq018> dc:title 
    "Documentation for Loma Prieta earthquake URI".</eg>

        <p>
	  Suppose that
	  Carol encounters both bits of RDF (or either) and needs to
	  make sense of 
	  them.  She is aware that 'http://example/eq018' might be
	  used in both kinds of statement - in metadata, with the
	  intent that the 
	  metadata is about OW@('http://example/eq018'); and also 
	  in statements that relate to an eathquake.
	  <!-- 
	  as described 
	  in OW@('http://example/eq018').  For each use of
	  'http://example/eq018' she (or her software) needs to
	  determine which sense is supposed to apply. -->
	</p>
	</example>

	<p>
	  Instead of defining eq:epicenter to be a property 
	  relating an earthquake to its
	  epicenter, one documents eq:epicenter to be a property
	  that relates a document to the
	  epicenter of its designated subject.  
	  Then, as long as you
	  have a URI for the 
	  IR, you don't need a URI for the earthquake.
	  If property eq:epicenter has domain eq:Earthquake,
	  then the members of eq:Earthquake are IRs
	  whose designated subjects are earthquakes.
	</p>

	<p>
	  The nature of the designated subject is inferred from
	  information found in the IR.  For example, if the IR says
	  that its eq:epicenter is E, then you can infer that the
	  designated subject has epicenter E.
	</p>

	<graphic source="proxy.png"
		 alt="Document as proxy for thing URI documentation discovery"/>

	<p>
	  The overall effect when reading the RDF is that the
	  documents, being ubiquitous, seem to disappear,
	  and one focuses naturally on information about their
	  designated subjects without being aware of the indirection.
	</p>

        <p>
	  All considerations that apply to the subject of a property
	  also apply to the object, making the situation more complex in
	  ways that we won't work out in detail here.
        </p>

	<p>
	  [via TimBL?]
	  This pattern has some degree of uptake.  Using the 
	  <a href="http://ogp.me/"
	   >open graph protocol</a>
	  on Facebook, you can get a page about a movie. 
	  The RDF references &lt;&gt;, which is of class Movie.
	  (&lt;&gt; is equivalent to a reference via the base URI,
	  the one from which the page was retrieved, and therefore
	  refers to a document.)
	  The members of class Movie are documents whose
	  designated subjects are movies.
	  [is <a href="http://lists.w3.org/Archives/Public/public-lod/2010Nov/0559.html"
	  >this message</a> on topic?]
	</p>

	<!-- 
	<p>
	  This is an old idea, going back to the 
	  <a href="http://www.w3.org/History/1989/proposal.html"
	   >original description of the Web</a>.
	</p> -->

	<p>
	  <emph>Criticism:</emph>
	  If a property that refers directly to movies also needs to be used,
 	  then two properties have to be defined (with distinct URIs), one
	  relating to the movie and one relating to the Movie.  This
	  results in clerical overhead and potential user confusion.
	</p>

      </div2>

    </div1>


    <div1>
      <head>Some potential new discovery methods</head>

      <p>
        All rules presented in this section assume that the probe URI
	is a hashless http: URI.
      </p>

          <p>
	    For compatibility with clients that are not aware of
	    new method(s) for hashless URIs, a complete discovery solution 
	    should grandfather discovery methods that are currently widely known,
	    such as 303 redirects.  A current method
	    should be deployed when possible, redundantly.  Lacking this a 404
	    should be returned, and if the content of the 404 response
	    can be controlled it should provide suitable
	    information such as a link to the URI documentation.
	    Agents would be faced with the 
	    problem of which method to attempt first, since if the
	    the new method doesn't yield URI documentation, a
	    retrieval using the probe URI might have to be attempted
	    (in hope of either success or a See Also), 
	    resulting in one or two extra retrieval
	    requests.  It is the editor's belief that
	    this problem is not insurmountable, but the details would
	    have to be worked out.
          </p>

      <div2 id="global-rule">
          <head>Global rule yielding documentation URI</head> 
          <p>
	    The network round-trip (303 redirect) used to map the
	    probe URI to the URI of the document
	    that carries its URI documentation can be avoided if we
	    know a general rule 
	    that maps the one kind of URI to the other, as such a rule can
	    be applied on the client without server involvement.
          </p>
          <p>
            The "well known URIs" specification <bibref ref="rfc5988"/>
	    provides a solution to this problem.
	    For any origin (the part of the URI preceding the path part)
	    we can prefix the path of URIs with a
	    fixed string, say, '/.well-known/meta', to obtain the
	    URI documentation URI.  For example, if the URI is
	    'http://example/eq018', then its URI documentation would
	    be found by retrieval using the URI
	    'http://example/.well-known/meta/eq018'.
          </p>

          <p>
	    (There is nothing special about the string 'meta'; it
	    could as easily be, say, 'about' or 'seealso'.)
          </p>

          <p>
	    <emph>Criticism:</emph>
	    Web publishers without the ability to control
	    retrieval results for the
	    /.well-known/meta/... URIs would not be able to take advantage of
	    this method.
	    (Desideratum: <specref ref="d.hosting"/>.)
          </p>

	  <p>
	    <emph>Criticism:</emph>
	    Jeni Tennison says: "the disadvantage is that you lose the
	    distinction between status codes for the thing [described]
	    and the
	    document [instantiated]".  [But the editor doesn't
	    understand this.  Any
	    information that would have been conveyed by the status
	    code from a GET on the probe URI, could be conveyed in
	    the document retrieved by URI documentation discovery?]
          </p>
      </div2>
      <div2 id="hostrule">
          <head>Site-specific rule yielding documentation URI</head> 
          <p>
	    Considering the transformation rule idea of the previous section,
	    it is probably too much to hope for that a single rule could work
	    uniformly for hosts whose documentation
	    might be sought,
	    but each individual host may have a rule that applies for 
	    URIs at that host.
          </p>
          <p>
            To support site-specific rules, a
	    a file containing such rules can be provided <bibref ref="rfc5988"/>
	    using a well-known path, say
	    '/.well-known/documentation-rule', e.g.
	    'http://example/.well-known/documentation-rule'.
	    To obtain documentation for 'http://example/eq018', first
	    retrieve (and cache) the
	    documentation-rule document for its host.
            Then if the rule says to map 'http://example/{path}' to, say,
            'http://example/{path}.about', 
            documentation for 'http://example/eq018' can be sought by
	    a retrieval request using 'http://example/eq018.about'.
          </p>
          <p>
            When the mapping rule is cached, the number
            of round trips is one instead of two.
          </p>
          <p>
            Although it would not be difficult to specify a new
	    .well-known path and syntax for the documentation-rule document,
	    it might be possible to use the link-template feature of
	    the <a href="http://tools.ietf.org/html/draft-hammer-hostmeta-13"
		 >host-meta file</a>.  There are pros and cons for
	    each approach.
          </p>

          <p>
	    This approach is essentially the same as the ARK design,
	    [TBD: reference https://wiki.ucop.edu/display/Curation/ARK
	    or something better]
	    which uses as its global URI transformation appending a
	    '?' to the URI.  The main differences are that the ARK
	    rule only works when the path begins 'ark:', and that the risk of
	    'squatting' on part of a domain owner's URI space (not all
	    '?'-ended URIs are for URI documentation discovery) is
	    somewhat higher than in the case of /.well-known/meta/,
	    which would be sanctioned by <bibref ref="rfc5988"/>.
          </p>

	  <p>
	    <emph>Criticism:</emph>
	    Web publishers without the ability to control
	    retrieval results for
	    /.well-known/meta/documentation-rule would not be able to
	    take advantage of
	    this method.
	    (Desideratum: <specref ref="d.hosting"/>.)
          </p>

	  <p>
	    <emph>Criticism:</emph>
	    Jeni Tennison says: "in some cases the mapping from thing URI to
	    document URI can be complex or change over time in ways that
	    make it hard to use a documentation rule file; in
	    legislation.gov.uk for example, we return a 303 redirection
	    from a legislation item to <emph>either</emph> an as-enacted 
	    version
	    <emph>or</emph> the most recently revised version, depending 
	    on what is
	    available for that particular item of legislation (which
	    changes as new revised versions are added). It would be
	    quite hard to create a documentation-rule file in those
	    circumstances (we would have to solve it by having a simple
	    mapping with some URIs 307 redirecting to others)."
	  </p>
      </div2>

      <div2 id="linkheader">
        <head>HTTP response header that links to documentation</head> 
          <p>
	    The Link: HTTP header <bibref ref="rfc5988"/>
	    is useful for indicating a metadata source for an
	    information resource (see POWDER spec [citation needed]).  
	    (Although well documented by its normative specifications,
	    this method is not listed in this document under "methods in
	    current use" because the editor is not aware of any deployment.)
	    The URI needn't be retrieval-enabled,
	    as Link: could be used in any non-success response
	    for directing a client to documentation for the URI.
          </p>

          <p>
	    <emph>Criticism:</emph>
  	    The advantage of Link: over a 303 redirect 
	    in the non-retrieval-enabled case
	    is unclear, since
	    a second network round trip would be required either way.
 	    (Desideratum: <specref ref="d.efficient"/>.)
          </p>
      </div2>

      <div2 id="mget">
          <head>New HTTP request method eliciting documentation</head> 
          <p> 
	    To reduce the number of round trips relative to the 303 
	    redirect, we might have 
            HTTP requests that are somehow understood as signalling a
	    request for URI
	    documentation, as opposed to retrieval of an instance of
	    a resource on the Web at the URI, with the documentation coming
	    back in the HTTP response.  Such a request
	    could be distinguished from a retrieval or other request
	    by its method, headers, and/or content.
          </p>
          <p> 
            The URIQA specification <bibref ref="uriqa"/> defines MGET, 
            a new HTTP request method.
            An MGET request on a URI yields a response containing 
	    information about the referent of the URI.
	    If the URI is retrieval-enabled, then (by Convention 1) the URI
	    refers to the resource on the Web at that URI,
	    so the MGET result is metadata for that
	    resource.
	    Otherwise, the MGET result might be documentation for the
            URI.  In that case a GET request should
            yield a 303 See Other 
	    linking to the same URI documentation
	    <!-- -carrying document -->
 	    obtained by MGET, or
	    maybe to a 405 Method Not Allowed
            response.
          </p>

          <p> 
	    <emph>Criticism:</emph>
	    Not possible to deploy on many hosting services.
	    (Desideratum: <specref ref="d.hosting"/>.)
          </p> 
      </div2>

      <div2 id="status-code">
        <head>New HTTP response status code</head>

          <p> 
	    In response to GET of a URI,
            a server might provide documentation for the URI directly 
	    in a non-200
            response, as opposed to indirectly via a 303 redirect.  
	    (The URI documentation can't go in a successful GET response
 	    since that would mean that the URI
            refers to the resource on the Web at the URI.)
            Possibilities for HTTP response status codes that might
            signal this situation: 
            203 Non-Authoritative Information; a new 2xx status
            (maybe 209); a new 3xx status (maybe 309);
            or a variety of 4xx codes.
            Placing the URI documentation in the content of a redirect response
            (status code 301,
            302, 303, and 307) is unsatisfactory as the
            content would not be displayed in a Web browser; the same
            situation might apply to any 3xx or 4xx response, making a
            2xx status code the most attractive.
          </p>

          <p>
	  <emph>Criticism:</emph>
	    Probably impossible for many hosting services.
  	    Not clear whether
            proxies, caches, and Web clients do
            something reasonable with the proposed status code.
	    (Desiderata: <specref ref="d.retrieval"/>, 
	    <specref ref="d.hosting"/>.)
          </p>
      </div2>
    </div1>

    <div1>
      <head> Discovery methods where some retrieval responses carry URI
             documentation</head>

          <p> 
	    A range of discovery method designs involve having clients
	    interpret parts 
	    of retrieval (HTTP GET/200 or equivalent) responses, or
	    entire responses, as URI documentation.  Depending on
	    design details, any particular response might be treated
	    as carrying URI documentation (or expected to do so),
	    treated as an instance (per Convention 1), both (instance
	    with embedded metadata), or neither.
	  </p>

          <p> 
	    The following illustration diagrams the case where all
	    retrieval responses are treated as carrying URI documentation,
	    i.e. all responses are instances of something different
	    from what the URI 
	    refers to that carries URI documentation.
	  </p>

	<graphic source="change.png"
		 alt='"Take at face value" discovery'/>

          <p> 
	    These designs have in common that at most a single HTTP
	    round trip is required, when discovery uses the HTTP protocol.
	  </p>

          <p> 
	    After surveying the design choices that have to be made, a few
	    representative method designs are presented.  The entire space
	    of possibilities is too broad to cover here.
	  </p>

      <div2 id="in-content">
          <head> Design space overview</head>
          <p> 
	    Designs in this space differ in important ways:
	  </p>
	  <olist>
	    <item>
	      Which kinds of retrieval responses are recognized as carrying
	      applicable URI documentation?  Possible design elements
	      (not mutually exclusive):
	      <olist>
		<item> all responses (client may just fail to
	      	  understand it) </item>
		<item> only responses with
		  particular media types (see <bibref ref="davis"/>) </item>
		<item> only responses possessing some 
		  other special indicator (response header) </item>
		<item> only responses that contain URI documentation
		  for the probe URI (e.g. response header) </item>
		<item> only responses to
		  rerieval requests that explicitly request documentation
		  (e.g. using HTTP request headers, see
		  <bibref ref="thompson"/>, which has a detailed
		  analysis) </item> 
		<item> only responses
		  in which the probe URI occurs (e.g. as an RDF
		  subject), or has some other 
		  distinguishing syntactic property </item>
		<item> (n.b. "no responses" would put the design outside this
		  space; see the other discovery methods, above)
		  </item>
	      </olist>
	    </item>
	    <item>
	      How is URI documentation found inside such a response?
	      <olist>
	        <item> consider the entire response to be potential documentation
	          for the probe URI (no need to find it) </item>
	        <item> somehow select some relevant part
	      	  of it, such as statements that use the probe URI </item>
		<!-- 
	        <item> URI documentation is linked in a distinguished way
	      	  from the content (e.g. HTML &lt;link&gt; element) 
		</item>
		 -->
	      </olist>
	    </item>
	    <item>
	      Which kinds of retrieval responses are recognized as carrying
	      instances of what the URI names?
	      <olist>
		<item> all of them (any embedded URI documentation is
		  metadata) </item>
		<item> none of them </item> 
		<item> those not recognized
		  as containing URI documentation for the probe URI
		  treat it as an instance </item> 
		<item> those responses the truth of which implies that
		  the URI refers to 
		  a kind of thing that can have 
		  instances then treat is as an instance</item> 
		<item> if the response is consistent with the URI
		  documentation, then treat is as an instance</item> 
	      </olist>
	    </item>
	    <item>
	      When the probe URI does not refer to what's on the Web
	      at that URI (when all retrieval responses carry
	      documentation, that would be the a URI documentation 
	      carrier), how does
	      one refer to what's on the Web at the URI?
	      <olist>
	        <item> this case does not arise - all URI
	               documentation carriers are also instances </item> 
	        <item> using a second URI found via some
		       method </item> 
	        <item> using blank node notation <bibref ref="generic"/></item> 
	        <item> using syntactic sugar </item> 
	        <item> using the URI, distinguishing between cases based
	               on context of use of the URI </item> 
	        <item> not specified </item> 
	      </olist>
	    </item>
          </olist>

	  <p>
	    Regarding the last question,
	    any method that conflicts with Convention 1 makes some
	    URIs unavailable for expressing what the URIs mean
	    according to Convention 1.
	    There are many applications that need a method for
	    writing a reference to the resource at an arbitrary 
	    retrieval-enabled hashless URI, including
	    those concerned with metadata (including licensing), provenance, 
	    Web site testing, validation, text processing, text annotation, and 
	    access control.  Therefore any complete discovery solution that
	    includes some a discovery method that preempts Convention 1
 	    for any URI should include a way
	    to write such references.
	  </p>
	  <!-- 
	  <p>
	    This is not a fatal
 	    flaw, since it is possible to compensate by
	    providing new notational "homes"
	    for those meanings.  That is,
	    some new notational device can be specified that
 	    yields a way to refer to the URI documentation.
	    This is not a matter of semantics or philosophy; it is
	    just notational engineering.  <bibref ref="generic"/>
	    gives one way to do this using RDF blank node
 	    notation, and  suggests specifying
 	    a second URI using the Content-location: HTTP header
 	    (although perhaps Link: would work as well?).
	  </p>
	  <p>
  	    Another way one might refer to OW@(u)
	    is just to use the URI, and differentiate the two meanings
	    of the URI by context - document-preferred contexts
	    vs. non-document-preferred.
	    This approach is equivalent in practice to 
	    <specref ref="coercion"/> and bears some resemblance to
	    <specref ref="parallel"/>, but may differ in boundary
	    cases.
        </p>

      <ednote>
        <date>2012-01-29</date>
	<edtext>
	  Obviously this bears elaboration.
	</edtext>
      </ednote>

 -->
          <p> 
	    The workaround <specref ref="parallel"/> described above
	    falls in this design space, but as it can be used
	    immediately with no new consensus, it is not listed here.
	  </p>

          <p>
	    <emph>Criticism:</emph>
            Designs requiring new request or response headers fail
	    desideratum <specref ref="d.hosting"/>.
            Designs in which some responses are non-instances fail 
	    desideratum <specref ref="d.metadata"/> since metadata
            might be interpreted to be about the URI documentation.
	    Designs in which URI interpretation is context sensitive fail 
	    <specref ref="d.uniform"/>.
          </p>

      </div2>

      <div2 id="unnecessary-303">
        <head>A "take it at face value" discovery method design</head>

          <p>
	    One particular point in the design space is presented in
	    <bibref ref="davis"/>, and will be taken as
	    representative.
          </p>

          <p>
	    For discovery, do a GET requesting media type
	    application/rdf+xml.  If the result is 
	    application/rdf+xml, then assume no retrieval response is an
	    instance of the referent (?), and assume the result carries
	    URI documentation for the probe URI.  
	    To refer to the URI
	    documentation, use the URI in the Content-location: header
	    of the response.
          </p>

          <p>
	    If there is no application/rdf+xml variant then assume the
	    URI refers to what's on the Web at the probe URI.
          </p>

          <p>
	    When an instance is sought (application/rdf+xml not
	    requested), and the result is application/rdf+xml, it is
	    not clear [to the editor] how the result
	    should be classified: as both instance and URI documentation,
	    just an instance, or just URI documentation.
          </p>

          <p>
	    <emph>Criticism:</emph>
            Designs in which some responses are non-instances fail 
	    desideratum <specref ref="d.metadata"/> since metadata
            might be interpreted to be about the URI documentation.
          </p>

          <p>
	    <emph>Criticism:</emph>
	    This design does not seem to support other
	    URI documentation formats such as RDFa or Turtle.
          </p>

      </div2>

      <div2 id="coercion">
        <head>Rely on implicit coercion from a named document its
          intended subject</head>

        <p>
	  If one's domain of discourse mixes documents
	  with entities that might be their designated subjects,
	  then maintaining parallel properties
	  (see <specref ref="parallel"/>), one set that applies
	  the 'designated subject' coercion and one that doesn't,
	  might be considered an unacceptable cognitive and clerical burden.
	  (There is quite a lot of variation in opinion on this point.)
	  In this case one might try combining the two properties
	  into a single property that can be used in either
	  way.  Suppose that P is the initial property (not
	  defined via designated subject coercion) and Q is the
	  overloaded property
	  we'd like to define and write.  Then obvious documentation for
	  Q would be
        </p>
        <slist>
	  <sitem> Q(x,y) </sitem>
	  <sitem> &nbsp;&nbsp;&nbsp;if and only if </sitem>
	  <sitem> P(x,y) OR P(designated-subject(x),y)
	  </sitem>
        </slist>

        <p>
  	  For example, taking P = dc:creator as defined by the Dublin
	  Core documentation, and Q = dc:creator as overloaded, the 
	  statement 
	</p><eg>
  &lt;http://example/eq018> dc:creator "Alice". </eg>
        <p>
          could be taken to imply that P(&lt;http://example/eq018>, "Alice")
	  as long as it is agreed ahead of time that earthquakes don't
	  have creators.
        </p>
        <p>
	  This manner of overloading can make correct recovery of
	  P-relationships impossible when a
	  designated subject
	  is a document, so it's probably better
	  use a "tie breaking" rule such as
        </p>
        <slist>
	  <sitem> Q(x,y) </sitem>
	  <sitem> &nbsp;&nbsp;&nbsp;if and only if </sitem>
	  <sitem> P(x,y) OR
  	  	  {P(designated-subject(x),y) AND
 		     designated-subject(x) is not a document}
	  </sitem>
        </slist>
        <p>
	  There may be better tie-breakers than this one; this is just
	  for illustration.
        </p>

        <p>
	  All considerations that apply to the subject of a property
	  also apply to the object, making the coercion rules that
	  much more complex.
        </p>

        <p>
	  <emph>Criticism:</emph>
          Any tie-breaking rule is going to be fragile and will
          make the "losing" side of the race difficult to express.
	  One can expect many mistakes where the designated subject
          was the intended subject of some metadata but the tie-breaking
          rule implicated the other resource.
          (Desideratum: <specref ref="d.metadata"/>)
        </p>

        <p> 
	  <emph>Criticism:</emph>
          This method, by design, creates the illusion that
          the URI actually refers to the designated subject, not the
          resource at the URI.  If
          predicates that already 
          possess meaning are being reinterpreted as overloaded 
	  properties, there is risk
          that an agent will draw unsound conclusions.  For example,
          if two URIs u, v refer to distinct resources
          with the same designated subject,
          and one then writes &lt;u&gt; owl:sameAs &lt;v&gt;
	  having their designated subjects in mind, then one 
          can incorrectly impute that the two resources
 	  are identical.  A similar situation holds with
          functional properties, which induce equations.
          (Desideratum: <specref ref="d.inference"/>)
        </p>
        <p>
        </p>
      </div2>

      <div2>
        <head>Boundary cases
        </head>
	<p>
	  Designers of discovery methods based on this idea should
	  consider what the specified outcome is to be in
	  the following test scenarios. 
	  Assume that U is a hashless http: URI and Z is the payload
	  ("entity" in <bibref ref="rfc2616"/> terms) of a response
	  to a successful retrieval request (GET/200 in HTTP).
	  The outcome could be that Z is seen as carrying URI
	  documentation for U,  
	  Z is seen as an instance of what U refers to / identifies,
	  both of these, neither of these, or not specified by the method
	  definition (answer is out of scope).
	</p>
	<olist>
	  <item>
	    Z has media type application/rdf+xml, but U does not occur
	      in the RDF graph
	  </item><item>
	    Z has media type text/turtle
	  </item><item>
	    Z has media type text/plain and carries URI documentation
	      for U
	  </item><item>
	    Z has media type text/html and contains RDFa markup in which
	      U occurs in the equivalent RDF graph
	  </item><item>
	    Z has media type text/html and contains RDFa markup in which
	      U does not occur in the equivalent RDF graph, or no RDFa
	      markup at all
	  </item><item>
	    Z has a media type specifying an equivalent RDF graph, but the type
	      is not registered with IANA
	  </item><item>
	    one content negotiation variant is recognized to have URI
	      documentation for U but another isn't
	  </item><item>
	    URI documentation for U in Z is consistent with U "identifying"
	      an unspecified document
	  </item><item>
	    URI documentation for U in Z is consistent with U "identifying"
	      a document that carries no URI documentation for U (i.e. an
	      information resource but not the one at U)
	  </item><item>
	    Z has a primary topic that is inconsistent with URI
	      documentation for U contained in Z
	  </item><item>
	    URI documentation for U in Z is internally inconsistent
	      (meaning U would not refer to anything)
	  </item><item>
	    Z carries information that would be harmful if considered true
	  </item>
	</olist>
      </div2>


    </div1>


    <div1>
      <head>Summary</head>
      <p>
        The following table summarizes some of the current and
 	proposed
	URI documentation discovery methods,
        evaluating each against the desiderata stated in the
 	introduction, as explained in the 
 	key below.
      </p>
      <p>
        A complete discovery solution would combine
        methods in some way, conceivably resulting in an overall approach
        possessing more or fewer virtues than any of its individual 
	constituent methods.
      </p>
      <p>
        A table entry of '?' means that the answer depends on the
        details of the method design, while '~' means it depends on
        the interpretation of the desideratum statement (i.e. the
        vagueness of the desideratum statement makes it hard to say).
      </p>
      <table rules='all'>
       <thead>
        <tr><td>
		</td>
	    <td><loc href="#d.uniform"  >uniform</loc></td> 
	    <td><loc href="#d.retrieval">retrieval</loc></td> 
	    <td><loc href="#d.easy"	>easy</loc></td> 
	    <td><loc href="#d.hosting"  >hosting</loc></td>
	    <td><loc href="#d.efficient">round trips</loc></td> 
	    <td><loc href="#d.resistant">resistant</loc></td>
	    <td><loc href="#d.metadata" >metadata</loc></td>
	    <td><loc href="#d.inference"   >inference</loc></td> 
	</tr>
       </thead>
       <tbody>
        <tr><td><specref ref="colocate"/>    </td>
            <td>-</td> <!-- it's context sensitive -->
	    <td>-</td> <!-- GET doesn't do it -->
            <td>+</td> <!-- don't need to use anything at all -->
            <td>+</td>
            <td>0</td>
            <td>~</td> <!-- resistant? -->
	    <td>~</td>
            <td>+</td></tr>

        <tr><td><specref ref="cite-source"/>    </td>
            <td>-</td> <!-- context sensitive (link is context) -->
	    <td>-</td> <!-- GET doesn't work -->
            <td>+</td> <!-- mild tweak, follow the link -->
            <td>+</td>
            <td>1</td>
            <td>~</td> <!-- resistant? -->
	    <td>~</td>
            <td>+</td></tr>

        <tr><td><specref ref="not-http"/>    </td>
            <td>+</td>
	    <td>-</td>
            <td>?</td> <!-- if proxy servers are available -->
            <td>-</td>
            <td>1</td>
            <td>+</td>
	    <td>+</td>
            <td>+</td></tr>

        <tr><td><specref ref="hash"/>    </td>
            <td>+</td>
	    <td>+</td>
            <td>+</td>
            <td>+</td>
            <td>1</td>
            <td>-</td>
	    <td>+</td>
            <td>+</td></tr>

        <tr><td><specref ref="seeother"/>    </td>
            <td>+</td>
	    <td>+</td>
            <td>+</td>
            <td>-</td>
            <td>2</td>
            <td>-</td> <!-- browsers lose original uri on redirect -->
	    <td>+</td>
            <td>+</td></tr>

        <tr><td><specref ref="parallel"/> </td>
            <td>+</td>
	    <td>+</td>
            <td>+</td>
            <td>+</td>
            <td>1</td>
            <td>+</td>
	    <td>+</td>
            <td>+</td></tr>

        <tr><td><specref ref="global-rule"/></td>
            <td>+</td>
	    <td>-</td> <!-- GET loses unless you do a 303 or 404 as well -->
            <td>+</td>
            <td>+</td> <!-- probably -->
            <td>1</td>
            <td>+</td>
	    <td>+</td>
            <td>+</td></tr>

        <tr><td><specref ref="hostrule"/></td>
            <td>+</td>
	    <td>-</td>
            <td>+</td>
            <td>+</td> <!-- probably -->
            <td>1+&epsilon;</td>
            <td>+</td>
	    <td>+</td>
            <td>+</td></tr>

        <tr><td><specref ref="mget"/> </td>
            <td>+</td>
	    <td>-</td> <!-- simple GET loses -->
            <td>~</td> <!-- depends on whether MGET is 'easy' -->
            <td>-</td>
            <td>1</td>
            <td>+</td>
	    <td>+</td>
            <td>+</td></tr>

        <tr><td><specref ref="linkheader"/> </td>
            <td>+</td>
	    <td>?</td> <!-- GET doesn't do it -->
            <td>~</td> <!-- you have to parse new header -->
            <td>-</td>
            <td>2</td>
            <td>+</td>
	    <td>+</td>
            <td>+</td></tr>

        <tr><td><specref ref="status-code"/> </td>
            <td>+</td>
	    <td>+</td>
            <td>~</td> <!-- need to handle new status -->
            <td>-</td>
            <td>1</td>
            <td>+</td>
	    <td>+</td>
            <td>+</td></tr>

        <tr><td><specref ref="unnecessary-303"/></td>
            <td>+</td> <!-- what about HTTP request-URI context vs. RDF?? -->
	    <td>+</td> <!-- GET+conneg  ??? -->
            <td>+</td>
            <td>+</td>
            <td>1</td>
            <td>+</td>
	    <td>-</td> <!-- metadata loses, in most of these designs -->
            <td>+</td>
	</tr>

        <tr><td><specref ref="coercion"/> </td>
            <td>~</td> <!-- think about this -->
	    <td>+</td>
            <td>+</td>
            <td>+</td>
            <td>1</td>
            <td>+</td>
	    <td>+</td>
            <td>-</td> <!-- hard to reason due to pred overloading -->
	</tr>
       </tbody>
      </table>

      <p>Refer to
         <specref ref="desiderata"/>,
	 as follows, for explanations of each column in the table:</p>

      <glist>
        <label>uniform</label>
        <def> 
          <loc href="#d.uniform"
	    >Uniform</loc>.
        </def> 

        <label>retrieval</label>
        <def> 
          <loc href="#d.retrieval"
	    >Retrieval-friendly</loc>.
        </def> 

        <label>easy</label>
        <def> 
          <loc href="#d.easy"
	    >Easy to deploy using existing widely deployed protocol stacks</loc>.
        </def> 

        <label>hosting</label>
        <def> 
          <loc href="#d.hosting"
	    >Easy to deploy on Web hosting services</loc>.
        </def> 

        <label>round trips</label>
        <def> 
          <loc href="#d.efficient"
	    >Efficient</loc>. The table gives the number of network
	    round trips are needed, at minimum, to find 
          URI documentation, assuming (a) the URI documentation is not cached and
          (b) the /.well-known/host-meta cache misses with probability
          &epsilon;.
        </def> 

        <label>resistant</label>
        <def> 
          <loc href="#d.resistant"
	    >Substitution resistant</loc>.
        </def> 

        <label>metadata</label>
        <def> 
          <loc href="#d.metadata"
	    >Compatible with use of Web metadata per Convention 1</loc>.
        </def> 

        <label>inference</label>
        <def> 
          <loc href="#d.inference"
	    >Compatible with inference</loc>. 
        </def> 
      </glist>

      <ednote>
        <date>2011-04-11</date>
	<edtext>
	  For reference, 
          <loc href="http://hueniverse.com/2008/09/discovery-and-http/"
           >here</loc>'s a similar analysis &mdash; not the same problem, but a
          closely related one &mdash; with its own matrix.
	</edtext>
      </ednote>

    </div1> 


    <div1 id="glossary">
      <head>Glossary</head>

      <!-- 
      <p>[Draft note: HH: Put Glossary at end. Otherwise, I doubt
      anyone will get past it.]</p>
      -->

      <p>
        This section defines terms that are used in this report.
        An attempt has been made to avoid gratuitous differences
        from the way these terms are used elsewhere, but in a few
        cases choice of terminology has been difficult and words
        with other meanings are
        given technical definitions.  These definitions are not being
        proposed for general adoption.
      </p>

      <p>
        [Draft comment: All terminology choices are provisional; 
        for most of them I
        am testing the waters to see how well the word works, and am
        prepared to change.]
      </p>

      <glist>
        <label id="hashless">hashless</label>
        <def>
	  A URI is hashless if it contains no hash '#' sign.
        </def>

        <label>http: URI</label>
        <def>
          A URI whose scheme (the part before the colon) is 'http' or 'https'.
        </def>

        <label>local identifier</label>
 	<def>
	  The part of a URI that follows a '#' character (perhaps
	  null); fragment identifier.
 	</def>

        <label>metadata</label>
        <def>
          Information about information, 
	  i.e. a document, image, audio recording, etc.  In RDF, metadata might
          be written using vocabularies such as Dublin Core, FOAF,
          or CC REL.
        </def>

        <label>named document</label>
        <def>
	  A document that has a particular URI associated with it.
	  Two named documents might have identical content, but be
	  distinguishable by virtue of having distinct associated URIs.
        </def>

        <label id="on-web-at">on the Web at</label>
        <def> 
          When a URI is retrieval-enabled,
          "the resource on the Web at a URI" 
          (abbreviated OW@(that URI), see below)
          is the resource whose associated 
	  <loc href="#representation">representations</loc>
          are the 
	  ones obtained by retrieval requests using that URI (or more precisely,
          the ones that are authorized for retrievals using that URI).
	  Note that without Convention 1
	  "the resource on the Web at u" may be different
          from what u refers to.
	  See <bibref ref="generic"/> for a rigorous definition.
        </def>

        <label>OW@(u)</label>
        <def> 
          OW@(u) is shorthand for the generic resource (generic
          information entity)
	  <loc href="#on-web-at">on the Web at</loc>
          URI u.
        </def>

        <label>refer</label>
        <def>
          For the purposes of this report, reference is just one way to
          mean.  There may be ways to mean other than to
          refer, but none are specified here.
        </def>

        <label id="representation">representation</label>
        <def>
	  Content (an octet sequence) tagged with media type and perhaps
      	  other information meant to guide interpretation of the content.
      	  "Representation" is used as a term of art; these representations
      	  don't necessarily "represent" anything at all.  Similar to
      	  "entity" in RFC 2616.  <bibref ref="rfc2616"/>
	  See <bibref ref="generic"/> for a treatment of representations
      	  and their resources.
        </def>

        <label id="retrieval-enabled">retrieval-enabled</label>
        <def>
          A URI is "retrieval-enabled" iff a retrieval request could
          legitimately lead to a successful response.
          (Source: <bibref ref="rfc3986"/> section 1.2.2.)
          In particular, hashless http: URIs are
          retrieval-enabled if an HTTP GET method or
          equivalent correctly is, or could be, successful (yields a 2xx
          response).  Some URIs belonging to some other 
          URI schemes are also retrieval-enabled.
        </def> 

	<!-- 
        <label>term</label>
        <def>A URI, word, name, or phrase
          that can serve in subject or object position in a statement.  In an
          RDF serialization, for example, a term might be a qname,
          URI, or blank
          node label.  In Turtle, a term might be any Turtle term,
          including one written using blank node [...] notation.
	  [Draft note: HH says that to be correct, need to admit that
          URIs are also used as predicates.]
        </def>
        -->

        <label>URI documentation</label>
        <def>
          A document or document part that provides
          information about the meaning of a URI.
	  This term is not meant to be either rigorous or exclusive.  The
          "information" could be provided in
	  any human-readable or machine-readable language,
	  or combination of languages.

          [Draft note: Alan R: Is a sound recording possible documentation?]

          <!-- 
          [We need a word for this, and its relation
          to a phrase whose meaning is in question.  "Description" (or
          Eran H-L's "description resource") is
          incorrect as it shifts focus from the term to some (unknown)
          resource - I don't start out knowing what the resource is and then
          look for a description of it, I start out knowing a term and
          then I want to know what resource is meant.  "Definition" is
          another option but may be misleading.  David B likes
          "URI declaration" but this term is evocative of his architecture,
          which I don't want to evoke.]
           -->
        </def> 

      </glist>
    </div1>

    <div1>
      <head>Acknowledgments</head> 
      <p>
        <loc href="http://www.w3.org/2001/tag/awwsw/"
          >AWWSW Task Group</loc>
	members
	David Booth, Michael Hausenblas, Nathan Rixham, and
        Alan Ruttenberg contributed to
        the creation of this report.  
	Pat Hayes and Henry S. Thompson participated in discussions.
	Timothy Danford gave some helpful suggestions on a draft.
	Dave Reynolds gave 
        detailed advice the handling of desiderata throughout the
        document, and other valuable comments.
	Jeni Tennison and the rest of the TAG gave many helpful comments.
	Martin J. Dürst clarified the technical meaning of the term
 	"absolute URI".
      </p>
    </div1>

    <div1>
      <head>References</head> 
      <blist> 

        <bibl id="ark"
              href="https://wiki.ucop.edu/display/Curation/ARK">
          John Kunze.
	  <titleref href="https://wiki.ucop.edu/display/Curation/ARK"
           >ARK: Archival Resource Key</titleref>.
          Web page, January 2012, accessed 30 January 2012.
        </bibl> 

        <bibl id="davis"
              href="http://blog.iandavis.com/2010/11/07/a-guide-to-publishing-linked-data-without-redirects/">
          Ian Davis.
	  <titleref href="http://blog.iandavis.com/2010/11/07/a-guide-to-publishing-linked-data-without-redirects/"
           >A guide to publishing linked data without redirects</titleref>.
          Blog post, November 2010, accessed 27 January 2012.
        </bibl> 


<!-- 
        <bibl id="davis"
              href="http://blog.iandavis.com/2010/11/04/is-303-really-necessary/">
          Ian Davis.
	  <titleref href="http://blog.iandavis.com/2010/11/04/is-303-really-necessary/"
           >Is 303 really necessary?</titleref>
          Blog post, November 2010, accessed 20 January 2012.
        </bibl> 
 -->

        <bibl id="degraauw"
              href="http://www.marcdegraauw.com/2007/02/20/the-referent-convention/">
          Marc de Graauw.
	  <titleref href="http://www.marcdegraauw.com/2007/02/20/the-referent-convention/"
	   >The #referent convention</titleref>.
	   Blog post, 2007, accessed 20 January 2012.
         </bibl>

        <bibl id="disambiguating"
              href="http://www.w3.org/2002/12/rdf-identifiers/">
          Sandro Hawke.
	  <titleref href="http://www.w3.org/2002/12/rdf-identifiers/"
           >Disambiguating RDF Identifiers</titleref>.
          W3C, January 2003.
        </bibl> 

        <bibl id="halpin"
              href="http://lists.w3.org/Archives/Public/public-awwsw/2011Jan/0021.html">
          Harry Halpin.
	  <titleref href="http://lists.w3.org/Archives/Public/public-awwsw/2011Jan/0021.html"
	   >Reversing HTTP Range 14 and SemWeb Cool URIs decision</titleref>.
	   Email to public-awwsw list, 2011.
         </bibl>

        <bibl id="hostmeta"
              href="http://tools.ietf.org/html/draft-hammer-hostmeta-13">
          E. Hammer-Lahav.
	  <titleref href="http://tools.ietf.org/html/draft-hammer-hostmeta-13"
           >Web Host Metadata</titleref>.
          Internet-draft, IETF, 2010.
        </bibl> 

        <bibl id="generic"
	      href="http://www.w3.org/2001/tag/awwsw/ir/20120127/">
          Jonathan A. Rees, editor.
	  <titleref href="http://www.w3.org/2001/tag/awwsw/ir/20120127/"
	   >Generic resources and Web metadata</titleref>.
	  Editor's draft, W3C, 2012.
	</bibl>

        <bibl id="issue-14-resolved"
              href="http://lists.w3.org/Archives/Public/www-tag/2005Jun/0039.html">
          Roy Fielding.
	  <titleref href="http://lists.w3.org/Archives/Public/www-tag/2005Jun/0039.html"
           >[httpRange-14] Resolved</titleref>.
          Email to www-tag list, 2005.
        </bibl> 

        <bibl id="issue-57"
              href="http://www.w3.org/2001/tag/group/track/issues/57">
          <titleref href="http://www.w3.org/2001/tag/group/track/issues/57"
           >Issue 57</titleref>.
          W3C Technical Architecture Group, 2007-2012.
        </bibl> 

        <bibl id="lsid"
	      href="http://www.omg.org/cgi-bin/doc?dtc/04-05-01.pdf">
          <titleref href="http://www.omg.org/cgi-bin/doc?dtc/04-05-01.pdf"
	   >Life Sciences Identifiers Specification</titleref>.
	  Object Management Group, 2004.
	</bibl>

        <bibl id="rfc2616"
              href="http://www.ietf.org/rfc/rfc2616.txt">
          R. Fielding, J. Gettys, J. Mogul, H. Frystyk, L. Masinter,
          P. Leach, and T. Berners-Lee.
	  <titleref href="http://www.ietf.org/rfc/rfc2616.txt"
           >Hypertext Transfer Protocol -- HTTP/1.1</titleref>.
          RFC 2616, IETF, 1999.
        </bibl> 
        
        <bibl id="rfc3406"
	      href="http://www.ietf.org/html/rfc3406.txt">
	  L. Daigle, D.W. can Gulik, R. Iannella, and P. Faltstrom.
	  <titleref href="http://www.ietf.org/html/rfc3406.txt"
	   >Uniform Resource Names (URN) Namespace Definition 
	    Mechanisms</titleref>.
	  RFC 3406, IETF, 2002.
        </bibl> 

        <bibl id="rfc3986"
              href="http://www.ietf.org/rfc/rfc3986.txt">
          T. Berners-Lee, R. Fielding, L. Masinter.
	  <titleref href="http://www.ietf.org/rfc/rfc3986.txt"
           >Uniform Resource Identifier (URI): Generic Syntax</titleref>.
          RFC 3986, IETF, 2005.
        </bibl> 
        
        <bibl id="rfc4395"
              href="http://www.ietf.org/rfc/rfc4395.txt">
	  T. Hansen, T. Hardie, and L. Masinter.
          <titleref href="http://www.ietf.org/rfc/rfc4395.txt"
           >Guidelines and Registration Procedures for New URI Schemes</titleref>.
          RFC 4395, IETF, 2006.
        </bibl> 
        
        <bibl id="rfc5988"
              href="http://www.ietf.org/rfc/rfc5988.txt">
          M. Nottingham.
	  <titleref href="http://www.ietf.org/rfc/rfc5988.txt"
           >Web linking</titleref>.  
          RFC 5988, IETF, 2010.
        </bibl> 

        <bibl id="sporny"
              href="http://lists.w3.org/Archives/Public/public-awwsw/2011Jan/0012.html">
          Manu Sporny.
	  <titleref href="http://lists.w3.org/Archives/Public/public-awwsw/2011Jan/0012.html"
	   >Reversing HTTP Range 14 and SemWeb Cool URIs decision</titleref>.
	   Email to public-awwsw list, 2011.
         </bibl>

        <bibl id="thompson"
              href="http://www.ltg.ed.ac.uk/~ht/wantOther.html">
          Henry S. Thompson.
	  <titleref href="http://www.ltg.ed.ac.uk/~ht/wantOther.html"
	   >Yet another workaround for definition discovery</titleref>.
	  W3C informal memo, 19 September 2011.
        </bibl>

        <bibl id="tumarello"
              href="http://lists.w3.org/Archives/Public/www-tag/2007Jul/0034.html">
          Giovanni Tumarello.
	  <titleref href="http://lists.w3.org/Archives/Public/www-tag/2007Jul/0034.html"
	   >http-range-14 303 issue, request for reopening the 
	    discussion</titleref>.
	   Email to www-tag list, 2007.
         </bibl>

        <bibl id="uriqa"
              href="http://sw.nokia.com/uriqa/URIQA.html">
          Patrick Stickler.
	  <titleref href="http://sw.nokia.com/uriqa/URIQA.html"
           >The URI Query Agent Protocol</titleref>.
          Nokia, 2010.
        </bibl> 

        <bibl id="webarch"
              href="http://www.w3.org/TR/webarch/">
          Ian Jacobs and Norman Walsh, editors.
	  <titleref href="http://www.w3.org/TR/webarch/"
           >Architecture of the World Wide Web, Volume One</titleref>.
          W3C Recommendation, December 2004.
        </bibl> 



      </blist>

    </div1> 

    <div1>
      <head>Change log</head> 
      <ulist>
        <item>
	  2011-11-07 Terminology correction: "absolute" to "hashless"</item>
        <item>
 	  2011-11-07 Added new material on 
 	     	     scripts,
 	     	     <specref ref="misspellings"/>, and
 	     	     analytics </item>
	<item>
 	  2012-01-19 Terminology change:
	     	     "definition" to "URI documentation"</item>
        <item>
 	  2012-01-19 Terminology correction:
 	     	     "dereferenceable" to "retrieval-enabled"</item>
	<item>
 	  2012-01-19 Added new section <specref ref="global-rule"/>
	       	     </item>
	<item>
	  2012-01-20 Terminology change:
	       "fragment identifier" to "local identifier"</item>
	<item>
 	  2012-01-20 
	       Defined "convention 1" and introduced more care
	       regarding the question of how widespread
	       observance of httpRange-14(a) is.
	       </item>
	<item>
 	  2012-01-24 Changed "criteria" to "desiderata".</item>
	<item> 
	  2012-01-24
	       Removed "unattractive, silly, and/or vestigial"
	       criticism of hash URIs pending better documentation.</item>
	<item>
 	  2012-01-24 Removed unfair criticism of LSID.</item>
	<item> 
	  2012-01-24 Moved new-status-code method into its own section.
	       </item>
	<item>
 	  2012-01-24 
	    Overhauled treatment of desiderata in intro and summary.</item>
	<item>
 	  2012-01-27
	    Purged the term "information resource" from the
	    document.  Cross-referenced criticisms to desiderata.
	    Some reorganization. </item>
	<item>
 	  2012-01-28
	    Hash URI section reorg; recast 2xx ("convention 1") as a
	    discovery method, and added criticism. </item>
	<item>
 	  2012-01-30
	    Added comment about ARK </item>
	<item>
 	  2012-01-30
	    Reorganized "face value" again; expanded design space 
	    overview </item>
	<item>
 	  2012-01-30
	    Removed section "This use of 303 has no consensus
	    specification" since <emph>all</emph> discovery solutions 
	    (other than hash and Link:) have this problem.
	    </item>
	<item>
 	  2012-02-02
	    Added use case for discovering that a URI refers to the
	    generic resource at that URI.
	    </item>
	<item>
 	  2012-02-02
	    Added section listing boundary cases.
	    </item>
      </ulist>
    </div1>

  </body> 
</spec>

	  <!-- 
        <p>
	  <emph>Criticism:</emph>
	  It is likely that there is deployed content that would
	  be interpreted differently under the proposed rule than at
	  present.  This would be hard to know, and inconsistencies
	  could be consequential, such as the assignment of authorship
	  or a copyright license to the wrong resource.
	  (Think about the case where a resource at URI U
	  documents U as being a different resource.)
	  More complex
	  and costly heuristics than those given above might eliminate
	  some kinds of misinterpretation, but would never eliminate it.
        </p>

      </div2>
 -->
	  <!-- 
        <div3>
          <head> How to refer to things that are in the Web,
          	 then?</head> 
 -->

	  <!-- 
	  <p>
	    Another design option would be a rule or protocol
	    for providing a URI (other
	    than u) to refer to OW@(u), when 
	    one is available.
	    One way to do this would be with a Link: HTTP response header
	    <bibref ref="rfc5988"/>: if GET u or HEAD u yielded a
	    response with a Link: header with an agreed link relation,
	    the target of the link would be the URI naming OW@(u).
	    Using a Content-location: header
	    has also 
 -->
	    <!-- 
	    <a href="http://blog.iandavis.com/2010/11/07/a-guide-to-publishing-linked-data-without-redirects/"
	     >been suggested</a>.  
 -->
	    <!-- 
	    been suggested <bibref ref="davis"/>.
	    It would be necessary that
	    the extra header be provided for <emph>all</emph> indirect URIs,
	    since otherwise some of these
	    resources would lack URIs.
	  </p>
	  <p>
	    It is not clear how difficult it would be to correctly deploy
	    Link: or Content-type: headers on hosting services.
	  </p>
 -->
	    <!-- 
  , or
	    via an RDF statement such as
	  </p>
	  <eg>
    &lt;http://example/eq018#ir&gt; wa:hasInstanceUri "http://example/eq018"^^xsd:anyURI . </eg>
       </div3>
   -->

	    <!-- 
	<p>
	  <emph>Criticism:</emph>
	  We're not seriously advancing this option.
	  The review process for new URI schemes and URN namespaces is
	  probably too stringent for all but a very few URI documenation
	  discovery applications.
	  There would likely be poor protocol support for discovering
	  URI documentation in a new URI scheme or URN namespace.  It is
	  possible, manually, to look up a scheme or namespace in the
	  appropriate registry, but few client applications are able
	  to do this, and the resulting document is not machine
	  actionable in any standard way.  One could attempt to modify
	  all Web clients to understand the new scheme, but this would
	  be difficult.
	    (Desiderata: <specref ref="d.retrieval"/>,
			 <specref ref="d.hosting"/>.)
	</p>
 -->

	  <!-- 
        <p>
	  Under this approach, some or all retrieval-enabled 
	  hashless URIs - call them "indirect" URIs - might
          get their meaning according to URI documentation
          found in the resource (document, usually) at the URI,
	  even if that is (or is seen as) inconsistent with
	  referring to the resource on the Web
          as suggested by Convention 1.
          Defining an indirect URI is easy &mdash; it is the same as
          publishing any Web document &mdash; and access to its URI documentation
	  is also easy, not requiring an indirection step.
        </p>
 	  -->

	<!-- 

        <p>
	  How does one learn whether a URI is indirect or not?
	  One might like to say that an indirect URI is one whose
	  retrieval results are documentation for itself, and that all others
	  are direct.
	  But this criterion is not machine
          actionable as stated, both because the URI documentation
 	  might be couched
          in an arbitrary language or notation (the number of RDF
	  serializations is increasing steadily), and because even for
	  a known notation it may not be obvious 
          how to distinguish content that contains 
          documentation for a particular URI from content that doesn't.
          One actionable approximation that has been
          proposed is as follows: If OW@(u) has an associated
	  <loc href="#representation">representation</loc>
	  with media type 'application/rdf+xml', then
          take u to be indirect; otherwise take u to be direct
	  (see <bibref ref="davis"/>).
	  This
	  rule would generate false positives (e.g. RDF/XML documents 
          not containing u) and false negatives (e.g. those defining the
	  URI only in an associated text/owl-manchester
	  <loc href="#representation">representation</loc>),
          but it illustrates the idea.
        </p>

        <p>
	  In order to compose or use metadata, agents would 
	  first check whether a URI is direct by
	  requesting an application/rdf+xml representation.  If the URI
	  is direct, agents could compose or use metadata in the
	  usual way (at some risk that the URI might change status in
	  the future from direct to indirect).  If the URI is
	  indirect, agents 
	  would have to write or interpret the metadata in some new
	  way (see below).
        </p>
 -->

<!-- 
	  <p>
	    
	  </p>
	  <p>
	    A standard way to refer to
	    OW@(u) is needed in a variety of circumstances:
	  </p>
	  <ol>
	    <li>when u is an indirect URI</li>
	    <li>when it is not known whether u is direct or indirect</li>
	    <li>when the cost of determining whether u is direct or
	      indirect is judged to be too high</li>
	    <li>when it is desired not to impose on others the cost of
	      determining whether u is indirect</li>
	    <li>to guard against u possibly becoming indirect in the future</li>
	  </ol>
	  <p>
	    Although direct URIs might still be used to refer per
	    Convention 1, 
	    when they are known to be direct,
	    it is possible that the risks and costs of doing so might
	    lead some people to stop using them in that way, 
	    in preference to a common
	    approach that worked uniformly for both direct and indirect URIs.
	  </p>
	  <p>
	    In any case, there are many design alternatives for referring
	    to a resource on the Web at a URI other than via that URI.  For
	    example, the Turtle term 
	  </p>
	  <eg>
    [ wa:hasInstanceUri "http://example/eq018"^^xsd:anyURI ] </eg>
	  <p>
	    could be a new way to refer to 
	    OW@('http://example/eq018'), which we formerly
	    referred to in Turtle as '&lt;http://example/eq018&gt;'.
	    [See <bibref ref="generic"/> for a more detailed exposition of
	    the wa:hasInstanceUri relation.]
	    [TBD: also reference Halpin and Presutti's closed access
	    ESWC 2009 paper.]
	    A local shorthand for use within a document or graph
	    could be defined to the same effect:
	  </p>
	  <eg>
    :about-eq018 wa:hasInstanceUri "http://example/eq018"^^xsd:anyURI . </eg>
	  <p>
	    (Note that :about-eq018 could be either a 'hash' URI or a
	    303 URI.)
	  </p>
	  <p>
	    Yet another possible replacement notation would be syntactic sugar:
	  </p>
	  <eg>
    &amp;&lt;http://example/eq018&gt; </eg>
	  <p>
	    which might be supported in a hypothetical new RDF serialization.
	    (The ampersand is meant to be suggestive of the address-of
	    operator in the C programming language.)
	    (This would of course have significant deployment cost.)
	  </p>
 -->

	  <!-- 
	  <p>
	    [Draft note: HH requested that the idea be presented of
	    syntactic sugar to
	    support references to IRs.  He suggested something having to
	    do with quotation and named graphs that I didn't understand,
	    but I think he's referring to something that's basically the same
	    as the address-of operator in my 
	    <loc href="http://www.w3.org/2001/tag/2011/02/metadata-arch.html#slide9"
	      >TAG F2F slides</loc>.]
	  </p>
	  -->

	  <!-- 
	  <p>
	    Alternatively, the referring document could just assert that
	    a URI is to be treated as direct, without checking whether
	    it is or not:
	  </p>
	  <eg>
    &lt;http://example/eq018&gt; wa:hasInstanceUri "http://example/eq018"^^xsd:anyURI . </eg>
	  <p>
	    This would be an instance of <specref ref="colocate"/>.
	    However, this runs some interoperability risk as there may
	    be other agents that interpret the same URI as indirect.
	    <footnote>
	      <p>
		One might think that the notation 
		for referring to the resource at u could relate the
		resource to the referent of u (written
		'&lt;http://example/eq018&gt;' in Turtle) instead of to the
		URI u itself
		(written '"http://example/eq018"^^xsd:anyURI'):
	      </p>
	      <eg>
    [ rdfs:isDefinedBy &lt;http://example/eq018&gt; ] </eg>
	      <p>
		However, the meaning of this expression is then sensitive to the
		interpretation of the URI 'http://example/eq018', which
		is what was in doubt in the first place
		and is therefore something that the notation
		has to avoid depending on.
	  -->
		<!-- 
		The &lt;...&gt; notation is also
		ambiguous according to RDF semantics, because -->
		<!-- 
		If two URIs, say 
		'http://example/eq018' and 'http://example/earthquake571',
		both refer to the same thing (whatever it is), there might
		be two distinct
		resources
		OW@('http://example/eq018') and
		OW@('http://example/earthquake571') satisfying this relationship,
		with no way for the property, which is defined on the
		interpretations of the URIs and not on the URIs
		themselves, to choose between them.
 		-->
		<!-- 
	      </p>
	    </footnote>
	  </p>
 		-->

	<!-- 
	<div3>
	  <head>'Hash URIs' are unattractive or seem redundant</head> 
	  <p> There has been resistance to hash URIs on aesthetic
	      grounds.</p>
	</div3>
 	-->

	<!-- 
	<div3 id="analytics">
	  <head>Use of local identifiers reduces analytics quality</head> 
	  <p>
	  <emph>Criticism:</emph>
	    Ian Davis 
	    <a href="http://lists.w3.org/Archives/Public/www-tag/2011Aug/0127.html"
	     >reports</a>
	    It might be useful to know, for analytics purposes, which
	    hits are due to accesses using the probe URI, vs. which
	    ones are due to accesses using the documentation's URI.
	  </p>
	  <p>
	    <emph>Response:</emph>
	    If the
	    single-hash-URI-per-stem-URI pattern as described above
	    is used, then
	    discovery use of the stem and probe URIs will be in 1-1
	    correspondence, so analytics will be precise.
	  </p>
	</div3>

	<div3>
	  <head>'Hash URIs' don't support REST architecture</head> 
	  <p>
	  <emph>Criticism:</emph>
	    Manu Sporny
	    <bibref ref="sporny"/>
	    reports that
	    hash URIs should work with HTTP PUT, POST, and DELETE
	    methods; they don't.
	  </p>
	  <p>
	    <emph>Response:</emph>
	    More information needed.  Why not use a separate
	    retrieval-enabled URI for REST controls related to 
	    the referent and/or documentation of a hash URI?
	    In particular, using the single-hash-URI-per-stem-URI pattern,
	    the REST controls could be applied to the stem URI.
	  </p>
	</div3>
 -->

	<!-- 	  <p>
	    (One might consider using an empty suffix, since there is
	    no information to convey:
	  </p>
	  <eg>
  http://example/ns/a#  http://example/ns/b#  http://example/ns/c# ...</eg>
	  <p>
	    but, while technically correct, this approach interacts 
	    <a href="http://www.w3.org/2001/tag/2011/06/07-minutes.html#item03"
	     >badly</a> with many deployed tools.)
	  </p>
 -->

	<!-- 
          <p>
	    The following are some of the answers that have been
	    advanced; this is certainly not a complete list of the
	    possibilities.
          </p>

	  <ulist>
	    <item>
	      Filter by new response header: Only responses carrying a
	      certain header or kind of header are recognized as
	      carrying URI documentation, and/or as not being instances.
	      Perhaps a new request header 
	      (new content negotiation dimension) could be used to elicit
	      such a response from participating servers.
	      See <bibref ref="thompson"/>.
	    </item>
	    <item>
	      Filter by media type: Only certain media type(s),
	      e.g. application/rdf+xml, are
	      recognized as carrying URI documentation, and/or as not
	      being instances.
	      Content negotiation can be used to elicit these media
	      types.  See <bibref ref="davis"/>.
	    </item>
	    <item>
	      Perhaps responses that are non-instances
	      can <emph>only</emph> be elicited via appropriate
	      request headers, per previous two bullets.
	    </item>
	    <item>
	      Responses that pass one of these filters might be
	      always presumed to be non-instances, or...
	    </item>
	    <item>
	      Responses that pass one of these filters might be
	      treated as non-instances when URI documentation is
	      actually found in the response.  Although 
	      the topic of figuring out whether a document contains
	      URI documentation and if so what it is has not (to the
	      editor's knowledge) been 
	      discussed, URI documentation might
	      be recognized as being some part of the document (some
	      set of statements, in RDF or OWL) in
	      which the URI occurs.
	    </item>
	    <item>
	      If URI documentation is consistent with the response
	      being an instance, then perhaps it is to be taken as
	      one.
	    </item>
	    <item>
	      The status as above of a response may influence the
	      status of other responses - for example, an
	      application/rdf+xml response that implies that the
	      resource does not have instances might force other
	      responses to be not treated as instances.  Therefore,
	      before assuming any response is an instance of what the
	      URI names, discovery
	      would have to be performed.
	    </item>
	  </ulist>

 -->

	<!-- 
	  Thanks to Convention 1, it is currently easy to write and
	  interpret Web metadata 
	  (meaning metadata written about retrieval results at
	  a retrieval-enabled hashless URI).
	  By conflicting with Convention 1
	  this proposal makes metadata more complicated, fragile, and
	  costly, and requires producers and 
	  consumers of Web metadata to be updated to be aware
	  of indirect URIs.
        </p>

        <p> 

        <p>
	  As most of the Web (e.g. HTTP clients and servers) will
	  continue to adhere to the current interpretation of
	  Convention 1, the proposed rule
	  introduces a split in the URI namespace, with two
	  communities interpreting the same URIs in incompatible
	  ways.  Having multiple namespaces 
	  imposes an overall system cost in that one has to
	  determine which one to use in each instance 
	  (see <bibref ref="webarch"/> 2.2.1).

	  (Desideratum: <specref ref="d.metadata"/>,
        </p>

-->

	<!-- 
	  LSIDs rely on an unregistered URN namespace, calling their
	  consensus status into question and making them impossible to
	  understand through the usual "follow your nose" chain of
	  IETF URI specifications. 
	  As currently used, LSIDs rely on DNS 
	  for both authority and resolution,
	  and therefore have the same
	  vulnerabilities as http: URIs.
        -->


	<!-- 

        <p>
          This would be incompatible with Convention 1.
	  Clearly some agents, such as Web servers, would have to
	  respect the current rule, since under the proposal they
          might interpret an indirect URI in a manner not expected by any
          client.  (Consider the situation where the indirect URI is
	  documented as meaning
          a resource that is not the one
	  on the Web at the URI.)  So the first effect would be to
	  partition URI contexts into those 
          where indirect URIs are interpreted according to the current
          rule, and those where they're interpreted them according to
          the proposal.  For
          example, an indirect URI as the target URI of an HTTP
          messages would be interpreted according to the current rule,
          while an indirect URI occurring in an RDF document might be 
          interpreted according to the proposal.
        </p>

        <p>
          Some machine-actionable rule is desirable, since without one there
          is no reliable way to use <emph>any</emph>
          retrieval-enabled hashless URI u to
          refer to IR(u), and much currently deployed metadata, which
          relies on Convention 1, would fail.  There
          would always be the possibility that 
          u might be understood to be documented by IR(u) instead.
        </p>
 	-->


	<!-- 
        <label>information resource</label>
        <def>
          Roughly speaking, something that is appropriate as the
          subject of metadata.  See <bibref ref="generic"/>.
        </def>
 -->


	<!-- 
	<div3>
	  <head>'Hash URI' semantics is sensitive to media type</head> 
	  <p>
	  <emph>Criticism:</emph>
	    If there is content negotiation, session sensitivity,
	    etc., then the URI documentation that is intended and sought may
	    not be present in the 
	    <loc href="#representation">representation</loc>
	    that is accessed.
	    Worse, the URI documentation that is found may be incompatibly
	    different from the one that is meant.  For example, if
	    there is an application/rdf+xml 
	    <loc href="#representation">representation</loc>
	    and a text/html 
	    <loc href="#representation">representation</loc>,
	    then the former may document the
	    URI as naming an earthquake, while the latter may
	    document it as naming an HTML element.
	  </p>

	  <p>
	    <emph>Response:</emph>
	    The answer to this objection is that a server that wants
	    to avoid risking such confusion shouldn't do this.  A
	    server should 
	    either avoid content negotiation completely, or if it must
	    do CN, it should make sure that the URI is documented
	    in all 
	    <loc href="#representation">representations</loc>,
 	    and in the same way in all of them.
	  </p>

	  <p>
	    At present the only media type registration that directly supports
	    documenting 'hash URIs' in arbitrary ways is
	    application/rdf+xml.  Since this media type has no
	    human-friendly presentation and is not enabled for XSLT,
	    many providers (e.g. FOAF, dx.doi.org) use CN between
	    HTML and RDF so that access in a browser
	    delivers information that is useful to a human.  
	    E.g. if you access FOAF without
	    special CN parameters you will not get discoverable URI
	    documentation for
	    its non-element fragids.
	  </p>

	  <p>
	    The advent of RDFa, which should eliminate the need for HTTP/RDF
	    content negotiation, may create an
	    opportunity to smooth this inconsistency over.
	  </p>

	  <p>
	    [Draft note: Talk about what RDF Concepts says - in RDF, the
	    meaning depends only on an application/rdf+xml
	    representation (possibly a hypothetical one).  This is
	    odd, but it has the virtue of removing any context
	    dependence i.e. dependence on the media type of
	    the representation that's actually retrieved.]
	  </p>

	  <p>
	    [Note: After this was written, Henry Thompson pointed out
	    that any media type specifying RDF graph equivalence and a
	    normative reference chain leading to the RDF 
	    Concepts recommendation supports RDF-style hash URIs.  So
	    N3, Turtle, Manchester syntax, etc. should be covered.]
	  </p>
 -->

	  <!-- 
	  <p>
	    [Draft note: See
	    <a href="http://www.dehora.net/journal/2007/10/19/fragged/"
	     >Bill de h&Oacute;ra's blog post "Fragged"</a>
	    and <a href="http://blog.iandavis.com/2007/11/17/fragmentation-reprise/"
	      >Ian Davis's post on fragids</a> and
	    AWWW 3.2.2 <bibref ref="webarch"/>.]
	  </p>
          -->
	  <!-- 
	</div3>
	  -->

	  <!-- 
	<div3 id="compete">
	  <head>Markup and scripts compete with URI documentation for use
	    of local identifiers</head> 
	  <p>
	  <emph>Criticism:</emph>
	    Local identifiers, in addition to admitting
	    URI documentation, are also used by markup (for example, anchor
	    names) and by scripts, for potentially conflicting purposes.
	  </p>
	  <p>
	    <emph>Response:</emph>
	    All three uses are under the control of the author of the
	    file.  It is only necessary to coordinate the three uses so
	    that no local identifier is used for more than one purpose.
	  </p>
	</div3>
          -->


        <!-- 
        <p>
          Variant use case: Same as above, but instead of the earthquake, the
          referent of the URI is to be an 
          information resource that is not accessible on the Web, or at least
          not at any URI known to Alice.  The definition might describe where
          the information resource might be found, and other aspects
          such as bibliographic metadata (author, title, etc.) or SHA1 hash.
        </p>
        <p>
          Variant use case: Same as above, but instead of the earthquake, the
          referent is to be an 
          information resource that <emph>is</emph>
          Web accessible, via a URI known to Alice.
          The definition that Alice
          writes explains that the term is to refer to that information
          resource.  That is, there are <emph>two</emph> information
          resource at play here, one carrying the definition and one
          that's a subject of the definition.  It's important in this
          case to make sure that metadata can be written about either
          information resource.
          (In this situation, which is common in the publishing industry and
          digital archives, Alice's definition is often
          called a "landing page".)
        </p>
	-->




	<!-- 
	<p>
	  [Draft note: HH: I think the answer to this should be a strong
	  "No" and should be 
 	  discouraged, rather than heavily described as currently is. I feel too
 	  much space is used on this example.  JAR: Removed most of
	  it; is it OK now?]
	</p>
        <p>
          Each URI scheme, e.g. mailto:, http:, ftp:, and so on, has
          its own URI scheme registration, accessible via a registry
          maintained by IANA
          <bibref ref="rfc4395"/>.
          A URI scheme registration defines the
          meaning of URIs using that scheme either directly or by
          delegation to additional defining mechanisms.
          For example, the registration for the data: URI scheme
          fully explains the meaning of every URI that uses that
          scheme, while the mailto: scheme registration explains 
	  that each URI refers to a particular mailbox, understood
          relative to the domain name system and the mailbox
          assignments made by each particular host.
        </p>

        <p>
          URN namespaces <bibref ref="rfc3406"/>
 	  work in a similar way.  Each namespace has a
          registration document that is formally reviewed through IETF and
          placed on file
          with IANA.
        </p>
 -->



	    <!-- 
	    <a href="http://blog.iandavis.com/2007/11/17/fragmentation-reprise/"
	      >Ian Davis:</a>
	    The meaning of a hash URI "depends on how you access it, which
	    is nuts. Its as though a word has different meanings
	    depending on whether you read it in a book or have it read
	    out to you." 
	      &mdash; JAR: I think he's talking about the situation where
	    there is content
	    negotiation <emph>and</emph> there is inconsistency between the
	    variants.  The more common problem with content negotiation is
	    that there is no way to know ahead of time which variant 
	    has the definition at all, and thus which one to
	    request in content negotiation.
	  <p>
	    Ian points out that RDF Concepts says:
	    "a URI reference in an RDF graph is treated with respect to
	    the MIME type application/rdf+xml [RDF-MIME-TYPE]. Given an
	    RDF URI reference consisting of an hashless URI and a
	    fragment identifier, the fragment identifer identifies the
	    same thing that it does in an application/rdf+xml
	    <loc href="#representation">representation</loc>
	    of the resource identified by the hashless
	    URI component."
	    and that this appears to conflict with webarch.
	    [Draft note: TBD: try to figure out what is going on here.]
	  </p>
 	    -->

<!-- 
        <label>fixed information resource</label>
        <def>
          A document, image, sound recording, or
          other replicable entity as encoded in
          an octet sequence, together with
          optional brief annotations, such as media type and language,
          intended to guide the interpretation of the
          content.  There is no requirement that a given fixed
          information resource is accessible via any URI.
        </def>
 -->
	  <!-- 
	  <p>
	    The Chicago use case is an extreme version of this - the
	    entity providing access to the Chicago document (Alice) does not
	    even care about providing URIs that refer to Chicago; it is
	    someone having no control over retrievals using the URI (Bob)
	    who needs a reference to Chicago.
	  </p>
	  -->

	<!-- 
        <p>
	  [ADI('http://example/about-eq018', 'http://example/eq018') ?]
        </p>
	 -->

<!-- 
      <div2 id="suffix">
        <head>'Hash URI' with fixed suffix</head> 
        <p>
          This idea attempts to address one reason for using 'hashless'
          URIs instead of local identifiers.  Suppose you want to
          combine a large number of local names a, b, c, ... into a
          namespace.  The usual solutions would be to write
          'http://example/namespace#a' (a "hash namespace") or 
          'http://example/namespace/a' (a "hashless namespace").
        </p>
        <p>  
          In the "singleton fragid" approach one would write
          'http://example/namespace/a#' (a null local identifier) or
          'http://example/namespace/a#_', using a fixed suffix for every
          URI and varying the part between the namespace prefix and
          the suffix.
        </p>
        <p>
          As in the 303 approach, each URI in the namespace would (or
          could) have its own document, providing a definition for that
          single URI rather than for every URI in the namespace.
        </p>
        <p>
          The choice of fixed local identifier (null, "_", or
          something else) is largely a matter of taste.
        </p>
        <p>
          A null fragid precludes the use of qnames to abbreviate such URIs.
          (In particular it would not be possible to use them as
          predicate names in RDF/XML.)
          However, SPARQL, Turtle, and RDFa 
          are being extended to admit CURIEs that include #, making this a
          newly attractive option.
        </p>

        <p>
          To address the "hash gets lost" problem we could explore
          heuristics to automatically replace 'http://example/eq018' with
          'http://example/eq018#' (or 'http://example/eq018#_') when needed.
        </p>
      </div2>
     -->





<!-- Unused stuff. -->


      <!-- 
        <p>
          With any of these methods other than retrieval-enabled hashless URIs,
          the URI may refer to anything at all, including an
          information resource.  [COMMON MISUNDERSTANDING, not sure
          where this goes in the document.
          <loc href="http://lists.w3.org/Archives/Public/public-lod/2010Nov/0249.html"
          >This email</loc>, for example, gets it wrong; the question is not
          IR vs. NIR, it's about which thing the URI is to refer to,
          IR(u) vs. FV(u).]
        </p>
 -->


<!-- 
        <label>version (of an information resource)</label>
        <def>
          A fixed information resource associated with an information
          resource is a version of the information resource.
          <footnote>
            "Version of" as used here is similar to one of the senses in
            which "representation of" is used in
            discussions of Web architecture.
            We have two reasons to avoid "representation".  One is that
	    "representation" has been used in different ways by
            different parties and it seems wise to avoid risk of
            misinterpretation.  Another is that our versions have to be
            the same kind of thing as the information resource that
            they are versions of, so that they can have metadata.
 	    In most treatments of Web
            Architecture, representations are considered very
            different from information resources and
            do not have the same sorts of properties as information
            resources.
 -->
	    <!-- 
            due to the different ways in which Roy Fielding (in his
            REST work) and Tim Berners-Lee [citation needed] use the word.
            It seems better to avoid the word entirely and use a new
            word to specifically mean the Tim Berners-Lee sense.
 	    -->
	    <!-- 
          </footnote> -->

          <!-- 
          [Cf. TimBL 'fixed resource.']
          [Searching for a new term since Nathan and JAR don't 
          like "representation".
          Consider: version, content+, continent, malcontent, discontent, epresentation, 
          represen-tation, specific information resource, simple information
          resource, fixed resource, specialization.]
          [Consider trying to write the document without any need for
          this word!]
          -->
	  <!-- 
        </def>
     -->
	  <!-- 
          Because of the controversy around this term we will not
          attempt to define it, but rather say that: (1) An
          information resource
          is associated with a set of fixed information 
          resources (its versions). (2)
          An information resource is "similar" to its versions in that
          metadata that applies to each version of an
          information resource applies
          to the information resource itself, and vice versa.
 	  -->

	<!-- 
        <label>FV(u)</label>
        <def> 
          FV(u) is shorthand for the meaning of a URI u
          according to the definition of u in (a version of)
          the information resource IR(u).  For
          example, if IR('http://example/p16') says that 
          'http://example/p16' refers to Alice's canoe,
          then FV('http://example/p16') is Alice's canoe.
          ('FV' stands for 'take at face value'.)
        </def>
 -->


      <!-- 
      <div2>
        <head>Alternative URI schemes and/or URN namespaces</head> 
        <p>
          The purpose of URI scheme registration is to create new
          classes of URIs with meanings specified by the
          registration.  That is, the registration is a definition
          (perhaps partial) of the meanings of the URIs having that
          scheme.
        </p>
        <p>
          One could derive a URI to refer to a canoe from a URI
          with retrievals yielding a definition (of the derived URI) by prefixing
          a particular URI prefix to
          the URI for the definition, e.g. fv:http://example/about-p16 .
        </p>
        <p>
          tdb: is close to this, but it covers the primary-topic-of
          use case, not the mint-a-term one - these would not have the
          same behavior.
        </p>
        <p>
          The process for registering a URI scheme is documented by 
          RFC 4395, and for registering a URN namespace is in RFC 
          3406.
        </p>
        <p>
          A problem shared by all non-http URIs is that they won't "work" in
          unmodified browsers.  (But "it's not about 
          browsers," cries Mark Wilkinson.)
        </p>
      </div2>
       -->



        <!-- 
        <p>
          Variant use case: Same as above, but Bob's bibliography
          includes a number of RDF 
          documents, and his metadata includes information relevant
          for making use of those RDF documents.
        </p>
        <p>
          Variant use case: Same as above, but instead of being a 
          person, Bob is a tool that
          is charged with updating all the documents on a Web site with
          license metadata.
        </p>
 	-->

      <!-- 
      <p>
        [Terminology option: Maybe "metadata subject" instead of
        "information resource"??]
      </p>
      -->


        <!-- 
        <p>
          (Why would one be dealing with both kinds of statements 
          at the same time?  Well, the two groups of statements
          might be inserted as RDFa into a
          single HTML document by different tools, or by different
          modules in a content management system.  Or the statements
          might be combined in a single triple store from multiple sources.)
        </p>

        <p>
          (Another way to make sense of this approach is to say that
          URI u refers to IR(u),
          but predicates such as foo:mass and foaf:name have their
          domains expanded 
          to include information resources, and IR(u) is "coerced" to
          FV(u) as needed in order for the predicates to make sense.
          In this view it is the predicates that are the chimeras, not
          the entities they apply to.)
        </p>
        -->

	<!-- 
        <p> 
          Second,
          if the definition of 'http://example/p16'
	  happens to specify an information resource
          other than IR('http://example/p16'), we will end up
          with incorrect statements, since metadata for two distinct
          information 
          resources would be attributed to a single entity.
          Consider, for example, the case where copyright license A applies to
          IR('http://example/p16') and copyright license B applies to
          FV('http://example/p16').  This would lead to both licenses
          being applied to CH('http://example/p16'), which would be
          impossible to interpret correctly, as neither subject is
          such that both licenses apply to it.
          We would have to obtain general agreement that the
          definition at IR('http://example/p16')
          must not
          lead to the URI being understood to refer to
          any information resource other than
          IR('http://example/p16') itself.
        </p>
 	-->

      <!-- 
      <p>
        [Draft note: still thrashing on terminology "definition"
        vs. "documentation" vs. "account"]
      </p>

      <p>
        Languages such as OWL and RDF that
        pervasively use URI-based vocabularies require that
        one be able to refer [mean?], in those languages, to things one
        has to refer to,
        in such a way that the reference will be understood by someone
        encountering the reference.  These references either are URIs
        or are built on URIs, so the problem of referring
        reduces to that of either knowing, or influencing, the way that
        readers will interpret URIs referentially.
      </p>
      -->

	<!-- 
        <p>
          (Any of these methods may be used to document a URI that refers
          to something for which a more specialized generic 
	  definition already exists, for
          example a mailbox (for which there is the mailto: URI scheme
          registration document) or
          an information resource (for which there are the
          registrations for http:, ftp:, gopher:, data:, and so on).
          In theory, an information 
          resource could be specified in a URI definition by
          spelling out the details of its versions, perhaps in RDF.
          However, this is ordinarily not necessary, since usually the
          specialized naming system can be used.)
        </p>
 	-->

	<!-- 
        <p>
          Most URI scheme registrations, such as that for http:, only
          provide a partial definition, and other sources of
          information must be consulted in order to understand a
          particular URI using that scheme.  For example, to
          understand the meaning of an http: URI, one generally needs
          to request a retrieval using
          it (and even then one only knows a single version of the
          information resource; see
          <specref ref="ir-ref"/>).
        </p>

	<example>
	  <head>Defining a URI by registering a URI scheme</head>
	  <p>
	    To document a URI as referring to Mount Everest, Alice
	    invents a new URI scheme, say mountain:, and publishes a
	    registration for it via IETF and IANA that says that
	    'mountain:peakxv' refers to Mount Everest.  Bob, on
	    encountering 'mountain:peakxv', checks the IANA URI scheme 
	    registry
	    (which he knows about because the registry is specified
	    by IETF),
	    obtains a link to Alice's 
	    registration for the 'mountain:' scheme, 
	    reads the registration, and is enlightened.
	  </p>
	</example>

  	<p>
	  Practically
          speaking, this approach is very challenging due to the
          rigor of the review process for URI scheme registrations
	  (see <bibref ref="rfc4395"/>).
          Furthermore, Web clients will not understand the new URI
          scheme, making the definition of the URI
          effectively inaccessible for most agents encountering the URI,
	  at least until the mountain: scheme becomes as well known as
          the http: scheme.
        </p>
 -->

	<!-- 
        <label>ADI(u,v)</label>
        <def> 
	  The meaning of URI v, as documented in IR(u).
	  [not sure we need this one.]
        </def>
	 -->
        <!-- 
        Methods that use the
        Web protocol for the probe URI (HTTP, in this case) in order to determine
        what the probe URI means are called "follow
        your nose" (FYN) methods.   [Henry doesn't like this usage.]
         -->

      <!-- 
      <p> [Draft note, Alan:
But this suggests that you introduce earlier: "sentences, phrases"
etc, as the scope of URI use you are interested in.
I see you define "phrase" later. With this audience they will read
their own meaning. So either use terms outside their repertoire or use
typography to distinguish and warning at top of document to read
carefully.] </p>
 -->
      <!-- 
	<ednote>
	  <date>2011-06-09</date>
	  <edtext>
        [Jeni Tennison: "I think you could do with making more of (ie explaining
        in more detail up front) the criteria against which the
        various alternatives are judged."]
	  </edtext>
	</ednote>
 -->
