<?xml version="1.0" encoding="UTF-8"?> 
<?xml-stylesheet type="text/xsl" href="../../../doc/xmlspec.xsl"?>
<!DOCTYPE spec SYSTEM
"http://www.w3.org/2002/xmlspec/dtd/2.6/xmlspec.dtd" [ 
<!--
================================================================
--> 
<!ATTLIST spec xmlns:xlink CDATA #IMPLIED>
<!ENTITY mdash " &#8212; "> 
<!ENTITY epsilon "&#949;"> 
<!ENTITY Oacute "&#211;"> 

<!ENTITY draft.day "25"> 
<!ENTITY draft.monthname "June"> 
<!ENTITY draft.year "2011">
]>

<!-- Larry doesn't like the time dependence of it all.
  Ashok wonders how this relates to Web Linking.

  Larry: my concern about hash and 303 is that the make meaning depend
  on deployment of services infrastructure, which cost power and money
  to maintain, and that to mean something you shouldn't have to have a
  foundation

  Larry: i'd like to see in section 5 the issue of requirement for
  long-term availability of URI data or services (303 or hash)

  HT: You might want to reference XRI and a whole bunch of other stuff...

  Jeni: as we highlighted at the F2F, as we come to use named graphs
  to enable us to describe the provenance/trust/temporal coverage
  characteristics of particular sets of information, it's going to
  become really important to keep the notions of document/named graph
  and the topic of that document/named graph separate in some way. I
  think you could make more of a case for that, to counter the view
  that information about documents isn't really important.
     - added mention of provenance as an app that needs this
-->

<!-- 
DB:
 16. Similarly, I think it would be helpful if each proposed solution
 explicitly stated what Alice, Bob and Carol should do, according to that
 solution: "According to this approach, in scenario 2.1, Alice
 should . . . Bob should  . . . Carol should . . . ".  I first noticed
 the need for this in sec 3.4 (LSID), perhaps because I don't know the
 details of how LSID works.
 -->

<!-- 
Alan R: - feeling at atm - just before glossary, is that good content but
better presentation order needed.
In intro make clear that it is use of URI is in sentences. (there are
other points to move there, I think).
 -->

<!-- Providing and discovering URI definitions -->

<spec xmlns:xlink="http://www.w3.org/1999/xlink" w3c-doctype="wd" role="editors-copy"> 
  <header>

    <title> Providing and discovering definitions of URIs
    </title>

  <!-- 
    <w3c-designation>http://www.w3.org/TR/2009/WD-hash-in-url-20090415/</w3c-designation> 
  -->
    <w3c-doctype>Editor's Draft</w3c-doctype> 
    <pubdate> 
      <day>&draft.day;</day>
      <month>&draft.monthname;</month> 
      <year>&draft.year;</year>
    </pubdate> 

    <publoc> 
  <!-- 
      No stable URI for this version.  When citing please specify date
      given above.
  -->

      <loc href="http://www.w3.org/2001/tag/awwsw/issue57/20110625/" >
        http://www.w3.org/2001/tag/awwsw/issue57/20110625/
      </loc>
    </publoc>

    <prevlocs>
      <loc href="http://www.w3.org/2001/tag/awwsw/issue57/20110531/" >
        http://www.w3.org/2001/tag/awwsw/issue57/20110531/
      </loc>
    </prevlocs>

    <altlocs>
      <loc role="xml" href="issue57.xml"
           xlink:type="simple">XML</loc>
    </altlocs>
    <latestloc> 
      <loc href="http://www.w3.org/2001/tag/awwsw/issue57/latest/" 
        >http://www.w3.org/2001/tag/awwsw/issue57/latest/</loc> 
    </latestloc>  

    <authlist> 
      <author>

        <name>Jonathan A. Rees
        </name> 
        <email href="mailto:rees@mumble.net"
	   >rees@mumble.net</email> 
      </author>

    </authlist> 
    <status> 
      <p>
        This report has been developed by the 
        <loc href="http://www.w3.org/2001/tag/awwsw/"
          >AWWSW Task Group</loc>
        of the
        <loc href="http://www.w3.org/2001/tag/"
          >W3C Technical Architecture Group</loc>
        in order to provide background material for further discussion
        among those affected by this architectural question, and to help drive
        TAG issue 57 <bibref ref="issue-57"/> to a conclusion.
	The task group's public discussion list is
	public-awwsw@w3.org
        (<loc href="http://lists.w3.org/Archives/Public/public-awwsw/" 
          >archives</loc>).
      </p> 

      <p>
        Earlier versions of this document have been reviewed by the
        task group and the TAG but this version has not.
	The content of this version is the sole responsibility of the
        editor.	<!-- 
	, and has not been formally endorsed by the task group
        or the TAG. -->
      </p>

      <p>
        Publication of this draft
        does not imply endorsement by the W3C Membership. This is
        a draft document and may be updated, replaced, or obsoleted by
        other documents at any time.
      </p> 

      <p>
	<!-- 
        Please send comments on this
        document to the editor at
	<loc href="mailto:rees@mumble.net" 
	 >rees@mumble.net</loc>.
 	-->
	Please send comments on this
	document to the publicly archived TAG mailing list 
	<loc
	    href="mailto:www-tag@w3.org">www-tag@w3.org</loc>
	(<loc href="http://lists.w3.org/Archives/Public/www-tag/"
	   >archive</loc>).
      </p>

      <!-- 
      <p>
        Changes expected for the next version of the document include:
        make the problem of
        metadata incompatibility much more prominent in abstract.
      </p>
 -->

    </status> 

    <abstract> 
      <p>
        The specification governing Uniform Resource Identifiers
        (URIs) <bibref ref="rfc3986"/> allows URIs to mean anything at all,
        and this unbounded flexibility is exploited in
        a variety contexts, notably the Semantic Web and Linked Data.
        To use a URI to mean something, an agent (a) selects a URI,
        (b) provides a definition of the URI in a manner that
        permits discovery by agents who encounter
        the URI, and (c) uses the URI.  
	Subsequently other agents may not only understand the URI (by
        discovering and consulting the definition) but may also use
        the URI themselves.
	<!--  redundant:
	As long as the definition remains
        discoverable, the URI may then be used and understood by other
        agents.
        or,
        As long as the definition remains discoverable, agents
        encountering the URI will be able to understand it [to the
        extent that the definition is helpful].
         -->
      </p>
      <p>
        A few widely known methods are in use to help agents provide
        and discover URI definitions,
        including RDF fragment identifier resolution and the HTTP 303
        redirect.  
        Difficulties in using these methods
        have led to a search for new methods that
        are easier to deploy, and perform better,
        than the established ones.  
	However, some of the proposed methods introduce new problems, such
        as incompatible changes to the way metadata is written.
	This report
        brings together in one place information on current and
        proposed practices, with analysis of benefits and shortcomings
        of each.
      </p>
      <p>
        The purpose of this report is not to make recommendations but
        rather to initiate a discussion that might lead to
        consensus on the use of current and/or new methods.
      </p>
    </abstract> 

    <langusage> 
      <language id="en-US">English</language> 
    </langusage>

    <revisiondesc> 
      <p>
        <ulist> 
          <item>
            <p>$Id: issue57.xml,v 1.2 2011/06/25 14:41:57 jrees Exp $
            </p>
          </item>          
        </ulist> 
      </p> 
    </revisiondesc> 
  </header>

  
  <body> 
    <div1>
      <head>Introduction
      </head>

<p><emph>This is an old issue, and people are tired of it.  
&mdash;Sandro Hawke, January 2003</emph> 
<bibref ref="disambiguating"/></p>

      <p>
        In any kind of discourse it is very useful for an agent to be
        able to provide a definition of a term, in such a way that other agents
        can discover and use that definition in order to make sense of
        utterances that use that term, and to compose new ones.
      </p>

      <example>
	<head>Definition discovery</head>

	<graphic source="discovery.png"
		 alt='Definition of "EQ 018"'/>

	<p>
	  Suppose that Alice, in
	  communication with Bob, uses
	  the term "EQ 018" to mean
	  the Loma Prieta earthquake, as in "Alice was in the laboratory
	  during EQ 018".  If Bob does
	  not know what "EQ 018" means, he will have to find out. He
	  might be able to ask Alice directly, although  
	  this may be impossible, as Alice might be too busy, or
	  otherwise unavailable.
	  Lacking that option he does some research, consulting
	  a dictionary or similar resource (reference book, database, 
	  search engine)
	  in order to obtain the 
	  explanation of Alice's use of the term "EQ 018".
	</p>

	<!-- 
	<p>
	  The essential idea is that there are one or more methods
	  available to Bob by which he can discover 
	  bits of writing that explain what 
	  what Alice
	  means by "EQ 018".
	</p>
 	-->

      </example>

      <p>
        In this report, the terms to be defined are assumed to be
	URIs.  URIs can be used 
	to mean all sorts of things
	in many different technical contexts.  Contexts of 
	special interest to this report are
	those processed by machine,
 	including the RDF and OWL family of languages.  The question
	may appear to 
	be limited to RDF and its derivatives, but to the
	extent that there is supposed to be a single 
	meaning for each URI common to RDF and Web architecture
	<bibref ref="webarch"/>, the issue transcends RDF.
      </p>

      <p>
        The nature of definitions need not concern us here - many forms
        are familiar, including translation between
        languages (e.g. providing an English or Spanish phrase equivalent to a
        URI), descriptions (the URI refers to an entity possessing
        some set of properties), explanation by example, axiomatic
        method, and so on.  Also
        not of concern here are the many ways in which
        meaning can fail as a result
        of <emph>what</emph> a definition says or doesn't say about the
        URI in question, or the particular way in which a URI is
        used.  Our concern is only with 
        the method by which definitions are conveyed, and with meaning
        only to the extent the method impinges on interpretation.
      </p>

      <p>
        Definitions are typically carried in documents.  No
        assumptions are made about what else might be in such a
        document; there could be additional related information,
        definitions of other URIs, and so on.  Nor is it important
        here that a definition be delimited or set off from the other
        information in the document.  As in an encyclopedia, the
        definition part blurs into the other-information parts of the
        document.
      </p>

      <p>
        When the term to be defined is a URI,
        discovery methods
        include, in addition to those already mentioned, network
        protocols such as HTTP that involve the URI as a protocol element.  
        <!-- 
        Methods that use the
        Web protocol for the URI (HTTP, in this case) in order to determine
        what the URI means are called "follow
        your nose" (FYN) methods.
         -->
      </p>

      <!-- 
      <p> [Draft note, Alan:
But this suggests that you introduce earlier: "sentences, phrases"
etc, as the scope of URI use you are interested in.
I see you define "phrase" later. With this audience they will read
their own meaning. So either use terms outside their repertoire or use
typography to distinguish and warning at top of document to read
carefully.] </p>
 -->

      <p>
        Definition discovery is similar to Web dereference in that in
        both cases one starts with a URI and ends with a document.
	The two must not be confused, however, since dereference often
        yields a document that either does <emph>not</emph> define the URI
        or is not recognized as doing so.
	At present, by convention, a dereferenceable absolute URI
        refers to the information resource 
	<loc href="#on-web-at">on the Web at</loc>
	that URI 
 	(see <bibref ref="ir"/>),
 	independent of anything that the information resource
	says about what the URI means.
      </p>

      <p>
	The reason we define definition discovery methods is 
	interoperability: so that there is agreement on how each URI
 	is to be understood.
	In principle, we only need consensus on methods such as the ones
	surveyed here for URIs
	that are to be shared widely.  If 
	agents in one community never use the URI in communication with
	agents in another community, then it is OK for the URI
	to have
	distinct senses in the two communities, and there is no
	problem to be solved.  Each community can use the URI in its
	own way, and there will be no confusion.
      </p>

      <p>
        The operative word here is "if".  Isolation is fragile and
        means lost opportunities for synergy and unintended reuse.  All
        the arguments in favor of a World Wide Web, which depends on the
        global nature of the URI vocabulary, apply here.
      </p>

      <p>
        This report presents discovery methods in current use,
        reports some 
        criticisms of them, and describes some additional discovery methods that
        have been proposed to address the criticisms.
      </p>

      <!-- 
      <p>
        [Draft note: Maybe talk in the introduction about alternatives
        to defining a URI: using
        non-URI phrases and syntactic sugar (these used to be sections).
	Discussion currently relegated to <specref ref="ddi"/>. ]
      </p>
     -->

      <div2>
        <head>Success criteria</head>
	<p>
	  The ideal definition discovery method would have the
	  following properties:
	</p>
	<ol>
	  <li>
	    <emph>
	      Simple.
	    </emph>
	    Having too many options or too many things to remember makes
	    discovery fragile and impedes uptake.
	  </li>
	  <li>
	    <emph>
	      Easy to deploy on Web hosting services.
	    </emph>
	    Uptake of linked data depends on the technology being
	    accessible to as many Web publishers as possible, so
	    should not require control over Web server behavior that
	    is not provided by typical hosting services.
	  </li>

	  <li>
	    <emph>
	      Easy to deploy using existing Web client stacks.
	    </emph>
	    Discovery should employ a widely deployed network protocol
	    in order to avoid the need to deploy new protocol stacks.
	  </li>

	  <li>
	    <emph>
	      Efficient.
	    </emph>
	    Accessing a definition should require at most one network
	    round trip, and definitions should be cacheable.
	  </li>

	  <li>
	    <emph>
	      Browser-friendly.
	    </emph>
	    It should be possible to configure
	    a URI that has a discoverable definition
	    so that 'browsing' to it yields information
	    useful to a human.
	  </li>

	  <li>
	    <emph>
	      Compatible with Web architecture.
	    </emph>
	    A URI should have a single agreed meaning globally,
	    whether it's used as a protocol element, hyperlink, or name.
	  </li>
	</ol>
	<p>
	  It is not certain that all of these goals can be met
	  simultaneously.
	</p>
      </div2>


    </div1> <!-- end introduction -->


    <div1>
      <head>Use case scenarios</head> 

      <p>
        Use cases need to be presented as being independent of any
        particular solution to be used, in order that the solution space
        can be explored without bias.  This leads to some
        frustrating vagueness in the following, but the vagueness is
        intentional and necessary.
      </p>

      <div2>
        <head>Choosing a URI, providing a definition of the URI, using 
          the URI</head> 
        <p>
          Alice wants to refer to a particular earthquake.
          Alice "mints" a new URI (one that is not yet in use) with the
          purpose of using that URI to refer to the earthquake.  Alice
          publishes a document containing a definition of the URI, i.e.
	  a document that
          would lead a reader to understand that the URI refers to the
          earthquake.
        </p>
        <p>
          Bob then learns of Alice's URI and its definition, and uses
	  the URI in a document
          of his own.
        </p>
        <p>
          Subsequently Carol encounters Bob's document.  Wanting to
          know what the URI means, she 
          is led somehow to Alice's published definition, which she
          reads.  She is enlightened.
        </p>

	<p>
	  Any method for implementing this use case would need to explain:
	  what kind of URI Alice should use (syntactic constraints);
	  where and how should Alice should publish the definition so that it
	  can be found;
	  and how Carol might come to discover Alice's definition, given
	  the URI.
	</p>

      </div2>

      <div2 id="chicago">
        <head>Using a document as a definition by reference to its
          primary topic</head>  

	<ednote>
	  <date>2011-04-14</date>
	  <edtext>
	    Consider dropping this use case, and explain the
	    situation in some less prominent way.
	    The only evidence we have for this situation is from 
	    <loc href="http://lists.w3.org/Archives/Public/semantic-web/2011Apr/0001.html"
	     >Hugh Glaser's message</loc>,
	    and most of the discussion in this document does not apply
	     to this case. 
	  </edtext>
	</ednote>

        <p>
          Bob desires to refer to Chicago.  
          He finds a Web page 
          on the Web at 'http://example/about-chicago' (provided by,
          say, Alice) that consists
          of a description of Chicago, and wants to use it for the
          purpose of referring to Chicago.  He chooses
          a URI and associates it with Alice's Web page 
          in such a way that Bob's URI will be understood as referring to
          Chicago.
        </p>
        <p>
          Carol encounters Bob's URI, is led to 'http://example/about-chicago'
          and thence to Alice's description of
          Chicago, and then somehow understands that Bob's URI is
          meant to refer to Chicago.
        </p>
        <p>
	  Any method for implementing this use case would need to
	  explain: what are the syntactic constraints on the URI Bob 
	  chooses; what
	  Bob needs to do to associate his URI with the document about
	  Chicago; and how Carol comes to discover and use that
	  association.
        </p>
        <p>
          (This differs from the previous use case in that the
	  document about Chicago was
          not written with the purpose of defining Bob's URI.  In fact 
          Bob's URI doesn't even occur in it.  Rather than look 
	  in the document for a
          definition mentioning Bob's URI, Carol must determine the
          topic of the document and take the topic as the meaning of
          Bob's URI.)
        </p>
      </div2>

    </div1> 


    <div1>
      <head>General definition methods in current use</head>

        <p>
          This section describes currently accepted methods for
          providing and discovering definitions of URIs.
        </p>

      <div2 id="colocate">
        <head>Colocate definition and use</head> 
        <p>
          One way to lead someone encountering a URI to a definition
	  of the URI is to
	  make sure that the definition of the URI occurs in
	  each document in which the URI occurs.
	  This makes the definition easy to find, since anyone who
	  encounters the URI will have in hand
	  the definition that they need.  
	  The form of the URI in this case is arbitrary.
        </p>
        <p>
	  This method treats URIs similarly to blank nodes in RDF, which
	  have to stay close to their own definition, since they
	  are scoped to a graph.  An example of the application of
	  this approach would be the use of a 
	  URI in an OWL ontology file that defines that URI.
        </p>

        <p>
	  <emph>Criticism:</emph>
	  In RDF, this method is fragile in the same way as are blank nodes,
	  because use and definition can get
	  separated, e.g. when uses of the URI are deposited into a
	  triple store and then retrieved by a query.  
	  Carrying a definition around with a reference 
	  does not help in 
	  the common case where an out-of-context reference is needed (as
	  one would want in, say, a Semantic Web).
        </p>
      </div2>


      <div2 id="cite-source">
        <head>Point to the document that contains the URI's definition</head> 
	<!-- 
	<p>
	  [Draft note: HH says this section title is not specific
	  enough. "Link to a URI with the definition using a special 
	  kind of link"?]
	</p>
 	-->
        <p>
          When using a URI, provide,
	  again in the document in which the URI occurs,
          a reference to a document that carries a definition of the URI.  
          This is the approach taken by OWL; the document containing
          the URI is related to the one from which the definition of
          the URI should be obtained via the owl:imports 
	  relation.<footnote>More precisely, the definition will be
	  found in the imports closure of the
          document containing the URI.</footnote>
        <p>
        </p>
          The rdfs:isDefinedBy property might also be used for this
          purpose, but it probably isn't.
        </p>

        <p>
	  <emph>Criticism:</emph>
	  Like the previous approach, this one is good so far as it
	  goes, but it suffers in similar ways.  The URI and the link to
	  its definition can get separated, or 
	  keeping the definition link close to the occurrence of the URI
	  may prove to be too difficult for applications.
        </p>

        <!-- 
        <p>
          Both of these properties beg the question in that
          they do not say how to figure out what the URI that is the
          target of owl:imports or rdfs:definedBy refers to. 
	  _____

          If the
          meaning of <emph>that</emph> URI had to be given by citing a source,
          there would be infinite regress.
        </p>
        -->
      </div2>

      <div2 id="new-scheme">
        <head>Register a URI scheme or URN namespace</head>

        <p>
	  In principle, one could create a new URI scheme or URN
	  namespace, in which case the registration document would
	  constitute a definition (although perhaps not on its own;
	  often there is delegation of some kind to other documents).
	  A recent example is RFC 5870 for URIs defined to name
	  geographic locations.  Another is
	  the definition of the URI about:blank,
	  which is in progress as of this writing.
	  A "tdb:" (thing-described-by) URI scheme has also been
	  proposed, 
	  [TBD: cite Masinter]
	  as has "xri:" for 
	  <a href="http://tools.ietf.org/html/draft-yevstifeyev-xri-uri-rsrv-00"
	   >"extensible resource identifiers"</a>
	  (n.b. xri: has been deprecated in favor of http: and Web Linking).
	  <!-- http://tools.ietf.org/html/draft-holsten-about-uri-scheme-06 -->
	  See <bibref ref="rfc4395"/> and <bibref ref="rfc3406"/>
	  for details.
	</p>

	<p>
	  <emph>Criticism:</emph>
	  The review process for new URI schemes and URN namespaces is
	  probably too stringent for all but a very few definition
	  discovery applications.
	  There would likely be poor protocol support for discovering
	  definitions in a new URI scheme or URN namespace.  It is
	  possible, manually, to look up a scheme or namespace in the
	  appropriate registry, but few client applications are able
	  to do this, and the resulting document is not machine
	  actionable in any standard way.  One could attempt to modify
	  all Web clients to understand the new scheme, but this would
	  be difficult.
	</p>
      </div2>

      <div2 id="use-lsid">
        <head>Use the LSID getMetadata() method</head>
	<!-- 
        <p>
          [Draft note: LSID is not exactly common - is this worthy of mention?
          Maybe rule out all non-linked-data solutions up front?
          But it is used and I'd like some of those users to read this report.]
        </p>
 	-->
        <p>
          A URN namespace for which there is a general definition method
          is the 'lsid' namespace.
	    <!-- 
	    <footnote
            >Unfortunately the 'lsid' URN namespace is not in the
            IANA registry.  Someone encountering an LSID may need
            to do a search in order to locate the LSID specification and
            consequently determine what the LSID means.
	    In addition each LSID contains an "authority" field
	    whose meaning is not assigned by the LSID specification,
            requiring even more research on the part of someone trying
            to understand an LSID.
	    </footnote> -->
          URIs beginning 'urn:lsid:' are called LSIDs. <bibref ref="lsid"/>
          LSIDs have an associated SOAP-based 
          protocol that has separate methods for dereference (getData)
          and discovery (getMetadata).
          According to the LSID specification,
          an LSID for which the getData method yields nonempty
          content refers to a 
	  <loc href="#representation">representation</loc>,
          while the LSID could refer to
          anything at all if getData yields empty content.  
          In the latter case the information yielded by the
          getMetadata method generally constitutes, or at least
          contains, a definition of the LSID.
        </p>

        <p>
	  For clients lacking an LSID protocol implementation,
	  HTTP/LSID gateways are available.
        </p>

        <p>
	  The LSID protocol improves on 303 redirects (see below)
	  in that only one
	  round trip is required to obtain a definition.
        </p>

        <p>
	  <emph>Criticism:</emph>
	  LSIDs rely on an unregistered URN namespace, calling their
	  consensus status into question and making them impossible to
	  understand through the usual chain of IETF URI specifications.
	  The LSID protocol itself is poorly deployed.
	  As currently used, LSIDs rely on DNS 
	  for both authority and resolution,
	  and therefore have the same
	  vulnerabilities as http: URIs.
	  LSIDs do not meet the "browser friendly" criterion.
	</p>

      </div2>

      <div2 id="hash">
        <head>'Hash URI'</head> 
        <p>
          With this method, the URI must be a 'hash URI', i.e. must
	  contain a hash character '#'.
	  (For historical reasons the part of the URI following '#' is
	  called the 
	  'fragment identifier', even when it is null.)
	  The definition of the URI
          is placed in the document on the Web at the URI that is the
	  pre-hash stem of the URI.
        </p>
	<example>
	  <head>'Hash URI'</head>
	  <graphic source="hash.png"/>
	</example>
        <p>
          The interpretation of a 'hash URI', say 'http://example/eq#eq018',
          depends (according to <bibref ref="rfc3986"/>) on  
	  the media types of 
	  <loc href="#representation">representations</loc>
	  of the information resource on the Web at its stem URI
          'http://example/eq'.
          For media type application/rdf+xml, the media type registration
 	  defers to the content of the
	  <loc href="#representation">representation</loc>
	  &mdash; that is, the 
	  <loc href="#representation">representation</loc>
          itself gets to arbitrarily define what the 'hash' URI 
	  means.<footnote>
          If
          IR('http://example/eq')
          (the information resource at URI 'http://example/eq')
	  has multiple
	  <!-- associated -->
	  <loc href="#representation">representations</loc>,
	  it is important that all
	  <loc href="#representation">representations</loc>
          provide definitions of every URI that needs one, and that
          corresponding definitions in different
	  <loc href="#representation">representations</loc>
	  be compatible with one another.
          (See <bibref ref="webarch"/> section 3.2.)</footnote>
        </p>

        <p>
	  <emph>Criticism:</emph>
	  Using 'hash URIs' in this way is a retrofit of
	  an existing architecture intended for locating parts
	  (fragments) of documents to definition discovery.
	  As such the mechanism has some rough edges.
	  Some of the objections to the use of 'hash URIs' are
	  as follows.
        </p>

	<div3>
	  <head>'Hash URI' semantics is sensitive to media type</head> 
	  <p>
	    If there is content negotiation, session sensitivity,
	    etc., then the definition that is intended and sought may
	    not be present in the 
	    <loc href="#representation">representation</loc>
	    that is accessed.
	    Worse, the definition that is found may be incompatibly
	    different from the one that is meant.  For example, if
	    there is an application/rdf+xml 
	    <loc href="#representation">representation</loc>
	    and a text/html 
	    <loc href="#representation">representation</loc>,
	    then the former may define the
	    URI to name an earthquake, while the latter may
	    define it to name an HTML element.
	  </p>

	  <p>
	    <emph>Response:</emph>
	    The answer to this objection is that a server that wants
	    to avoid risking such confusion shouldn't do this.  A
	    server should 
	    either avoid content negotiation completely, or if it must
	    do CN, it should make sure that the URI is defined
	    in all 
	    <loc href="#representation">representations</loc>,
 	    and in the same way in all of them.
	  </p>

	  <p>
	    At present the only media type registration that supports
	    defining 'hash URIs' in arbitrary ways is
	    application/rdf+xml.  Since this media type has no
	    human-friendly presentation and is not enabled for XSLT,
	    many providers (e.g. FOAF, dx.doi.org) use CN between
	    HTML and RDF so that access in a browser
	    delivers information that is useful to a human.  
	    E.g. if you access FOAF without
	    special CN parameters you will not get discoverable definitions of
	    its non-element fragids.
	  </p>

	  <p>
	    The advent of RDFa, which should eliminate the need for HTTP/RDF
	    CN, may create an
	    opportunity to smooth this inconsistency over.
	  </p>

	  <!-- 
	  <p>
	    [Draft note: See
	    <a href="http://www.dehora.net/journal/2007/10/19/fragged/"
	     >Bill de h&Oacute;ra's blog post "Fragged"</a>
	    and <a href="http://blog.iandavis.com/2007/11/17/fragmentation-reprise/"
	      >Ian Davis's post on fragids</a> and
	    AWWW 3.2.2 <bibref ref="webarch"/>.]
	  </p>
          -->

	</div3>

	<div3>
	  <head>The common 'hash URI' pattern fails with large namespaces</head> 
	  <p>When a large number of URIs are formed by combining a
	    fixed "namespace" prefix with many suffixes using hash as a
	    connector, there will be a single underlying document 
	    at the pre-hash URI that must
	    provide definitions of all of the large number of URIs.
	    This is an unacceptable performance hit for the server, the
	    network, and the client.  Absolute URIs don't have this problem
	    as the response can be specific to each URI.
	  </p>

	  <p>
	    <emph>Response:</emph>
	    The answer to this has been reported a number of times
	    <bibref ref="degraauw"/>.
	    For a set of namespace members a, b, c, ...
	    instead of using URIs
	  </p>
	  <eg>
  http://example/ns#a  http://example/ns#b  http://example/ns#c ...</eg>
          <p>
	    use URIs that look like
	  </p>
	  <eg>
  http://example/ns/a#_  http://example/ns/b#_  http://example/ns/c#_ ...</eg>
	  <p>
	    where _ is a common suffix of your choice.  (One might consider
	    an empty suffix:
	  </p>
	  <eg>
  http://example/ns/a#  http://example/ns/b#  http://example/ns/c# ...</eg>
	  <p>
	    but, while technically correct, this approach interacts 
	    <a href="http://www.w3.org/2001/tag/2011/06/07-minutes.html#item03"
	     >badly</a> with
	    many deployed tools.)
	  </p>

	</div3>

	<div3>
	  <head>Fragment identifiers are easily lost</head> 
	  <p>
	    Harry Halpin <bibref ref="halpin"/> says that fragment
	    identifiers are often lost during document preparation
	    and cut/paste operations.
	  </p>
	  <p>
	    Rumor has it that some MVC-based web frameworks (Jango?,
	    Sinatra?) are not
	    good about preserving fragids.  But this is just rumor; it
	    needs to be verified.
	  </p>
	  <p>
	    <emph>Response:</emph>
	    It's not obvious that this should be the case.
	    More detail is needed on this objection.  Concrete
	    scenarios would help.  This is really important because
	    without the anti-hash arguments, there is no need to use
	    absolute URIs.
	  </p>
	</div3>

	<div3>
	  <head>'Hash URIs' don't support REST architecture</head> 
	  <p>
	    <a href="http://lists.w3.org/Archives/Public/public-awwsw/2011Jan/0012.html"
	     >Manu Sporny</a>
	    says that
	    hash URIs should work with HTTP PUT, POST, and DELETE
	    methods; they don't.
	  </p>
	  <p>
	    <emph>Response:</emph>
	    More information needed.  Why not use a separate
	    dereferenceable URI for REST controls related to 
	    the referent and/or documentation of a hash URI?
	  </p>
	</div3>

	<div3>
	  <head>'Hash URIs' are unattractive, silly, and/or vestigial</head> 
	  <p>?</p>
	</div3>

      </div2>


      <div2 id="303">
        <head>Absolute URI with HTTP 303 See Other redirect</head> 
        <p>
	  Initially (around 2000) 'hash URIs' were advanced
	  as the recommended method for definition provision and discovery.
	  In the 2002-2005 time period
	  demand arose for a discovery method applicable to absolute
	  URIs.  This led 
	  to the invention of a new protocol for use in
	  situations where
	  'hash URIs' are considered unacceptable.
	</p>
        <p>
          In this approach, one mints an absolute (i.e. hashless) http: URI,
	  puts a definition of it on the Web at a second URI,
	  and then arranges for a GET request of the first URI to
	  redirect, using a 303 'See Other' status code, to the second
	  URI.  The first URI is not
	  dereferenceable, and therefore does not name the information
	  resource at that URI (since there is none).  The first
	  URI then gets its meaning 
	  by interpreting the document on the Web at the second URI,
	  which presumably contains a definition of the first URI.
	  The document may carry definitions of other URIs as well,
	  so the referent of the URI is not necessarily the document's
	  primary topic - it may be only one of many things "described
	  by" the document.
          [Draft note: TBD: cite HTTPbis]
	</p>

	<!-- redundant
        <p>
	  Similar to this is the practice of a 303 redirect to a
	  document, where the URI is taken to refer to the document's
	  primary topic.  Using this rule could give a lead
	  to a different meaning for the URI compared to what the
	  previous rule would give, so some tie-breaker is
	  needed in practice.
	</p> -->

	<example>
	  <head>303 redirect</head>
	  <graphic source="303.png"/>
	  <p>
	    Alice chooses 'http://example/eq018' as the way she will refer
	    to a particular earthquake.
	    At 'http://example/about-eq018' she publishes text and/or RDF
	    that defines 'http://example/eq018', explaining the URI by
	    providing details about the
	    earthquake (date, location).
	    For the URI 'http://example/eq018', which will not be
	    dereferenceable (since otherwise, it would refer to the
	    information resource at that URI <bibref ref="ir"/>, 
	    not the earthquake),
	    she arranges that a GET request yields a 303 redirect with
	    a Location: header specifying 'http://example/eq018' as the
	    redirect target.
	  </p>
	  <p>
	    Those encountering 'http://example/eq018' will attempt 
	    to dereference it, but
	    this will fail, with a 303 redirect delivered instead.  
	    The 303 redirect indicates that
	    the document at 'http://example/about-eq018'
	    provides a definition of the URI 'http://example/eq018'.
	  </p>
	</example>

        <p>
          Another pattern is to use a 303 redirect to a document whose
          primary topic is the intended referent, similar to the
          Chicago use case (<specref ref="chicago"/>).  This
          could, in theory, lead to 
          ambiguities, as the primary topic of the document and the
          entity referred to 
          using the URI might be different things.

	  <!-- 
          [Draft note: This is the second use case.
		       Is anyone, in practice, deploying 303 redirects to a
          "primary topic" page not mentioning the URI to be 
          defined, rather than to a document that explicitly mentions
          the URI?  YES - Hugh Glaser.]
	   -->
        </p>

        <p>
	  <emph>Criticism:</emph>
	  Again, a number of objections to this approach have been raised:
	</p>

	<div3>
	  <head>303 is difficult, sometimes impossible, to deploy</head> 
	  <p>
	    Deploying a 303 redirect requires giving the correct
	    directive to a web server, for example adding 
	    a Redirect line to .htaccess in Apache HTTPD.  Unfortunately
	    many hosting solutions do not allow this, putting this
	    manner of publishing definitions off limits to many who
	    would otherwise like to use it.
	  </p>

	  <p>
 	  <emph>Response:</emph>
	    Web publishers whose ISP does not permit them to set up a
	    303 redirect, or for whom the overhead such as expertise
	    acquisition is
	    prohibitive in some other way, could choose to use a service
	    that provides 303 redirects to a location of their choosing.
	    One such service is purl.org, operated by OCLC, which
	    permits anyone to set up a 303 or other redirect from their domain.
	    The URI to be defined would have to have the form
	    http://purl.org/..., while the URI for the document carrying
	    the definition could be anything at all.
	  </p>
	  <p>
	    Unfortunately,
	    use of a redirect service makes one dependent on two
	    service providers instead of 
	    one, making one's definitions more vulnerable than if only
	    one provider were involved.
	  </p>
	</div3>

	<div3>
	  <head>303 leads to too many round trips</head> 
	  <p>
	    To get definitions of N URIs by redirecting through
	    303 responses,
	    you need to do 2N HTTP requests.
	    This is a frustrating and apparently gratuitous performance
	    hit for those interested in publishing and accessing
	    large numbers of definitions.
	  </p>
	  <p>
 	    <emph>Response:</emph>
	    See <specref ref="hostrule"/>.
	  </p>
	</div3>

	<div3>
	  <head>303 responses aren't cached</head> 
	  <p>
	    RFC 2616 <bibref ref="rfc2616"/> says that 303 responses
	    shouldn't be cached. 
	    Some caching software obeys this directive, with
	    negative consequences for the performance of GET/303 exchanges.
	  </p>
	  <p>
 	    <emph>Response:</emph>
	    This problem was recognized quite early on as a mistake in
	    RFC 2616 <bibref ref="rfc2616"/>,
	    and an erratum was circulated. This is one of many changes
	    made in HTTPbis, which is being developed by the IETF
	    HTTP working group and should be published some time
	    soon.  Any software that fails to cache 303 responses
	    when allowed to by HTTPbis needs to be fixed.
	  </p>
	</div3>

	<div3>
	  <head>303 makes the URI difficult to bookmark</head> 
	  <!-- 
	  <p>
	    "The user enters one URI into their browser and ends up at
	    a different one, causing confusion when they want to reuse
	    the URI ... Often they use the document URI by
	    mistake."
	    (<a href="http://iand.posterous.com/is-303-really-necessary"
	      >Ian Davis</a>)
	  </p>
 	  -->

	  <p>
	    "Redirection has in fact very confusing side effects; as we expect the
	    semantic web to work seamlessly with the web, it is very odd that a
	    semantic web uri cannot be copy pasted to a browser without seeing it
	    change to something that is not the same as before."  
	    <bibref ref="tumarello"/>
	  </p>

	  <p>
 	    <emph>Response:</emph>
	    The location bar issue is discussed
	    <a href="http://www.w3.org/QA/2010/04/why_does_the_address_bar_show.html"
	     >here</a>. [TBD: citation]
	    The content from the redirect target does
	    not originate from the referent of the original URI, so
	    an interface that suggests otherwise is guilty of misattribution.
	    The best answer to this is that an additional user
	    interface element should be added to browsers that
	    provides access to the original URI.
	  </p>

	</div3>

	<div3>
	  <head>This use of 303 has no consensus specification</head> 
	  <p>
	    HH: "The hash 303 redirect method in common use has
	    not received adequate 
	    review such as W3C recommendation track; in fact it is not
	    really documented at all in any adequate form."
	    <bibref ref="halpin"/>
	  </p>
	  <p>
 	    <emph>Response:</emph>
	    The IETF HTTP working group has taken on this issue.
	    <a href="http://tools.ietf.org/html/draft-ietf-httpbis-p2-semantics-14#section-8.3.4"
	     >HTTPbis's new text for GET/303</a>
 	    specifies the pattern, which
	    is now in common use in RDF deployment.  There is no issue
	    of incompatibility with prior usage because the current HTTP 
	    specification <bibref ref="rfc2616"/>
	    only defines what 303
	    means in conjunction with POST and says nothing about what
	    it means with GET.
	  </p>
	</div3>


      </div2>

    </div1>


    <div1>
      <head>Don't do it: Potential workarounds</head>

      <p>
        If issues around 'hash URIs' and 303 redirects
	render them unacceptable, it is worth considering alternatives.
	In this section we reconsider ways in which definition
	discovery can be bypassed altogether.  In the following secion
	potential new discovery methods are considered.
      </p>

      <div2 id="ddi">
        <head>Use something other than a URI</head> 

        <ednote>
	  <date>2011-04-14</date>
          <edtext>This section derives from 
            <loc href="http://www.w3.org/2001/tag/2011/02/metadata-arch.html#slide9"
            >JAR's TAG F2F presentation slides</loc>.  The purpose of
            talking about this idea would be
            mainly to remind people that the problem is one of notational
            engineering, not philosophy.  I have been asked to remove this
            section.
	  </edtext>
        </ednote>

        <p>
	  URIs are just one kind of term that might be used to
          refer to something.  If defining a URI is too difficult or
          costly, then perhaps one might do without.
          In RDF serializations such as Turtle, 
          for example, we have blank node notation:
        </p>
        <eg>
  [ foaf:isPrimaryTopicOf &lt;http://example/about-chicago&gt; ] </eg>
        <p>
          Here we have managed to refer to Chicago without defining a
          new URI; we have simply referred indirectly using a URI that 
          refers to an information resource according to a generic method
          (see <bibref ref="ir"/>).
        </p>

	<!-- 
      </div2>
      <div2 id="sugar">
        <head>Syntactic sugar</head> 
 	-->

        <p>
          A concise alternative would be syntactic sugar:
        </p>
        <eg>
  *&lt;http://example/about-chicago&gt; </eg>
        <p>
	  which might be supported in a hypothetical new RDF serialization
          as a shorthand for the previous example.
          (The asterisk is meant to be suggestive of indirection in the
          C programming language.)
        </p>

        <p>
	  <emph>Criticism:</emph>
	  These are good as far as they go, but they do not meet the
	  demand for defined URIs.  In particular, it can be 
	  <a href="http://www.w3.org/wiki/RdfSmushing"
	   >difficult</a>
	  to detect that blank nodes in separate graphs are meant to
	  refer to the same thing.  Data integration is easier when
	  shared URIs are used.
        </p>

	<!-- 
	<p>
	  Each thing to be referenced has to have a dedicated page; pages
	  cannot be shared among multiple things.
	</p>
	 -->

        <p>
	  In the case of syntactic sugar, there would be
	  adoption overhead in publishing new 
	  RDF serialization specifications and getting them implemented.
        </p>

      </div2>

      <div2 id="parallel">
        <head>Express data in terms of information resources</head>
	<p>
          [Or, "parallel properties."]
	  The idea here is that you don't need to define a URI
	  if you are willing to use properties that 
	  are defined or understood as indirecting
	  through information resources.  Instead, just use a URI that
	  refers to the information resource at that URI, and use
	  it as the subject of such properties.
	</p>
	<p>
	  Assume that each information resource
	  can have an associated entity, which we'll call its
	  "designated subject".<footnote>Why
	    'designated subject' instead of 'primary topic'?
	    Because they might be different things.  Consider
	    identical content,
	    served from two URIs u1 and u2,
	    containing information about the designated subjects of both
	    u1 and u2.  Even though the content is identical, the URIs
	    would have to refer to distinct information 
	    resources with different designated subjects.
	    But the content can have only one primary topic.
	  </footnote>
	  Information about the designated subject is expressed using
	  properties whose subject is the information resource.
	</p>


	<example>
	  <head>Combining metadata and data using the same URI</head>
        <p>
	  Suppose that Alice wants to record some information about an
 	  earthquake.  She publishes a definition containing the
	  following so that it's on the Web at the URI
 	  'http://example/eq018':</p>
	  <eg>
  &lt;http://example/eq018> eq:magnitude 6.9.
  &lt;http://example/eq018> eq:epicenter &lt;geo:37.040,-121.877>. </eg>
	<p>
	  Bob then comes along and writes the
	  following metadata about IR('http://example/eq018') in the
	  usual way, i.e. using the URI to refer to the information
	  resource, based on what information is accessed via that
	  URI:
        </p>
        <eg>
  &lt;http://example/eq018> dc:creator "Alice".
  &lt;http://example/eq018> dc:title 
    "Loma Prieta earthquake URI definition".</eg>

        <p>
	  Suppose that
	  Carol encounters both bits of RDF (or either) and needs to
	  make sense of 
	  them.  She is aware that 'http://example/eq018' might be
	  used in both kinds of statement - in metadata, with the
	  intent that the 
	  metadata is about IR('http://example/eq018'); and also 
	  in statements that relate to an eathquake.
	  <!-- 
	  as described 
	  in IR('http://example/eq018').  For each use of
	  'http://example/eq018' she (or her software) needs to
	  determine which sense is supposed to apply. -->
	</p>
	</example>

	<p>
	  Instead of defining eq:epicenter to be a property 
	  relating an earthquake to its
	  epicenter, one defines eq:epicenter to be a property
	  that relates an information resource to the
	  epicenter of its designated subject.  
	  Then, as long as you
	  have a URI for the 
	  IR, you don't need a URI for the earthquake.
	  If property eq:epicenter has domain eq:Earthquake,
	  then the members of eq:Earthquake are IRs
	  whose designated subjects are earthquakes.
	</p>

	<p>
	  The nature of the designated subject is inferred from
	  information found in the IR.  For example, if the IR says
	  that its eq:epicenter is E, then you can infer that the
	  designated subject has epicenter E.
	</p>

	<graphic source="proxy.png"/>

	<p>
	  The overall effect when reading the RDF is that the
	  information resources, being ubiquitous, seem to disappear,
	  and one focuses naturally on information about their
	  designated subjects without being aware of the indirection.
	</p>

        <p>
	  All considerations that apply to the subject of a property
	  also apply to the object, making the situation more complex in
	  ways that we won't work out in detail here.
        </p>

	<p>
	  [via TimBL]
	  This pattern has some degree of uptake.  Using the 
	  <a href="http://ogp.me/"
	   >open graph protocol</a>
	  on Facebook, you can get a page about a movie. 
	  The RDF references &lt;&gt;, which is of class Movie.
	  (&lt;&gt; is equivalent to a reference via the base URI,
	  the one from which the page was retrieved, and therefore
	  refers to an information resource.)
	  The members of class Movie are information resources whose
	  designated subjects are movies.
	</p>

	<!-- 
	<p>
	  This is an old idea, going back to the 
	  <a href="http://www.w3.org/History/1989/proposal.html"
	   >original description of the Web</a>.
	</p> -->

	<p>
	  <emph>Criticism:</emph>
	  If a property that refers directly to movies also needs to be used,
 	  then two properties have to be defined (with distinct URIs), one
	  relating to the movie and one relating to the Movie.  This
	  results in clerical overhead and potential user confusion.
	</p>

      </div2>


      <div2 id="chimera">
        <head>Rely on implicit coercion from an information resource to its
          designated subject</head>

        <p>
	  [Draft note: We are trying to represent 
	  <a href="http://inkdroid.org/journal/2010/07/07/linking-things-and-common-sense/"
	   >Ed Summers's proposal</a>, which others have echoed,
	  in this section.  This is sometimes call "punning".]
        </p>
        <p>
	  If one's domain of discourse mixes information resources
	  (used as above) and entities that might be their designated subjects,
	  then maintaining parallel properties, one set that applies
	  the 'designated subject' coercion and one that doesn't,
	  might be considered an unacceptable cognitive and clerical burden.
	  (There is quite a lot of variation in opinion on this point.)
	  In this case one might try combining the two properties
	  into a single property that can be used in either
	  way.  Suppose that P is the initial property (not
	  defined via designated subject coercion) and Q is the
	  overloaded property
	  we'd like to define and write.  Then an obvious definition
	  of Q would be
        </p>
        <slist>
	  <sitem> Q(x,y) </sitem>
	  <sitem> &nbsp;&nbsp;&nbsp;if and only if </sitem>
	  <sitem> P(x,y) OR P(designated-subject(x),y)
	  </sitem>
        </slist>

        <p>
  	  For example, taking P = dc:creator as defined by the Dublin
	  Core definition, and Q = dc:creator as overloaded, the 
	  statement 
	</p><eg>
  &lt;http://example/eq018> dc:creator "Alice". </eg>
        <p>
          could be taken to imply that P(&lt;http://example/eq018>, "Alice")
	  as long as it is agreed ahead of time that earthquakes don't
	  have creators.
        </p>
        <p>
	  This manner of overloading can make correct recovery of
	  P-relationships impossible when a designated 
	  subject is an information resource, so it's probably better
	  use a "tie breaking" rule such as
        </p>
        <slist>
	  <sitem> Q(x,y) </sitem>
	  <sitem> &nbsp;&nbsp;&nbsp;if and only if </sitem>
	  <sitem> P(x,y) OR
  	  	  {P(designated-subject(x),y) AND
 		     designated-subject(x) is not an information resource}
	  </sitem>
        </slist>
        <p>
	  There may be better tie-breakers than this one; this is just
	  for illustration.
        </p>

        <p>
	  All considerations that apply to the subject of a property
	  also apply to the object, making the coercion rules that
	  much more complex.
        </p>

        <p>
	  <emph>Criticism:</emph>
          This approach presents a couple of challenges.
        </p>

        <p>
          First, any tie-breaking rule is going to be fragile and will
          make the "losing" side of the race difficult to express.
	  One can expect many mistakes where the designated subject
          was the intended subject of some metadata but the tie-breaking
          rule implicated the other information resource.
        </p>

        <p> 
          Second, this method, by design, creates the illusion that
          the URI actually refers to the designated subject, not the
          information resource.  If
          predicates that already 
          possess meaning are being reinterpreted as overloaded 
	  properties, there is risk
          that an agent will draw unsound conclusions.  For example,
          if two URIs u, v refer to distinct information resources
          with the same designated subject,
          and one then writes &lt;u&gt; owl:sameAs &lt;v&gt;
	  having their designated subjects in mind, then one 
          can incorrectly impute that the two information resources
 	  are identical.  A similar situation holds with
          functional properties, which induce equations.
        </p>
      </div2>

    </div1>


    <div1>
      <head>Potential new discovery methods</head>
      <div2 id="hostrule">
          <head>Absolute URI with site-specific discovery rules</head> 
          <p>
	    The network round-trip (303 redirect) used to map the URI whose
	    definition is 
	    to be discovered to the URI of the information resource
	    that defines it can be avoided if we know a general rule
	    that maps the one kind of URI to the other, as such a rule can
	    be applied on the client without server involvement.
	    It is probably too much to hope for that a single rule could work
	    uniformly for all URIs whose definition might be sought,
	    but an individual host may have a rule that applies for 
	    URIs at that host.
          </p>
          <p>
            The "well known URIs" protocol gives a place where
	    a file containing such rules can be stored <bibref ref="rfc5988"/>.
	    The rule might be stored in a well-known file
	    'definition-rule', as in 
	    'http://example/.well-known/definition-rule'.
	    To obtain a definition of 'http://example/eq018', obtain the
	    definition-rule file for its host.
            Then if the rule says to map 'http://example/{path}' to, say,
            'http://example/{path}.about', a
            definition of 'http://example/eq018' can be sought by dereferencing
 	    'http://example/eq018.about'.
          </p>
          <p>
            When the mapping rule is cached, this reduces the number
            of round trips from two (in the 303 case) to one.
          </p>
          <p>
            This would be a new protocol and the name and format of
	    the definition-rule file would have to be pinned down.
	    One option might be to use the link-template feature of
	    the <a href="http://tools.ietf.org/html/draft-hammer-hostmeta-13"
		 >host-meta file</a>,
	    but registering a new well-known file name would also be a
	    viable option.
          </p>
          <p>
            Looking for a definition-rule file for every host that has URIs
            for which definitions need to be discovered would be
            expensive if only a few of them have such files, but with some
            cleverness the number of such failed requests can probably
            be kept small.
            The details would have to be worked out, but this approach
            could be a boon to bulk consumers of absolute URI definitions.
          </p>

          <p>
	    For compatibility with clients that are not aware of
	    discovery rules, 303 redirects for these URIs should be
	    retained when possible.
          </p>

          <p>
	    <emph>Criticism:</emph>
	    Web site authors without write access to the appropriate
	    .well-known file would not be able to take advantage of
	    this facility.
          </p>

	  <p>
	    Jeni says: "the disadvantage is that you lose the
	    distinction between status codes for the thing [described]
	    and the
	    document" -- but JAR doesn't understand this.  Any
	    information that would have been conveyed by the status
	    code from a GET on the original URI, could be conveyed in
	    the document retrieved by definition discovery.
          </p>

	  <p>
	    Jeni says: "in some cases the mapping from thing URI to
	    document URI can be complex or change over time in ways that
	    make it hard to use a definition rule file; in
	    legislation.gov.uk for example, we return a 303 redirection
	    from a legislation item to <emph>either</emph> an as-enacted 
	    version
	    <emph>or</emph> the most recently revised version, depending 
	    on what is
	    available for that particular item of legislation (which
	    changes as new revised versions are added). It would be
	    quite hard to create a definition-rule file in those
	    circumstances (we would have to solve it by having a simple
	    mapping with some URIs 307 redirecting to others)."
	  </p>
      </div2>

      <div2 id="newhttp">
          <head>Absolute URI with new HTTP method or status code</head> 
          <p> 
	    To reduce the number of round trips relative to the 303 
	    redirect, we might use a new HTTP status code
            to indicate that what is being returned is a definition of 
            the request URI, rather than a representation
	    associated with the information resource at that URI.
	    Alternatively, we could define an
            HTTP method to request a definition of 
            a URI.
          </p>
          <p> 
            <emph>New status code:</emph>
	    In response to GET of a URI,
            a server might provide a definition of the URI directly 
	    in a non-success
            response, as opposed to indirectly via a 303 redirect.  
	    (The definition can't go in a successful GET response
 	    since that would mean that the URI
            refers to the information resource at the URI.)
            Possibilities for HTTP response status codes that might
            signal this situation: 
            203 Non-Authoritative Information; a new 2xx status
            (maybe 209); a new 3xx status (maybe 308);
            or a variety of 4xx codes.
            Placing the definition in the content of a redirect response
            (status code 301,
            302, 303, and 307) is unsatisfactory as the
            content would not be displayed in a Web browser; the same
            situation might apply to any 3xx or 4xx response, making a
            2xx status code the most attractive.
          </p>
          <p> 
            <emph>New method:</emph>
            The URIQA specification <bibref ref="uriqa"/> defines MGET, 
            a new HTTP request method.
            An MGET request on a URI yields a response containing 
	    information about the referent of the URI.
	    If the URI is dereferenceable, then the URI
	    refers to the information resource at that URI,
	    so the MGET result is metadata for that information
            resource.
	    Otherwise, the MGET result might be a definition of the
            URI.  A GET in that case would
            yield a 303 See Other 
	    linking to the same definition obtained by MGET, or
	    maybe to a 405 Method Not Allowed
            response.
          </p>
          <p>
            Either of these options would mean fewer round trips than
            following a 303 redirect.
          </p>
          <p>
	    The Link: HTTP header <bibref ref="rfc5988"/>
	    is useful for indicating a metadata source for an
	    information resource (see POWDER spec, citation needed).  
	    In case a URI is not dereferenceable,
	    Link: could be used for directing
	    a client to a definition of a URI.  However, the
	    advantage of Link: over a 303 redirect is unclear, since
	    a second network round trip would be required in either case.
          </p>
          <p>
	    <emph>Criticism:</emph>
            Although they reduce the expected number of round trips,
	    all HTTP extensions are generally as difficult, or more
            difficult, to deploy than 303 redirects.  And it's not clear
            which status codes play nicely with the "browser
            friendly" goal.
	    We would have to check to make sure that
            proxies, caches, and Web clients do
            something reasonable with the proposed status code.
          </p>
      </div2>

      <div2 id="depends">
        <head>Repurpose some or all dereferenceable absolute URIs</head>

        <p>
	  Under this approach, some or all dereferenceable 
	  absolute URIs - call them "indirect" URIs - would
          get their meaning according to a definition
          found in the information resource (document, usually) at the URI;
	  they would no longer refer to their information resource
          <bibref ref="ir"/>.
	  This approach avoids the deployment and performance
	  difficulties of 303
          redirects.  Defining an indirect URI is easy &mdash; it is the same as
          publishing any Web document &mdash; and access to its definition
	  is also easy, not requiring an indirection step.
        </p>

	<graphic source="change.png"/>

        <p>
	  How does one learn whether a URI is indirect or not?
	  One might like to say that an indirect URI is one that
	  dereferences to a definition of itself, and that all others
	  are direct.
	  But this criterion is not machine
          actionable as stated, both because the definition might be couched
          in an arbitrary language or notation (the number of RDF
	  serializations is increasing steadily), and because even for
	  a known notation it may not be obvious 
          how to distinguish content that contains a
          definition of a particular URI from content that doesn't.
          One actionable approximation that has been
          proposed is as follows: If IR(u) has an associated
	  <loc href="#representation">representation</loc>
	  with media type 'application/rdf+xml', then
          take u to be indirect; otherwise take u to be direct.  This
	  rule would generate false positives (e.g. RDF/XML documents 
          not containing u) and false negatives (e.g. those defining the
	  URI only in an associated text/owl-manchester
	  <loc href="#representation">representation</loc>),
          but it illustrates the idea.
        </p>

        <p>
	  In order to compose or use metadata, agents would 
	  first check whether a URI is direct by
	  requesting an application/rdf+xml representation.  If the URI
	  is direct, agents could compose or use metadata in the
	  usual way (at some risk that the URI might change status in
	  the future from direct to indirect).  If the URI is
	  indirect, agents 
	  would have to write or interpret the metadata in some new
	  way (see below).
        </p>

        <p>
	  <emph>Criticism:</emph>
	  Currently it is easy to write and interpret Web metadata
	  (meaning metadata written using a dereferenceable absolute
	  URI to refer to the information resource at that URI).
	  This proposal makes metadata more complicated, fragile, and
	  costly, and forces all existing producers and 
	  consumers of Web metadata to be updated to be aware
	  of indirect URIs.
        </p>

        <p>
	  It is likely that there is deployed content that would
	  be interpreted differently under the proposed rule than at
	  present.  This would be hard to know, and inconsistencies
	  could be consequential, such as the assignment of authorship
	  or a copyright license to the wrong information resource.
	  (Think about the case where an information resource at URI U
	  defines U to be a different information resource.)
	  More complex
	  and costly heuristics than those given above might eliminate
	  some kinds of misinterpretation, but would never eliminate it.
        </p>

        <p>
	  As most of the Web (e.g. HTTP clients and servers) will
	  continue to adhere to the current interpretation of
	  dereferenceable absolute URIs, the proposed rule
	  introduces a split in the URI namespace, with two
	  communities interpreting the same URIs in incompatible
	  ways.  Having multiple namespaces 
	  imposes an overall system cost in that one has to
	  determine which one to use in each instance 
	  (see <bibref ref="webarch"/> 2.2.1).
        </p>

	<!-- 

        <p>
          This would be an incompatible change.
	  Clearly some agents, such as Web clients servers, would have to
	  respect the current rule, since under the proposal they
          might dereference an indirect URI in a manner not expected by any
          client.  (Consider the situation where the indirect URI is defined to
          be an information resource that is not the one
	  on the Web at the URI.)  So the first effect would be to
	  partition URI contexts into those 
          where indirect URIs are interpreted according to the current
          rule, and those where they're interpreted them according to
          the proposal.  For
          example, an indirect URI as the target URI of an HTTP
          messages would be interpreted according to the current rule,
          while an indirect URI occurring in an RDF document might be 
          interpreted according to the proposal.
        </p>

        <p>
          Some machine-actionable rule is desirable, since without one there
          is no reliable way to use <emph>any</emph>
          dereferenceable absolute URI u to
          refer to IR(u), and all currently deployed metadata would fail.  There
          would always be the possibility that 
          u might be understood to be defined by IR(u) instead.
        </p>
 	-->

        <div3>
          <head>How to refer to information resources, then?</head>
	  <p>
	    Any proposal that displaces the current meanings of some URIs 
	    from those URIs has to compensate by providing new homes
	    for those meanings.  That is,
	    some rule must be specified that
 	    yields a way to refer to IR(u), given any
	    dereferenceable absolute URI u.
	    This is not a matter of semantics or philosophy; it is
	    just notational engineering.
	  </p>
	  <p>
	    There are many applications that need such a rule for
	    writing references to information resources at arbitrary 
	    URIs, including
	    those concerned with metadata (including licensing), provenance, 
	    Web site testing, validation, text processing, text annotation, and 
	    access control.
	  </p>
	  <p>
	    A standard way to refer to
	    IR(u) is needed in a variety of circumstances:
	  </p>
	  <ol>
	    <li>when u is an indirect URI</li>
	    <li>when it is not known whether u is direct or indirect</li>
	    <li>when the cost of determining whether u is direct or
	      indirect is judged to be too high</li>
	    <li>when it is desired not to impose on others the cost of
	      determining whether u is indirect</li>
	    <li>to guard against u possibly becoming indirect in the future</li>
	  </ol>
	  <p>
	    Although direct URIs might still be used to refer to their
	    information resources, 
	    when they are known to be direct,
	    the risks and costs of doing so might
	    lead people to stop using them, in preference to a common
	    approach that worked uniformly for direct and indirect URIs.
	  </p>
	  <p>
	    In any case, there are many design alternatives for referring
	    to an information resource other than using its URI.  For
	    example, the Turtle term 
	  </p>
	  <eg>
    [ ir:onWebAt "http://example/eq018"^^xsd:anyURI ] </eg>
	  <p>
	    could be a new way to refer to 
	    IR('http://example/eq018'), which we formerly
	    referred to in Turtle as '&lt;http://example/eq018&gt;'.
	    [TBD: Reference Halpin and Presutti's closed access ESWC 2009 paper.]
	    A local shorthand for use within a document or graph
	    could be defined to the same effect:
	  </p>
	  <eg>
    :about-eq018 ir:onWebAt "http://example/eq018"^^xsd:anyURI . </eg>
	  <p>
	    (Note that :about-eq018 could be either a 'hash' URI or a
	    303 URI.)
	  </p>
	  <p>
	    Yet another possible replacement notation would be syntactic sugar:
	  </p>
	  <eg>
    &amp;&lt;http://example/eq018&gt; </eg>
	  <p>
	    which might be supported in a hypothetical new RDF serialization.
	    (The ampersand is meant to be suggestive of the address-of
	    operator in the C programming language.)
	    (This would of course have significant deployment cost.)
	  </p>

	  <!-- 
	  <p>
	    [Draft note: HH requested that the idea be presented of
	    syntactic sugar to
	    support references to IRs.  He suggested something having to
	    do with quotation and named graphs that I didn't understand,
	    but I think he's referring to something that's basically the same
	    as the address-of operator in my 
	    <loc href="http://www.w3.org/2001/tag/2011/02/metadata-arch.html#slide9"
	      >TAG F2F slides</loc>.]
	  </p>
	  -->

	  <p>
	    Alternatively, the referring document could just assert that
	    a URI is direct, without checking whether it is or not:
	  </p>
	  <eg>
    &lt;http://example/eq018&gt; ir:onWebAt "http://example/eq018"^^xsd:anyURI . </eg>
	  <p>
	    This would be an instance of <specref ref="colocate"/>.
	    However, this runs some interoperability risk as there may
	    be other agents that interpret the same URI as indirect.
	    <footnote>
	      <p>
		One might think that the notation 
		for referring to information resources could relate the
		information resource to the referent of u (written
		'&lt;http://example/eq018&gt;' in Turtle) instead of to the
		URI u itself
		(written '"http://example/eq018"^^xsd:anyURI'):
	      </p>
	      <eg>
    [ rdfs:isDefinedBy &lt;http://example/eq018&gt; ] </eg>
	      <p>
		However, the meaning of this expression is then sensitive to the
		interpretation of the URI 'http://example/eq018', which
		is what was in doubt in the first place
		and is therefore something that the notation
		has to avoid depending on.
		<!-- 
		The &lt;...&gt; notation is also
		ambiguous according to RDF semantics, because -->
		<!-- 
		If two URIs, say 
		'http://example/eq018' and 'http://example/earthquake571',
		both refer to the same thing (whatever it is), there might
		be two distinct information
		resources IR('http://example/eq018') and
		IR('http://example/earthquake571') satisfying this relationship,
		with no way for the property, which is defined on the
		interpretations of the URIs and not on the URIs
		themselves, to choose between them.
 		-->
	      </p>
	    </footnote>
	  </p>

	  <p>
	    Another design option would be a rule or protocol
	    for providing a URI (other
	    than u) to refer to IR(u), when 
	    one is available.
	    One way to do this would be with a Link: HTTP response header
	    <bibref ref="rfc5988"/>: if GET u or HEAD u yielded a
	    response with a Link: header with an agreed link relation,
	    the target of the link would be the URI naming IR(u).
	    Using a Content-location: header
	    has also 
	    <a href="http://blog.iandavis.com/2010/11/07/a-guide-to-publishing-linked-data-without-redirects/"
	     >been suggested</a>.  It would be necessary that
	    the extra header be provided for <emph>all</emph> indirect URIs,
	    since otherwise some of these
	    information resources would lack URIs.
	  </p>
	  <p>
	    It is not clear how difficult it would be to correctly deploy
	    Link: or Content-type: headers on hosting services.
	  </p>
	    <!-- 
  , or
	    via an RDF statement such as
	  </p>
	  <eg>
    &lt;http://example/eq018#ir&gt; ir:onWebAt "http://example/eq018"^^xsd:anyURI . </eg>
   -->
       </div3>

      </div2>

    </div1>


    <div1>
      <head>Summary</head>
      <p>
        [Jeni: "I think you could do with making more of (ie explaining
        in more detail up front) the criteria against which the
        various alternatives are judged. There are various criteria
        that crop up in the criticism sections that aren't necessarily
        reflected in the table here, such as the copy/paste factor,
        cachability (as I described above)"]
      </p>
      <p>
        The following table summarizes some of the current and
 	proposed
	definition discovery methods,
        evaluating each against a set of criteria, as explained in the
 	key below.
      </p>
      <table rules='all'>
       <thead>
        <tr><td></td> <td>webarch?</td>
                      <td>robust?</td>
                      <td>easy to deploy?</td>
                      <td>min round trips</td> 
                      <td>sound?</td> 
                  </tr>
       </thead>
       <tbody>
        <tr><td><loc href="#hash"
                 >Hash</loc>    </td>
            <td>+</td>
            <td>-</td> <td>+</td> <td>1</td>
            <td>+</td></tr>

        <tr><td><loc href="#303"
                 >Absolute + 303</loc>    </td>
            <td>+</td>
            <td>+</td> <td>-</td> <td>2</td>
            <td>+</td></tr>

	<!-- 
        <tr><td><loc href="#suffix"
                 >Hash + fixed suffix</loc>    </td>
            <td>+</td>
            <td>-</td> <td>+</td> <td>1</td>
            <td>+</td></tr>
       -->
        <tr><td><loc href="#hostrule"
                 >Absolute + discovery-rule</loc></td>
            <td>+</td>
            <td>+</td> <td>?</td> <td>1+&epsilon;</td>
            <td>+</td></tr>
        <tr><td><loc href="#newhttp"
                 >Absolute + new HTTP</loc> </td>
            <td>+</td>
            <td>+</td> <td>-</td> <td>1</td>
            <td>+</td></tr>
        <tr><td><loc href="#chimera"
                 >Coerce</loc></td>
            <td>+</td>
            <td>+</td> <td>+</td> <td>1</td>
            <td>-</td></tr>
        <tr><td><loc href="#depends"
                 >Take at face value</loc></td>
            <td>-</td>
            <td>+</td> <td>+</td> <td>1</td>
            <td>+</td></tr>
       </tbody>
      </table>

      <glist>
        <label>webarch?</label>
        <def> 
          Does it assign a new, incompatible meaning to existing URIs?
        </def> 

        <label>robust?</label>
        <def> 
          Is the URI free of fragment identifiers that can get lost
	  or misinterpreted?
        </def> 

        <label>easy to deploy?</label>
        <def> 
          Can a publisher with a file-upload-only hosting solution use 
          this method?
        </def> 

        <label>min round trips</label>
        <def> 
          How many network round trips are needed to find
          a definition, assuming (a) the definition is not cached and
          (b) the /.well-known/host-meta cache misses with probability
          &epsilon; ?
        </def> 

        <label>sound?</label>
        <def> 
          Is the method likely to respect deployed
          axioms and inference rules (i.e. is safe with respect to
          logical soundness)?
        </def> 
      </glist>

      <!-- 
      <ednote>
        <date>2011-04-11</date>
	<edtext>
	  For reference, 
          <loc href="http://hueniverse.com/2008/09/discovery-and-http/"
           >here</loc>'s a similar analysis &mdash; not the same problem, but a
          related one &mdash; with its own matrix.
	</edtext>
      </ednote>
      -->
    </div1> 


    <div1 id="glossary">
      <head>Glossary</head>

      <!-- 
      <p>[Draft note: HH: Put Glossary at end. Otherwise, I doubt
      anyone will get past it.]</p>
      -->

      <p>
        This section defines terms that are used in this report.
        An attempt has been made to avoid gratuitous differences
        from the way these terms are used elsewhere, but in a few
        cases choice of terminology has been difficult and words
        with other meanings (such as "definition") are
        given technical definitions.  These definitions are not being
        proposed for general adoption.
      </p>

      <p>
        [Draft comment: All terminology choices are provisional; 
        for most of them I
        am testing the waters to see how well the word works, and am
        prepared to change.]
      </p>

      <glist>
        <label id="absolute">absolute</label>
        <def>
	  A URI is absolute if it contains no hash '#' sign.  This usage is
	  a bit unintuitive
	  but is used for consistency with RFC 3986
	  <bibref ref="rfc3986"/>.
        </def>

        <label>associated with</label>
        <def>
	  [Draft note: This is too sketchy.  TBD.]
	  "Association" of a 
	  <loc href="#representation">representation</loc>
	  with an information
	  resource is by fiat according to each particular
	  information resource.  See <bibref ref="ir"/>.
        </def>

        <label>definition</label>
        <def>
          A document or document part that provides
          information about the meaning of a URI or other kind of term.
	  This term is not meant to be either rigorous or exclusive.  The
          "information" could provided in
	  any human-readable or machine-readable language,
	  or combination of languages.
          It needn't be successful, specific, or comprehensive in defining the 
	  term in the ordinary sense of "defining".  Rather, the term
          as used here refers to the role it plays in discovery.  We
          might more accurately say "putative definition".

          [Draft note: Alan R: Is a sound recording a possible definition?]

          <!-- 
          [We need a word for this, and its relation
          to a phrase whose meaning is in question.  "Description" (or
          Eran H-L's "description resource") is
          incorrect as it shifts focus from the term to some (unknown)
          resource - I don't start out knowing what the resource is and then
          look for a description of it, I start out knowing a term and
          then I want to know what resource is meant.  "Definition" is
          another option but may be misleading.  David B likes
          "URI declaration" but this term is evocative of his architecture,
          which I don't want to evoke.]
           -->
        </def> 

        <label>dereferenceable</label>
        <def> 
          A URI is dereferenceable if 
	  there is at least one 
	  <loc href="#representation">representation</loc>
	  that is authorized as the result of a retrieval
	  operation. (This definition is derived from
	  <bibref ref="rfc3986"/>
	  section 1.2.2, which also applies 'dereference' to
	  operations such as POST.)
          In particular, absolute http: URIs are
          dereferenceable if some HTTP method or
          equivalent is successful (yields a 2xx response).  Some URIs
          belonging to some other 
          URI schemes are also dereferenceable.
        </def> 

        <label>http: URI</label>
        <def>
          A URI whose scheme (the part before the colon) is 'http' or 'https'.
        </def>

        <label>information resource</label>
        <def>
          Roughly speaking, something that is appropriate as the
          subject of metadata.  See <bibref ref="ir"/>.
        </def>

        <label>IR(u)</label>
        <def> 
          IR(u) is shorthand for the information resource
	  <loc href="#on-web-at">on the Web at</loc>
          URI u.  For example, 
          if 'http://example/image23' is dereferenceable, then
          IR('http://example/image23')
          is the information resource on the Web at that URI.
        </def>

        <label>metadata</label>
        <def>
          Information about information, or about an information 
          resource.  In RDF, metadata might
          be written using vocabularies such as Dublin Core, FOAF,
          or CC REL.
        </def>

        <label id="on-web-at">on the Web at</label>
        <def> 
          When a URI is dereferenceable,
          "the information resource on the Web at a URI" 
          (abbreviated IR(that URI), see below)
          is the information resource whose associated 
	  <loc href="#representation">representations</loc>
          are the 
	  ones obtained by dereferencing that URI (or more precisely,
          the ones that are authorized for dereferences of that URI).
	  See <bibref ref="ir"/> for a rigorous definition.
        </def>

        <label>refer</label>
        <def>
          For the purposes of this report, reference is just one way to
          mean.  There may be ways to mean other than to
          refer, but none are specified here.
        </def>

        <label id="representation">representation</label>
        <def>
	  Content (an octet sequence) tagged with media type and perhaps
      	  other information meant to guide interpretation of the content.
      	  "Representation" is used as a term of art; these representations
      	  don't necessarily "represent" anything at all.  Similar to
      	  "entity" in RFC 2616.  <bibref ref="rfc2616"/>
	  See <bibref ref="ir"/> for a treatment of representations
      	  and information resources.
        </def>

	<!-- 
        <label>term</label>
        <def>A URI, word, name, or phrase
          that can serve in subject or object position in a statement.  In an
          RDF serialization, for example, a term might be a qname,
          URI, or blank
          node label.  In Turtle, a term might be any Turtle term,
          including one written using blank node [...] notation.
	  [Draft note: HH says that to be correct, need to admit that
          URIs are also used as predicates.]
        </def>
 -->

      </glist>
    </div1>

    <div1>
      <head>Acknowledgments</head> 
      <p>
        David Booth, Michael Hausenblas, Nathan Rixham, and
        Alan Ruttenberg contributed to
        the creation of this report.  
	Pat Hayes and Henry S. Thompson participated in discussions.
	Timothy Danford gave some helpful suggestions on a draft.
	Jeni Tennison and the rest of the TAG gave many helpful comments.
      </p>
    </div1>

    <div1>
      <head>References
      </head> 
      <blist> 
        <bibl id="issue-57"
              href="http://www.w3.org/2001/tag/group/track/issues/57">
          <titleref href="http://www.w3.org/2001/tag/group/track/issues/57"
           >Issue 57.</titleref>
          W3C Technical Architecture Group, 2007-2011.
        </bibl> 

        <bibl id="rfc3986"
              href="http://www.ietf.org/rfc/rfc3986.txt">
          T. Berners-Lee, R. Fielding, L. Masinter.
	  <titleref href="http://www.ietf.org/rfc/rfc3986.txt"
           >Uniform Resource Identifier (URI): Generic Syntax.</titleref>
          RFC 3986, IETF, 2005.
        </bibl> 
        
        <bibl id="disambiguating"
              href="http://www.w3.org/2002/12/rdf-identifiers/">
          Sandro Hawke.
	  <titleref href="http://www.w3.org/2002/12/rdf-identifiers/"
           >Disambiguating RDF Identifiers.</titleref>
          W3C, January 2003.
        </bibl> 

        <bibl id="webarch"
              href="http://www.w3.org/TR/webarch/">
          Ian Jacobs and Norman Walsh, editors.
	  <titleref href="http://www.w3.org/TR/webarch/"
           >Architecture of the World Wide Web, Volume One.</titleref>
          W3C Recommendation, December 2004.
        </bibl> 

        <bibl id="ir"
	      href="http://www.w3.org/2001/tag/awwsw/ir/20110625/">
          Jonathan A. Rees, editor.
	  <titleref href="http://www.w3.org/2001/tag/awwsw/ir/20110625/"
	   >Information resources and Web metadata</titleref>.
	  Editor's draft, W3C, 2011.
	</bibl>

        <bibl id="rfc4395"
              href="http://www.ietf.org/rfc/rfc4395.txt">
	  T. Hansen, T. Hardie, and L. Masinter.
          <titleref href="http://www.ietf.org/rfc/rfc4395.txt"
           >Guidelines and Registration Procedures for New URI Schemes.</titleref>
          RFC 4395, IETF, 2006.
        </bibl> 
        
        <bibl id="rfc3406"
	      href="http://www.ietf.org/html/rfc3406.txt">
	  L. Daigle, D.W. can Gulik, R. Iannella, and P. Faltstrom.
	  <titleref href="http://www.ietf.org/html/rfc3406.txt"
	   >Uniform Resource Names (URN) Namespace Definition 
	    Mechanisms.</titleref>
	  RFC 3406, IETF, 2002.
        </bibl> 

        <bibl id="lsid"
	      href="http://www.omg.org/cgi-bin/doc?dtc/04-05-01.pdf">
          <titleref href="http://www.omg.org/cgi-bin/doc?dtc/04-05-01.pdf"
	   >Life Sciences Identifiers Specification.</titleref>
	  Object Management Group, 2004.
	</bibl>


	<!-- 
	Appears not to be used in this document any more.
        <bibl id="issue-14-resolved"
              href="http://lists.w3.org/Archives/Public/www-tag/2005Jun/0039.html">
          Roy Fielding.
	  <titleref href="http://lists.w3.org/Archives/Public/www-tag/2005Jun/0039.html"
           >[httpRange-14] Resolved.</titleref>
          Email to www-tag list, 2005.
        </bibl> 
 	-->

        <bibl id="rfc5988"
              href="http://www.ietf.org/rfc/rfc5988.txt">
          M. Nottingham.
	  <titleref href="http://www.ietf.org/rfc/rfc5988.txt"
           >Web linking.</titleref>  
          RFC 5988, IETF, 2010.
        </bibl> 

        <bibl id="hostmeta"
              href="http://tools.ietf.org/html/draft-hammer-hostmeta-13">
          E. Hammer-Lahav.
	  <titleref href="http://tools.ietf.org/html/draft-hammer-hostmeta-13"
           >Web Host Metadata.</titleref>
          Internet-draft, IETF, 2010.
        </bibl> 

        <bibl id="uriqa"
              href="http://sw.nokia.com/uriqa/URIQA.html">
          Patrick Stickler.
	  <titleref href="http://sw.nokia.com/uriqa/URIQA.html"
           >The URI Query Agent Protocol.</titleref>
          Nokia, 2010.
        </bibl> 

        <bibl id="halpin"
              href="http://lists.w3.org/Archives/Public/public-awwsw/2011Jan/0021.html">
          Harry Halpin.
	  <titleref href="http://lists.w3.org/Archives/Public/public-awwsw/2011Jan/0021.html"
	   >Reversing HTTP Range 14 and SemWeb Cool URIs decision.</titleref>
	   Email to public-awwsw list, 2011.
         </bibl>

        <bibl id="degraauw"
              href="http://www.marcdegraauw.com/2007/02/20/the-referent-convention/">
          Marc de Graauw.
	  <titleref href="http://www.marcdegraauw.com/2007/02/20/the-referent-convention/"
	   >The #referent convention.</titleref>
	   Blog post, 2007.
         </bibl>

        <bibl id="tumarello"
              href="http://lists.w3.org/Archives/Public/www-tag/2007Jul/0034.html">
          Giovanni Tumarello.
	  <titleref href="http://lists.w3.org/Archives/Public/www-tag/2007Jul/0034.html"
	   >http-range-14 303 issue, request for reopening the 
	    discussion.</titleref>
	   Email to www-tag list, 2007.
         </bibl>

        <bibl id="rfc2616"
              href="http://www.ietf.org/rfc/rfc2616.txt">
          R. Fielding, J. Gettys, J. Mogul, H. Frystyk, L. Masinter,
          P. Leach, and T. Berners-Lee.
	  <titleref href="http://www.ietf.org/rfc/rfc2616.txt"
           >Hypertext Transfer Protocol -- HTTP/1.1.</titleref>
          RFC 2616, IETF, 1999.
        </bibl> 
        


      </blist>

    </div1> 
  </body> 
</spec>


        <!-- 
        <p>
          Variant use case: Same as above, but instead of the earthquake, the
          referent of the URI is to be an 
          information resource that is not accessible on the Web, or at least
          not at any URI known to Alice.  The definition might describe where
          the information resource might be found, and other aspects
          such as bibliographic metadata (author, title, etc.) or SHA1 hash.
        </p>
        <p>
          Variant use case: Same as above, but instead of the earthquake, the
          referent is to be an 
          information resource that <emph>is</emph>
          Web accessible, via a URI known to Alice.
          The definition that Alice
          writes explains that the term is to refer to that information
          resource.  That is, there are <emph>two</emph> information
          resource at play here, one carrying the definition and one
          that's a subject of the definition.  It's important in this
          case to make sure that metadata can be written about either
          information resource.
          (In this situation, which is common in the publishing industry and
          digital archives, Alice's definition is often
          called a "landing page".)
        </p>
	-->




	<!-- 
	<p>
	  [Draft note: HH: I think the answer to this should be a strong
	  "No" and should be 
 	  discouraged, rather than heavily described as currently is. I feel too
 	  much space is used on this example.  JAR: Removed most of
	  it; is it OK now?]
	</p>
        <p>
          Each URI scheme, e.g. mailto:, http:, ftp:, and so on, has
          its own URI scheme registration, accessible via a registry
          maintained by IANA
          <bibref ref="rfc4395"/>.
          A URI scheme registration defines the
          meaning of URIs using that scheme either directly or by
          delegation to additional defining mechanisms.
          For example, the registration for the data: URI scheme
          fully explains the meaning of every URI that uses that
          scheme, while the mailto: scheme registration explains 
	  that each URI refers to a particular mailbox, understood
          relative to the domain name system and the mailbox
          assignments made by each particular host.
        </p>

        <p>
          URN namespaces <bibref ref="rfc3406"/>
 	  work in a similar way.  Each namespace has a
          registration document that is formally reviewed through IETF and
          placed on file
          with IANA.
        </p>
 -->



	    <!-- 
	    <a href="http://blog.iandavis.com/2007/11/17/fragmentation-reprise/"
	      >Ian Davis:</a>
	    The meaning of a hash URI "depends on how you access it, which
	    is nuts. Its as though a word has different meanings
	    depending on whether you read it in a book or have it read
	    out to you." 
	      &mdash; JAR: I think he's talking about the situation where
	    there is content
	    negotiation <emph>and</emph> there is inconsistency between the
	    variants.  The more common problem with content negotiation is
	    that there is no way to know ahead of time which variant 
	    has the definition at all, and thus which one to
	    request in content negotiation.
	  <p>
	    Ian points out that RDF Concepts says:
	    "a URI reference in an RDF graph is treated with respect to
	    the MIME type application/rdf+xml [RDF-MIME-TYPE]. Given an
	    RDF URI reference consisting of an absolute URI and a
	    fragment identifier, the fragment identifer identifies the
	    same thing that it does in an application/rdf+xml
	    <loc href="#representation">representation</loc>
	    of the resource identified by the absolute
	    URI component."
	    and that this appears to conflict with webarch.
	    [Draft note: TBD: try to figure out what is going on here.]
	  </p>
 	    -->

<!-- 
        <label>fixed information resource</label>
        <def>
          A document, image, sound recording, or
          other replicable entity as encoded in
          an octet sequence, together with
          optional brief annotations, such as media type and language,
          intended to guide the interpretation of the
          content.  There is no requirement that a given fixed
          information resource is accessible via any URI.
        </def>
 -->
	  <!-- 
	  <p>
	    The Chicago use case is an extreme version of this - the
	    entity providing access to the Chicago document (Alice) does not
	    even care about providing URIs that refer to Chicago; it is
	    someone having no control over how the URI dereferences (Bob)
	    who needs a reference to Chicago.
	  </p>
	  -->

	<!-- 
        <p>
	  [ADI('http://example/about-eq018', 'http://example/eq018') ?]
        </p>
	 -->

<!-- 
      <div2 id="suffix">
        <head>'Hash URI' with fixed suffix</head> 
        <p>
          This idea attempts to address one reason for using 'hashless'
          URIs instead of fragment identifiers.  Suppose you want to
          combine a large number of local names a, b, c, ... into a
          namespace.  The usual solutions would be to write
          'http://example/namespace#a' (a "hash namespace") or 
          'http://example/namespace/a' (a "hashless namespace").
        </p>
        <p>  
          In the "singleton fragid" approach one would write
          'http://example/namespace/a#' (a null fragment identifier) or
          'http://example/namespace/a#_', using a fixed suffix for every
          URI and varying the part between the namespace prefix and
          the suffix.
        </p>
        <p>
          As in the 303 approach, each URI in the namespace would (or
          could) have its own document, providing a definition for that
          single URI rather than for every URI in the namespace.
        </p>
        <p>
          The choice of fixed fragment identifier (null, "_", or
          something else) is largely a matter of taste.
        </p>
        <p>
          A null fragid precludes the use of qnames to abbreviate such URIs.
          (In particular it would not be possible to use them as
          predicate names in RDF/XML.)
          However, SPARQL, Turtle, and RDFa 
          are being extended to admit CURIEs that include #, making this a
          newly attractive option.
        </p>

        <p>
          To address the "hash gets lost" problem we could explore
          heuristics to automatically replace 'http://example/eq018' with
          'http://example/eq018#' (or 'http://example/eq018#_') when needed.
        </p>
      </div2>
     -->





<!-- Unused stuff. -->


      <!-- 
        <p>
          With any of these methods other than dereferenceable hashless URIs,
          the URI may refer to anything at all, including an
          information resource.  [COMMON MISUNDERSTANDING, not sure
          where this goes in the document.
          <loc href="http://lists.w3.org/Archives/Public/public-lod/2010Nov/0249.html"
          >This email</loc>, for example, gets it wrong; the question is not
          IR vs. NIR, it's about which thing the URI is to refer to,
          IR(u) vs. FV(u).]
        </p>
 -->


<!-- 
        <label>version (of an information resource)</label>
        <def>
          A fixed information resource associated with an information
          resource is a version of the information resource.
          <footnote>
            "Version of" as used here is similar to one of the senses in
            which "representation of" is used in
            discussions of Web architecture.
            We have two reasons to avoid "representation".  One is that
	    "representation" has been used in different ways by
            different parties and it seems wise to avoid risk of
            misinterpretation.  Another is that our versions have to be
            the same kind of thing as the information resource that
            they are versions of, so that they can have metadata.
 	    In most treatments of Web
            Architecture, representations are considered very
            different from information resources and
            do not have the same sorts of properties as information
            resources.
 -->
	    <!-- 
            due to the different ways in which Roy Fielding (in his
            REST work) and Tim Berners-Lee [citation needed] use the word.
            It seems better to avoid the word entirely and use a new
            word to specifically mean the Tim Berners-Lee sense.
 	    -->
	    <!-- 
          </footnote> -->

          <!-- 
          [Cf. TimBL 'fixed resource.']
          [Searching for a new term since Nathan and JAR don't 
          like "representation".
          Consider: version, content+, continent, malcontent, discontent, epresentation, 
          represen-tation, specific information resource, simple information
          resource, fixed resource, specialization.]
          [Consider trying to write the document without any need for
          this word!]
          -->
	  <!-- 
        </def>
     -->
	  <!-- 
          Because of the controversy around this term we will not
          attempt to define it, but rather say that: (1) An
          information resource
          is associated with a set of fixed information 
          resources (its versions). (2)
          An information resource is "similar" to its versions in that
          metadata that applies to each version of an
          information resource applies
          to the information resource itself, and vice versa.
 	  -->

	<!-- 
        <label>FV(u)</label>
        <def> 
          FV(u) is shorthand for the meaning of a URI u
          according to the definition of u in (a version of)
          the information resource IR(u).  For
          example, if IR('http://example/p16') says that 
          'http://example/p16' refers to Alice's canoe,
          then FV('http://example/p16') is Alice's canoe.
          ('FV' stands for 'take at face value'.)
        </def>
 -->


      <!-- 
      <div2>
        <head>Alternative URI schemes and/or URN namespaces</head> 
        <p>
          The purpose of URI scheme registration is to create new
          classes of URIs with meanings specified by the
          registration.  That is, the registration is a definition
          (perhaps partial) of the meanings of the URIs having that
          scheme.
        </p>
        <p>
          One could derive a URI to refer to a canoe from a URI
          that dereferences to a definition (of the derived URI) by prefixing
          a particular URI prefix to
          the URI for the definition, e.g. fv:http://example/about-p16 .
        </p>
        <p>
          tdb: is close to this, but it covers the primary-topic-of
          use case, not the mint-a-term one - these would not have the
          same behavior.
        </p>
        <p>
          The process for registering a URI scheme is documented by 
          RFC 4395, and for registering a URN namespace is in RFC 
          3406.
        </p>
        <p>
          A problem shared by all non-http URIs is that they won't "work" in
          unmodified browsers.  (But "it's not about 
          browsers," cries Mark Wilkinson.)
        </p>
      </div2>
       -->



        <!-- 
        <p>
          Variant use case: Same as above, but Bob's bibliography
          includes a number of RDF 
          documents, and his metadata includes information relevant
          for making use of those RDF documents.
        </p>
        <p>
          Variant use case: Same as above, but instead of being a 
          person, Bob is a tool that
          is charged with updating all the documents on a Web site with
          license metadata.
        </p>
 	-->

      <!-- 
      <p>
        [Terminology option: Maybe "metadata subject" instead of
        "information resource"??]
      </p>
      -->


        <!-- 
        <p>
          (Why would one be dealing with both kinds of statements 
          at the same time?  Well, the two groups of statements
          might be inserted as RDFa into a
          single HTML document by different tools, or by different
          modules in a content management system.  Or the statements
          might be combined in a single triple store from multiple sources.)
        </p>

        <p>
          (Another way to make sense of this approach is to say that
          URI u refers to IR(u),
          but predicates such as foo:mass and foaf:name have their
          domains expanded 
          to include information resources, and IR(u) is "coerced" to
          FV(u) as needed in order for the predicates to make sense.
          In this view it is the predicates that are the chimeras, not
          the entities they apply to.)
        </p>
        -->

	<!-- 
        <p> 
          Second,
          if the definition of 'http://example/p16'
	  happens to specify an information resource
          other than IR('http://example/p16'), we will end up
          with incorrect statements, since metadata for two distinct
          information 
          resources would be attributed to a single entity.
          Consider, for example, the case where copyright license A applies to
          IR('http://example/p16') and copyright license B applies to
          FV('http://example/p16').  This would lead to both licenses
          being applied to CH('http://example/p16'), which would be
          impossible to interpret correctly, as neither subject is
          such that both licenses apply to it.
          We would have to obtain general agreement that the
          definition at IR('http://example/p16')
          must not
          lead to the URI being understood to refer to
          any information resource other than
          IR('http://example/p16') itself.
        </p>
 	-->

      <!-- 
      <p>
        [Draft note: still thrashing on terminology "definition"
        vs. "documentation" vs. "account"]
      </p>

      <p>
        Languages such as OWL and RDF that
        pervasively use URI-based vocabularies require that
        one be able to refer [mean?], in those languages, to things one
        has to refer to,
        in such a way that the reference will be understood by someone
        encountering the reference.  These references either are URIs
        or are built on URIs, so the problem of referring
        reduces to that of either knowing, or influencing, the way that
        readers will interpret URIs referentially.
      </p>
      -->

	<!-- 
        <p>
          (Any of these methods may be used to define a URI that refers
          to something for which a more specialized generic 
	  definition already exists, for
          example a mailbox (for which there is the mailto: URI scheme
          registration document) or
          an information resource (for which there are the
          registrations for http:, ftp:, gopher:, data:, and so on).
          In theory, an information 
          resource could be specified in a URI definition by
          spelling out the details of its versions, perhaps in RDF.
          However, this is ordinarily not necessary, since usually the
          specialized naming system can be used.)
        </p>
 	-->

	<!-- 
        <p>
          Most URI scheme registrations, such as that for http:, only
          provide a partial definition, and other sources of
          information must be consulted in order to understand a
          particular URI using that scheme.  For example, to
          understand an http: URI, one generally needs to dereference
          it (and even then one only knows a single version of it; see
          <specref ref="ir-ref"/>).
        </p>

	<example>
	  <head>Defining a URI by registering a URI scheme</head>
	  <p>
	    To define a URI to refer to Mount Everest, Alice
	    invents a new URI scheme, say mountain:, and publishes a
	    registration for it via IETF and IANA that says that
	    'mountain:peakxv' refers to Mount Everest.  Bob, on
	    encountering 'mountain:peakxv', checks the IANA URI scheme 
	    registry
	    (which he knows about because the registry is specified
	    by IETF),
	    obtains a link to Alice's 
	    registration for the 'mountain:' scheme, 
	    reads the registration, and is enlightened.
	  </p>
	</example>

  	<p>
	  Practically
          speaking, this approach is very challenging due to the
          rigor of the review process for URI scheme registrations
	  (see <bibref ref="rfc4395"/>).
          Furthermore, Web clients will not understand the new URI
          scheme, making the definition of the URI
          effectively inaccessible for most agents encountering the URI,
	  at least until the mountain: scheme becomes as well known as
          the http: scheme.
        </p>
 -->

	<!-- 
        <label>ADI(u,v)</label>
        <def> 
	  The meaning of URI v, as defined in IR(u).
	  [not sure we need this one.]
        </def>
	 -->
