<?xml version="1.0" encoding="UTF-8"?>
<!-- <!DOCTYPE spec SYSTEM "C:\XMLspec\spec-prod\dtd\xmlspec.dtd" [ -->
<!DOCTYPE spec SYSTEM "http://www.w3.org/2002/xmlspec/dtd/2.3/xmlspec.dtd" [
<!-- ================================================================ -->
<!ENTITY mdash " &#8212; ">
<!ENTITY draft.day "1">
<!ENTITY draft.month "11 ">
<!ENTITY draft.monthname "November">
<!ENTITY draft.year "2006">
<!ENTITY iso6.doc.date "&draft.year;-&draft.month;-&draft.day;">
<!ENTITY http-ident "http://www.w3.org/2001/tag/doc/alternatives-discovery">
]>

<?xml-stylesheet type="text/xsl" href="./xmlspec.xsl"?> 

<spec xmlns:xlink="http://www.w3.org/1999/xlink">
  <header>
    <title>
      On Linking Alternative Representations To Enable Discovery And Publishing
    </title>
    <w3c-designation>
      http://www.w3.org/2001/tag/doc/alternatives-20060524
    </w3c-designation>
    <w3c-doctype>TAG Finding</w3c-doctype>
    <pubdate>
      <day>&draft.day;</day>
      <month>&draft.monthname;</month>
      <year>&draft.year;</year>
    </pubdate>
    <publoc>
      <loc href="http://www.w3.org/2001/tag/doc/alternatives-discovery-20061101.html" >
      http://www.w3.org/2001/tag/doc/alternatives-discovery-20061101.html</loc>
    </publoc>
    <latestloc>
      <loc href="http://www.w3.org/2001/tag/doc/alternatives-discovery.html" > Latest Version</loc> 
    </latestloc>
    <prevlocs>
      <loc
          href="http://www.w3.org/2001/tag/doc/alternatives-discovery-20060915.html">September
      15, 2006</loc>
      <loc
          href="http://www.w3.org/2001/tag/doc/alternatives-discovery-20060620.html">June
      20, 2006</loc>
    </prevlocs>
    <authlist>
      <author>
        <name>T. V. Raman</name>
        <email href="mailto:raman@google.com">raman@google.com</email>
      </author>
    </authlist>
    <status>
      <p>This document has been developed for discussion by the
      <loc href="http://www/w3.org/2001/tag/">W3C Technical
      Architecture Group</loc>. This version, dated November 1, 2006
was 
approved by the TAG on its teleconference October 31, 2006;
 this approval  hereby closes  the TAG
      issue <loc href="">Generic-resources-53</loc>.</p>
      <p>Publication of this finding does not imply endorsement
      by the W3C Membership. This is a draft document and may be
      updated, replaced or obsoleted by other documents at any
      time.</p>
      <p>
        <loc href="http://www.w3.org/2001/tag/findings">Additional
        TAG findings</loc>, both approved and in draft state, may
      also be available. </p>
      <p>Please send comments on this finding to the publicly
      archived TAG mailing list <loc href="mailto:www-tag@w3.org"
      >www-tag@w3.org</loc> (<loc
      href="http://lists.w3.org/Archives/Public/www-tag/"
      >archive</loc>).</p>
    </status>
    <abstract>
      <p>
        Content creators wishing to publish multiple versions of a given
        resource on the Web face a number of questions with respect to
        how such URIs are created, published and discovered. Questions
        include:

        <ulist>
          <item><p>Given a resource
          <code>http://example.com/ubiquity/</code>
          that can be delivered in a multiplicity of representations,
          how should one publish the relevant URIs to enable automatic
          discovery of these representations (AKA <emph>specific resources</emph>)?</p></item>
          <item><p>How does one ensure that the
          <emph>alternative</emph> relationship amongst these
          various representations is available in a  machine
          readable form, and
          consequently <emph>discoverable</emph>?</p></item>
          <item><p>Here, multiple representations might include:
          <slist>
            <sitem>Representations appropriate for different delivery
            contexts</sitem>
            <sitem>Alternative formats of the resource
            distinguished by <code>Content-type</code></sitem>
<sitem>Different versions of the resource e.g., either by
language or date</sitem>
            <sitem>Representations in different languages</sitem>
          </slist>
          </p> </item>
        </ulist>


      </p>
      <p>This document explores the issues that arise in this context,
      and attempts to define best practices that help:
      
<ulist>
        <item><p>Preserve the <emph>One Web</emph>  while enabling   content
        publishing to  a multiplicity of delivery contexts.</p>
        </item>
        <item><p>Enable the creation of <emph>RESTful</emph> URIs that
        remain representation agnostic while delivering the correct
        end-user experience.</p>
        </item>
        <item><p>Enable automatic discovery of the available
        representations.</p></item>
<item>
  <p>Enable web crawlers discover the relationship between a
  given generic resource and the specific resources that correspond to
  its various alternatives.
This will help search engines build better Web indices and avoid 
the need to index all available alternatives of a given resource.</p>
</item>
      </ulist>

      
    </p></abstract>
    <langusage>
      <language/>
    </langusage>
    <revisiondesc>
      <p><ulist>
        <item><p>$Id: alternatives-discovery-20061101.xml,v 1.5 2006/11/01 17:36:39 vquint Exp $</p></item>
        <item><p>Updated by TVR after June F2F.</p></item>
        <item><p>
          Created May 2006 by TVR for discussion at the June 2006 F2F.
        </p></item>
      </ulist>
      </p>
    </revisiondesc>
  </header>

  <body>
    <div1>
      <head>Introduction</head>
      
      <p>There has always been a need to serve user-agent
      specific content for a given URI &mdash; thus highlighting the
      distinction between <emph>Resource</emph> and
      <emph>Representation</emph> on the Web. The increasing
      importance of the mobile, multilingual Web makes this
      requirement even stronger. At the same time, published
      content (and its various representations) needs to be
      <emph>discoverable</emph> on the Web; as an example,
      crawlers and web-bots need to be able to discover the
      availability of alternative forms of a given
      resource. Documents published on the Web become
      <emph>discoverable</emph> via the hyperlinked structure of
      the Web; to enable discovery of alternative
      representations, the relation between these multiple
      representations needs to be captured by the hyperlink
      structure of the Web. This finding enumerates some of the
      issues faced by content creators on the Web today and
      proposes a sequence of best practices to foster the
      following long-term goals:</p>
      
      <olist>
        <item><p>Preserve a <emph>Single Web</emph>
        i.e., a Web where content is universally accessible from a
        variety of end-user devices.</p></item>
        <item><p>Ensure that the <emph>One Web</emph>
        enables the easy exchange of resources (and pointers to resources)
        across its different facets, i.e.,
        mobile and desktop users should be able to share references to Web Resources (URIs)
        with the accessing user being able to retrieve an
        appropriate representation (specific resource).</p></item>
        <item><p>Ensure that contents published to a given
        facet of the Web are <emph>linkable</emph>,
        <emph>discoverable</emph>, <emph>crawlable</emph>,
        <emph>searchable</emph> and
        <emph>browsable</emph>  from any of its
        other facets.</p></item>
        <item><p>Enable content providers clearly advertise the
        relationship between a given generic resource and the
        various specific resources that correspond to the
        available alternatives for that generic resource.
</p></item>
      </olist>
    </div1>
    <div1>
      <head>Use Case Scenarios</head>

      <p>This section enumerates the candidate use case scenarios
      along with  accompanying issues and suggested solutions.
      See the next section for
      recommended best practices that  are a generalization of
      these solutions.</p>


      <p>The owners of <code>http://example.com/ubiquity</code>
      would like to publish their content to a wide variety of end-user
      devices ranging from desktop Web browsers to mobile devices such
      as cell-phones and PDAs.
      They also serve multiple geographies using different languages.
      They know about the different markup language variants that are
      currently in vogue on these devices, and are capable of
      generating the  representation that is most appropriate for the
      accessing user-agent. In publishing their content and associated
      URIs, they face the following issues.
      </p>


      <div2>
        <head>Publishing Desktop And Mobile Versions</head>
        <p>
          Given generic resource <code>http://example.com/ubiquity/resource</code>
          with corresponding alternatives for a desktop browser, a PDA
          and a cell-phone:
          
          <ulist>
            <item><p>Should the different alternatives have distinct URIs?</p></item>
            
<item><p>Should the generic resource have a single URI that delivers the appropriate
            representation?</p></item>
            <item><p>If publishing  distinct URIs for the resource and its
            various representations, how should the relationship
            between these URIs be expressed in a discoverable,
            machine-readable form? How  should this relationship
            be reflected in the hyperlink structure of the Web? </p></item>
          </ulist>
        </p>    
        
        
        

        <div3>
          <head>Suggested Solution</head>

          <p>We suggest the following approach for this situation:

          <olist>
            <item><p>Create representation-specific URIs
            (specific resources) for each available
            alternative (<code>representation_i</code>), e.g.,
            <code>http://example.com/ubiquity/resource/representation_i</code>.</p></item>
            <item><p>If no content negotiation is in place,
            serve  a canonical representation (generic resource) of the content at
            <code>http://example.com/ubiquity/resource</code></p></item>
            <item><p>With that same URI,
            use HTTP content-negotiation, along with the correct
            HTTP VARY headers to serve up the  appropriate representation 
            at access time. Ensure that the VARY headers capture the right
            parameters that were used to choose the representation that is
            being served &mdash; this is important for correct behavior when
            using cacheing proxies.</p></item>
            <item><p>
              As an alternative to the previous step,
              arrange for the server to generate an <code>HTTP
              302 (Found)</code>
              redirect to automatically serve up
              <code>http://example.com/ubiquity/representation_i</code>
              when <code>http://example.com/ubiquity</code> is accessed by
              <code>user-agent_i</code>.
              This form of redirect involves an extra client/server
round-trip,
and may therefore be suboptimal for mobile devices.
              This is a <emph>temporary</emph> redirect;
              the accessing user-agent should continue to use the canonical URI
              when creating bookmarks, or emailing URI.
Finally, note that to optimize link traversal out of the resulting
document, the content provider might wish to rewrite relative
links  to point at the specific resource.
              This will ensure that later uses of the URI results in
              expected end-user results; e.g., In the following scenario:
              <slist>
                <sitem>Cell-phone user emails link</sitem>
                <sitem>Recipiant opens message on a desktop</sitem>
                <sitem>Clicks on the link</sitem>
              </slist>
              The user following the link from inside the email message on a
              desktop browser should receive the desktop version, and not the
              mobile version. Notice that passing around the canonical URI is
              critical in achieving this behavior.
            </p>
            <p>Additionally, contrast this solution with  using HTTP
            content-negotiation with VARY headers; using a redirect to the
            URI as a specific resource  has the advantage of freezing
            all parameters that were used to choose that representation into
            the URI.</p>
            </item>
            <item><p> Use linking mechanisms provided by the
            representation being served to create
            <emph>links</emph> to the other available
            representations. As an example, when using HTML, one
            might use <code>a</code> and <code>link</code>
            elements to advertize the availability of alternate
            representations. In this context, note that there are
            two distinct types of such links:
            <slist>
              <sitem>Links for human consumption that are to be
              presented to the user</sitem>
              <sitem>And links for machine consumption, that are
              used by the user agent to provide additional functionality.</sitem>
            </slist>
            As an example, links to available alternatives meant
            for human consumption might use the HTML
            <code>a</code> element since these are rendered by
            user-agents. In contrast, links meant for use by bots
            might use the HTML <code>link</code> element &mdash;
            as an example, this reflects present practice when
            publishing pointers to 
            Atom/RSS feeds.
          </p>
          <p>In either case, notice that  following these steps
          creates a 
          mini-graph comprising of the canonical URI and  URIs
          for its various representations.
          </p>
            </item>
          </olist>
        </p>
      </div3>
    </div2>


    <div2>
      <head>Publishing In Multiple Languages</head>
      <p>
        The owners of <code>http://example.com/global</code>
        publish their content in a multiplicity of languages.
        They wish to publish any given announcement at a
        <emph>canonical</emph> URI,
        while retaining the ability to serve up a version in a language
        that is most appropriate for the user.
        Further, they wish to create URIs for 
        each available language to facilitate hyperlinking and
        discovery. At the same time, they do not wish to hard-wire the
        language in which a given announcement is accessed when such URIs
      are passed around by end-users.</p>

      <div3>
        <head>Suggested Solution</head>

        <p>For a design pattern that has worked well over the years, see
        the W3C practice of publishing press releases in multiple
        languages. Here are its salient characteristics:</p>
        <olist>
          <item><p>
            Press releases announced with a <emph>canonical</emph> URI.</p>
          </item>
          <item><p>
            Accessing this <emph>canonical</emph> URI with the appropriate
            <code>Language</code> header results in an automatic redirect
          that delivers the document in the desired language.</p></item>
          <item><p>
            Each language version 
            contains links to URI's that in turn  serve 
            a  representation in one of  the other available languages. 
          </p></item>
          <item><p> Since these translations are typically for
          human consumption, these  links 
          are encoded as HTML
          <code>a</code> elements so that they get displayed in
          browsers.</p></item>
        </olist>
      </div3>
    </div2>

    <div2>
      <head>Publishing Continuously Updating Content</head>
      <p>
        The owners of <code>http://example.com/blogosphere/current</code>
        publish up-to-date content. Once published, they would like users
        to be able to reliably bookmark the published content.
        At the same time, they would like end-users to be able to always
        access a <emph>canonical</emph> URL  when looking for the most
      recently published content.</p>

      <div3>
        <head>Suggested Solution</head>
        <p>The issue identified here has been faced by and  solved
        successfully during the last few years by the blogging community.
        </p>
        <olist>
          <item><p>
            Accessing a blog's <emph>canonical</emph> URI retrieves recent
          posts.</p></item>
          <item><p>
            Posted items have a <emph>bookmark</emph> or
            <emph>permalink</emph> pointer that can be used to reliably
          access postings from the past.</p></item>
          <item><p> Pointers to alternative content are encoded
          as <code>link</code> elements. This enables agents
          such as blog-readers, content-aggregators and Web
          crawlers to discover the availability of alternative
          versions. Note that this design pattern is widely
          deployed on the Web in the context of RSS/ATOM feeds
          to advertize permalinks and other pointers to make
          them discoverable. In the case of RSS/Atom feeds,
          this has enabled Web sites to embed such links within
          the <code>head</code> element of HTML pages, and have
          them <emph>revealed</emph> to the user by Web browsers
          that are capable of consuming such feeds. </p></item>
        </olist>
      </div3>
    </div2>
  </div1>
  <div1>
    <head>Recommended Best Practices</head>
    
    <p>As can be seen from the use-cases and suggested solutions
    enumerated in the previous section,
    pointers to Web Resources (URIs) can  either:
    <ulist>
      <item><p>Be canonical URIs, i.e., have no context
      hard-wired. Such canonical URIs identify a generic resource.</p></item>
      <item><p>Encapsulate partial context, e.g.,
       language. Such URIs identify a specific resource that is
       one  possible alternative of a generic resource.</p>
      </item>
      <item><p>Encapsulate multiple context bits, e.g., language and
      device profile. </p></item>
      <item><p>Capture <emph>all</emph>  context, i.e., 
      the creator of the URI <emph>guarantees</emph> that all state is
      completely captured by the URI.</p></item>
    </ulist>
    </p>
    <p>
      Our primary take-aways from the these observations are:
      <ulist>
        <item><p>URIs are cheap,  we suggest creating as many distinctive
        URIs as is meaningful.</p></item>
        <item><p>The hyperlink structure of the Web is crucial
        for content discovery; when creating a multiplicity of
        URIs for a given canonical resource, ensure that the
        relationship amongst these multiple URIs is captured by
        the hyperlink structure of the content. This will ensure
        that Web user-agents (both human-facing as well as web
        crawlers) are able to discover the various available
        alternatives and even more importantly, discover the
        inter-relationship amongst these specific resources,
        and their mutual relationship to the generic resource.</p></item>
        <item><p> Encourage users and user-agents to work with
        canonical URIs; leave it to the underlying
        infrastructure to generate appropriate redirects in
        order to serve users the appropriate representation
        (specific resource). For each such available
        representation that is generated as a function of user
        context, ensure that there is a URI that can reproduce
        that representation (specific resource) in the absence
        of user context; or equivalently: for every
        representation, ensure that there is a URI that
        hard-wires all user context e.g., language, device
        preference etc., required to generate that
specific resource.</p></item>
      </ulist>
    </p>
  </div1>
  <div1>
    <head>Conclusions</head>

    <p>Principal conclusions:</p>
    <ulist>
      <item><p>URIs are cheap.Create them as needed, publish them to
      the Web, and ensure that they are appropriately linked in to the
      rest of the Web.
      Thus, 
      each representation of interest should
      get it's own URI (become a specific resource)
      and there should be one additional URI representing the
      generic resource.</p></item>
      <item><p>Enable discoverability of alternative representations by
      leveraging the hyperlink structure of the Web.
      Thus, 
      given one of the alternatives for a resource, ensure that one can reach the corresponding <emph>generic
      resource</emph>
      by traversing a contained hyperlink.
      When  creating a <emph>generic resource</emph>
      with multiple alternatives, encode hyperlinks to the available
      alternatives with the generic resource. This will enable crawlers
      and other web agents discover the availability of these
      alternatives,
      and to establish the correct semantic linkage amongst the various
      alternatives.
      </p></item>
      <item><p>Hyperlinks can be designed either for human consumption
      (HTML <code>a</code> element), purely for machine consumption
      (HTML <code>link</code> element), or both.
      To maintain a single Web, 
      ensure that the hyperlink structure of the Web is leveraged to
      create a graph structure whose transitive closure includes all
      available representations of a given generic resource.
    </p>
      </item>
    </ulist>
  </div1>
<div1 id="open-issues">
<head>Open Issues</head>
<p>This finding has highlighted the need to capture the
relationship between a generic resource and its specific
alternatives. We have illustrated such linking using the present
practice of using <code>link</code> elements with an appropriate
<code>rel</code> attribute.
It would be useful for groups defining various hypertext formats
 to arrive at a common set of values for the <code>rel</code>
 attribute 
that appropriately capture the various types of relationships
that are envisioned amongst a generic resources and its specific
alternatives --- for some initial ideas, see <loc
href="#generic-ontology">W3C Architecture: Generic
Resources</loc>
which sketches an ontology;
also, see <loc
href="http://www.w3.org/2001/tag/issues.html#standardizedFieldValues-51">TAG
Issue 51 (Standardized Field Values)</loc>.
</p>
</div1>
  <div1 role="appendix">
    <head>Figures</head>
    <example>
      <graphic  alt="Illustrates multiple representations forming a connected graph with the generic resource at the center."
                source="af0x.png"/>
      <p>This figure shows a <emph>Generic Resource</emph> along with
      its multiple representations. In addition to its generic
      representation, the resource is available in print and mobile
      versions in both English and Japanese. URIs are assigned to each
      of these possible representations, and the illustration shows
      that these individual representations (specific resources) have
      links to/from the <emph>Generic Resource</emph>. Additional
      dotted arcs indicate that the content provider may create
      additional links that connect specific resources.</p>
    </example>
  </div1>
  
  <div1>
    <head>References</head>
    <blist>
      <bibl id='mwi'
            href='http://www.w3.org/tr/2006/CR-mobile-bp-20060627/' >
        Jo Rabin, Charles McCathieNevile
        <titleref>Mobile Web Best Practices 1.0</titleref>W3C. 27 June,
      2006.</bibl>
      <bibl id="dics"
            href="http://www.w3.org/TR/cselection/">
        Rhys Lewis, Roland Merrick
        <titleref>Content Selection For Device Independence</titleref>
      May 2, 2005</bibl>
<bibl id="generic-ontology"
      href="http://www.w3.org/DesignIssues/Generic">
Tim Berners-Lee,
<titleref>Web Architecture: Generic Resources</titleref>
</bibl>
      <bibl id="metadata31"
            href="http://www.w3.org/2001/tag/doc/metaDataInURI-31">
        Noah Mendelsohn, Stuart Williams
        <titleref>The Use Of MetaData In URLs</titleref> W3C, September
        16, 2006
      </bibl>
    </blist> 

  </div1>
</body>
</spec>

