<?xml version="1.0" encoding="UTF-8"?> 
<?xml-stylesheet type="text/xsl" href="../../../2002/xmlspec/xhtml/1.13/xmlspec.xsl"?>
<!DOCTYPE spec SYSTEM
"../../../2002/xmlspec/dtd/2.10/xmlspec.dtd" [ 
<!--
================================================================
--> 
<!ATTLIST spec xmlns:xlink CDATA #IMPLIED>
<!ENTITY mdash " &#8212; "> 

<!ENTITY draft.day "15"> 
<!ENTITY draft.monthname "April"> 
<!ENTITY draft.year "2009">
]>

<spec xmlns:xlink="http://www.w3.org/1999/xlink" w3c-doctype="wd"> 
  <header>

    <title> Usage Patterns For Client-Side URL parameters 
    </title>

    <w3c-designation>http://www.w3.org/TR/2009/WD-hash-in-url-20090415/</w3c-designation> 
    <w3c-doctype>W3C Working Draft: TAG Finding</w3c-doctype> 
    <pubdate> 
      <day>&draft.day;</day>
      <month>&draft.monthname;</month> 
      <year>&draft.year;</year>
    </pubdate> 
    <publoc> 
      <loc href="http://www.w3.org/TR/2009/WD-hash-in-url-20090415/" >
        http://www.w3.org/TR/2009/WD-hash-in-url-20090415/
      </loc>

    </publoc>
    <altlocs>
      <loc role="xml" href="hash-in-url.xml"
           xlink:type="simple">XML</loc>
    </altlocs>
    <latestloc> 
      <loc
          href="http://www.w3.org/TR/hash-in-url/" > Latest
      Version
      </loc> 
    </latestloc>  
    <authlist> 
      <author>

        <name>T. V. Raman
        </name> 
        <email
            href="mailto:raman@google.com">raman@google.com
        </email> 
      </author>

    </authlist> 
    <status> 

      <p>
<strong>This document is now developed in <em>hash-in-uri.xml</em> *DONOTEDIT THIS File!</strong>
This document has been developed for
      discussion by the 
      <loc href="http://www/w3.org/2001/tag/">W3C
      Technical Architecture Group
      </loc> and is being published as a
      Public Working Draft in order to get additional input from the
      Web community. This version, dated April 15, 2009 is a follow-up
      to the previous version dated March 20, 2008. Sections that need
      additional work are intentionally left as empty place-holder
      sections so that the Web community gets a sense of where we would
      like to take this document. 
      </p> 
      <p>Publication of this draft
      finding does not imply endorsement by the W3C Membership. This is
      a draft document and may be updated, replaced or obsoleted by
      other documents at any time.
      </p> 
      <p>Please send comments on this
      finding to the publicly archived TAG mailing list 
      <loc
          href="mailto:www-tag@w3.org" >www-tag@w3.org
      </loc> (
      <loc
          href="http://lists.w3.org/Archives/Public/www-tag/"
          >archive
      </loc>).
      </p> 
    </status> 
    <abstract> 
      <p>Designers of URLs
      have traditionally used 
      <code>?
      </code> to encode

      <emph>server-side 
      </emph> parameters. At its inception, the Web
      also introduced fragment identifiers (preceded by 
      <code>#
      </code>)
      as a means of addressing specific locations in a document. As
      highly interactive applications get built using Web parts (HTML,
      CSS and JavaScript component resources that are themselves Web
      addressible &mdash; see 
      <bibref ref="tvr-cacm2009"/>, there is an
      increasing need for encoding interaction state as part of the
      URL. The Web is beginning to discover and codify design patterns
      based on fragment identifiers for many of these use cases.
      </p>

      <p>This draft finding is being prepared in response to 
      <loc
          href="http://www.w3.org/2001/tag/group/track/issues/60"> TAG
      issue #60
      </loc>. This document explores the issues that arise in
      this context, and attempts to define best practices that help:
      </p>
      
      <ulist> 
        <item>
          <p>Create URLs for intermediate pages in a
          Web application so that the 
          <emph>back button does the right
          thing
          </emph>
          </p>
        </item> 
        <item> 
          <p>Enable clients to address into
          specific points in a stream of content, e.g., video.
          </p>
        </item>

      </ulist>
      <p>The goal of this finding is to initially collect
      the various usage scenarios that are leading to innovative uses
      of client-side URL parameters, along with the solutions that have
      been developed by the Web community. When this exercise is
      complete, this finding will conclude by ensuring that these
      design patterns are mutually compatible. If some of these usage
      patterns are identified as being in conflict, we will recommend
      best practices that help side-step such conflicts. We encourage
      the wider Web community to point us at emerging usage scenarios
      and design patterns so that we maximize our chances of arriving
      at a final finding that helps move forward the architecture of
      the Web in a self-consistent manner.

      
      </p> 
    </abstract> 
    <langusage> 
      <language id="en-US">English</language> 
    </langusage>

    <revisiondesc> 
      <p>
        <ulist> 
          <item>
            <p>$Id: hash-in-url.xml,v 1.25
            2008/09/04 13:37:28 ht Exp $
            </p>
          </item>          
        </ulist> 
      </p> 
    </revisiondesc> 
  </header>

  
  <body> 
    <div1>
      <head>Introduction
      </head>



      
      <p>At the beginning of the Web, we decided to encode

      <emph>server-side
      </emph> URL parameters with a 
      <code>?
      </code>. At
      the same time, the Web adopted 
      <code>#
      </code> to attach fragment
      identifiers to URLs so that user-agents could address into
      specific locations in an HTML document. Nearly 20 years later,
      the Web has built a strong set of conventions around how URL
      parameters are used. As transactional applications began moving
      on to the Web in the late 1990's, server-side parameters formed a
      core building block for how application state was communicated
      between client and server. In this phase of Web evolution,
      clients were still comparatively simple, and client-side URL
      parameters did not move beyond the use of fragment
      identifiers. But with Web 2.0 applications increasingly moving
      traditional client-side applications to the Web, we are now
      seeing a variety of design patterns beginning to emerge with
      respect to how client-side URL parameters are used in order to
      influence client interaction. The need to remain consistent with
      the prevalent Web architecture has seen these design patterns
      build on the existing mechanism of fragment identifiers in
      URLs. This finding enumerates the various emerging patterns along
      with their associated use cases as a means of documenting
      existing practice on the Web.
      </p> 
    </div1> 
    <div1>
      <head>Use Case
      Scenarios
      </head> 
      <p>This section enumerates the various usage
      scenarios that are leading to innovative uses of client-side URL
      parameters on the Web.
      </p> 
      <div2>
        <head>Addressing Into Multimedia
        Streams
        </head>

        
        <p>When publishing multimedia streams, there is often a need
        to address into specific points in the multimedia stream, e.g.,
        by using a time-index. The simplest means of doing this is to
        pass in the start-time as a server-side parameter in the URL,
        e.g.,

        <code>http://www.example.com/media.stream?start=03:06:09
        </code>
        and have the server start streaming the content starting at 3
        hours, 6 minutes and 9 seconds into the content. This has the
        additional side-benefit of creating distinct URLs for each point
        in the media stream and such URLs can be used to bookmark
        locations of interest. 
        </p> 
        <p>It is also possible to leverage
        client-side parameters encoded as part of the URL (using a

        <code>#
        </code>), where this 
        <emph>pseudo
        </emph> fragment
        identifier is used by client-side scripts as an argument to be
        passed to an appropriate 
        <emph>locator
        </emph> function. Consider
        the following example taken from 
        <emph>cnn.com
        </emph>:
        </p>

        
        <example> 
          <eg><![CDATA[<a href="http://www.cnn.com/video/#/video/tech/2008/02/19/vo.aus.sea.spider.ap">
          Giant sea spider filmed deep underwater
          </a>]]></eg> 
        </example>

        <p> CNN uses links like the above for all the topical video
        segments that are published on its site. The URL in this case has
        the following components:</p> 
        <table border="2" cellspacing="0"
               cellpadding="6" rules="groups" frame="hsides"> 
          <thead>

            <tr>
              <th>Component
              </th>
              <th>Value
              </th>
            </tr> 
          </thead> 
          <tbody>

            <tr>
              <td>Protocol
              </td>
              <td>http
              </td>
            </tr>

            <tr>
              <td>Host
              </td>
              <td>www.cnn.com
              </td>
            </tr>

            <tr>
              <td>Path
              </td>
              <td>video
              </td>
            </tr> 
            <tr>
              <td>Client
              Param
              </td>
              <td>#/video/tech/2008/02/19/vo.aus.sea.spider.ap
              </td>
            </tr>

          </tbody> 
        </table> 
        <div3>
          <head>Things To Note
          </head>
          <slist> 
            <sitem>The browser is expected to do a GET of the URL
            leading up to the fragment, and the processing application, in
            this case, the JavaScript embedded in the HTML Response processes
            the portion of the URL following the 
            <code>#
            </code>.
            </sitem>

            <sitem> Note that in the general case, the JavaScript function
            that eventually processes the client param may not have been
            present in the original HTTP Response it may come from a
            Javascript library that was loaded as the result of a subsequent
            HTTP GET request as a result of a 
            <code>script
            </code> in the
            text/html response.
            </sitem> 
            <sitem> The fragment identifier has
            been intentionally identified as a 
            <emph>client
            parameter
            </emph>.
            </sitem> 
            <sitem> Treating it as a regular
            fragment identifier in this usage would result in one incorrectly
            inferring that the URL for the video resource being addressed is

            <code>http://www.cnn.com/video
            </code>.
            </sitem> 
            <sitem>This would
            result in all the video links on the CNN site getting the same
            URL.
            </sitem> 
            <sitem>Thus, the entire URL in this case is

            <![CDATA[
                     http://www.cnn.com/video/#/video/tech/2008/02/19/vo.aus.sea.spider.ap]]>

            </sitem> 
            <sitem> A consumer of this URL who goes looking for an

            <code>id
            </code>within the 
            <emph>Response
            </emph> that matches the

            <code>#-suffix
            </code> of this URL will fail.
            </sitem> 
            <sitem>The
            reported 
            <emph>Content-Type
            </emph> for the resource is

            <code>text/html
            </code>. However the behavior of the

            <code>#-suffix
            </code> in this case is not defined by the HTML
            specification.
            </sitem> 
            <sitem>As used, the 
            <code>#-suffix
            </code>
            is a first-class 
            <emph>client parameter
            </emph> in that it gets
            consumed by a 
            <code>script
            </code> that is served as part of the
            HTML document returned by the server upon receiving a GET
            request.
            </sitem> 
            <sitem>This embedded script examines the URL
            available to it as script variable 
            <code>content.location
            </code>,
            strips off the 
            <code>#
            </code> and uses the rest of the prefix as
            an argument to function that generates the actual URL. 
            </sitem>

            <sitem> Having constructed this content URL, the script then
            proceeds to instruct the browser to play the media at the newly
            constructed location.
            </sitem> 
            <sitem>Notice further that the
            behavior of a user-agent that does not execute the embedded
            JavaScript is different given this URL. Notice further that the
            HTTP Response headers do not give the client any indication that
            this is likely to be so. 
          </sitem></slist>

          
        </div3> 
        <div3>
          <head>Extrapolating From This Pattern
          </head>

          
          <p>The CNN example cited above is not unique with respect to
          its use of 
          <code>#
          </code> within the URL for encoding parameters
          to the receiving application. It shows that in a world of dynamic
          documents, the traditional fragment identifier need no longer be
          an 
          <code>idref
          </code> value that addresses an existing node in
          the serialized HTML making up the HTTP Response. In addition to
          possibly being a static 
          <code>idref
          </code>, the fragment
          identifier in the URL, the pattern demonstrated here generalizes
          to the following:</p>
          <slist> 
            <sitem>An 
            <code>idref
            </code> to a
            dynamically generated node.
            </sitem> 
            <sitem>A parameter to be

            <emph>consumed
            </emph> by the 
            <emph>application
            </emph> that is
            delivered as the HTTP Response to the original GET
            request.
            </sitem> 
          </slist>
        </div3> 
        <div3>
          <head>Architectural
          Questions
          </head> 
          <p>This section enumerates some of the questions
          raised by this design pattern:
          </p> 
          <slist> 
            <sitem> What if the
            returned HTML contains an element that has the same fragment ID
            as the one being used as a client-side parameter &mdash; who
            wins?
            </sitem> 
            <sitem>What should the correct behavior be in the
            face of such conflicts? </sitem>
            <sitem> (1) To scroll down to
            that element 
            </sitem> 
            <sitem>(2) play the video 
            </sitem>

            <sitem>(3) Error message
            </sitem> 
            <sitem>(4) Do nothing?
            </sitem>
            <sitem>What happens if the receiving client does
            not implement JavaScript, or has had scripting turned
            off?
            </sitem> 
            <sitem>Until now, URLs have been equally useful to
            browsers and non-browser consumers. this pattern demonstrates a
            case where the 
            <emph>URL
            </emph> inferred by browsers vs
            non-browsers is 
            <emph>different
            </emph>. A non-browser that
            receives a URL as in the above, and sees a

            <code>Content-Type
            </code> of 
            <code>text/html
            </code> might assume
            (incorrectly) that the URL for this video resource is

            <code>http://www.cnn.com/video.html
            </code>.
            </sitem> 
            <sitem> A
            related fragment id meaning arises when one considers
            content-negotiation. For instance:
            </sitem> 
            <sitem>a) 
            <![CDATA[
                     get application/rdf+xml "http://example.com/exp/#something"]]>

            </sitem>

            
            <sitem>b) 
            <![CDATA[get text/html
                     "http://example.com/exp/#something"]]>
            </sitem> 
            <sitem>Given that
            the fragment identifier leads to a subsequent request, who should
            process the error response if one should be raised by that
            subsequent request?
            </sitem>

            
          </slist>
          
          
        </div3>


      </div2> 
      <div2>
        <head>Interaction State And Browser History
        </head>

        <p>AKA 
        <emph>make the back button do the right thing
        </emph>. For
        live examples of this design pattern, see 
        <loc href="http://mail.google.com">GMail</loc> and 
        <loc
            href="http://maps.google.com">Google Maps
        </loc> both of which take
        extreme care to ensure that the 
        <emph>back button
        </emph> works as
        the user would expect. These applications use  
        <code>iframe
        </code>
        proxies to achieve the desired effect.
        </p> 
      </div2> 
      <div2>

        <head>AJAX Libraries And State Management
        </head> 
        <p>AJAX
        applications use features of Dynamic HTML (DHTML) to create
        highly reactive user experiences. Updates to the Web user
        interface in response to user actions no longer require a full
        page reload. Consequently, the user can perform a sequence of
        interaction steps while remaining on the 
        <emph>same page
        </emph>
        at least as seen from the browser's perspective of

        <code>content.location
        </code>. This makes for a good user
        experience, except for the following:</p>
        <slist> 
          <sitem>Recording
          key points in the interaction flow, e.g., for
          bookmarking.
          </sitem> 
          <sitem>Providing intuitive behavior for the
          browser's history mechanism.
          </sitem> 
          <sitem>Snapshoting
          interaction state so that one can return to a partially completed
          task at a later time.
          </sitem> 
        </slist>


        <p> Today, many of the details of AJAX programming have been
        abstracted away by higher level toolkits such as Dojo 
        <bibref
            ref="dojo"/> and 
        <bibref ref="google-gwt"/>GWT. Management of
        interaction state and browser history is one of the key
        affordances implemented in these libraries. History mechanisms in
        AJAX libraries like GWT and Dojo share a lot in common, and the
        approach can be traced back to 
        <loc
            href="http://code.google.com/p/reallysimplehistory/">Really
        Simple History (RSH)
        </loc>. In addition, the mechanism described
        here has also been adopted by a recent update to GMail. 
        </p> 
        <p>
          The basic premise is to keep track of the application's

          <code>internal state
          </code> in the url fragment identifier. This
          works because updating the fragment doesn't typically cause the
        page to be reloaded. This approach has several benefits:</p>
        <slist>

          <sitem> It's about the only way to control the browser's history
          reliably.
          </sitem> 
          <sitem> It provides good feedback to the
          user.
          </sitem> 
          <sitem> It's 
          <code>bookmarkable
          </code> &mdash;
          i.e., the user can create a bookmark to the current state and
          save it, email it, or whatever.
          </sitem> 
        </slist>



      </div2> 
      <div2>
        <head>Web Command Lines
        </head> 
        <p>When applications
        can be built of Web parts, there is a need to configure them at
        the point the application is launched. Traditional applications
        would call these default start-up or 
        <emph>command-line
        </emph>
        options. We see the equivalent emerging for configuring desktop
        gadgets and widgets where command-line options are passed in via
        URL parameters &mdash; in this context, the URL is the Web
        command-line. For one sample implementation and its associated
        usage, see 
        <loc
            href="http://internet-apps.blogspot.com/2007/11/using-urls-to-pass-parameters-to-web.html">Using
        URLs To Pass Parameters To The Web
        </loc>. Dave Raggett's 
        <loc
            href="http://www.w3.org/Talks/Tools/Slidy/">HTMLSlidy
        </loc> uses
        URLs of the form 
        <code>...#(nn)
        </code> to address into a deck of
        slides.
        </p> 
      </div2> 
      <div2>
        <head>Passing Data Among Frames
        </head>

        <p>Web applications that use multiple frames often need to pass
        data between them. This problem gets even more interesting when
        the child frame displays content from a domain different from
        that of its parent. In this case, the parent and child frames do
        not share any script context &mdash; that would open a cross-site
        scripting hole. A common technique that is used where the parent
        and child have mutually agreed to collaborate is for the parent
        to pass data to the child via a fragment identifier by reseting
        the child's 
        <code> location
        </code> URL. Thus, given a parent
        frame 
        <code>P
        </code> and a child frame 
        <code>C
        </code>, where the
        location URLs 
        <code>U_P
        </code> and 
        <code>U_C
        </code> come from
        different domains, the parent frame might pass data to the child
        by resetting its location URL to 
        <code>U_C#data
        </code>; the child
        picks up this data by polling for changes in its location
        URL. This technique is common in 
        <loc
            href="http://en.wikipedia.org/wiki/Comet_(programming)">Comet
        Programming
        </loc>. As an example, the 
        <loc
            href="http://dojotoolkit.org/node/87">Dojo AJAX toolkit
        </loc>
        uses an 
        <loc
            href="http://www.google.com/search?&amp;q=iframe+proxy&amp;num=25">IFrame
        proxy
        </loc> to enable cross-domain XML HTTP Requests. this is a
        useful technique when writing cross-site mashups. As an example,
        see 
        <loc
            href="http://code.google.com/p/google-axsjax/wiki/Showcase">XKCD
        and AxsJAX
        </loc> &mdash; a cross-site mashup that mashes together
        XKCD comics with their associated transcripts to create a
        speech-friendly XKCD experience.
        </p> 
      </div2> 
      <div2>
        <head>The

        <emph>Naked
        </emph> Hash-Ref
        </head> 
        <p>As the final item in the
        usage scenarios 
        <emph>as seen on the Web
        </emph>, this section
        documents the use of a single 
        <code>#
        </code> sign as the value of
        the 
        <code>href
        </code> attribute on HTML anchors. This can be
        thought of as a 
        <emph>relative URL
        </emph> with a

        <emph>null
        </emph> fragment identifier. Web sites wishing to
        override the 
        <emph>default-target
        </emph> behavior of anchors use
        this when attaching a JavaScript event-handler to anchor elements
        for mouse-clicks. The only justification to place a naked

        <code>#
        </code> as the value of the 
        <code>href
        </code> attribute
        appears to be to avoid anything showing up on the browser status
        bar as the user activates the link. Note that this idiom also
        creates significant hurdles for non-mouse users of the Web.
        </p>

      </div2> 
    </div1> 
    <div1>
      <head>Recommended Best Practices
      </head>
      <p>This section will be populated upon completion of this finding.
      Note that the preceding sections have identified design patterns without prejudice &mdash; with a view to enumerating the pros and cons of the various idioms seen on the Web today.</p>

    </div1> 
    <div1>
      <head>Affected Communities To Liaise With</head>
      <p>It is clear that we will need to liaise effectively with
      standard groups that are active in defining the formats and
      protocols that come together in turning an HTTP Response into an
      interactive user interface for a Web application. This section
      will be used to track these dependencies,
      and may be removed upon final publication of this document.</p>
      <slist>
        <sitem>The <loc href="http://www.whatwg.org">WhatWG</loc> that presently defines the behavior of conforming HTML5 Web browsers in conjunction with the W3C HTMLWG.</sitem>
        <sitem>The HTTP work in the IETF.</sitem>
      </slist>
    </div1>
    <div1>
      <head>Conclusions
      </head>
      <p>This section will be completed when this finding is ready for final publication as an officially approved TAG  Finding.</p>



    </div1> 
    <div1>
      <head>Pending Work</head>
      <p>This section will track pending work items, including
      technical proposals currently in existence within and outside the
      W3C that are relevant to this issue. As we continue to finalize
      this work, these <emph>pending items</emph> will move into relevant sections of this document from being
      <emph>editorial notes</emph> in this section.</p>

      <div2>
        <head>WhATWG: PushState()</head>
        <p>Here is a link to a proposal that is the topic of ongoing
        discussion in the <loc href="http://www.whatwg.org">WHATWG</loc> for encoding client-side state.
        <ednote>
          <date>May 11, 2009</date>
          <edtext>
            Proposal pushState() allows for changing the whole URL using
            ECMAScript so that the URL exposed to copy-and-paste can still
            make sense in contexts without scripting. It also addresses the
            back button concern &mdash; see <loc href="http://www.whatwg.org/specs/web-apps/current-work/multipage/history.html#dom-history-pushstate">pushState()</loc>.
          </edtext>
        </ednote></p>
      </div2>
    </div1>
    <div1 id="open-issues">
      <head>Open Issues
      </head>


    </div1>


    <div1>
      <head>References
      </head> 
      <blist> 
        <bibl id="www-tag-archive"
              href="http://lists.w3.org/Archives/Public/www-tag/2007Jul/0148.html">
          Mail thread on WWW-TAG from 2007 that initiated some of these
          discussions.
        </bibl> 
        <bibl id="JsonP"
              href="http://ajaxian.com/archives/jsonp-json-with-padding">
          JSONP: JSON With Padding
        </bibl>


        
        <bibl href="http://en.wikipedia.org/wiki/Comet_(programming)"
              id="wikipedia-comet"> Comet Programming from Wikipedia
        </bibl>

        <bibl id="sidewinder-hash"
              href="http://internet-apps.blogspot.com/2007/11/using-urls-to-pass-parameters-to-web.html">
          Mark Birbeck: Using URLs To Pass Parameters To The Web
        </bibl>

        <bibl id="google-gwt" href="http://code.google.com/webtoolkit/">
          Google Web Toolkit &mdash; Java software development framework
          that makes writing AJAX applications like Google Maps and GMail
          easy for developers taking care of browser and platform
          details. 
        </bibl> 
        <bibl id="tvr-cacm2009"
              href="http://portal.acm.org/citation.cfm?id=1461945"> Toward 2^W
        &mdash; Beyond Web-2.0, T. V. Raman, Communications Of The ACM,
        ACM, New York.


        
        </bibl> 
        <bibl id="dojo" href="http://dojotoolkit.org/"> The
        Javascript Toolkit by the Dojo Foundation. 
        </bibl> 
      </blist>

    </div1> 
  </body> 
</spec>
