Copyright © 2006 W3C® (MIT, ERCIM, Keio), All Rights Reserved. W3C liability, trademark, document use and software licensing rules apply.
Content creators wishing to publish multiple versions of a given resource on the Web face a number of questions with respect to how such URIs are created, published and discovered. Questions include:
Given a resource http://example.com/newspaper
that can
be delivered in a multiplicity of representations, how should one publish
the relevant URIs to enable automatic discovery of these
representations?
Here, multiple representations might include:
to name a few.Representations appropriate for different delivery contexts,
Representations in different languages,
This document explores the issues that arise in this context, and attempts to define best practices that help toward:
Preserve the One Web while enabling content publishing to a multiplicity of delivery contexts.
Enable automatic discovery of the available representations.
Enable the creation of RESTful URIs that remain representation agnostic while delivering the correct end-user experience.
Editors DRAFT
This document has been developed for discussion by the W3C Technical Architecture Group. This finding addresses the TAG issue Generic-Resources-53.
The content of this document is intended for discussion and does NOT necessarily represent a consensus position of the TAG. An informal guide to previous discussion of this topic is available and may be useful to reviewers of this draft.
The terms MUST, MUST NOT, SHOULD, and SHOULD NOT are used in this document in accordance with [RFC2119].
Publication of this finding does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time.
Additional TAG findings, both approved and in draft state, may also be available.
Please send comments on this finding to the publicly archived TAG mailing list www-tag@w3.org (archive).
1 Introduction
2 Use Case Scenarios
2.1 Publishing Desktop And Mobile
Versions
2.1.1 Suggested Solution
2.2 Publishing In Multiple Languages
2.2.1 Suggested Solution
2.3 Publishing Continuously Updating
Content
2.3.1 Suggested Solution
3 Recommended Best Practices
4 Conclusions
5 References
There has always been a need to serve user-agent specific contents for a given URI --- thus highlighting the distinction between Resource and Representation on the Web. The increasing importance of the mobile, multilingual Web makes this requirement even stronger. At the same time, published content (and its various representations) needs to be discoverable on the Web; as an example, crawlers and web-bots need to be able to discover the availability of alternative representations of a given resource. Documents published on the Web become discoverable via the hyperlinked structure of the Web; to enable discovery of alternative representations, the relation between the multiple representations needs to be captured by the hyperlink structure of the Web. This finding enumerates some of the issues faced by content creators on the Web today and proposes a sequence of best practices to foster the following long-term goals:
Preserve a Single Web i.e., a Web where content is universally accessible from a variety of end-user devices.
Ensure that the One Web enables the easy exchange of resources (and pointers to resources) across its different facets, i.e., mobile and desktop users should be able to share references to Web Resources (URIs) with the accessing user being able to retrieve an appropriate representation.
Ensure that contents published to a given facet of the Web are linkable, discoverable and browsable from any of its other facets.
This section enumerates the candidate use case scenarios along with accompanying issues and suggested solutions. See the next section for recommended best practices that are a generalization of these solutions.
The owners of http://example.com/ubiquity
would like to
publish their content to a wide variety of end-user devices ranging from
desktop Web browsers to mobile devices such as cell-phones and PDAs. They
also serve multiple geographies using different languages. They know about
the different markup language variants that are currently in vogue on these
devices, and are capable of generating the representation that is most
appropriate for the accessing user-agent. In publishing their content and
associated URIs, they face the following issues:
Given resource http://example.com/ubiquity/resource
with
corresponding representations for a desktop browser, a PDA and a cell-phone,
should these different representations:
Have distinct URIs?
Have a single URI that delivers the appropriate representation?
In either of the above cases, how should the availability of multiple representations be advertised to enable discoverability?
If publishing URIs for the reesource and its various representations, how should the relationship between these URIs expressed in a discoverable, machine-readable form?
We suggest the following approach for this situation:
Create representation-specific URIs for each available
representation (representation_i
), e.g.,
http://example.com/ubiquity/resource/representation_i
.
Serve a canonical representation of the content at
http://example.com/ubiquity/resource
Using content negotiation, arrange for the server to
generate an HTTP 302 (moved temporarily)
redirect to
automatically serve up
http://example.com/ubiquity/representation_i
when
http://example.com/ubiquity
is accessed by
user-agent_i
. This is a temporary redirect,
accessing user-agent should continue to use the canonical URI when
creating bookmarks, or emailing URI. This will ensure that later uses of
the URI results in expected end-user results; e.g., In the following
scenario:
The user following the link from inside the email message on a desktop browser should receive the desktop version, and not the mobile version. Notice that passing around the canonical URI is critical in achieving this behavior.Cell-phone user emails link,
Recipiant opens message on a desktop,
Clicks on the link
Use linking mechanisms provided by the representation being served
to create links to the other available representations. As an
example, when using HTML, one might use a
and
link
elements to advertize the availability of alternate
representations. In this context, note that there are two distinct types
of such links:
As an examplelinks to available alternatives meant for human consumption might use the HTMLLinks for human consumption,
And links for machine consumption.
a
element since these are rendered by
user-agents. In contrast, links meant for machine consumption, e.g.,
Atom/RSS feeds might use the HTML link
element.
In either case, notice that following these steps creates a mini-graph comprising of the canonical URI and URIs for its various representations.
The owners of http://example.com/global
publish their content
in a multiplicity of languages. They wish to publish any given announcement
at a canonical URI, while retaining the ability to serve up a
version in a language that is most appropriate for the user. Further, they
wish to create URIs for each available language to facilitate hyperlinking
and discovery. At the same time, they do not wish to hard-wire the language
in which a given announcement is accessed when such URIs are passed around by
end-users.
For a design pattern that has worked well over the years, see the W3C practice of publishing press releases in multiple languages. Here are its salient characteristics:
Press releases announced with a canonical URI.
Accessing this canonical URI with the appropriate
Language
header results in an automatic redirect that
delivers the document in the desired language.
Each language version contains pointers to available languages.
Since these translations are typically for human consumption, they
are encoded as HTML a
elements so that they get displayed in
browsers.
The owners of http://example.com/blogosphere/current
publish
up-to-date content. Once published, they would like users to be able to
reliably bookmark the published content. At the same time, they would like
end-users to be able to always access a canonical URL when looking
for the most recently published content.
The issue identified here has been faced by and solved successfully during the last few years by the blogging community.
Accessing a blog's canonical URI retrieves recent posts.
Posted items have a bookmark or permalink pointer that can be used to reliably access postings from the past.
Pointers to alternative content are encoded as link
elements, and the same mechanism is used within RSS/ATOM feeds to
advertize permalinks and other pointers to make them discoverable.
As can be seen from the use-cases and suggested solutions enumerated in the previous section, pointers to Web Resources (URIs) can either:
Our primary take-aways from the these observations are:Be canonical URIs, i.e., have no context hard-wired.
Can encapsulate partial context, e.g., encapsulate language,
Encapsulate multiple context bits, e.g., language and device profile,
Capture all context, i.e., the creator of the URI guarantees that all state is completely captured by the URI.
URIs are cheap, we suggest creating as many distinctive URIs as is meaningful.
The hyperlink structure of the Web is crucial for content discovery; When creating a multiplicity of URIs for a given canonical resource, ensure that the relationship amongst these multiple URIs is captured in a machine-readable form.
Encourage users and user-agents to work with canonical URIs; leave it to the underlying infrastructure to generate appropriate redirects in order to serve users the appropriate representation. For each such available representation that is generated as a function of user context, ensure that there is a URI that can reproduce that representation in the absence of user context; Or equivalently: for every representation, ensure that there is a URI that hard-wires all user context e.g., language, device preference etc., required to generate that representation.
Contrast these findings with the metadata in URIs and state finding which each enumerate use cases where user context Should Not be encapsulated by URIs.