Dereferencing HTTP URIs

Draft Tag Finding 31 August 2007

This version:
http://www.w3.org/2001/tag/doc/httpRange-14/2007-08-31/HttpRange-14.html
Latest version:
http://www.w3.org/2001/tag/doc/httpRange-14/HttpRange-14.html
Previous version:
http://www.w3.org/2001/tag/doc/httpRange-14/2007-05-31/HttpRange-14.html
Editor:
Rhys Lewis, Volantis Systems Ltd. <rhys@volantis.com>

Abstract

Editorial note 
The title is deliberately vague at this point. We mentioned a number of ways of making this finding available, including a direct response to httpRange-14, material in an updated version of AWWW, or as part of another finding. Currently this is written as a stand-alone finding.
This ....

Status of this Document

This document is an editors' copy that has no official standing. In particular, it does not yet necessarily reflect consenus within the working group or within the wider community.

This document has been produced by the W3C Technical Architecture Group (TAG). This finding addresses TAG issue httpRange-14.

This version of the document is an editor's draft. It does not necessarily represent consensus within the TAG or within the wider community.

Additional TAG findings, both accepted and in draft state, may also be available. The TAG may incorporate this and other findings into future versions of the [AWWW].

The terms must, must not, required, shall, shall not, should, should not, recommended, may, and optional are used in this document in accordance with [KEYWORDS].

Please send comments on this finding to the publicly archived TAG mailing list www-tag@w3.org (archive).

Table of Contents

1 The Web
    1.1 A Presence on the Web
2 Resources Whose Essence is Information
    2.1 Dynamically Generated Representations
    2.2 Web Pages as Information Resources
    2.3 Media and Styling Resources
3 Resources Whose Essence is not Information
4 Associating Information Resources with Other Resources via Redirection
    4.1 A Web Presence for Non-Information Resources
    4.2 Using HTTP Redirection to Represent Associations
    4.3 Interpreting HTTP Response Codes
        4.3.1 Success Response Codes
        4.3.2 Redirection Response Codes
        4.3.3 Error Response Codes
        4.3.4 Summary of Response Codes
    4.4 Using Content Negotiation with Redirection
5 Using Secondary Resources for Association
    5.1 Secondary Resources in HTML
    5.2 Secondary Resources in RDF
    5.3 Using Content Negotiation with Secondary Resources
6 Generating Representations via Transformation
7 Choosing Between Redirection and Secondary Resources

Appendices

A References
B Changes in this version (Non-Normative)


1 The Web

The World Wide Web (Web) is an information space in which the items of interest are known as resources. Resources are identified by Uniform Resource Identifiers (URI). The URIs used in the Web are based on the HTTP [HTTP] scheme. The general syntax of URIs is described in detail in [URI], which also gives many examples.

URIs are globally unique within the Web. The way in which URIs are structured provides an important contribution to the ability of the Web to scale to support very large numbers of uniquely identified resources. In particular, the structure of URIs supports delegation of the authority for their allocation. This approach largely removes bottlenecks from the process for creating new URIs. Usually, a person or an organization can acquire the authority to create URIs within some part of the space of all possible URIs. They are then free to create any URIs that fall within their delegated authority without the need to request further permission. The particular approach used provides freedom of allocation whilst preserving the uniqueness of individual URIs.

Organizations or individuals which have the authority to create URIs can be thought of as the owners of those URIs (see for example [AWWW] section 2.2.2).

1.1 A Presence on the Web

A URI uniquely identifies a resource. However, on its own, a URI is not enough to provide a resource with a presence on the Web. For a resouce to have a Web presence, there must be a means by which it can be accessed. HTTP specifies the way in which such access can occur and defines the various participants in the process. Until the appropriate steps have been taken to create a Web presence for a resource, attempts to access it will fail.

Once a resource has a Web presence, its URI provides the information necessary to allow it to be accessed. This is in addition to providing its unique identification. A Web presence allows additional information to be associated with a URI. In some cases the association will be direct and the additional information will be available directly by accessing the Web presence associated with the URI itself. In other cases, the association may need to be more indirect. We'll examine both cases in later sections and see when each is appropriate.

The relationship between a resource and the Web presence that supports access to it is important. If an author makes an assertion about a resource, its Web presence should behave in a manner that supports that assertion. To do otherwise would be misleading.

Story

Peter has a Web site concerned with guitar playing. He is in the process of updating his site to add a tutorial on playing the classical guitar. Part of the tutorial is set of pages, each of which describes how to play each of the standard chords. Each page includes a chart showing the position of the fingers needed to play that particular chord. Peter prepares each of these pages and uploads them to his server, thereby creating their Web presence and allocating their URIs. To aid navigation to these new pages, he also prepares a page containing a list of the chords. For each chord, he also provides a link to the appropriate page in the tutorial.

Unfortunately, due to a typing error, Peter incorrectly references the page describing the D major chord in the link whose text claims it references the details for the E major chord.

Fortunately, the users of Peter's Web site soon notice the error and inform him of it. He corrects the misleading information by changing the incorrect reference.

Including a link in a Web page is one mechanism that authors have for making assertions about resources. The text associated with a link is an assertion about the resource that the link identifies. Of course, in Web pages, such assertions are meant only for human users. Peter incorrectly asserted that one particular link referenced the details for E major. This was misleading, and caused his users to report this as an error in his site. In this case it was an innocent error on his part and it was a simple matter to correct the mistake.

Good Practice

Authorities MAY create a Web presence for any resource whose URI they own.

Authorities SHOULD NOT make misleading assertions about the Web presence of any resource, whether or not they own its URI.

2 Resources Whose Essence is Information

Many resources that have a Web presence are actually documents. Documents provide physical representations of bodies of information. A document might, for example, provide a description of the planet Mars (see for example [MARS]). URIs allow documents to be uniquely identified as resources in the Web. In addition, their Web presence allows the information they embody to be accessed and retrieved. Normally, such retrieval is direct. The resource itself consists of a body of information that is amenable to transmission across the Web within suitable messages. In general, we call such resources information resources (see for example [AWWW] section 2.2) because their essence is information.

To allow transmission across the Web, it must be possible to represent the information associated with a resource, such as a document, in a suitable form. For example, HTTP, specifies that the information must be represented as a stream of octets. In addition, the person or system making the request, for the information associated with the resource, may also place constraints on the representations that are considered acceptable. For example, two different requesters might ask for the information in different languages. One might specify Italian, while another might specify Japanese. The mechanism, by which the Web presence of a resource can be accessed, is able to support such constraints. Whether the requested representations are actually available for a given resource depends on the nature of the materials provided in support of its Web presence.

When we use the term representation, we mean specifically a form of the information, associated with a resource, that can be transmitted across the Web. So, in the previous example, we might say that two different representations of the resource are available, one containing material written in Italian, and the other containing the same material written in Japanese. The architecture of the Web [AWWW] clearly separates the notion of a resource from that of the representations which may be provided by its Web presence. As we've seen, different representations may be retrieved from the Web presence of a given resource under different conditions. The representation of a document returned in Italian is different from one returned in Japanese, though both should convey the same information if they are retrieved from the Web presence of the same resource. The process by which a suitable representation is chosen, based on constraints in the request, is known as content negotiation. The process is described in detail in [HTTP] in section 12.

Although we recognize that representations are returned in response to requests made to the Web presence of a resource, for brevity, it's often convenient to talk about the representations of a resource. Again, in the previous example, we might say that the resource has two representations, one written in Italian and the other written in Japanese.

2.1 Dynamically Generated Representations

Documents are not the only kind of information resource. The information associated with some resources is provided by computing systems. These perform work when the Web presence of the resource is accessed. Some systems might be able to retrieve data from sources that do not themselves have a Web presence. They may also perform computations in order to assimilate the information that will ultimately be returned to the requester in a suitable representation.

Story

Joan needs to check the transactions against her bank account for a particular month. Her bank provides an on-line system, via the Web, that allow her to do this. After authenticating with her bank, Joan enters the details required to retrieve the appropriate month's account activity. The on-line system uses this information to construct a URI that references the appropriate resource. The Web presence of this resource extracts the necessary data from the bank's databases and converts it into a textual form. The representation is created by embedding the data in a stream of HTML markup, which also references the appropriate styling information and an image of the bank's logo.

The representation flows across the Web to Joan's browser, where it is rendered. She is able to view the information and confirm that the transaction, in which she is particularly interested, has indeed processed.

Systems, like the one provided by Joan's bank, which process requests and dyamically create representations are clearly not themselves documents. However, they can be viewed as dynamically creating documents in response to the requests that they receive. Just as with resources that are actually documents, the representations returned by such systems may need to meet constraints specified in the original request. Once again, this may result in the need for content negotiation. From the perspective of the requester, the form of the information received from this kind of system may be indistinguishable from that received from a resource that actually is a document.

Resources of this type can be associated with information which is changing rapidly over time. For example, a resource that represents the current wind conditions at the Lizard lighthouse, in Cornwall, England, might return representations that vary from minute to minute.

2.2 Web Pages as Information Resources

Probably the most widely recognized type of information resource on the Web currently is the Web page. To most users, the Web appears as a very large number of interlinked pages. Pages themselves usually contain references to other resorces. Often, these are rendered in ways that allow user interaction. Such references are often termed links. A user may be able to activate a rendered link. This may cause a representation, of the resource whose URI it references, to be made available to the user. Pages may also contain other kinds of reference to resources. These might identify additional materials, such as images, or style information, needed to render the page successfully. Users don't normally interact with these references, which denote resources that are not themselves Web pages. We'll look at these types of information resource in the next section.

Although the term Web page tends to imply a rather static entity that provides information, in addition, pages actually allow access to a huge variety of operations, from purchasing a book to remotely controlling a robot. Information available through Web pages may also be highly dynamic in nature. Increasingly, organizations are making applications available to their users as sets of Web pages with which they can interact.

The idea of activating links to navigate between and within pages is so natural for Web users that most of them do not distinguish between a resource, the representation that traverses the Web and the final rendered version that they perceive.

2.3 Media and Styling Resources

Media resources, such as images, and styling resources, such as Cascading Style Sheets [CSS], are examples of resources with which normal Web users rarely interact directly. Representations of Web pages may contain references to media resources to provide images, audio or video that forms part of the final rendering. They may also refer to style resources that control aspects of that rendering, including color schemes and the use of particular fonts.

Media resources are information resources. Media resources may have multiple representations. Often the differences between representations are technical in nature. Different representations of a resource might use different encoding schemes. Images, for example, might be provided in representations based on various formats, such as JPEG, PNG or GIF, in order to satisfy the needs of various requests. The most appropriate representation for a particular request is typically determined by content negotiation. Similar approaches can be used with other kinds of media, such as audio and video clips.

Styling resources are also information resources. As with media resources, references to styling resources are usually processed automatically and result in a representation of the style information being made available for use in rendering.

3 Resources Whose Essence is not Information

The vast majority of resources on the Web are information resources. Representations of these resources are made available through the appropriate interaction with their Web presence. However, increasingly there is interest in being able to use URIs to identify uniquely resources whose essence is not information. We term such resources non-information resources.

When a resource is not an information resource, it is important that it does not behave on the Web as if it is an information resource. To do so would be misleading. For example, it is important that non-information resources do not respond with representations if they have a Web presence. Indeed, the whole question of the purpose of a Web presence for non-information resources, and the circumstances under which they should have one, needs consideration. We'll explore these questions in more detail in the rest of this section.

Story

Angela is creating a semantically rich description of the Solar System, using RDF. The description includes astronomical terms associated with the Solar System. It also provides information about particular celestial objects, such as the planets.

In building definitions associated with particular planets, Angela discovers the need to identify each member of the Solar System. For example, she needs to identify the planet Mars in order to form assertions about its properties, such as mass and diameter. Following the advice in [AWWW], Angela uses a URI as a unique identifier for the planet Mars. She creates the URI http://www.example.com/solar_system/Mars to represent the planet itself.

Following additional advice, that owners of URIs should provide representations, Angela is keen to comply. However, the choices of possible representations appear legion. Given that the URI is being used in the context of an RDF description, Angela first considers a representation that consists of some RDF triples that allow suitable computer systems to discover more information about the planet Mars. She then worries that these might be less useful to a human user, who might prefer the appropriate Wikipedia entry [MARS]. Perhaps, she reasons, a better approach would be to create a representation which itself contains a set of URIs to a range of resources that provide related representations. Perhaps content negotiation can help? She could arrange for different representations to be returned, based on the content type specified in the request.

Angela's dilemma is based on the fact that none of the representations she is considering are actually representations of the planet Mars, at least in the sense in which we defined the term in 2 Resources Whose Essence is Information. The problem is that, whatever the essence of the planet Mars is, it is clearly not information. The planet itself cannot be transmitted over the Web. The resource identified by the URI that Angela has created is a non-information resource. The representations that Angela is considering are not of the planet Mars. Instead, they are representations of information resources related to the planet Mars. Consequently, it would be appropriate for any of these representations, to be returned by accessing the URI that Angela created to identify the planet itself .

However, Angela clearly feels that there is benefit in providing a means for accessing additional information about Mars from the URI she created for the planet. What is needed is a way to associate these information resources, with the URI she created for Mars, without misleading users that the associated representations are of the planet itself. We'll look at this in more detail in 4 Associating Information Resources with Other Resources via Redirection. Angela faces an additional challenge. Any mechanism for retrieving this associated information, based on the Web, will require her to create some kind of Web presence for the planet Mars, and to associate it with the URI that she has created. We'll look at this in more detail in 4.1 A Web Presence for Non-Information Resources.

4 Associating Information Resources with Other Resources via Redirection

It can be extremely useful to associate information resources with other kinds of resource. However, it would be misleading to claim that the representations of such information resources are representations of the resources with which they are associated. In the previous example, the associated information resources can convey information about the planet Mars. However, no resource can provide a representation that conveys the essence of the planet, since that is not information.

Information resources that are associated with a non-information resource need to have their own URIs. They are themselves distinct resources and provide representations. They may have uses other than providing additional information about the non-information resource. For example, one of the information resources that Angela considered, the Wikipedia entry for Mars, provides information about the planet independently from her description. When such resources are associated with a non-information resource, we need a means to identify the association in an unambiguous way. Fortunately, HTTP redirection provides a means to indicate this association by the use of a suitable Web presence.

4.1 A Web Presence for Non-Information Resources

The Web presence for an information resource is responsible for returning suitable representations when accessed, as we have seen. We've also noted that the Web presence for a non-information resource must not return representations when accessed. To do so would be misleading and would imply that the resource was actually an information resource. The Web presence for a non-information resource must behave differently when accessed. It must indicate that it is not returning a representation, and must also indicate where the associated information can be found. This behavior can be achieved by use of suitable HTTP facilites.

4.2 Using HTTP Redirection to Represent Associations

The representation returned when a Web presence is accessed consists of two parts. One part is the data, the other part is metadata. For a Web page, for example, the data is the part that contains the markup and associated instructions from which the rendered version is created. The metadata includes the information within the HTTP headers that form part of the response. This metadata includes the HTTP response code. One particular value of this response code forms the mechanism by which HTTP itself can indicate associations.

HTTP response code 303, named 'See Other', indicates that there is no representation for the accessed URI, but that associated information may be available. Importantly, this response also provides a URI which may be accessed in pursuit of the associated information. This type of redirection of the request to a different URI is exactly the behavior that we need in order to indicate that there may be additional information associated with a non-information resource. If the Web presence for a non-information resource is arranged to have this behavior, information can be associated with the resource without misleading people or systems that access it.

Of course, there is no guarantee that the URI returned in a 'See Other' response will actually provide a representation when accessed. Access to the returned URI is a separate operation from access to the original non-information resource. It could be that the returned URI leads to the need for further redirection actions before a representation can be retrieved. Indeed, it might not be possible to access the returned URI at all. At a practical level, either behavior is suboptimal since it increases network traffic without necessarily allowing a represetnation to be retrieved.

Good Practice

Authorities MAY create HTTP URIs for non-information resources as well as for information resources.

If a URI identifies an information resource, the URI owner SHOULD create a Web presence for it. The Web presence SHOULD provide representations of the resource when accessed. This is based on the discussion in Representation Management in [AWWW].

If a URI identifies a non-information resource, the URI owner MAY create a Web presence for it. The Web presence SHOULD respond by returning information about an associated information resource, related to the resource. The URI of this associated information resource should be indicated using the redirection mechanism based on the HTTP 'See Other' response code, 303.

A URI owner providing an information resource associated with a non-information resource SHOULD avoid the need for additional redirection operations after the original 'See Other' response. In particular, the URI returned in the 'See Other' response SHOULD be able to provide representations of the associated information resource.

Let's see how Angela resolves her particular dilemma using this set of good practices.

Story

In creating her description of the solar system, Angela is ready to provide information about the planet Mars. She creates a Web presence for information about the planet and associates it with the URI http://www.example.com/solar_system/information/Mars. This is an information resource. Its Web presence responds to appropriate requests by returning a representation in RDF. Within this representation, the planet itself is identified by the URI http://www.example.com/solar_system/Mars, that she created earlier.

Angela now creates a Web presence, for the URI for the planet. By configuring her web server, she arranges for an HTTP 303 response code to be returned when http://www.example.com/solar_system/Mars is accessed. She arranges for the URI http://www.example.com/solar_system/articles/Mars.html to be returned in the HTTP 303 response. This URI refers to an information resource that provides HTML representations of an article describing the planet.

4.3 Interpreting HTTP Response Codes

The HTTP specification [HTTP] describes in detail all of the possible codes that might be received in response to a request. In this section, we'll look at some specific codes and how they can be interpreted when received in response to attempts to access a representation from a resource.

4.3.1 Success Response Codes

HTTP indicates the success of a request with response codes in the range 200-299. Often these are termed the 2XX codes. In fact, only the first few of these codes are actually in use. The following list describes codes that are of particular interest in the context of this finding.

Response Code 200 - OK

This code indicates that an HTTP operation completed successfully.

When the operation is an attempt to retrieve a representation, from the Web presence of a resource, this code indicates that the retrieval has been successful. A representation has been returned as part of the response.

Under these circumstances, it is possible to conclude that the resource, whose Web presence was accessed, is an information resource.

4.3.2 Redirection Response Codes

Sometimes, it is not possible to access a URI as specified in an HTTP request, although no error has occurred. Some further action may be necessary for successful access. Generally this action is known as redirection. HTTP indicates the need for redirection with response codes in the range 300-399. These are termed the 3XX codes. Once again, only the first few codes are actually in use.

When redirection occurs, the Web presence for the URI referenced in the initial request indicates that a different resource needs to be accessed instead. Different response codes indicate the reasons for the redirection. The following list describes codes that are of particular interest in the context of this finding.

Response Code 301 - Moved Permanently

This code indicates that the resource being accessed has been moved permanently. It has a new URI, which is returned in the response to the access request.

This response code indicates that the resource has been moved, but otherwise has not changed. This code can be interpreted as an assertion that the URI used in the original request and the URI returned in the response identify the same resource. If a representation is successfully retrieved from the Web presence of the returned URI, it is a representation of the resource identified by the URI in the original request.

The URI used in the original request, and that returned in responses carrying this code are aliases of one another. URI aliases are described in URI aliases in [AWWW].

Response Code 302 - Found

This code indicates that the resource being accessed has been moved temporarily. It has a new URI, which is returned in the response to the access request.

Once again, this response code indicates that the resource has been moved, but otherwise has not changed. This code can be interpreted as an assertion that the URI used in the original request and the URI returned in the response identify the same resource. If a representation is successfully retrieved from the Web presence of the returned URI, it is a representation of the resource identified by the URI in the original request.

Again, the URI used in the original request, and that returned in responses carrying this code are aliases of one another.

Response Code 303 - See Other

This code indicates that there is no representation available for the resource being accessed. However, it also indicates that a response may be found using a different URI, which is returned in the response to the access request.

An important distinction, between this code and either 301 or 302, is that the returned URI is not an alias of the original URI. Instead, the resource referenced in the response that carries a 303 code is related to that in the original request, but does not substitute for it. If a representation is successfully retrieved from the Web presence for the returned URI, it is not a representation of the resource specified in the original request. This is important for the use of this code in associating information resources with non-information resources, as we saw earlier.

This code can be interpreted as indicating that the URI used in the original request references a non-information resource.

Editorial note: Rhys2007-08-07
Are we happy to make this assertion?

Of course, there is no guarantee that the URI returned with this response code 303 will lead to a representation, although often it will. Only by accessing its Web presence and processing any response can anything further be concluded. One possibility is that the URI returned in the 303 might itself lead to further redirections. However, if by following any such redirections it is possible to retrieve a representation, we can conclude that the information it contains is related to the URI that originally led to the 303 response code.

Editorial note: Rhys2007-08-07
In the May/June F2F we discussed mandating the meaning of 303 in such a way that associated resource MUST be a 'description' of the non-information resource. Do we actually want to make that assertion? We also noted that the description of 303 in the HTTP specificaiton might need revision.

4.3.3 Error Response Codes

HTTP indicates errors in the processing of a request with response codes in the range 400-599. Often these are termed the 4XX and 5XX codes. The 4XX codes indicate errors that are likely to have originated on the client, while the 5XX codes indicate errors likely to have originated on the server.

As with other code ranges, only the first few of the codes in each range are actually in use. The following list describes codes that are of particular interest in the context of this finding.

Response Code 406 - Not Acceptable

This code indicates that there is no representation available that satisfies the constraints specified in the access request. No representation is returned. However, this response indicates that the Web presence of the resource is capable of providing representations, even though it cannot supply one that can satisfy this particular request.

From this response code, it is possible to infer that the resource is an information resource, though without altering the constraints in the request, it will not be possible to retrieve a representation.

Editorial note: Rhys2007-08-07
Are we happy to make this assertion?
Other 4XX Response Codes

When a 4XX error response code, other than those explicitly described above, is received, nothing can be inferred about the URI that led to the error or about the resource that it identifies. In particular, it is not possible to determine whether the resource is an information resource or not. It is also impossible to determine whether there are any resources associated with the resource whose URI led to the error.

5XX Response Codes

When a 5XX error response code is received, nothing can be inferred about the URI that led to the error or about the resource that it identifies. In particular, it is not possible to determine whether the resource is an information resource or not. It is also impossible to determine whether there are any resources associated with the resource whose URI led to the error.

4.3.4 Summary of Response Codes

Based on this discussion, Table 1 summarizes the information that can be inferred from the results of dereferencing a URI.

Table 1: Summary of inferences that can be made when dereferencing an HTTP URI
CodeMeaningMaterial ReturnedInference
200OKA representationThe resource is an information resource and a representation of it has been returned.
301Moved PermanentlyA URIThe URI specified in the request and the URI returned in the response are aliases and refer to the same resource. The resource might be an information resource or a non-information resource.
302FoundA URIThe URI specified in the request and the URI returned in the response are aliases and refer to the same resource. The resource might be an information resource or a non-information resource.
303See OtherA URIThe resource is a non-information resource. There is an associated resource whose URI has been returned. The associated resource might or might not be an information resource.
Editorial note: Rhys2007-08-07
Again, do we want to assert that 303 indicates a non-information resource? In the May/June F2F we discussed the possibility that the description if 303 in the HTTP spec might need revision.
406Not AcceptableNothingThe resource is an information resource, but no representation could be returned.
Other 4XXErrorNothingNothing can be inferred about the nature of the resource.
5XXErrorNothingNothing can be inferred about the nature of the resource.

4.4 Using Content Negotiation with Redirection

As we've seen, it is possible for information resources to provide a variety of different representations to meet the needs of specific requests. Content negotiation is used to match the appropriate representation to the request. Information resources associated with non-information resources may be able to respond with different representations to satisfy different requests.

Story

For a short period of time, Angela is content to associate just the article about the planet Mars with the URI that she created for the planet itself. However, the need to support automatic discovery of additional information about Mars soon arises. She decides to alter the behavior of the Web presence she created for the URI the planet (http://www.example.com/solar_system/Mars). She changes the 303 response that it returns to contain the URI http://www.example.com/solar_system/related/Mars. She arranges for the Web presence for this URI to support content negotiation. In particular, it provides equivalent representations in HTML, for human users, and as a set of RDF triples, for use by suitable computer systems.

Subsequently, she adds a third representation. This is also in HTML, but contains a French translation of the original text.

By adding the ability to support materials for use by human users and by computer systems that can apply reasoning, the usefulness of an associated information resource can be increased. Of course, it is important to ensure, however, that the different representations of the resource are effectively equivalent. The reasons why there may be differences between representations and the issues that may arise if this occurs are discussed in [AWWW] section 3.1.1.

Good Practice

URI owners MAY create multiple representations for an information resource which is associated with a non information resource. They MAY make such representations available via content negotiation.

Such representations SHOULD be equivalent to one another.

5 Using Secondary Resources for Association

So far, the discussion has focused on primary resources. The Web also provides mechanisms for identifying secondary resources. Secondary resources are indirectly associated with primary resources. They are identified by a URI formed from that of the associated primary resource with the addition of a fragment identifier. The terms primary resource, secondary resource and fragment identifier are defined in [URI] in section 3.5 Fragment.

A secondary resource might, for example, be some portion or subset of the primary resource, some view on representations of the primary resource, or some other resource defined or described by those representations. URIs provide unique identification of secondary resources. However, just as with primary resources, the existence of the URI does not imply that any kind of access is possible.

A secondary resource has neither its own Web presence nor its own representations. It cannot itself be transmitted over the Web. However, the primary resource, with which a secondary resource is associated, may have a Web presence and may provide representations in whose context the secondary resource may be interpreted.

Importantly, the semantics associated with a secondary resource, depend on the content type of the representation retrieved from the associated primary resource. Effectively this means that the way that fragment identifiers are interpreted, once a representation has been retrieved, depend on the content type of that representation. The interpretation of fragment identifiers in HTML is different from that in RDF, for example.

We'll discuss these differences in more detail in the following sections.

5.1 Secondary Resources in HTML

Secondary resources in HTML identify specific subsets of the representation of the primary resource. Any subset of the HTML markup might constitute a secondary resource.

There is no guarantee, of course, that simply because a representation of the primary resource can be accessed, that it contains material related to the secondary resource.

Story

Peter wants to enhance his guitar playing tutorial by providing a page that links directly to the fingering chart for each chord. First, he adds an id attribute with the value fingeringChart to the part of the markup in the page for each chord that contains the fingering chart. Then he prepares a page that lists each chord and provides a link to the appropriate secondary resource, in this case the fingering chart . Each link includes text asserting that it provides the fingering chart for a particular chord. It also includes the URI for the appropriate secondary resource. For example, the URI for the fingering chart for the D major chord is http://www.example.com/guitar/tutorial/chords/Dmajor.html#fingeringChart.

When a user, browsing Peter's site, activates one of these links, a representation of the page for the chord is retrieved by their browser. The browser creates the rendered version of the page, and scrolls it to ensure that the section on fingering is presented to the user.

To see how the relationship between primary and secondary resources works in this case, we'll follow through what happens if an attempt is made to access the URI http://www.example.com/guitar/tutorial/chords/Dmajor.html#fingeringChart.

Suppose that the user of a conventional Web browser activates the link for this URI. The browser recognizes that the URI refers to a secondary resource and attempts to retrieve an HTML representation of the associated primary resource. It attempts to access URI http://www.example.com/guitar/tutorial/chords/Dmajor.html. Since Peter has arranged a Web presence for this resource and it can return HTML, the access is successful, and the Web page for the D Major chord is returned.

Because the retrieved representation is HTML, the Web browser knows the appropriate semantics to apply, when processing the fragment identifier from the URI of the secondary resource. In particular, it looks for and locates an element in the Web page that has the id value fingeringChart. It renders the page, scrolling if necessary to ensuring that the element with that id is presented to the user.

Editorial note: Rhys12/8/2007
It feels as though we might be able to make the following statement about representations in HTML and indeed any content type for which secondary resources identify subsets of primary resources. "The representation of the secondary resource is a subset of the representation of the primary resource." Or maybe that is going too far and opening up ambiguity in the definition of representation.

5.2 Secondary Resources in RDF

If an RDF representation of the primary resource, associated with a secondary resource, is available, it provide additional information about the secondary resource. Since the representation is associated with the primary resource and not the secondary resource, it is clear that it is not a representation of the secondary resource. However, the representation of the primary resource might provide information associated with that secondary resources.

There is no guarantee, of course, that simply because a representation of the primary resource can be accessed, that it contains information about the secondary resource.

Story

Angela needs to extend her description of the solar system to add information about the moons that orbit various planets. Since, for some planets, there are significant numbers of moons, she decides to define a primary resource for the moons of each planet and then to define a secondary resource within that primary resource for each individual moon.

To add information about the moons of Mars, she first creates the URI http://www.example.com/solar_system/Mars/moons. Next, she creates a Web presence for this URI and arranges for it to return a representation in RDF. Within this representation she arranges for assertions about the moon Phobos to be identified with an rdf:ID value of Phobos and for those aboue the moon Deimos to be identified with an rdf:ID value of Deimos.

Finally, she adds assertions, to the existing RDF representation of information about the planet Mars, at URI http://www.example.com/solar_system/information/Mars, to specify that http://www.example.com/solar_system/Mars/moons#Phobos and http://www.example.com/solar_system/Mars/moons#Deimos are moons of Mars.

To see how the relationship between primary and secondary resources works in this case, we'll follow through what happens if an attempt is made to access the URI http://www.example.com/solar_system/Mars/moons#Phobos, in Angela's description of the solar system. This URI identifies the moon itself. It's a non-information resource.

Suppose the user of a semantic Web browser attempts to discover more about Phobos. The browser recognizes that the URI refers to a secondary resource and attempts to retrieve an RDF representation of the associated primary resource. It attempts to access URI http://www.example.com/solar_system/Mars/moons. Since Angela has arranged a Web presence for this resource and it can return RDF, the access is successful, and a set of RDF triples is returned.

Because the retrieved representation is RDF, the semantic Web browser knows the appropriate semantics to apply when processing the fragment identifier from the URI of the secondary resource. In particular, it looks for and locates assertions related to the rdf:ID value Phobos. The semantic Web browser recognizes that these assertions are related to the secondary, non-information resource identified by the original URI. The browser renders additional information for the user, based on these new assertions, and adds them to the set about which it is capable of reasoning.

Editorial note 
Of course, this behaviour of the imagined semantic web browser is pure speculation. Should we have a more concrete example?

5.3 Using Content Negotiation with Secondary Resources

Content negotiation can be used with secondary resources, just as it can when redirection is used to associated information resources with non-information resources. Once again, it's important that the different representations of the information resource are effectively equivalent. In particular, the fragment identifiers for each secondary resources need to be interpretable in each of the representations.

Story

Angela decides to provide additional representations of materials from her ontology that relate to moons. She creates new representations in HTML and arranges for these to be returned under the appropriate conditions. For example, she arranges for the Web presence for the URI http://www.example.com/solar_system/Mars/moons to support content negotiation and to return either an HTML representation or an RDF representation. She is careful to ensure that the new HTML version includes id attributes with the values Phobos and Deimos. This allows the fragment identifiers #Phobos and #Deimos to be interpreted whether the RDF or HTML representations are retrieved. It allows the information associated with the secondary resources to be accessed, regardless of which representaion is retrieved.

For consistency with the materials available for the planet, she also adds a third representation of the material about the moons. This is also in HTML, but contains a French translation of the original text.

Angela already has an RDF representation for the moons of Mars that uses secondary resources to relate the moons themselves to the primary information resource. She needs to create an HTML representation that contains the same information. She also needs to ensure that the fragment identifiers associated with the secondary resources, and which are already supported in the RDF version, can also be interpreted in the HTML version.

Good Practice

URI owners MAY create multiple representations for an information resource which is associated with one or more secondary resources. They MAY make such representations available via content negotiation.

Such representations SHOULD be equivalent to one another. In particular it SHOULD be possible for the fragment identifiers of the associated secondary resources to be interpreted in the context of every such representation.

6 Generating Representations via Transformation

Editorial note 
During the last F2F the TAG mentioned the question of transformation during generation of a representation. I noted two particular scenarios (listed below). Unfortunately I don't seem to have a clear description of the problem that was being discussed.

7 Choosing Between Redirection and Secondary Resources

We've described two possible ways of associating information and non-information resources. One involves indicating an association via redirection. The other indicates an association using secondary resources. Each approach has practical advantages and consequences that may make one preferrable to the other in a specific case.

Setting up the definitions required for redirection can be time consuming when a significant number of associations is being defined. Also, since each association leads to at least one additional access attempt, there can be implications for network and server traffic.

Associations defined using secondary resources do not generate additional network traffic. In addition, if several such resources are associated with the same primary resource, it may possible to satisfy multiple requests for the non-information resources from a single acces to the primary resource. One additional consequence of this approach is that requests for a single association will incur the overhead of loading all of the associations, not simply the one in which the client is interested.

Good Practice

Where a relatively small set of closely associated non-information resources is involved, associations with related information resources SHOULD be indicated using the secondary resource approach.

Where a large set of non-information resources is involved, or those resources bear little or no relation to one another, associations with information resources SHOULD be indicated using the redirection approach.

Where these associations involve the use of RDF or OWL, the natural approach is usually to use secondary resources.

A References

AWWW
Architecture of the World Wide Web I.Jacobs and N. Walsh, 2004, W3C. (See http://www.w3.org/TR/webarch/.)
KEYWORDS
RFC 2119: Key words for use in RFCs to Indicate Requirement Levels Internet Engineering Task Force, 1997. (See http://www.ietf.org/rfc/rfc2119.txt.)
HTTP
RFC 2616: Hypertext Transfer Protocol -- HTTP/1.1 Internet Engineering Task Force, 1999. (See http://www.ietf.org/rfc/rfc2616.txt.)
URI
RFC 3986: Uniform Resource Identifier (URI): Generic Syntax (See http://www.ietf.org/rfc/rfc3986.txt.)
MARS
Mars. Description in Wikipedia (See http://en.wikipedia.org/wiki/Mars.)
METRE
Metre. Description in Wikipedia. (See http://en.wikipedia.org/wiki/Metre.)
SI
The NIST Reference on Constants, Units and Uncertainty (See http://physics.nist.gov/cuu/Units/.)
Pub RDF
Best Practices Recipes for Publishing RDF Vocabularies (See http://www.w3.org/TR/swbp-vocab-pub/.)
CSS
Cascading Style Sheets Level 2 Revision 1 (CSS 2.1) B.Bos et. al., 2007. W3C. (See http://www.w3.org/TR/CSS21/.)
CoolURI
Cool URIs for the Semantic Web L. Sauermann et. al. (See http://www.dfki.uni-kl.de/~sauermann/2006/11/cooluris/.)

B Changes in this version (Non-Normative)

This version represents a major rewrite. Some material has been retained from the previous version. However, significant structural changes have been made. In addition there is a significant amount of new material. Material that caused significant discussion during TAG meetings and on the public TAG mailing list has also been heavily revised.

In particular the following changes have been made: