W3C

Giving Information About Other Resources in HTML

W3C Working Draft 20-Nov-95

This version
http://www.w3.org/pub/WWW/TR/WD-resource-951120
Latest version
http://www.w3.org/pub/WWW/TR/WD-resource
Editors
Tim Berners-Lee <timbl@w3.org>, Dave Raggett <dsr@w3.org>

**FUTURE** Status of this document

SCRIBBLED NOTE: ONLY FOR USE OF W3C TEAM AND INVITEES AS YET

This is a W3C Working Draft for review by W3C members and other interested parties. It is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to use W3C Working Drafts as reference material or to cite them as other than "work in progress". A list of current W3C working drafts reports can be found at: http://www.w3.org/pub/WWW/TR/

Abstract

This specification defines an extension of HTML to allow information about other documents, objects etc, ("resources" as in the "R" of "URL") to be transmitted within an HTML document. Typical uses would be for including advance information about embedded or referenced documents, such as title, and size information, alternative available versions and forms.

This specification includes:

  1. A definition of the <resource> element
  2. An minor change to the definition of the link relationship syntax to make links relationships potentially first class objects;
  3. The definition of some specific link relationship values to allow information about related versions of a document to be transmitted.

Contents

Introduction

The HyperText Markup Language (HTML) is a simple markup language used to create hypertext documents that are portable from one platform to another. HTML documents are SGML documents with generic semantics that are appropriate for representing information from a wide range of applications.

A number of developments in the Web require the ability to transmit within an HTML document information about other documents. This specification extends HTML in such a way as to allow this.

Requirements

The requirements addressed by or taken into consideration in the design of this specification include the following:

There is a need to be able to specify within a document two alternative renderings of an image, for example one animated and one (failing animation) still image.

There is a need to be able to describe to a client, to a server or in a CD-Rom distributed database, the relationships between multiple language versions of the same document. Examples of documents which may well be multilingual are PICS schemes;

There is a need to be able to specify a set of optional style sheets, for operator selection;

There is a need for combinations of the above.

There is a requirement to be able to convert between information expressed in RFC822-style headers as in the HTTP protocol's object headers, and the new form for inclusion within HTML documents, implying a linking of the specifications in some way;

There is information about HTML documents themselves already present within the <HEAD> element, and as little reinvention as possible was required with respect to that;

Specification

This section is the normative part of this document, excluding the rationale shown thus.

The RESOURCE element

The <resource> is a new element similar to the HEAD element, which contains information about the HTML document itself, but containing information about a different document or other resource.

Example

Within a document http://www.w3.org/pub/foobar.html, the text
  <resource href="myimage.png">
    <meta http-equiv="Content-type" content="image/png">
    <meta http-equiv="Content-length" content="120756">
  </resource>
would indicate that image http://www.w3.org/pub/myimage.png could be expected to have HTTP headers
  Content-type: image/png
  Content-length: 120756
There is no requirement that information returned in an HTTP transaction in the HTTP headers should exactly match that given in a RESOURCE or HEAD element. The semantics are defined to be equivalent, though. When the same item of information is provided in both forms, the value should therefore in most cases be the same. For example, if the resource is an HTML document, any TITLE element given in a RESOURCE element about that document should match the title given in that document's actual TITLE element.

A resource record embedded in a document should be seen as part of that document when it comes to authorship as the document. The resource element contains, in effect, a set of statements which have been made by the author of the document, and the same body is responsible for them as is responsible for the document.

DTD fragment

The following fragment gives the syntax of the resource element.

<!ENTITY % resource.content "TITLE & ISINDEX? & ">
<!ELEMENT RESOURCE - -  (%resource.content) +(META|LINK|RESOURCE)>
<!ATTLIST RESOURCE
        ID      @@@
        HREF    CDATA   #REQUIRED    >
The HEAD and BODY elements may both contain resource elements.
<!ELEMENT HEAD O O  (%head.content) +(META|LINK|RESOURCE)>

and
<!ENTITY % text "#PCDATA | A | IMG | BR | RESOURCE ">

A combined DTD is not given as part of this specification.

The significance of a resource element is the same whether it occurs within the head or anywhere within the body of a document.

RESOURCE elements have no significance attachable to their position within the body of an HTML document, and should therefore be put into the head of the document.

However, allowing RESOURCE elements to occur within the body allows them to be given locally to a reference to a resource. This may be more convenient for the generator of a document, particularly a pipelined automatic document generator.

Although the content models of the two are similar, for compatibility with older browsers, one should not put within a resource element's context anything that, like a title element for example, would be detrimentally interpreted by older browsers as applying to the enclosing document itself.

Specific-Generic links

Four link relationship values are defined for specifying that one document is a more specific instance of another in some dimension.

The concept of a generic resource expands the concept of the identity of objects referenced by URLs. In practice, resources are available in different versions, different renderings, or different translations. While one sometimes wants to refer to an immutable resource as a fixed byte string, in other circumstances one needs to be able to refer to a resource whose identity is unchanged by the processes of updating, translation, or rendering. The former is a specific resource, the latter a generic resource.

The direction of the link relationship is as follows. When used as the value of the REL attribute to a link from resource A to resource B, or in the REV attribute of a link from B to A, this indicates that document A is a generic document and B a more specific document.

Most object on the web are version-generic, in that the URL at any time refers to the current version, which may change. This is the default assumption.

When a generic resource is presented, normally this is done by implicitly selecting a corresponding specific resource. in the case of a file which is served, often there is no record of anything other than the most specific version.

Some HTTP servers use format negotiation to select from specific documents with different content-types or languages.

The relationships below may be used with LINK elements within a resource element, or any other context in which link relationships may be specified.

Extended syntax

In the HTML 2.0 standard, the value of the relationship attribute to a link (REL or REV) is constrained to be syntactically an SGML name. In future applications of the web, it is interesting for link relationship values to become first class objects, to which end the existing strings are, according to this specification, to be interpreted as relative URLs to be parsed relative to some as yet unspecified default URL.

(This interest was particularly evident at the W3C Collaboration workshop of September 1995.)

The future growth path here is to define a language in which the characteristics of a given relationship may be represented. Dereferencing of the relationship value as a URL would allow these characteristics to be read by a browser encountering a new relationship value. This could in turn enable reasoning and/or appropriate presentation to the user.

The content-type-specific relationship

When used as the value of the REL attribute to a link from document A to document B, or in the REV attribute of a link from B to A, this indicates that document A is a generic document and B a more specific document, in that A may exist in multiple content-types, but B is is available in a particular content-type.

The significance of content-type is that of the MIME specification.

Example

The document below contains an image which is available in three different forms, differing by content-type.
  <resource href="myimage">
    <link rel="content-type-specific" href="myimage.png">
    <link rel="content-type-specific" href="myimage.gif">
    <link rel="content-type-specific" href="myimage.avi">
  </resource>
  <img SRC="myimage">
This RESOURCE element indicates that myimage is available in three specific content-types, with specific URLS myimage.png, myimage.gif, and myimage.avi.

Relative URLs inside RESOURCE elements

The HREF attribute of the RESOURCE element or of a LINK element, if not absolute, is parsed relative to the URL of the closest enclosing RESOURCE element, if any, or else the enclosing document as in accordance with HTML 2.0.

If RESOURCE elements are nested, then the only effect of nesting them is to change the way relative URLs are parsed within the inner RESOURCE element. There is no other implied relationship between the two resources.

In the above example, the actual content-types of the specific documents are not given. They can be specified within resource elements for the instances themselves. Although it would save some characters to have a special attribute to contain this information in the original link, this would break the clean use of LINK, and would only cater for the content-type-specific case.

Clearly shorthand elements which give in an abbreviated form some combination of RESOURCE and LINK elements could in future be defined.

Further specific relationship values follow. There is a possibility (not taken up in this specification) of a general rule for the assumption of xxx-specific relationship values for any negotiable HTTP header xxx.

The content-language-specific relationship

When used as the value of the REL attribute to a link from document A to document B, this indicates that document A is a generic document and B a more specific document, in that A may exist in multiple content-languages, but B is is available in a particular content-language.

The significance of content-type is that of the MIME specification.

Example

  <resource href="mydoc">
    <link rel="content-language-specific" href="mydoc.eng">
    <link rel="content-language-specific" href="mydoc.fr">
  </resource>

The version-specific relationship

When used as the value of the REL attribute to a link from document A to document B, this indicates that document A is a generic document and B a more specific document, in that A may exist in multiple versions, of which B is a particular version.

"Version" in this case indicates of in a sequence of instances of a resource through time, each of which makes previous versions obsolete.

A version-specific link may be used to indicate the relationship between a "living" document and a frozen version. This can allow one to deliberately refer to either the living document as exemplified by its latest version, or to a particular version. One might imagine a link creation dialogue box in a GUI to allow for this selection by a link creator.

The significance of content-type is that of the MIME specification.

Example

  <resource href="mydoc">
    <link rel="content-language-specific" href="mydoc.eng">
    <link rel="content-language-specific" href="mydoc.fr">
  </resource>
This indicates that mydoc is available in languages, with specific URLS mydoc.eng and myimage.fr.

The choice-specific relationship value

In some cases, two alternative version of a resource are available at the discretion of the reader. An example is the provision of two alternative styles for a document, perhaps one for reading and another for printing.

This is specified using a choice-specific link.

Example: Alternative style sheets

Two style sheets are available for a given document, each provided by the author who leaves the choice at the reader's discretion.
   <resource url="mystyle">
     <title>Style options for the Kooltown telegraph</title>
     <link rel=choice-specific href="mystyle-old.css"
        title="Original Kooltown telegraph Style">
     <link rel=choice-specific href="mystyle-new.css"
        title="New Kooltown telegraph style">
   </resource>
   <link rel=stylesheet href="mystyle">
The user agent, on reading this, might for example prompt the user with a question as to which style is preferred:
For Style options for the Kooltown telegraph, do you prefer Original Kooltown telegraph Style or New Kooltown telegraph style?
in some sort of form or dialog panel. The server should make a default choice if the user agent simply requests the generic style sheet.

MIME Type

A MIME type is defined by this standard for a file whose body contains only RESOURCE elements. This allows one to refer explicitly to resource information, to store it and transmit it.
MIME type
application/about

Resource description link

The final part of this specification defines a link type description

A link with this type as REL attribute between resource A and resource B indicates that resource B contains information describing resource A. Document B need not necessarily be

DTD: TBD @@

Usage Scenarios

Addressing some of the requirements list above, and providing further explanation of the usage of the RESOURCE element and the generic-specific links types, there follow some example scenarios.

Server performs negotiation

A server locally has a naming convention which allows the name of a resource file to be deduce from the URL of an HTML file. When asked for the generic URL, it consults the resource description for a map of related files, and returns the most suitable one to the client.

Server returns resource information

A server locally has a naming convention which allows the name of a resource file to be deduce from the URL of an HTML file. When asked for the generic URL, it notices that the client accepts application/about and returns that, allowing the client to make the decision.

Is this reasonable? Should it in fact return something to indicate that it is the document about X rather than x which is bing returned? Yes. A suitable method would be a new HTTP return code.

Server returns resource information

A server locally has a naming convention which allows the name of a resource file to be deduce from the URL of an HTML file. When asked for the HEAD information about the generic URL, it generates that information from the resource description file.

A server returns resource description preemptively

When asked for a document, the server notices that the client supports multipart responses, and send back two objects: the one requested, and a description file containing the descriptions of all document referenced by or embedded in that object.

Shorthand Elements

This document describes a consistent structure for defining the relationships between generic and specific documents. In some cases, use of the RESOURCE and LINK elements by themselves may seem too lengthy, and specific elements could be defined as shorthand. For example, one single hypothetical SPECIFIC element
 <resource href="foo">
  <specific href=foo.en.html
        ct="text/html"
        cl="en_us">
 </resource>
could indicate in one go two links between the enclosing resource and a specific version, and two attributes of that version. This syntax is less flexible than the general syntax but more efficient.

There could be variations on this.

References

HTML 2.0
The HTML 2.0 proposed standard, available as W3C WD-HTML2-9505@@
HTTP 1.0
HyperText Transfer Protocol,@@@

Security Considerations

The resource element allows information to be given about other documents, potentially false information. And example of an attack would be for an arbitrary hostile document to contain mis-information that a version of the attacked document is in fact available elsewhere, on a hostile site. The danger is that the browser would pick up this information silently, use it to give the user what it was told was a valid rendering of the attacked document.

The general principle is that for any metainformation (indeed, any information), systems must be aware of the source. The same information relating documents A and B has different security value depending upon whether it is provided by the server of A, the server of B or a third party.

A user if free, of course, to use whatever security policy he or she wants. However, a reasonable default policy would be, when accessing resource A, to only use information obtained from the sever of A or from a specifically trusted third party such as a local proxy or friendly server. The granularity of trust could alternatively be an authorization domain within an HTTP server.

In general, the trust in a piece of network information relies on trust in the network server as an agent for the author of the information. If the same server distributed information from parties which do not trust each other, then that server must ensure that it does not serve any misinformation by one party about the other party.

The provision of secure means of transferring resource (or any) information is independent of and outside the scope of this document. Clearly, digital signature techniques can allow information original from A about A to be redistributed by B's sever and verified as authentically authored by A.


W3C: The World Wide Web Consortium http://www.w3.org/