<a shape="rect" href="http://www.w3.org/"> W3C

<a shape="rect" name="title" id="title"> Architecture of the World Wide Web 1.0

<a shape="rect" id="date" name="date"> Editor's Draft 11 28 November 2003

This version:
<a shape="rect" class="editorcopy" href="http://www.w3.org/2001/tag/2003/webarch-20031111/"> http://www.w3.org/2001/tag/2003/webarch-20031111/ http://www.w3.org/2001/tag/2003/webarch-20031128/
Latest editor's draft:
<a shape="rect" href="http://www.w3.org/2001/tag/webarch/"> http://www.w3.org/2001/tag/webarch/
Previous version:
<a shape="rect" class="editorcopy" href="http://www.w3.org/2001/tag/2003/webarch-20031027/"> http://www.w3.org/2001/tag/2003/webarch-20031027/ http://www.w3.org/2001/tag/2003/webarch-20031111/
Latest version:
<a shape="rect" href="http://www.w3.org/TR/webarch/"> http://www.w3.org/TR/webarch/
Editor:
Ian Jacobs, W3C
Authors:
See <a shape="rect" href="#acks"> acknowledgments .

<a shape="rect" name="abstract" id="abstract"> Abstract

The World Wide Web is an a network-spanning information space of resources interconnected by links. This information space is the basis of, and is shared by, a number of information systems. Within each of these systems, agents (e.g., browsers, servers, spiders, (people and proxies) software) retrieve, create, display, analyze, and reason about resources.

Web architecture encompasses includes the definition of the information space in terms of identification and representation of its contents, and of the protocols that support the interaction of agents within in an information system making use of the space. Web architecture is influenced by social requirements and software engineering principles, leading to design choices that constrain the behavior of systems using the Web in order to achieve desired properties of the shared information space: efficiency, scalability, and the potential for indefinite growth across languages, cultures, and media. This document reflects the three dimensions bases of Web architecture: identification, interaction, and representation.

deleted text: <p> <span class="ednote"> Editor's note </span>: The TAG expects to rewrite the abstract in light of other changes to the document. </p>

<a shape="rect" name="status" id="status"> Status of this document

This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the <a shape="rect" href="http://www.w3.org/TR/"> W3C technical reports index at http://www.w3.org/TR/.

This document has been developed by W3C's <a shape="rect" href="http://www.w3.org/2001/tag/"> Technical Architecture Group (TAG) ( <a shape="rect" href="http://www.w3.org/2001/07/19-tag"> charter ). Please send comments on this document to the public W3C TAG mailing list <a shape="rect" href="mailto:www-tag@w3.org"> www-tag@w3.org ( <a shape="rect" href="http://lists.w3.org/Archives/Public/www-tag/"> archive ).

This draft has been prepared for TAG incorporates changes based on discussion by the TAG at its November 2003 face-to-face meeting in Japan. With respect to the previous draft, this one incorporates new material on versioning and extensibility and reflects comments from Tim Bray and Dan Connolly. Japan . A complete <a shape="rect" href="/2001/tag/webarch/changes"> list of changes since the previous Working Draft is available on the Web. deleted text: This draft includes some editorial notes and also references to open <a shape="rect" href="http://www.w3.org/2001/tag/ilist"> TAG issues </a>. These do not represent all open issues in the document. They are expected to disappear from future drafts.

The TAG has published is preparing to start a number Last Call review of <a shape="rect" href="http://www.w3.org/2001/tag/findings"> findings version 1.0 Architecture Document. The TAG's issues list that indicates which issues the TAG intends to address specific architecture issues. Parts before a last call review of this document, and which issues the TAG intends to address after publication of a version 1.0 Recommendation.

This document uses the concepts and terms regarding URIs as defined in draft-fielding-uri-rfc2396bis-03, preferring them to those findings may appear defined in subsequent drafts. RFC 2396. The IETF Internet Draft draft-fielding-uri-rfc2396bis-03 is expected to obsolete RFC 2396 , which is the current URI standard. The TAG is tracking the evolution of draft-fielding-uri-rfc2396bis-03.

Publication as a Working Draft does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than "work in progress." The latest information regarding <a shape="rect" rel="disclosure" href="/2001/tag/disclosures"> patent disclosures related to this document is available on the Web.

<a shape="rect" id="contents" name="contents"> Table of Contents

<a shape="rect" id="principles" name="principles"> List of Principles and Good Practice Notes

The following principles and good practice notes explained in this document are listed here for convenience.

General Architecture Principles
  1. deleted text: <a href="#pr-info-hiding"> Identify features that cross abstraction levels </a> </li> <li> Error recovery
  2. deleted text: <li> <a href="#specify-error-handling"> Specify error handling </a> </li>
Identification
  1. Identify with URIs
  2. URI uniqueness
  3. Assign URIs URI assignment
  4. <a href="#thoughtful-uris"> Thoughtful URI creation aliases
  5. Use URIs consistently Consistent URI usage
  6. URI ambiguity
  7. New URI schemes
  8. URI Opacity opacity
Interaction
  1. Fragment identifier consistency: consistency
  2. Authoritative server metadata
  3. Don't guess Appropriate metadata
  4. Safe retrieval
  5. Consistent representations representation
  6. Available representations representation
Data Formats
  1. <a href="#spec-availability"> Format specification availability </a> </li> <li> <a href="#register-media-type"> Media type registration </a> </li> <li> <a href="#define-fragids"> Specified fragment identifier semantics Version information
  2. <a href="#text-binary"> Binary or text Namespace policy
  3. Allow for extensions </a> </li> <li> <a href="#pr-version-info"> Provide version information Extensibility mechanisms
  4. Unknown extensions
  5. deleted text: Content, Presentation, Interaction Separation of content, presentation, interaction
  6. Link mechanisms
  7. Web linking
  8. Generic URIs
  9. <a href="#use-hyperlinks"> Use of Hyperlinks </a> </li> <li> <a href="#qname-mapping"> QName Mapping Hypertext links
  10. <a href="#qname-uri-syntax"> QNames Indistinguishable from URIs Namespace adoption
  11. <a href="#use-namespaces"> Use Namespaces Namespace documents
  12. <a href="#pr-doc-ns-policy"> Document namespace policy QName Mapping
  13. <a href="#namespace-docs"> Namespace documents QNames Indistinguishable from URIs
  14. XML and text/*
  15. XML and character encodings

1. <a shape="rect" id="intro" name="intro"> Introduction

The World Wide Web ( WWW , or simply Web) is an information space in which the things items of interest, referred to deleted text: collectively as resources , are identified by global identifiers called URIs. Uniform Resource Identifiers ( URIs ).

A <a shape="rect" name="scenario" id="scenario"> travel scenario is used throughout this document to illustrate typical behavior of Web agents — software acting on this information space on behalf of a person, entity, or process. Agents Software agents include servers, proxies, deleted text: browsers, spiders, multimedia players, browsers, and other <a name="def-user-agent" id="def-user-agent"> <dfn> user agents </dfn> </a> (software acting on behalf of a person). multimedia players.

Story

While planning a trip to Mexico, Nadia reads "Oaxaca weather information: ' <code> http://weather.example.com/oaxaca </code> '" 'http://weather.example.com/oaxaca'" in a glossy travel magazine. Nadia has enough experience with the Web to recognize that " <code> http://weather.example.com/oaxaca </code> " "http://weather.example.com/oaxaca" is a URI. Given the context in which the URI appears, she expects that it allows her to access weather information. When Nadia enters the URI into her browser:

  1. The browser performs an information retrieval action in accordance with its configured behavior for resources identified via the "http" URI scheme.
  2. The authority responsible for "weather.example.com" responds to the retrieval action, providing provides information in a response. response to the retrieval request.
  3. The browser displays the retrieved information, which includes hypertext links to other information via additional URI references. information. Nadia can follow these hypertext links to initiate new retrieval request actions. retrieve additional information.

This scenario deleted text: (elaborated on throughout the document) illustrates the three architectural dimensions bases of the Web that are discussed in this document:

  1. <a shape="rect" href="#identification"> Identification . Each <a shape="rect" href="#def-resource"> resource is identified by a Uniform Resource Identifier ( <acronym> URI </acronym> ). URI. In this travel scenario, the resource involves is about the weather in Oaxaca and the URI is " <code> http://weather.example.com/oaxaca </code> ". "http://weather.example.com/oaxaca".
  2. <a shape="rect" href="#interaction"> Interaction . Protocols define the syntax and semantics of messages exchanged by agents over a network about Web resources. Web agents communicate information about the state of a resource through <a shape="rect" href="#def-representation"> representations . In the travel scenario, Nadia (by clicking on a <a shape="rect" href="#link"> hypertext link ) tells her browser to request a representation of the resource identified by the URI in the hypertext link. The browser sends an HTTP GET request to the server at "weather.example.com". The server responds with a representation that includes XHTML data and the Internet Media Type "application/xml+xhtml".
  3. <a shape="rect" href="#formats"> Formats . Representations are built from a non-exclusive set of data formats, used separately or in combination (including XHTML, CSS, PNG, XLink, RDF/XML, SVG, and SMIL animation). In this scenario, the representation data is XHTML, which includes hypertext links to several SVG weather map images. While interpreting the XHTML representation data, which includes references to weather maps identified by URIs; the browser retrieves and displays those maps.

The following illustration shows the simplest relationship between identifier, resource, and representation.

A resource (Oaxaca Weather Info) is identified by a particular URI and is represented by pseudo-HTML content

Editor's note : The TAG may include additional illustrations in this document to help explain important terms and their relationships.

1.1. <a shape="rect" name="about" id="about"> About this Document

This document attempts to describe the properties we desire of the Web and the design choices that have been made to achieve them.

This document promotes re-use of existing standards when suitable, and gives guidance on how to innovate in a manner consistent with the Web architecture.

deleted text: <div class="section"> <h4> 1.1.1. <a shape="rect" name="doc-audience" id="doc-audience"> Audience of this Document </a> </h4>

deleted text: This document is intended to inform discussions about issues of Web architecture. The intended audience for this document includes: </p> <ol> <li> Participants in W3C terms MUST, MUST NOT, SHOULD, SHOULD NOT, and MAY are used in the good practice notes, principles, etc. in accordance with RFC 2119 [ RFC2119 ]. However, this document does not include conformance provisions for at least these reasons:

1.1.1. Audience of this Document

This document is intended to inform discussions about issues of Web architecture. The intended audience for this document includes:

  1. Participants in W3C Activities; i.e., developers of Web technologies and specifications in W3C
  2. Other groups and individuals developing technologies to be integrated into the Web
  3. Implementers of W3C specifications
  4. Web content authors and publishers

Readers will benefit from familiarity with the <a shape="rect" href="http://www.ietf.org/rfc.html"> Requests for Comments ( RFC ) series from the <a shape="rect" href="http://www.ietf.org/"> IETF , some of which define pieces of the architecture discussed in this document.

1.1.2. <a shape="rect" name="doc-scope" id="doc-scope"> Scope of this Document

This document focuses on presents the general architecture of the Web. Other groups inside and outside W3C also address specialized aspects of Web architecture, including accessibility, internationalization, device independence, and Web Services. The section on <a shape="rect" href="#archspecs"> Architectural Specifications includes references.

This document strives for brevity and precision while including illustrative examples. <a shape="rect" href="http://www.w3.org/2001/tag/findings"> TAG findings </a> , informational documents that provide more background, motivation, and examples. The detail about selected topics, complement this document. This document includes some important material from the findings. Since the findings do not contain good practice notes, principles, etc. beyond those that appear are expected to evolve independently, this document also includes references to approved TAG findings. For other TAG issues covered by this document but without an approved finding, references are to entries in the current document. TAG issues list .

1.1.3. Principles, Constraints, and Good Practice

The architecture described in important points of this document are categorized as follows:

Constraint
An architectural constraint is deleted text: primarily the result of experience. Researchers in this area have proposed a theoretical basis and formal model restriction in behavior or interaction within the system. Constraints may be imposed for Web architecture, notably technical, policy, or other reasons.
Design Choice
In the design of the Web, some design choices, like the names of the <p> and <li> elements in HTML, or the choice of the colon character in URIs, are somewhat arbitrary; if <par>, <elt>, or * had been chosen instead, the large-scale result would, most likely, have been the same. Other design choices are more fundamental; these are the focus of this document.
Good practice
Good practice — by software developers, content authors, site managers, users, and specification writers — increases the value of the Web.
Principle
An architectural principle is a fundamental rule that applies to a large number of situations and variables. Architectural principles include "separation of concerns", "generic interface", "self-descriptive syntax," "visible semantics," "network effect" (Metcalfe's Law), and Amdahl's Law: "The speed of a system is determined by its slowest component."
Property
Architectural properties include both the functional properties achieved by the system, such as accessibility and global scope, and non-functional properties, such as relative ease of evolution, re-usability of components, efficiency, and dynamic extensibility.

This categorization is derived from Roy Fielding's work on "Representational State Transfer" [ <a shape="rect" href="#REST"> REST ]. Authors of protocol specifications in particular should invest time in understanding the REST model and consider the role to which of its principles could guide their design: statelessness, clear assignment of roles to parties, uniform address space, and a limited, uniform set of verbs.

1.2. <a shape="rect" name="general" id="general"> General Architecture Principles

A number of general architecture principles apply to across all three dimensions bases of Web architecture.

1.2.1. <a shape="rect" id="orthogonal-specs" name="orthogonal-specs"> Orthogonal Specifications

Identification, interaction, and representation are orthogonal (or, "independent", or "loosely coupled") concepts: an identifier can be assigned without knowing what representations are available, agents can interact with any identifier, and representations can change without regard to the identifiers or interactions that may dereference them.

Orthogonality deleted text: is an important principle in Web architecture. It specifications facilitates a flexible design that can evolve over time. The fact, for example, that the an image can be identified using a URI without needing any information about the representation of that image allowed PNG and SVG to evolve independent of the specifications that define image elements. Similarly, XML schema is defined as a schema language where the concept "datatype" is filled by an independent list of data types. The schema language can be extended by adding new datatypes.

Orthogonal abstractions deserve orthogonal specifications. When it is necessary for a specification to define a feature Specifications should clearly indicate those features that simultaneously accesses access information from otherwise orthogonal abstractions, for abstractions. For example a specification should draw attention to a feature that requires information from both the header and the body of a message or a feature that needs to infer information about the representations of a URI that are available, the fact that it is "peeking" across architectural boundaries should be clearly identified. available.

deleted text: <div class="boxedtext">

<span class="practicelab"> Good practice: <a shape="rect" name="pr-info-hiding" id="pr-info-hiding"> Identify features that cross abstraction levels </a> </span> </p> <p class="practice"> Format specification authors SHOULD clearly identify Although the features of a specification HTTP, HTML, and URI specifications are orthogonal for the most part, they are not completely orthogonal. Experience demonstrates that cross abstraction levels. </p> </div> </div> <div class="section"> <h4> 1.2.2. <a shape="rect" id="error-handling" name="error-handling"> Error Handling </a> </h4> <p> Errors occur in networked information systems. The manner in which where they are dealt with depends on application context. User agents act on behalf not orthogonal, problems have arisen:

  • The HTML specification includes a protocol extension of the sorts: it specifies how a user and therefore agent sends HTML form data to a server (as a URI query string). The design works reasonably well, although there are expected limitations related to help the user understand internationalization (see the nature of errors, TAG finding " URIs, Addressability, and deleted text: possibly overcome them. User agents that correct errors without the consent use of HTTP GET and POST " ) and the user are not acting query string design impinges on the user's behalf. </p> <div class="boxedtext"> <p> <span class="principlelab"> Principle: <a shape="rect" name="no-silent-recovery" id="no-silent-recovery"> Error recovery server design. Developers (for example, of [ CGI </span> </p> <p class="principle"> Silent recovery from error is harmful. </p> </div> <p> To promote interoperability, specifications should set expectations about behavior in ] applications) might have an easier time finding the face of known error conditions. Experience has led to specification if it were published separately and then cited from the following observation about error-handling approaches. </p> <ul> <li> Protocol designers should provide enough information about the error condition so that a person or agent can address the error condition. For instance, an HTTP 404 message ("resource not found") is useful because it HTTP, URI, and HTML specifications.
  • The HTML specification allows agents to present information content providers to users that enables them instruct HTTP servers to contact the author of the representation that included the (broken) link. Similarly, experience with the cost of building build response headers from META element instances. This is a user agent to handle the diverse forms of ill-formed HTML content convinced the authors of clear abstraction violation; the XML specification developer community deserves to require that agents fail deterministically upon encountering ill-formed content. Because users are unlikely be able to tolerate such failures, find all HTTP headers from the HTTP specification (including any associated extension registries and specification updates per IETF process). Furthermore, this design deleted text: choice has pressured all parties into respecting XML's constraints, led to the benefit of all. </li> <li> There are costs and benefits when an confusion in user agent ignores unrecognized content. development. The common practice by HTML specification states that META in conjunction with http-equiv is intended for HTTP servers, but many HTML user agents of ignoring unknown elements contributed to the growth of interpret http-equiv='refresh' as a client-side instruction.
  • Some authors use the Web by allowing META / http-equiv approach to declare the rapid deployment character encoding scheme of new ideas. However, an HTML document. By design, this deleted text: behavior also contributed to interoperability problems among early browsers. See the section on <a shape="rect" href="#ext-version"> extensibility and versioning </a> for related information. </li> <li> Error behavior that is deleted text: appropriate for a person may not be appropriate for hint that an HTTP server should emit a processor. People are capable corresponding "Content-Type" header field. In practice, the use of exercising judgement the hint in ways that software applications generally cannot. An informal error response may suffice for a person but servers is not for a processor. </li> </ul> <div class="boxedtext"> <p> <span class="practicelab"> Good practice: <a shape="rect" name="specify-error-handling" id="specify-error-handling"> Specify error handling </a> </span> </p> <p class="practice"> Specification authors SHOULD specify agent behavior widely deployed and many user agents peek inside the HTML document in preference to the face of error conditions. </p> </div> <p> See "Content-Type" header field. This works against the TAG finding <cite> " <a shape="rect" href="http://www.w3.org/2001/tag/doc/mime-respect.html"> Client handling principle of MIME headers </a> " </cite> for more discussion about error reporting. See also TAG issue <a shape="rect" href="http://www.w3.org/2001/tag/ilist#errorHandling-20"> errorHandling-20 authoritative representation metadata . </p>

1.2.3. <a shape="rect" id="syntax-interop" name="syntax-interop"> Syntax and Interoperability 1.2.2. Extensibility of Languages

The Web follows the Internet tradition of having its important interfaces defined not in terms of APIs or data structures or object models, but information in deleted text: terms of syntax, by specifying the content and sequence of the messages interchanged. It commonly occurs that programmers working with the Web deleted text: write code directly to generate and parse these messages. It is a bit less usual, though not altogether uncommon, for end users to have direct exposure the technologies used to these messages. This leads represent that information change over time. Some examples of successful technologies designed to allow change while minimizing disruption include:

  • the well-known "view source" effect, whereby users gain expertise in fact URI schemes are independently specified,
  • the workings use of the systems by direct exposure an open set Internet media types in mail and HTTP to specify document interpretation,
  • the underlying protocols. </p> <p> Widespread APIs such as separation of the Simple API for generic XML (SAX) greatly facilitate grammar and the development open set of Web software, XML namespaces of element and XPath attribute names,
  • Cascading Style Sheets (CSS) rules for handling unknown style properties and XQuery show property values,
  • Forward-compatible style sheet processing in [ XSLT10 ],
  • the importance SOAP extensibility model, and
  • user agent plug-ins

The following applies to languages, in particular the specifications of deleted text: abstract data models. And quality assurance can have as much impact on interoperability as any formats, of these factors or others. But the technology that is shared between agents message formats, and URIs. Note: This document does not distinguish in any formal way the Web lasts longer than the agents themselves. Web Architecture has thus focussed on concrete syntax terms "format" and protocols shared between agents. "language." Context has determined which term is used.

deleted text: </div> </div> </div> <div class="section"> <h2> 2. <a shape="rect" name="identification" id="identification"> Identification </a> </h2>

Parties who wish to communicate about something agree upon Language subset : one language is a shared set subset (or "profile") of identifiers another if and on their meanings. This shared vocabulary has only if any document in the first language is also a tangible value: it reduces valid document in the cost of communication. The ability to use common identifiers across communities motivates global identifiers second language and has the same interpretation in Web architecture. Thus, <a name="def-uri" id="def-uri"> the second language.

Uniform Resource Identifiers Language extension : one language is an extension of another if and only if the second is a language subset [ <a shape="rect" href="#URI"> URI </a> ], which are global identifiers in of the context first (thus, the extension is a superset). "Extensibility" is the property of a language that allows the Web, are central to Web architecture. </p> <div class="boxedtext"> <p> <span class="constraintlab"> Constraint: <a shape="rect" name="id-with-URI" id="id-with-URI"> Identify with URIs </a> </span> </p> <p class="constraint"> creation of extensions. The identification mechanism original language design can accomplish extensibility by defining, for predictable unknown extensions, the Web is the URI. handling by implementations -- for example that they be ignored (in some way) or should be considered errors.

deleted text: </div>

A URI must be assigned to a resource For example, from early on in order the Web, HTML agents followed the convention of ignoring unknown elements. This choice left room for innovation (i.e., non-standard elements) and encouraged the resource to be <a shape="rect" href="#links"> linked-to </a> within deployment of HTML. However, interoperability problems arose as well. In this type of environment, there is an inevitable tension between interoperability in the information space. It follows short term and the desire for extensibility. Experience shows that a resource should be assigned a URI if a third party might reasonably want designs that strike the right balance between allowing change and preserving interoperability are more likely to link thrive and are less likely to it, make or refute assertions about it, retrieve or cache a representation disrupt the Web community. Orthogonal specifications help reduce the risk of it, include all or part disruption.

For further discussion of it by reference into another representation, annotate it, or perform other operations extensibility, see the section on it. versioning and extensibility .

<div class="boxedtext"> <p> <span class="constraintlab"> Constraint: <a shape="rect" name="design-mult-URI" id="design-mult-URI"> URI uniqueness

1.2.3. Error Handling </span> </p> <p class="constraint"> Web architecture does not constrain a Web resource to be identified by a single URI. </p> </div>

Resources exist before URIs; a resource may be identified by zero URIs. However, there Errors occur in networked information systems. The manner in which they are many benefits to assigning a URI to a resource, including linking, bookmarking, caching, dealt with depends on application context. A user agent acts on behalf of the user and indexing by search engines. Designers should expect that it will prove useful to be able to share a URI across applications, even if that utility therefore is not initially evident. Remember expected to help the user understand the nature of errors, and possibly overcome them. User agents that correct errors without the scope consent of deleted text: a URI is global: the resource identified by a URI does user are not depend acting on the context in which the URI appears. Of course, what an agent does with a URI may vary. The TAG finding <cite> " <a shape="rect" href="http://www.w3.org/2001/tag/doc/whenToUseGet.html"> URIs, Addressability, and the use of HTTP GET and POST </a> " </cite> discusses additional benefits and considerations. user's behalf.

Principle: <a shape="rect" name="pr-use-uris" id="pr-use-uris"> Assign URIs Error recovery

A resource owner SHOULD assign a URI to each resource that Silent recovery from error is intended to be identified, shared, or described by reference. harmful.

This principle dates back at least as far as Douglas Engelbart's seminal work on open hypertext systems; see section <a shape="rect" href="http://www.bootstrap.org/augdocs/augment-132082.htm#11K"> Every Object Addressable </a> To promote interoperability, specifications should set expectations about behavior in [ <a shape="rect" href="#Eng90"> Eng90 </a> ]. </p> <div class="section"> <h3> 2.1. <a shape="rect" name="identifiers-comparison" id="identifiers-comparison"> URI Comparisons </a> </h3> <p> As stated above, Web architecture allows resource owners to assign more than one URI the face of known error conditions. Experience has led to a the following observations about error-handling approaches.

  • Protocol designers should provide enough information about the error condition so that a an agent can address the error condition. For instance, an HTTP 404 message ("resource not found") is useful because it allows user agents to present relevant information to users, enabling them to contact the author of the representation that included the (broken) link. Similarly, experience with the cost of building a user agent to handle the diverse forms of ill-formed HTML content convinced the authors of the XML specification to require that agents fail deterministically upon encountering ill-formed content. Because users are unlikely to tolerate such failures, this design choice has pressured all parties into respecting XML's constraints, to the benefit of all.
  • An agent that encounters unrecognized content may handle it in a number of ways, including as an error. See the section on extensibility and versioning for related information.
  • Error behavior that is appropriate for a person may not be appropriate for software. People are capable of exercising judgement in ways that software applications generally cannot. An informal error response may suffice for a person but not for a processor.

See the TAG finding " Client handling of MIME headers " for more discussion about error reporting. See also TAG issue errorHandling-20 .

1.2.4. Protocol-based Interoperability

The Web follows Internet tradition in that its important interfaces are defined not in terms of APIs or data structures or object models, but in terms of protocols, by specifying the content and sequence of the messages interchanged. The messages exchanged among agents in the Web last longer than the agents themselves.

It is common for programmers working with the Web to write code that generates and parses these messages directly. It is less common, but not unusual, for end users to have direct exposure to these messages. This leads to the well-known "view source" effect, whereby users gain expertise in the workings of the systems by direct exposure to the underlying protocols.

Widespread APIs such as the Simple API for XML [ SAX ] greatly facilitate the development of Web software, and XPath and XQuery show the utility of abstract data models.

2. Identification

Parties who wish to communicate must agree upon a shared set of identifiers and on their meanings. This shared vocabulary has a tangible value: it reduces the cost of communication. The ability to use common identifiers across communities motivates global identifiers in Web architecture. Thus, Uniform Resource Identifiers ([ URI ], currently being revised) which are global identifiers in the context of the Web, are central to Web architecture.

Constraint: Identify with URIs

The identification mechanism for the Web is the URI.

A URI must be assigned to a resource in order for agents to be able to refer to the resource. It follows that a resource should be assigned a URI if a third party might reasonably want to link to it, make or refute assertions about it, retrieve or cache a representation of it, include all or part of it by reference into another representation, annotate it, or perform other operations on it.

Constraint: URI uniqueness

Web architecture does not constrain a Web resource to be identified by a single URI.

Resources exist before URIs; a resource may be identified by zero URIs. However, there are many benefits to assigning a URI to a resource, including linking, bookmarking, caching, and indexing by search engines. Designers should expect that it will prove useful to be able to share a URI across applications, even if that utility is not initially evident.

The scope of a URI is global: the resource identified by a URI does not depend on the context in which the URI appears. Of course, what an agent does with a URI may vary. The TAG finding " URIs, Addressability, and the use of HTTP GET and POST " discusses additional benefits and considerations.

When a representation uses a URI (instead of a local identifier) as an identifier, then it gains great power from the vastness of the choice of resources to which it can refer. The phrase the "network effect" describes the fact that the usefulness of the technology is dependent on the size of the deployed Web.

Principle: URI assignment

A resource owner SHOULD assign assign a URI to each resource that others will expect to refer to.

This principle dates back at least as far as Douglas Engelbart's seminal work on open hypertext systems; see section Every Object Addressable in [ Eng90 ].

2.1. URI Comparisons

As stated above, Web architecture allows resource owners to assign more than one URI to a resource. Thus, URIs that are not identical (character for character) do not necessarily refer to different resources. The most straightforward way of establishing that two parties are referring to the same Web resource is to compare, as character strings, the URIs they are using. URI equivalence is discussed in section 6 of [ <a shape="rect" href="#URI"> URI ]

Good practice: <a shape="rect" name="thoughtful-uris" id="thoughtful-uris"> Thoughtful URI creation aliases

Resource owners should not create arbitrarily different URIs for the same resource.

URI producers should be conservative about the number of different URIs they produce for the same resource. For example, the parties responsible for weather.example.com should not use both " <code> http://weather.example.com/Oaxaca </code> " "http://weather.example.com/Oaxaca" and " <code> http://weather.example.com/oaxaca </code> " "http://weather.example.com/oaxaca" to refer to the same resource; agents will not detect the equivalence relationship by following specifications. On the other hand, there may be good reasons for creating similar-looking URIs. For instance, one might reasonably create URIs that begin with " <code> http://www.example.com/tempo </code> " "http://www.example.com/tempo" and " <code> http://www.example.com/tiempo </code> " "http://www.example.com/tiempo" to provide access to resources by users who speak Italian and Spanish.

Likewise, URI consumers should ensure URI consistency. For instance, when transcribing a URI, agents should not gratuitously escape characters. The term "character" refers to URI characters as defined in section 2 of [ <a shape="rect" href="#URI"> URI ].

Good practice: <a shape="rect" name="lc-uri-chars" id="lc-uri-chars"> Use URIs consistently Consistent URI usage

If a URI has been assigned to a resource, agents SHOULD refer to the resource using the same URI, character for character.

Applications may apply rules beyond basic string comparison (e.g., (for example, for "http" URIs, the authority component is case-insensitive) that are licensed by specifications to reduce the risk of false negatives and positives. Web agents Agents that reach conclusions based on comparisons that are not licensed by relevant specifications take responsibility for any problems that result. Agents should not assume, for example, that " <code> http://weather.example.com/Oaxaca </code> " "http://weather.example.com/Oaxaca" and " <code> http://weather.example.com/oaxaca </code> " "http://weather.example.com/oaxaca" identify the same resource, since none of the specifications involved states that the path part of an "http" URI is case-insensitive.

See section 6 [ <a shape="rect" href="#URI"> URI ] for more information about comparing URIs and reducing the risk of false negatives and positives. See the section on future directions for solutions other than string comparison that may allow different parties to <a shape="rect" href="#future-comparison"> determine that two URIs identify the same resource .

2.2. <a shape="rect" name="URI-ambiguity" id="URI-ambiguity"> URI Ambiguity Ownership

Just as The requirement for URIs to be unambiguous demands that two agents do not assign the same URI to different resources. URI scheme specifications assure this using a shared vocabulary has tangible value, variety of techniques, including:

The approach taken for the "http" URI scheme follows the pattern whereby the Internet community delegates authority, via the IANA URI scheme registry [ IANASchemes ] and the use DNS), over a set of the same URI to refer URIs with a common prefix to deleted text: more than one distinct resource. Consider the following scenario: particular owner. One division consequence of a company maintains data about Web pages, including who created them and when. This division naturally uses this approach is the URI of Web's heavy reliance on the page to identify it. Another division of central DNS registry.

Whatever the company maintains data about corporations, including who created them and when. They use techniques used, except for the URI of checksum case, the corporation's home page to identify it. agent has a unique relationship with the URI, called URI ownership URI . The phrase "authority responsible for a URI" is synonymous with "URI owner" in this document.

If The social implications of URI ownership are not discussed here. However, the two divisions decide to merge their data, they will have to exercise care success or failure of these different approaches depends on the fact that they are using the same URIs extent to identify two different resources will cause problems. </p> <p> Ambiguity which there is an error consensus in the Internet community to abide by the defining specifications, as expressed through protocol messages. Of particular importance are those messages that express a relationship between a URI and deleted text: should not be confused with indirect identification. Indirect identification occurs when a representation of the resource it identifies. The concept of URI ownership is identified through another resource to especially visible in the case of the HTTP protocol, which it has a known relationship. For example, people can be identified by their email addresses or organizations by their web pages. When conference organizers ask meeting participants enables the URI owner to register by giving their email addresses, both parties know that they are using serve authoritative representations of a resource. In this case, the HTTP origin server (defined in [ RFC2616 ]) is the mailbox identifier to indirectly identify agent acting on behalf of the deleted text: person. The URI " <code> mailto:joe@example.com </code> " still identifies owner.

2.3. URI Ambiguity

Just as a shared vocabulary has tangible value, the mailbox, not ambiguous use of terms imposes a cost in communication. URI ambiguity refers to the person. use of the same URI to refer to more than one distinct resource.

Good practice: URI ambiguity

Avoid URI ambiguity.

URI ambiguity should not be confused with ambiguity in natural languages. language. The natural language English statement "'http://www.example.com/moby' identifies 'Moby Dick'" is ambiguous because one could understand the statement to refer to deleted text: very distinct resources: a particular printing of this work, or the work itself in an abstract sense, or the fictional white whale, or a particular copy of the book on the shelves of a library (via the Web interface of the library's online catalog), or the record in the library's electronic catalog which contains the metadata about the work, or the Gutenberg project's online version. version .

2.3.1. URIs in other Roles

URI In Web architecture, URIs identify resources. They are also useful in other roles, but this should not normally lead to ambiguity only arises if different parties believe in the identification function. Consider the following scenario: a software-development group building a database of information about companies might choose to use the URI of each company's Web site as a unique lookup key, since URIs have useful properties of uniqueness, longevity, and moderate length. In this application, the Web site URI is being used indirectly to identify the company. The same software-development group might build a another database of Web pages, very likely indexed by URI. However, this does not mean that "http://www.example.com/moby" identifies different things. the company has become its Web site, that some Web-page record is actually a company, that the fields of the two databases would be consistent, or that the URIs would necessarily be useful as a basis for merging.

deleted text: <div class="boxedtext">

<span class="practicelab"> Good practice: <a shape="rect" name="pr-uri-ambiguity" id="pr-uri-ambiguity"> URI ambiguity </a> </span> </p> <p class="practice"> Avoid Similarly, people may be identified by their email addresses. When conference organizers ask attendees to register by giving their email addresses, both parties know that they are using the mailbox identifier indirectly to identify the person. The resource identified by the URI ambiguity. "mailto:nadia@example.com" is still a mailbox, not a person.

2.3. <a shape="rect" name="URI-scheme" id="URI-scheme"> 2.4. URI Schemes

In the URI " <code> http://weather.example.com/ </code> ", "http://weather.example.com/", the "http" that appears before the colon (":") is names a URI scheme name. scheme. Each URI scheme has a normative specification that explains how identifiers are assigned within that scheme. The URI syntax is thus a federated and extensible naming mechanism wherein each scheme's specification may further restrict the syntax and semantics of identifiers within that scheme. Furthermore, the URI scheme specification may specify whether and how an agent can <a shape="rect" href="#dereference-uri"> dereference the URI .

Examples of URIs from various schemes include:

The Internet Assigned Numbers Authority ( <acronym> IANA </acronym> ) maintains a registry [ <a shape="rect" href="#IANASchemes"> IANASchemes </a> ] of mappings between URI scheme names and scheme specifications. For instance, While the IANA registry indicates that Web architecture allows the "http" scheme is defined in [ <a shape="rect" href="#RFC2616"> RFC2616 </a> ]. The process for registering definition of new schemes, introducing a new deleted text: URI scheme is defined in [ <a shape="rect" href="#RFC2717"> RFC2717 </a> ]. </p> <p> Since many costly. Many aspects of URI processing are scheme-dependent, and deleted text: since a huge significant amount of deployed software already processes URIs of well-known schemes, schemes. Introducing a new URI scheme requires the development and deployment not only of client software to handle the scheme, but also of ancillary agents such as gateways, proxies, and caches. See [ RFC2718 ] for other considerations and costs related to URI scheme design.

Because of these costs, if a URI scheme exists that meets the needs of an application, designers should use it rather than invent one. The "https" scheme [ RFC2818 ] is an example of a URI scheme that, though commonly implemented by agents, is problematic for a number of reasons:

Good practice: <a shape="rect" name="pr-new-scheme-expensive" id="pr-new-scheme-expensive"> New URI schemes

Authors of specifications SHOULD NOT introduce a new URI scheme when an existing scheme provides the desired properties of identifiers and their relation to resources.

Consider our <a shape="rect" href="#scenario"> travel scenario : should the authority providing information about the weather in Oaxaca register a new URI scheme "weather" for the identification of resources related to the weather? They might then publish URIs such as " <code> weather://travel.example.com/oaxaca </code> ". While the Web architecture allows the definition of new schemes, there is a cost to registration and especially deployment of new schemes. "weather://travel.example.com/oaxaca". When an a software agent dereferences such a URI, if what really happens is that HTTP GET is invoked to retrieve an HTML a representation of the resource, then an "http" URI would have sufficed. deleted text: If a URI scheme exists that meets the needs of an application, designers should use it rather than invent one.

If the motivation behind registering a new scheme is to allow an a software agent to launch a particular application when retrieving a representation, such dispatching can be accomplished at lower expense by registering via Internet Media Types. If you are designing a new data format, the appropriate mechanism to promote its deployment on the Web is the Internet Media Type instead. Deployed software is more likely to Type.

Note that even if an agent cannot yet handle the introduction of representation data in a new media type than format, the introduction of representation data may contain enough information to allow a user or user agent to find more information. When an agent does not handle a new URI scheme. scheme, it cannot retrieve a representation.

<p>

2.4.1. URI Scheme Registration <strong> Note: </strong>

The TAG should provide more justification for Internet Assigned Numbers Authority ( IANA ) maintains a registry [ IANASchemes ] of mappings between URI scheme names and scheme specifications. For instance, the preceding sentence. Some suggestions have been IANA registry indicates that deleted text: this is true currently but is not constrained by the arch to be so. Others have commented that it's better to get a representation and deal with it locally rather than not even be able to GET "http" scheme is defined in [ RFC2616 ]. The process for registering a representation. new URI scheme is defined in [ RFC2717 ].

The use of unregistered URI schemes is discouraged for a number of reasons:

  • There is no generally accepted way to locate the scheme specification.
  • Someone else may be using the scheme for other purposes.
  • One should not expect that general-purpose software will do anything useful with URIs of this scheme; the network effect is lost.

Note: Some URI scheme specifications (such as the "ftp" URI scheme specification) use the term deleted text: "designate." This document uses "identify" rather than "designate" in some of where the descriptions above. current document would use "identify."

TAG issue <a shape="rect" href="http://www.w3.org/2001/tag/issues.html#siteData-36"> siteData-36 is about expropriation of naming authority.

<p> <span class="ednote"> Editor's note </span>: In a future version of this document, the TAG may summarize some URI schemes and what the scheme specification licenses agents to infer by recognizing the scheme. </p>

2.4. <a shape="rect" name="uri-opacity" id="uri-opacity"> 2.5. URI Opacity

It is tempting to guess the nature of a resource by inspection of a URI that identifies it. However, the Web is designed so that agents communicate resource state through <a shape="rect" href="#def-representation"> representations , not identifiers. In general, one cannot determine the Internet Media Type of representations of a resource by inspecting a URI for that resource. For example, the ".html" at the end of " <code> http://example.com/page.html </code> " "http://example.com/page.html" provides no guarantee that representations of the identified resource will be served with the Internet Media Type "text/html". The HTTP protocol does not constrain the Internet Media Type based on the path component of the URI; the server is free to return a representation in deleted text: the PNG or any other data format for that URI.

Resource state may evolve over time. Requiring resource owners to change URIs to reflect resource state would lead to a significant number of broken links. For robustness, Web architecture promotes independence between an identifier and the identified resource.

Good practice: <a shape="rect" name="pr-uri-opacity" id="pr-uri-opacity"> URI Opacity opacity

Web agents Agents making use of URIs MUST NOT attempt to infer properties of the referenced resource except as licensed by relevant specifications.

The example URI used in the <a shape="rect" href="#scenario"> travel scenario (" <code> http://weather.example.com/oaxaca </code> ") ("http://weather.example.com/oaxaca") suggests that the identified resource has something to do with the weather in Oaxaca. A site reporting the weather in Oaxaca could just as easily be identified by the URI " <code> http://vjc.example.com/315 </code> ". "http://vjc.example.com/315". And the URI " <code> http://weather.example.com/vancouver </code> " "http://weather.example.com/vancouver" might identify the resource "my photo album."

On the other hand, the URI " <code> mailto:joe@example.com </code> " "mailto:joe@example.com" indicates that the URI refers to a mailbox. The mailto "mailto" URI scheme specification authorizes deleted text: Web agents to infer that URIs of this form identify Internet mailboxes.

In some cases, relevant technical specifications license URI assignment authorities to publish assignment policies. For more information about URI opacity, see the TAG finding " <a shape="rect" href="http://www.w3.org/2001/tag/doc/metaDataInURI-31"> The use of Metadata in URIs " .

2.5. <a shape="rect" name="fragid" id="fragid"> 2.6. Fragment Identifiers

Story

When navigating within the XHTML data that Nadia receives as a representation of the resource identified by " <code> http://weather.example.com/oaxaca </code> ", "http://weather.example.com/oaxaca", Nadia finds that the URI " <code> http://weather.example.com/oaxaca#tom </code> " "http://weather.example.com/oaxaca#tom" refers to information about tomorrow's weather in Oaxaca. This URI includes the fragment identifier "tom" (the string after the "#").

The fragment identifier of a URI allows indirect identification of a secondary resource by reference to a primary resource and additional information. More precisely:

The secondary resource may be some portion or subset of the primary resource, some view on representations of the primary resource, or some other resource. The interpretation of fragment identifiers is discussed in the section on <a shape="rect" href="#media-type-fragid"> media types and fragment identifier semantics . deleted text: Note that one can use a URI with a fragment identifier even if one does not have a representation handy for interpreting the fragment identifier (e.g., one can compare two such URIs).

Refer the TAG finding " <a shape="rect" href="http://www.w3.org/2001/tag/doc/abstractComponentRefs.html"> Abstract Component References " for information about indirect identification of abstract components such as those identified in description languages such as WSDL and RDF.

TAG issue DerivedResources-43 : How are secondary resources derived?

2.6. <a shape="rect" name="identifiers-future" id="identifiers-future"> 2.7. Future Directions for Identifiers

There remain open questions regarding identifiers on the Web. The following sections identify a few areas of future work in the Web community. deleted text: The TAG makes no commitment at this time to pursuing these issues.

2.6.1. <a id="i18n-id" shape="rect" name="i18n-id"> 2.7.1. Internationalized Identifiers

The integration of internationalized identifiers (i.e., composed of characters beyond those allowed by [ <a shape="rect" href="#URI"> URI ]) into the Web architecture is an important and open issue. See TAG issue <a shape="rect" href="http://www.w3.org/2001/tag/ilist#IRIEverywhere-27"> IRIEverywhere-27 for discussion about work going on in this area.

3. <a shape="rect" name="interaction" id="interaction"> Interaction

deleted text: URIs are designed to work equally well on a Web of one computer or a Web of 1 billion computers. The exchange of information in a world of 1 billion computers is another story. A successful interaction architecture must account for the physical distance messages travel, network outages, representing information to allow shared understanding, multiple trust boundaries, and many other distributed communication issues. In this section we discuss the basis of the Web's interaction model. </p> <p> Communication between deleted text: Web agents over a network about resources involves URIs, messages, and data.

Story

Nadia follows a hypertext link labeled "satellite image" expecting to retrieve a satellite photo of the Oaxaca region. The link to the satellite image is an XHTML link encoded as <a href="http://example.com/satimage/oaxaca">satellite image</a> . Nadia's browser analyzes the URI and determines that its <a shape="rect" href="#URI-scheme"> scheme is "http". The browser configuration determines how it locates the identified information, which might be via a cache of prior retrieval actions, by contacting an intermediary (e.g., (such as a proxy server), or by direct access to the server identified by the URI. In this example, the browser opens a network connection to port 80 on the server at "example.com" and sends a "GET" message as specified by the HTTP protocol, requesting a representation of the resource identified by "/satimage/oaxaca".

The server sends a response message to the browser, once again according to the HTTP protocol. The message consists of several headers and a JPEG image. The browser reads the headers, learns from the 'Content-Type' field that the Internet Media Type of the representation is image/jpeg , reads the sequence of octets that comprises the representation data, and renders the image.

This section describes the architectural principles and constraints regarding interactions between agents, including such topics as network protocols and interaction styles, along with interactions between the Web as a system and the people that make use of it. The fact that the Web is a highly distributed system affects architectural constraints and assumptions about interactions.

<span class="ednote"> Editor's note </span>: Note: The TAG has Web Architecture does not yet reached agreement about whether to distinguish "information resources" from other types of resources. An information resource is one that conveys information (via representations). See TAG issue <a shape="rect" href="http://www.w3.org/2001/tag/ilist.html#httpRange-14"> httpRange-14 </a>. Related to the concept of "information resource" is the expression "on the Web". Roy Fielding suggested this require a formal definition of the commonly used phrase "on the Web": "A Web." Informally, a resource is deleted text: considered to be "on the Web" if when it has a URI and an agent can be independently referred to by at least one URI, even if access use the URI to that resource is restricted." Others have expressed that actual access should be retrieve a requirement as well. representation of it using network protocols (given appropriate access privileges, network connectivity, etc.). See the related TAG issue httpRange-14 .

3.1. <a shape="rect" name="dereference-uri" id="dereference-uri"> Using a URI to Access a Resource

Agents may use a URI to access the referenced resource; this is called dereferencing the URI . Access may take many forms, including retrieving a representation of resource state (e.g., (for instance, by using HTTP GET or HEAD), modifying the state of the resource (e.g., (for instance, by using HTTP POST or PUT), and deleting the resource (e.g., (for instance, by using HTTP DELETE).

There may be more than one way to access a resource for a given URI; application context may determine determines which access mechanism an agent uses. For instance, a browser might use HTTP GET to retrieve a representation of a resource, whereas a link checker might use HTTP HEAD on the same URI simply to establish whether a representation is available. Some URI schemes (e.g., set expectations about available access mechanisms, others (such as the URN scheme [ <a shape="rect" href="#RFC2141"> RFC 2141 ]) do not set expectations about available access mechanisms. ]) do not. Section 1.2.2 of [ <a shape="rect" href="#URI"> URI ] discusses the separation of identification and interaction in more detail. For more information about relationships between multiple access mechanisms and URI addressability, see the TAG finding " <a shape="rect" href="http://www.w3.org/2001/tag/doc/whenToUseGet.html"> URIs, Addressability, and the use of HTTP GET and POST " </cite>. </p> <p> Although many <a shape="rect" href="#URI-scheme"> URI schemes </a> are named after protocols, this does not imply that use of such a URI will result in access to the resource via the named protocol. Even when an agent uses a URI to retrieve a representation, that access might be through gateways, proxies, caches, and name resolution services that are independent of the protocol associated with the scheme name, and the resolution of some URIs may require the use of more than one protocol (e.g., both DNS and HTTP are typically used to access an "http" URI's origin server when a representation isn't found in a local cache). </p> <p> Dereferencing a URI generally involves a succession of steps as defined in multiple independent specifications and implemented by the agent. The following example illustrates the series of specifications that are involved when an agent dereferences the URI " <code> http://weather.example.com/oaxaca </code> " that is part of a link in an SVG document. </p> <ol> <li> Since the URI is part of a link in an SVG document, the first relevant specification is the SVG 1.1 Recommendation [ <a shape="rect" href="#SVG11"> SVG11 </a> ]. This specification imports the link semantics defined in XLink 1.0 [ <a shape="rect" href="#XLink10"> XLink10 </a> ]. Section 17.1 of the SVG specification suggests that interaction with an <code> a </code> link involves retrieving a representation of a resource, identified by the XLink <code> href </code> attribute: "By activating these links (by clicking with the mouse, through keyboard input, and voice commands), users may visit these resources." </li> <li> The XLink 1.0 [ <a shape="rect" href="#XLink10"> XLink10 .

Although many URI schemes ] specification, which defines the attribute <code> xlink:href </code> in section 5.4, states are named after protocols, this does not imply that "The value use of the href attribute must be such a URI reference as defined in [IETF RFC 2396], or must will result in a URI reference after access to the escaping procedure described below is applied." </li> <li> The URI specification [ <a shape="rect" href="#URI"> URI </a> ] states that "Each URI begins with resource via the named protocol. Even when an agent uses a scheme name that refers URI to retrieve a specification for assigning identifiers within representation, that scheme." The URI scheme access might be through gateways, proxies, caches, and name in this example is "http". </li> <li> [ <a shape="rect" href="#IANASchemes"> IANASchemes </a> ] states resolution services that the "http" scheme is defined by the HTTP/1.1 specification (RFC 2616 [ <a shape="rect" href="#RFC2616"> RFC2616 </a> ], section 3.2.2). </li> <li> Section 9.3 are independent of deleted text: [ <a shape="rect" href="#RFC2616"> RFC2616 </a> ] states how the server constructs protocol associated with the scheme name.

Dereferencing a GET response (section 6 URI generally involves a succession of [ <a shape="rect" href="#RFC2616"> RFC2616 </a> ]), including steps as described in multiple independent specifications and implemented by the 'Content-Type' field. In this SVG context, agent. The following example illustrates the series of specifications that are involved when a user instructs a user agent deleted text: employs the GET method to retrieve the representation. </li> <li> Section 1.4 of [ <a shape="rect" href="#RFC2616"> RFC2616 follow a hypertext link deleted text: ] states "HTTP communication usually takes place over TCP/IP connections." This example does not address that step in is part of an SVG document. In this example, the URI is "http://weather.example.com/oaxaca" and the application context calls for the deleted text: process. </li> <li> The user agent deleted text: interprets the returned representation according to retrieve and render a representation of the data format specification that corresponds to identified resource.

  1. Since the representation's <a shape="rect" href="#internet-media-type"> Internet Media Type </a> (the value URI is part of the HTTP 'Content-Type') a hypertext link in an SVG document, the first relevant IANA registry specification is the SVG 1.1 Recommendation [ <a shape="rect" href="#MEDIATYPEREG"> MEDIATYPEREG SVG11 ]. </li> </ol> <p> <span class="ednote"> Editor's note </span>: Chris Lilley is expected to provide a revised version Section 17.1 of this specification imports the above sequence. </p> </div> <div class="section"> <h3> 3.2. <a shape="rect" name="msg-representation" id="msg-representation"> Messages and Representations </a> </h3> <p> A <a name="def-message" id="def-message"> <dfn> message </dfn> link semantics defined in XLink 1.0 [ XLink10 ]. "The remote resource (the destination for the link) is an event that is represented in defined by a non-exclusive set of messaging protocols (e.g., HTTP, FTP, NNTP, SMTP, etc.). Messages may carry data, metadata about the data, and <a name="message-metadata" id="message-metadata"> <dfn> message metadata </dfn> </a>: metadata about the message (e.g., URI specified by the HTTP Transfer-encoding header). A message may even include metadata about XLink href attribute on the message metadata (e.g., for message-integrity checks). </p> <p> Two important classes of message are those 'a' element." The SVG specification goes on to state that request interaction with an a element involves retrieving a representation of a resource, and those that return identified by the result of such a request. Such a response message (e.g., a response to an HTTP GET) carries a <a name="def-representation" id="def-representation"> <dfn> representation </dfn> XLink href attribute: "By activating these links (by clicking with the mouse, through keyboard input, voice commands, etc.), users may visit these resources."
  2. The XLink 1.0 [ XLink10 of the state of ] specification, which defines the resource. A representation is an octet sequence attribute xlink:href in section 5.4, states that consists logically "The value of two parts: </p> <ol> <li> Electronic data about resource state, expressed the href attribute must be a URI reference as defined in one or more <a shape="rect" href="#formats"> formats </a> used separately [IETF RFC 2396], or must result in combination, and a URI reference after the escaping procedure described below is applied."
  3. <a name="representation-metadata" id="representation-metadata"> <dfn> Representation metadata </dfn> </a>, especially the <a shape="rect" href="#internet-media-type"> Internet Media Type The URI specification [ URI which governs the interpretation of the representation data. </li> </ol> <p> Some protocols (such as HTTP) may also allow agents ] states that "Each URI begins with a scheme name that refers to exchange <a name="resource-metadata" id="resource-metadata"> <dfn> resource metadata </dfn> </a>. For a specification for assigning identifiers within that scheme." The URI scheme name in this example deleted text: when using HTTP, some resource metadata is specified by headers such as 'Alternates' and 'Vary'. </p> <p> Agents use representations to modify as well as retrieve resource state. Note "http".
  4. [ IANASchemes ] states that deleted text: even though the response to an HTTP POST request may contain "http" scheme is defined by the above types of data, HTTP/1.1 specification (RFC 2616 [ RFC2616 ], section 3.2.2).
  5. In this SVG context, the response to agent constructs an HTTP POST GET request is not a representation of the state (per section 9.3 of deleted text: the resource identified in the POST request. </p> </div> <div class="section"> <h3> 3.3. <a shape="rect" id="internet-media-type" name="internet-media-type"> Internet Media Type </a> </h3> <p> The Internet Media Type [ <a shape="rect" href="#RFC2046"> RFC2046 RFC2616 ]) governs to retrieve the authoritative interpretation representation.
  6. Section 6 of deleted text: representation data, and the IANA registry [ <a shape="rect" href="#MEDIATYPEREG"> MEDIATYPEREG RFC2616 ] determines defines how the <a shape="rect" href="#format-specification"> format specifications server constructs a corresponding response message, including the 'Content-Type' field.
  7. Section 1.4 of [ RFC2616 ] states "HTTP communication usually takes place over TCP/IP connections." This example does not address that provide the authoritative interpretation for a given Internet Media Type. If step in the user process, or other steps such as Domain Name System ( DNS ) resolution.
  8. The agent deleted text: implements those specifications, it interprets the data accordingly; <a shape="rect" href="#error-handling"> error handling </a> is discussed below. In this document, returned representation according to the deleted text: phrase "media type M" is shorthand for "the data format defined by specification that corresponds to the specification(s) paired with representation's Internet Media Type M (the value of the HTTP 'Content-Type') in the relevant IANA registry." </p> registry [ MEDIATYPEREG ].
<h4> 3.3.1. <a shape="rect" id="media-type-fragid" name="media-type-fragid"> Media Types

3.2. Messages and Fragment Identifier Semantics Representations </h4>

The URI specification [ <a shape="rect" href="#URI"> URI </a> ] states that given a URI with a fragment identifier "U#F": </p> <ul> <li> The authoritative interpretation of "F" depends Web's protocols (including HTTP, FTP, SOAP, NNTP, and SMTP) are based on the <a shape="rect" href="#format-specification"> format specification </a> of representation data that is part of a representation exchange of messages. A message may include data, metadata about the resource identified by "U". </li> </ul> <p> The interpretation of "F" does not depend on data, and message metadata : metadata about the representation where "U#F" appears but on message (such as the representation retrieved by dereferencing "U". HTTP Transfer-encoding header). A message may even include metadata about the message metadata (for message-integrity checks, for instance).

Interpretation Two important classes of the fragment identifier during message are those that request a retrieval action is performed solely by the agent; representation of a resource, and those that return the fragment identifier is not passed result of such a request. Such a response message (for example, a response to other systems during an HTTP GET) carries a representation of the process state of deleted text: retrieval. This means that some intermediaries in the Web architecture (e.g., proxies) have no interaction with fragment identifiers and resource. A representation is an octet sequence that redirection (in HTTP [ <a shape="rect" href="#RFC2616"> RFC2616 </a> ], for example) does not account for them. consists logically of two parts:

</div> <div class="section"> <h4> 3.3.2. <a shape="rect" name="frag-multiple-reps" id="frag-multiple-reps"> Fragment Identifiers and Multiple Representations </a>
  1. </h4> <p> For a given resource, an agent may have the choice between representation Representation data , electronic data about resource state, expressed in deleted text: more than one data format (e.g., through HTTP content negotiation). Since different data or more formats may define different fragment identifier semantics, it is used separately or in combination, and
  2. Representation metadata . One important to note that by design piece of metadata is the secondary Internet Media Type , discussed below.

Some protocols (such as HTTP) may also allow agents to exchange resource identified by a URI with a fragment identifier metadata . For example when using HTTP, some resource metadata is expected specified by headers such as 'Alternates' and 'Vary'.

Agents use representations to modify as well as retrieve resource state. Note that even though the response to be an HTTP POST request may contain the same independent above types of representations. Thus, if a fragment has defined semantics in any one representation, data, the fragment response to an HTTP POST request is identified for all of them, even though not necessarily a particular format cannot represent it. </p> <p> Suppose, for example, that representation of the party responsible for " <code> http://weather.example.com/oaxaca/map#zicatela </code> " provides representations state of the resource identified by <code> http://weather.example.com/oaxaca/map </code> using three image formats: SVG, PNG, and JPEG/JFIF. in the POST request.

3.3. Internet Media Type

The SVG specification defines semantics for fragment identifiers while Internet Media Type [ RFC2046 ]) of a representation determines which data format specification(s) provide the other specifications do not. It is not considered an error that only authoritative interpretation of the representation data formats specifies semantics for the (including fragment identifier. Because the Web is a distributed system in which formats identifier syntax and agents are deployed in a non-uniform manner, the architecture allows this sort of discrepancy. Authors may take advantage of more powerful semantics , if any). The IANA registry [ MEDIATYPEREG ] maps media types to data formats, while still ensuring reasonable backward-compatibility for users whose agents do not yet implement them. formats .

On the other hand, it is considered an error if the semantics of See the fragment identifiers used in two representations TAG finding " Internet Media Type registration, consistency of a secondary resource are inconsistent. use " for more information about media type registration.

<div class="boxedtext"> <p> <span class="practicelab"> Good practice: <a shape="rect" name="fragid-consistency" id="fragid-consistency">

3.3.1. Media Types and Fragment identifier consistency: Identifier Semantics

Story

<p class="practice"> A resource owner

In one of his XHTML pages, Dirk links to an image that Nadia has published on the Web. He creates a URI hypertext link with a fragment identifier and that uses content negotiation to serve multiple representations <a href="http://www.example.com/images/nadia#hat">Nadia's hat</a> . Nadia serves an SVG representation of the image, so the authoritative interpretation of the deleted text: identified resource SHOULD NOT serve representations with inconsistent fragment identifier semantics. </p> </div> <p> See related TAG issues <a shape="rect" href="http://www.w3.org/2001/tag/ilist.html#httpRange-14"> httpRange-14 </a> and <a shape="rect" href="http://www.w3.org/2001/tag/ilist.html#RDFinXHTML-35"> RDFinXHTML-35 </a>. "hat" depends on the SVG specification.

deleted text: <div class="section"> <h3> 3.4. <a shape="rect" id="authoritative-metadata" name="authoritative-metadata"> Authoritative Representation Metadata </a> </h3>

Successful communication between two parties using a piece of information relies on shared understanding of the meaning of Per [ URI ], in order to know the information. Arbitrary numbers authoritative interpretation of deleted text: independent parties can identify and communicate about a Web resource. To give these parties the confidence that they are all talking about the same thing when they refer to "the resource identified by fragment identifier, one must dereference the deleted text: following URI ..." the design choice for the Web is, in general, that containing the owner fragment identifier. The Internet Media Type of a resource assigns the retrieved representation specifies the authoritative interpretation of deleted text: representations of the resource. See the TAG finding <cite> " <a shape="rect" href="http://www.w3.org/2001/tag/doc/mime-respect.html"> Client handling of MIME headers </a> " </cite> for related discussion. See also TAG issue <a shape="rect" href="http://www.w3.org/2001/tag/ilist#rdfURIMeaning-39"> rdfURIMeaning-39 </a>. </p> <p> In our <a shape="rect" href="#scenario"> travel scenario </a>, fragment identifier. Thus, in the authority responsible for "weather.example.com" has license to create representations case of this resource and assign their Dirk and Nadia, the authoritative interpretation. Which representation(s) Nadia receives interpretation depends on a number of factors, including: the SVG specification, not the XHTML specification (i.e., the context where the URI appears).

<ol> <li> Whether

Interpretation of the authority responsible for "weather.example.com" responds fragment identifier during a retrieval action is performed solely by the agent; the fragment identifier is not passed to requests at all; </li> <li> Whether other systems during the authority responsible process of retrieval. This means that some intermediaries in the Web architecture (such as proxies) have no interaction with fragment identifiers and that redirection (in HTTP [ RFC2616 ], for "weather.example.com" makes available example) does not account for them.

Note that one or more representations can use a URI with a fragment identifier even if one does not have a representation available for interpreting the resource identified by " <code> http://weather.example.com/oaxaca </code> "; </li> <li> Whether Nadia has access privileges to fragment identifier (one can compare two such a representation; </li> <li> If the authority responsible URIs, for "weather.example.com" has provided more than one example). Parties that make conclusions about the interpretation of a fragment identifier without retrieving a representation (in different formats such as HTML, PNG, or RDF, in different languages do so at their own risk; such as English interpretations are not authoritative.

3.3.2. Fragment Identifiers and Spanish, etc.), Multiple Representations

Story

Dirk informs Nadia that he would also like her to make her images available in formats other than SVG. For the resulting representation may depend on negotiation between same resource (thus, the same URI), Nadia makes available a PNG image as well. Dirk's user agent and Nadia's server negotiate so that occurs as part the user agent retrieves a suitable representation. Which specification specifies the authoritative interpretation of the HTTP transaction. </li> <li> When Nadia made "hat" fragment identifier, the request. Since PNG specification or the weather in Oaxaca changes, Nadia should expect that representations will change over time. </li> </ol> <div class="section"> <h4> 3.4.1. <a shape="rect" name="metadata-inconsistencies" id="metadata-inconsistencies"> Inconsistencies between Metadata and Representation Data </a> </h4> SVG specification?

Inconsistencies between For a given resource, an agent may have the format of choice between representation data deleted text: and assigned representation metadata do occur. Examples that have been observed in practice include: </p> <ul> <li> The actual character encoding of a representation is inconsistent with the charset parameter in the representation metadata. </li> <li> The namespace of the root element of the representation more than one data format (through HTTP content negotiation, for example). Since different data formats may define different fragment identifier semantics, it is inconsistent important to note that by design the secondary resource identified by a URI with a fragment identifier is expected to be the value same independent of the 'Content-Type' field in HTTP headers. </li> </ul> <p> User agents should detect such inconsistencies but should not resolve them without involving the user. </p> <div class="boxedtext"> <p> <span class="principlelab"> Principle: <a shape="rect" name="pr-server-auth" id="pr-server-auth"> Authoritative server metadata </a> </span> </p> <p class="principle"> User agents MUST NOT silently ignore authoritative server metadata. </p> </div> <p> representations. Thus, deleted text: for example, if a fragment has defined semantics in any one representation, the parties responsible fragment is identified for "weather.example.com" mistakenly label the satellite photo all of Oaxaca as "image/gif" instead them, even though a particular data format cannot represent it.

Suppose, for example, that the authority responsible for "http://weather.example.com/oaxaca/map#zicatela" provides representations of deleted text: "image/jpeg", and if Nadia's browser detects a problem, Nadia's browser must not silently ignore the problem resource identified by http://weather.example.com/oaxaca/map using three image formats: SVG, PNG, and render JPEG/JFIF. The SVG specification defines semantics for fragment identifiers while the JPEG image. Nadia's browser can notify Nadia other specifications do not. It is not considered an error that only of the problem, notify Nadia data formats specifies semantics for the fragment identifier. Because the Web is a distributed system in which formats and take corrective action, etc. Of course, user agent designers should not ignore usability issues when handling agents are deployed in a non-uniform manner, the architecture allows this type sort of error; notification may be discreet, and handling discrepancy. Authors may be tuned to meet the user's preferences. </p> <p> See the TAG finding <cite> " <a shape="rect" href="http://www.w3.org/2001/tag/doc/mime-respect.html"> Client handling take advantage of deleted text: MIME headers </a> " </cite> for more in-depth discussion and examples. powerful data formats, while still ensuring reasonable backward-compatibility for users whose agents do not yet implement them.

Furthermore, server managers can help reduce On the risk of other hand, it is considered an error through careful assignment if the semantics of representation metadata. the fragment identifiers used in two representations of a secondary resource are inconsistent.

<span class="principlelab"> Principle: <a shape="rect" name="pr-no-guess-meta" id="pr-no-guess-meta"> Don't guess metadata Good practice: Fragment identifier consistency

<p class="principle"> Server managers MUST ensure

A resource owner that representation metadata is appropriate for each representation. creates a URI with a fragment identifier and that uses content negotiation to serve multiple representations of the identified resource SHOULD NOT serve representations with inconsistent fragment identifier semantics.

<span class="ednote"> Editor's note </span>: Add an example Inconsistent fragment identifier semantics are one source of this principle. URI ambiguity .

See related TAG issues httpRange-14 and RDFinXHTML-35 .

3.5. <a shape="rect" name="safe-interaction" id="safe-interaction"> Safe Interactions 3.4. Authoritative Representation Metadata

deleted text: <div class="boxedtext"> <p> <span class="storylab"> Story </span> </p> <div class="story">

Nadia decides to book a vacation to Oaxaca at "booking.example.com." She enters data into Successful communication between two parties using a series piece of deleted text: HTML forms and is ultimately asked for credit card information to purchase the airline tickets. She provides this information in another HTML form. When she presses the "Purchase" button, her browser opens another network connection to the server at "booking.example.com" and sends a message conforming to the rules for an HTTP POST request. </p> <p> As described by the HTML specification, the message data consists of a set relies on shared understanding of deleted text: name/value pairs corresponding to the HTML form fields. Note that this is not a <a shape="rect" href="#safe-interaction"> safe interaction </a> ; Nadia wishes to change the state meaning of the system by exchanging money for airline tickets. </p> <p> The server reads the POST request, information. Arbitrary numbers of independent parties can identify and after performing the booking transaction returns a message to Nadia's browser that contains communicate about a representation of Web resource. To give these parties the results of Nadia's request. The representation data is in HTML so that it can be saved or printed out for Nadia's records. Note confidence that neither they are all talking about the data transmitted with same thing when they refer to "the resource identified by the POST nor following URI ..." the design choice for the data received Web is, in general, that the response necessarily correspond to any resource named by owner of a URI. resource assigns the authoritative interpretation of representations of the resource. See the TAG finding " Client handling of MIME headers " for related discussion. See also TAG issue rdfURIMeaning-39 .

deleted text: </div> </div>

Nadia's retrieval In our travel scenario , the authority responsible for "weather.example.com" has license to create representations of weather information qualifies as a "safe" interaction; this resource. Which representation(s) Nadia receives depends on a <a name="def-safe-interaction" id="def-safe-interaction"> <dfn> safe interaction </dfn> </a> is number of factors, including:

  1. Whether the authority responsible for "weather.example.com" responds to requests at all;
  2. Whether the authority responsible for "weather.example.com" makes available one where or more representations for the agent does not commit resource identified by "http://weather.example.com/oaxaca";
  3. Whether Nadia has access privileges to anything beyond such representations (see the interaction section on linking and is not access control );
  4. If the authority responsible for any consequences other than the interaction itself (e.g., a read-only query or lookup). Other Web interactions resemble orders "weather.example.com" has provided more than queries. These <a name="def-unsafe-interaction" id="def-unsafe-interaction"> <dfn> unsafe interactions </dfn> one representation (in different formats such as HTML, PNG, or RDF, or in different languages such as English and Spanish), the resulting representation may depend on negotiation between the user agent and server that occurs as part of the HTTP transaction.
  5. When Nadia made the request. Since the weather in Oaxaca changes, Nadia should expect that representations will change over time.

3.4.1. Inconsistencies between Metadata and Representation Data may cause a change to

Inconsistencies between the state data format of a resource representation data and the user may be held responsible for the consequences assigned representation metadata do occur. Examples that have been observed in practice include:

  • The actual character encoding of deleted text: these interactions. Unsafe interactions include subscribing to a newsletter, posting to a list, or modifying a database. </p> <p> Safe interactions are important because these are interactions where users can browse representation is inconsistent with confidence and where agents (e.g., search engines and browsers that pre-cache the charset parameter in the representation metadata.
  • The namespace of the root element of the representation data for is inconsistent with the user) can follow links safely. Users (or value of the 'Content-Type' field in HTTP headers.

User agents acting on their behalf) do should detect such inconsistencies but should not commit themselves to anything by querying a resource or following a link. resolve them without involving the user.

Principle: <a shape="rect" name="pr-deref-safe" id="pr-deref-safe"> Safe retrieval Authoritative server metadata

Agents do not incur obligations by retrieving a representation. User agents MUST NOT silently ignore authoritative server metadata.

For instance, it is incorrect to publish a link (e.g., " <code> http://example.com/oaxaca/newsLetter </code> ") that, when followed, subscribes a user to Thus, for example, if the parties responsible for "weather.example.com" mistakenly label the satellite photo of Oaxaca as "image/gif" instead of "image/jpeg", and if Nadia's browser detects a mailing list. Remember that search engines may follow such links. </p> <p> For more information about safe problem, Nadia's browser must not silently ignore the problem and unsafe operations using HTTP GET render the JPEG image. Nadia's browser can notify Nadia of the problem or notify Nadia and POST, take corrective action. Of course, user agent designers should not ignore usability issues when handling this type of error; notification may be discreet, and handling security concerns around may be tuned to meet the use of HTTP GET, see user's preferences. See the TAG finding " <a shape="rect" href="http://www.w3.org/2001/tag/doc/whenToUseGet.html"> URIs, Addressability, Client handling of MIME headers " for more in-depth discussion and examples.

Furthermore, server managers can help reduce the use risk of HTTP GET and POST error through careful assignment of representation metadata. The section on media types for XML " </cite>. presents an example of reducing the risk of error by providing no metadata about character encoding when serving XML.

Good practice: Appropriate metadata

Server managers MUST ensure that representation metadata is appropriate for each representation.

3.6. <a shape="rect" name="representation-management" id="representation-management"> Representation Management 3.5. Safe Interactions

Story

Since Nadia finds Nadia decides to book a vacation to Oaxaca at "booking.example.com." She enters data into a series of online forms (built with [ XFORMS10 ]) and is ultimately asked for credit card information to purchase the airline tickets. She provides this information in another form. When she presses the "Purchase" button, her browser opens another network connection to the server at "booking.example.com" and sends a message composed of form data using the POST method. Note that this is not a safe interaction ; Nadia wishes to change the state of the system by exchanging money for airline tickets.

The server reads the Oaxaca weather site useful, she emails POST request, and after performing the booking transaction returns a review message to her friend Dirk recommending Nadia's browser that he check out ' <code> http://weather.example.com/oaxaca </code> '. Dirk clicks on the link in contains a representation of the email he receives and results of Nadia's request. The representation data is surprised to see his browser display a page about auto insurance. Dirk confirms in XHTML so that it can be saved or printed out for Nadia's records. Note that neither the URI data transmitted with deleted text: Nadia, and they both conclude that the resource is unreliable. Although POST nor the managers of Oaxaca have chosen data received in the Web as a communication medium (since, for example, it costs less to update a Web site than to reprint and distribute weather information on paper), they have lost two customers due response necessarily correspond to ineffective any resource management. named by a URI.

The usefulness Nadia's retrieval of weather information (an example of a resource depends on good management by its owner. As is the case with many human interactions, confident interactions with read-only query or lookup) qualifies as a resource depend on stability and predictability. The value of "safe" interaction; a URI increases with the predictability of interactions using that URI. <a shape="rect" href="#thoughtful-uris"> Thoughtful URI creation safe interaction is one aspect of proper resource management. where the agent does not incur any obligation beyond the interaction. An agent may incur an obligation through other means (such as by signing a contract). If an agent does not have an obligation before a safe interaction, it does not have that obligation afterwards.

deleted text: <div class="boxedtext">

Other Web interactions resemble orders more than queries. These <span class="practicelab"> Good practice: <a shape="rect" name="pr-service-uri" id="pr-service-uri"> Consistent representations </a> </span> unsafe interactions </p> <p class="practice"> Publishers of may cause a URI SHOULD provide (or not) representations of change to the identified resource consistently and predictably. </p> </div> <p> This section discusses important aspects state of deleted text: representation management. </p> <div class="section"> <h4> 3.6.1. <a shape="rect" name="representation-available" id="representation-available"> Representation availability </a> </h4> <p> The authority responsible for a resource and the user may supply zero or more representations of a resource. The authority is also be held responsible for accepting or rejecting requests the consequences of these interactions. Unsafe interactions include subscribing to deleted text: modify a resource, e.g., by configuring a server newsletter, posting to accept a list, or reject HTTP PUT modifying a database.

Safe interactions are important because these are interactions where users can browse with confidence and where agents (including search engines and browsers that pre-cache data based for the user) can follow links safely. Users (or agents acting on Internet Media Type, validity constraints, their behalf) do not commit themselves to anything by querying a resource or other constraints. following a link.

<span class="practicelab"> Good practice: <a shape="rect" name="pr-describe-resource" id="pr-describe-resource"> Available representations Principle: Safe retrieval

<p class="practice"> Publishers of

Agents do not incur obligations by retrieving a URI SHOULD provide representations of the identified resource. representation.

In terms of user frustration, there For instance, it is little difference between incorrect to publish a link that, when followed, subscribes a user to a mailing list. Remember that search engines may follow such links.

For more information about safe and unsafe operations using HTTP GET and POST, and handling security concerns around the use of HTTP GET, see the inability to retrieve a representation for an important resource due to a network outage TAG finding " URIs, Addressability, and the inability because none has been provided. use of HTTP GET and POST " .

deleted text: </div>

3.6.2. <a shape="rect" name="URI-persistence" id="URI-persistence"> URI Persistence 3.5.1. Unsafe Interactions and Accountability

There are strong social expectations that once a URI identifies Story

Nadia pays for her airline tickets online (through an unsafe POST interaction as described above). She receives a particular resource, it should continue indefinitely Web page with confirmation information and wishes to bookmark it so that she can refer to that resource; this is called <a name="def-URI-persistence" id="def-URI-persistence"> <dfn> URI persistence </dfn> </a>. URI persistence it when she calculates her expenses. Although Nadia can print out the results, or save them to a file, she cannot bookmark the results. In fact, neither the POST request, which expresses her commitment to pay, nor the airline company's response, which expresses its acknowledgment and its own commitment, can be referenced by URIs.

It is a matter breakdown of deleted text: policy and commitment on the part Web architecture if agents cannot use URIs to reconstruct a "paper trail" of authorities servicing URIs. The choice transactions, i.e., to refer to receipts and other evidence of accepting an obligation. Indeed, each electronic mail message includes a particular URI scheme provides no guarantee that those URIs will be persistent or that they will not unique message identifier, one reason why email is so useful for managing accountability (since, for example, email can be persistent. </p> <p> Since representations are used copied to communicate resource state, persistence is directly affected by how well representations are served. Service breakdowns include: </p> <ul> <li> Inconsistent representations served. Note public archives). On the difference between other hand, HTTP servers and deployed user agents do not generally keep records of POST transactions, making it difficult for all parties to reconstruct a resource owner changing representations predictably in light series of transactions.

There are mechanisms in HTTP, not widely deployed, to remedy this situation. HTTP servers can assign a URI to the nature results of a POST transaction using the resource (e.g., the weather "Content-Location" header (described in Oaxaca changes) section 14.14 of [ RFC2616 ]), and allow authorized parties to retrieve a record of the owner changing representations arbitrarily. </li> <li> Improper use transaction thereafter via this URI (the value of content negotiation, such as serving two images as equivalent through HTTP content negotiation, URI persistence is apparent in this case). User agents can provide an interface for managing transactions where deleted text: one image represents a square and the other a circle. </li> </ul> user agent has incurred an obligation on behalf of the user.

<p> HTTP [ <a shape="rect" href="#RFC2616"> RFC2616

3.6. Representation Management ] has been designed to help service URIs. For example, HTTP redirection (using

Story

Since Nadia finds the 3xx response codes) permits servers to tell Oaxaca weather site useful, she emails a user agent that further action needs review to be taken by her friend Dirk recommending that he check out 'http://weather.example.com/oaxaca'. Dirk clicks on the user agent link in order the email he receives and is surprised to fulfill see his browser display a page about auto insurance. Dirk confirms the request (e.g., URI with Nadia, and they both conclude that the resource has been assigned a new URI). In addition, content negotiation also promotes consistency, is unreliable. Although the managers of Oaxaca have chosen the Web as a site manager is not required communication medium, they have lost two customers due to define new URIs when adding support for ineffective resource management.

The usefulness of a new format specification. Protocols that do not support content negotiation (e.g., FTP) require resource depends on good management by its owner. As is the case with many human interactions, confident interactions with a new identifier when resource depend on stability and predictability. The value of a new format URI increases with the predictability of interactions using that URI. Avoiding unnecessary URI aliases is introduced. one aspect of proper resource management.

For more discussion about URI persistence, see [ <a shape="rect" href="#Cool"> Cool Good practice: Consistent representation ].

Publishers of a URI SHOULD provide representations of the identified resource consistently and predictably.

This section discusses important aspects of representation management.

3.6.3. <a shape="rect" name="id-access" id="id-access"> Access Control 3.6.1. Representation availability

deleted text: It is reasonable to limit access to the resource (e.g., for security reasons), but it is unreasonable to prohibit others from merely identifying the resource. </p> <p> As an analogy: The owners of a building might have authority responsible for a policy that the public resource may only enter the building via the main front door, and only during business hours. People who work in the building and who make deliveries to it might use other doors as appropriate. Such a policy would be enforced by a combination of security personnel and mechanical devices such as locks and pass-cards. One would not enforce this policy by hiding some supply zero or more representations of the building entrances, nor a resource. The authority is also responsible for accepting or rejecting requests to modify a resource, for example, by requesting legislation requiring the use of the front door and forbidding anyone configuring a server to reveal the fact that there are accept or reject HTTP PUT data based on Internet Media Type, validity constraints, or other doors to the building. constraints.

<span class="storylab"> Story Good practice: Available representation

<div class="story"> <p> Nadia and Dirk both subscribe to the "weather.example.com" newsletter. Nadia wishes to point out an article

Publishers of deleted text: particular interest to Dirk, using a URI. The authority responsible for "weather.example.com" can offer newsletter subscribers such as Nadia and Dirk the benefits URI SHOULD provide representations of deleted text: URIs (e.g., book marking and linking) and still limit access to the newsletter to authorized parties. identified resource.

deleted text: <p> The Web provides several mechanisms to control access to resources; these mechanisms do not rely on hiding or suppressing URIs for those resources. For more information, see the TAG finding <cite> " <a shape="rect" href="http://www.w3.org/2001/tag/doc/deeplinking.html"> 'Deep Linking' in the World Wide Web </a> " </cite>. </p> </div>
<h3> 3.7. <a shape="rect" name="interaction-future" id="interaction-future"> Future Directions for Interaction

3.6.2. URI Persistence </h3>

There remain open questions regarding Web interactions. The TAG expects future versions of are strong social expectations that ; this document to address in more detail the relationship between the architecture described herein, Web Services, is called URI persistence . URI persistence is a matter of policy and commitment on the Semantic Web. part of authorities servicing URIs. The choice of a particular URI scheme provides no guarantee that those URIs will be persistent or that they will not be persistent.

deleted text: </div> </div> <div class="section"> <h2> 4. <a shape="rect" id="formats" name="formats"> Data Formats </a> </h2>

Data formats (e.g., XHTML, CSS, PNG, XLink, RDF/XML, and SMIL animation) Since representations are used to build <a shape="rect" href="#def-representation"> representations </a>. Each data format communicate resource state, persistence is defined directly affected by a <a name="format-specification" id="format-specification"> <dfn> format specification </dfn> </a>. The first data format used on how well representations are served. Service breakdowns include:

  • Inconsistent representations served. Note the Web was HTML. Since then, data formats have grown in number. The Web architecture does not constrain which data formats content providers can use. This flexibility is important because there is constant evolution in applications, resulting difference between a resource owner changing representations predictably in new data formats and refinements light of existing formats. </p> <p> Some characteristics the nature of deleted text: a data format make it easier to integrate into the Web architecture. We examine some resource (the changing weather of those characteristics below. This document does not address generally beneficial characteristics Oaxaca) and the owner changing representations arbitrarily.
  • Improper use of a specification content negotiation, such as readability, simplicity, attention serving two images as equivalent through HTTP content negotiation, where one image represents a square and the other a circle.

HTTP [ RFC2616 ] has been designed to programmer goals, attention help manage URIs. For example, HTTP redirection (using the 3xx response codes) permits servers to user needs, accessibility, internationalization, etc. The section on <a shape="rect" href="#archspecs"> architectural specifications </a> includes references tell an agent that further action needs to additional format specification guidelines. </p> <p> <strong> Note: </strong> This document does not distinguish be taken by the agent in any formal way order to fulfill the terms "format," "language," and "vocabulary." Context request (for example, the resource has determined which term been assigned a new URI). In addition, content negotiation also promotes consistency, as a site manager is used. not required to define new URIs when adding support for a new format specification. Protocols that do not support content negotiation (such as FTP) require a new identifier when a new data format is introduced.

For more discussion about URI persistence, see [ Cool ].

<h3> 4.1. <a shape="rect" id="standards" name="standards"> Interoperability

3.6.3. Linking and the Use of Standard Format Specifications Access Control </h3>

For a data format It is reasonable to limit access to deleted text: be usefully interoperable between two parties, the parties must have a shared understanding of its syntax and semantics. This resource (for commercial or security reasons, for example), but it is <em> not </em> unreasonable to imply that a sender prohibit others from merely identifying the resource.

As an analogy: The owners of deleted text: data can count on constraining its treatment by a receiver; simply building might have a policy that making good the public may only enter the building via the main front door, and only during business hours. People who work in the building and who make deliveries to it might use of other doors as appropriate. Such a data format requires knowledge policy would be enforced by a combination of security personnel and mechanical devices such as locks and pass-cards. One would not enforce this policy by hiding some of deleted text: its designers' intentions. </p> <div class="boxedtext"> <p> <span class="practicelab"> Good practice: <a shape="rect" name="spec-availability" id="spec-availability"> Format specification availability </a> </span> </p> <p class="practice"> To promote the interoperability building entrances, nor by requesting legislation requiring the use of deleted text: a Web data format, the format authors SHOULD provide a stable, normative specification for it front door and forbidding anyone to reveal the fact that is a widely available Web resource. there are other doors to the building.

deleted text: </div>

<span class="practicelab"> Good practice: <a shape="rect" name="register-media-type" id="register-media-type"> Media type registration </a> Story

<p class="practice"> Format specification authors SHOULD register an Internet Media Type for each format specification they wish

Nadia and Dirk both subscribe to deleted text: promote (see the [ <a shape="rect" href="#MEDIATYPEREG"> MEDIATYPEREG </a> ] registry). </p> </div> <p> See TAG finding <cite> " <a shape="rect" href="http://www.w3.org/2001/tag/2002/0129-mime"> Internet Media Type registration, consistency "weather.example.com" newsletter. Nadia wishes to point out an article of use </a> " </cite> particular interest to Dirk, using a URI. The authority responsible for more information. </p> <div class="boxedtext"> <p> <span class="practicelab"> Good practice: <a shape="rect" name="define-fragids" id="define-fragids"> Specified fragment identifier semantics </a> </span> </p> <p class="practice"> Format specification authors SHOULD define the syntax "weather.example.com" can offer newsletter subscribers such as Nadia and semantics Dirk the benefits of fragment identifiers for URIs (such as bookmarking and linking) and still limit access to the format. newsletter to authorized parties.

Although the The Web architecture allows provides several mechanisms to control access to resources; these mechanisms do not rely on hiding or suppressing URIs for those resources. For more information, see the deployment of new data formats, TAG finding " 'Deep Linking' in the creation and deployment of new formats (and agents able to handle them) is very expensive. Thus, before inventing a new data format, designers should carefully consider re-using one that is already available. World Wide Web " .

4.2. <a shape="rect" name="binary" id="binary"> Binary and Textual Data Formats 3.7. Future Directions for Interaction

A textual data format is one There remain open questions regarding Web interactions. The TAG expects future versions of this document to address in which more detail the data is specified as a sequence of characters. HTML, Internet e-mail, and all <a shape="rect" href="#xml-formats"> XML-based formats </a> are textual. In modern textual data formats, relationship between the characters are usually taken from architecture described herein, Web Services , the Unicode repertoire Semantic Web , peer-to-peer systems (including Freenet , MLdonkey , and NNTP [ <a shape="rect" href="#UNICODE"> UNICODE RFC977 ]. </p> <p> Binary data formats are those in which portions of the data are encoded for direct use by computer processors, for example thirty-two bit little-endian two's-complement and sixty-four bit IEEE double-precision floating-point. The portions of data so represented include numeric values, pointers, ]), instant messaging systems (including [ XMPP ), and compressed data of all sorts. </p> <p> In principle, all data can be represented using textual formats. voice-over-ip (including RTSP [ RFC2326 ]).

4. Data Formats

The trade-offs between binary and textual data formats are complex and application-dependent. Binary formats can be substantially more compact, particularly for complex pointer-rich A data structures. Also, they can be consumed more rapidly by agents in those cases where they can be loaded into memory and used with little or no conversion. </p> <p> Textual formats are usually more portable format (including XHTML, CSS, PNG, XLink, RDF/XML, and interoperable, since there are fewer choices for representation of SMIL animation) specifies the basic units (characters), and those choices are well-understood and widely implemented. </p> <p> Textual interpretation of representation data . The first data format used on the Web was HTML. Since then, data formats deleted text: also have the considerable advantage that they grown in number. The Web architecture does not constrain which data formats content providers can be directly read and understood by human beings. use. This can simplify the tasks of creating and maintaining processing software, flexibility is important because there is constant evolution in applications, resulting in new data formats and allow the direct intervention refinements of humans in the processing chain without recourse to tools more complex than existing formats. Although the ubiquitous text editor. Finally, it simplifies Web architecture allows for the necessary human task deployment of deleted text: learning about new data formats, the creation and deployment of new formats (the "View Source" effect). </p> <p> It is important to emphasize that intuition as (and agents able to such matters as handle them) is expensive. Thus, before inventing a new data size and processing speed are not format, designers should carefully consider re-using one that is already available.

For a deleted text: reliable guide in data format deleted text: design; quantitative studies are essential to be usefully interoperable between two parties, the parties must have a correct shared understanding of the trade-offs. </p> <div class="boxedtext"> <p> <span class="practicelab"> Good practice: <a shape="rect" name="text-binary" id="text-binary"> Binary or text </a> </span> </p> <p class="practice"> Format specification authors SHOULD make a considered choice between binary its syntax and textual. </p> </div> <p> <strong> Note: </strong> Text (i.e., semantics. This is not to imply that a sequence sender of data can count on constraining its treatment by a receiver; simply that making good use of a data format requires knowledge of its designers' intentions. Below we describe some characteristics of deleted text: characters from a deleted text: repertoire) is distinct from serving data with a media type beginning with "text/". Although XML-based formats are textual, many such formats are format make it easier to integrate into the Web architecture. This document does not primarily comprised address generally beneficial characteristics of phrases in natural language. See the a specification such as readability, simplicity, attention to programmer goals, attention to user needs, accessibility, and internationalization. The section on <a shape="rect" href="#xml-media-types"> media types for XML architectural specifications for issues that arise when "text/" is used in conjunction with an XML-based format. </p> <p> TAG issue <a shape="rect" href="http://www.w3.org/2001/tag/ilist#binaryXML-30"> binaryXML-30 </a>: Effect of Mobile on architecture - size, complexity, memory constraints. Binary Infosets, storage efficiency. includes references to additional format specification guidelines.

deleted text: </div>

4.3. <a shape="rect" name="ext-version" id="ext-version"> Extensibility 4.1. Binary and Versioning Textual Data Formats

The information that people represent A textual data format is one in which the Web data is specified as a sequence of characters. HTML, Internet e-mail, and all XML-based formats are textual. In modern textual data formats, the technologies they use to represent that information change over time. Versioning is characters are usually taken from the process of managing that change. Unicode repertoire [ UNICODE ].

From early on Binary data formats are those in the Web, HTML agents followed the convention which portions of ignoring unknown elements. This choice left room the data are encoded for innovation (i.e., non-standard elements) direct use by computer processors, for example thirty-two bit little-endian two's-complement and encouraged the deployment of HTML. However, interoperability problems arose as well. In this type of environment, there is an inevitable tension between interoperability in the short term sixty-four bit IEEE double-precision floating-point. The portions of data so represented include numeric values, pointers, and the desire for extensibility. compressed data of all sorts.

Experience shows that designs that strike the right balance In principle, all data can be represented using textual formats.

The trade-offs between allowing change binary and preserving interoperability textual data formats are complex and application-dependent. Binary formats can be substantially more likely to thrive compact, particularly for complex pointer-rich data structures. Also, they can be consumed more rapidly by agents in those cases where they can be loaded into memory and used with little or no conversion.

Textual formats are less likely to disrupt usually more portable and interoperable. Textual formats also have the Web community (thus keeping down considerable advantage that they can be directly read and understood by human beings. This can simplify the cost of change). Some examples tasks of successful technologies designed to allow change while minimizing disruption include: URI schemes, Internet Media Types, XML Namespaces, Cascading Style Sheets rules for handling unknown style properties creating and property values, SOAP extensibility model, maintaining software, and deleted text: user agent plug-ins. </p> <div class="boxedtext"> <p> <span class="practicelab"> Good practice: <a shape="rect" name="pr-allow-exts" id="pr-allow-exts"> Allow for extensions </a> </span> </p> <p class="practice"> Format designers SHOULD provide mechanisms that allow any party to create extensions that do not interfere with conformance the direct intervention of humans in the processing chain without recourse to tools more complex than the original specification. ubiquitous text editor. Finally, it simplifies the necessary human task of learning about new data formats (the "view source" effect).

deleted text: </div>

Application needs determine the most appropriate extension strategy. For example, applications designed to operate in closed environments may be able It is important to employ versioning strategies emphasize that would be impractical on the distributed system of the Web intuition as to such matters as data size and processing speed are not a whole. See the work on RDF ontologies (e.g., [ <a shape="rect" href="#OWL10"> OWL10 </a> ]) for one example reliable guide in data format design; quantitative studies are essential to a correct understanding of the trade-offs. Therefore, data format specification authors should make a well-defined solution for mixing arbitrary ontologies. considered choice between binary and textual format design.

There is typically Note: Text (i.e., a (long) transition period during which multiple versions sequence of characters from a format, protocol, or agent repertoire) is distinct from serving data with a media type beginning with "text/". Although XML-based formats are simultaneously in use. </p> <div class="boxedtext"> <p> <span class="practicelab"> Good practice: <a shape="rect" name="pr-version-info" id="pr-version-info"> Provide version information </a> </span> </p> <p class="practice"> Format designers SHOULD provide for version information textual, many such formats are not primarily comprised of phrases in language instances. </p> </div> <p> natural language. See the section on <a shape="rect" href="#ns-versioning"> media types for XML deleted text: namespaces and versioning for more information about using issues that arise when "text/" is used in conjunction with an XML-based format.

TAG issue binaryXML-30 : Standardize a namespace name as the basis of version information. "binary XML" format?

4.2. Versioning and Extensibility

Designers can make the transition process smoother by making careful choices about versioning Extensibility and extensibility, in particular with regard versioning are strategies to compatibility: </p> <ul> <li> Design for forward-compatible processing. XSLT 1.0, for example, was designed so that XSLT 1.0 processors would operate predictably when processing stylesheets authored in later versions help manage the natural evolution of information on the specification. </li> <li> Make backward-compatible changes so Web and technologies used to represent that newer processors can also handle older data. </li> </ul> information.

Agent For more information on about versioning strategies and agent behavior in the face of unrecognized content will vary according to application needs, security issues, etc. Application needs will determine extensions, see TAG issue XMLVersioning-41 : What are good practices for designing extensible XML languages and for handling versioning?. See also the appropriate behavior. TAG finding " Versioning XML Languages " and "Web Architecture: Extensible Languages" [ EXTLANG ].

4.2.1. Versioning

There is typically a (long) transition period during which multiple versions of a format, protocol, or agent are simultaneously in use.

Good practice: <a shape="rect" name="pr-unknown-extension" id="pr-unknown-extension"> Unknown extensions Version information

Format designers SHOULD specify agent behavior provide for version information in the face of unrecognized extensions. language instances.

Two strategies have emerged as being particularly useful: Story

<ol> <li> "Must ignore": The agent ignores any content it does not recognize. </li> <li> "Must understand": The agent stops processing as soon as it encounters content it does not recognize. </li> </ol>

Additional strategies include prompt the user for more input, automatically retrieve data from available links, Nadia and fall back to default behavior. More complex strategies Dirk are also possible, including mixing strategies. For instance, a designing an XML data format can include mechanisms to encode data about the film industry. They provide for overriding standard behavior. Thus, extensibility by using XML namespaces and creating a schema that allows the inclusion, in certain places, of elements from any namespace. When they revise their format, Nadia proposes a new optional "lang" attribute on the "film" element. Dirk feels that such a format can specify "must ignore" semantics but also allow people change requires them to create extensions that override assign a new namespace name, which might require changes to deployed software. Nadia explains to Dirk that semantics in light their choice of application needs (e.g., extensibility strategy in conjunction with "must understand" semantics for a particular extension). </p> <p> For more information on about versioning strategies their namespace policy allows certain changes that do not affect conformance of existing content and agent behavior in software, and thus no change to the face namespace identifier is required. They chose this policy to help them meet their goals of unrecognized extensions, see reducing the TAG finding <cite> " <a shape="rect" href="http://www.w3.org/2001/tag/doc/versioning-20031003"> Versioning XML Languages </a> " </cite>. See also "Web Architecture: Extensible Languages" [ <a shape="rect" href="#EXTLANG"> EXTLANG </a> ]. cost of change.

<div class="section"> <h3> 4.4. <a shape="rect" name="pci" id="pci"> Separation of Presentation, Content, and Interaction </a> </h3>

The Web is a heterogeneous environment where Dirk and Nadia have chosen a wide variety of user agents provide access particular namespace change policy that allows them to avoid changing the namespace name whenever they make changes that do not affect conformance of deployed content to users with and software. They might have chosen a wide variety of capabilities. It is good practice different policy, for authors to create content example that can reach the widest possible audience, including users with graphical desktop computers, hand-held devices and cell phones, and users with disabilities who may require speech synthesizers any new element or attribute has to belong to a namespace other specialized hardware or software. Furthermore, authors cannot predict how every user agent will display or process their content. Experience shows that than the original one. Whatever the allowing authors to separate content logic from presentation and interaction concerns helps them reach chosen policy, it should set clear expectations for users of the widest possible audience. format.

Good practice: <a shape="rect" name="cpi" id="cpi"> Content, Presentation, Interaction Separation Namespace policy

Format specification authors designers SHOULD design formats that allow authors to separate content logic from presentation and interaction concerns. document change policies for XML namespaces.

Of course, it is not always desirable to reach the "widest possible audience". Application context may require As an example of a very specific display (e.g., for some legally-binding transaction). Also, digital signature technology, <a shape="rect" href="#id-access"> access control </a>, and other technologies are appropriate for controlling access change policy designed to content. </p> <p> See reflect the general principle on <a shape="rect" href="#orthogonal-specs"> orthogonal specifications </a> and variable stability of a namespace, consider the TAG issues <a shape="rect" href="http://www.w3.org/2001/tag/ilist#formattingProperties-19"> formattingProperties-19 </a> and <a shape="rect" href="http://www.w3.org/2001/tag/ilist#contentPresentation-26"> contentPresentation-26 </a>. </p> </div> <div class="section"> <h3> 4.5. <a shape="rect" name="links" id="links"> Links W3C namespace policy </h3> <p> A defining characteristic of for documents on the Web is that it allows embedded references to other Web resources via URIs. W3C Recommendation track. The simplicity of <code> <a href="#foo"> </code> as a link to <code> foo </code> and <code> <a name="foo"> </code> as policy sets expectations that the anchor <code> foo </code> are partly (perhaps largely) Working Group responsible for the birth of the hypertext Web as we know namespace may modify it today. </p> <p> When a <a shape="rect" href="#def-representation"> representation </a> of one resource refers to another resource with a URI, this constitutes in any way until a <a name="link" id="link"> <dfn> link </dfn> </a> between the two resources. The networked information space is built of linked resources, and certain point in the large-scale effect is a shared information space. The value of process ("Candidate Recommendation") at which point W3C constrains the Web grows exponentially as a function of set possible changes to the number of linked resources (the "network effect"). namespace in order to promote stable implementations.

A link is built from two pieces: </p> <ol> <li> Note that since namespace names are URIs, the party (if any) responsible for a <a name="baseuri" id="baseuri"> <dfn> base namespace URI </dfn> </a>, which is associated with has the representation in which authority to decide the link appears, and </li> <li> a <a name="uriref" id="uriref"> <dfn> URI reference </dfn> </a> (defined in section 4.2 of [ <a shape="rect" href="#URI"> URI namespace change policy.

4.2.2. Extensibility ]). Strings such as "#anchor" and "../" are familiar examples of URI references. </li> </ol>

Web agents resolve a URI reference before using Designers can facilitate the resulting URI to interact with another agent. This split design facilitates content management. transition process by allowing authors to design a representation locally, i.e., without worrying making careful choices about what global identifier may later be used to refer to extensibility during the associated resource. design of a language or protocol specification.

Section 5 of [ <a shape="rect" href="#URI"> URI Good practice: Extensibility mechanisms ] explains how

Language designers SHOULD provide mechanisms that allow any party to create a URI by composing a base URI and a URI reference; this is called resolving a URI reference. Section 5.1 explains different mechanisms specifying a base URI extensions that do not interfere with conformance to the original specification.

Application needs determine the most appropriate extension strategy for a representation and establishes specification. For example, applications designed to operate in closed environments may allow specification authors to define a precedence among versioning strategy that would be impractical at the various mechanisms. Two examples scale of deleted text: such mechanisms are the "base" element Web. As part of defining an extensibility mechanism, a specification should set expectations about agent behavior in deleted text: HTML and XML, and the HTTP 'Content-Location' header. face of unrecognized extensions.

Good practice: <a shape="rect" name="link-mechanism" id="link-mechanism"> Link mechanisms Unknown extensions

Format specification authors Language designers SHOULD provide mechanisms for identifying links to other resources and to portions specify agent behavior in the face of representation data (via fragment identifiers). unrecognized extensions.

deleted text: <div class="boxedtext">

<span class="practicelab"> Good practice: <a shape="rect" name="web-linking" id="web-linking"> Web linking </a> </span> Two strategies have emerged as being particularly useful:

<p class="practice"> Format specification authors SHOULD provide
  1. "Must ignore": The agent ignores any content it does not recognize.
  2. "Must understand": The agent treats markup from an unrecognized namespace as an error condition.

A powerful design approach is for the language to allow either form of extension, but to distinguish explicitly between them in the syntax.

Additional strategies include prompting the user for more input, automatically retrieving data from available links, and falling back to default behavior. More complex strategies are also possible, including mixing strategies. For instance, a language can include mechanisms that for overriding standard behavior. Thus, a data format can specify "must ignore" semantics but also allow Web-wide linking, people to create extensions that override that semantics in light of application needs (for instance, with "must understand" semantics for a particular extension).

Extensibility is not just internal document linking. free. Providing hooks for extensibility is one of many requirements to be factored into the costs of language design. Experience suggests that the long term benefits of extensibility generally outweigh the costs.

4.2.3. Composition of Data Formats

What agents do with a link Many modern data format specifications include mechanisms for composition. For example:

  • It is not constrained by Web architecture. Agent behavior depends possible to embed text comments in some image formats, such as JPEG/JFIF. Although these comments are embedded in the containing data, they have little or no effect on deleted text: application context, which may include additional metadata about the relationship embodied by display of the link. For instance, agents may consider a link may image.
  • There are container formats such as SOAP which fully expect to be active or passive. Hypertext browsers usually consider anchors composed from multiple namespaces but which provide an overall semantic relationship of message envelope and in-line image references payload.
  • RDF allows well-defined mixing of vocabularies, and allows text and XML to be active links (also called <a name="hyperlink" id="hyperlink"> <dfn> hyperlinks </dfn> </a> ). Behavior may vary for these links however: a user agent may retrieve an image automatically but require user interaction in order to follow a link specified with an element such used as deleted text: "a" in HTML. On the other hand, a reasoning system might focus activity on assertions, a messaging agent might traverse service descriptions, or data type values within a subscriber might describe "callback" control-points. </p> <div class="boxedtext"> statement having clearly defined semantics.

<span class="practicelab"> Good practice: <a shape="rect" name="generic-uri" id="generic-uri"> Generic URIs </a> </span> </p> <p class="practice"> Format specification authors SHOULD allow authors to use URIs without constraining them These relationships can be mixed and nested arbitrarily. In principle, a SOAP message can contain a JPEG image that contains an RDF comment that refers to a limited set vocabulary of URI schemes. terms for describing the image.

deleted text: </div>

Users of the hypertext Web expect to be able to navigate links among representations. Data formats Note however, that do not allow authors to create hyperlinks lead to for general XML there is no semantic model that defines the creation interactions within XML documents with elements and/or attributes from a variety of "terminal nodes" namespaces. Each application must define how namespaces interact and what effect the namespace of an element has on the Web. element's ancestors, siblings, and descendants.

deleted text: <div class="boxedtext">

<span class="practicelab"> Good practice: <a shape="rect" name="use-hyperlinks" id="use-hyperlinks"> Use TAG issue mixedUIXMLNamespace-33 : Composability for user interface-oriented XML namespaces

TAG issue xmlFunctions-34 : XML Transformation and composability (XSLT, XInclude, Encryption)

TAG issue RDFinXHTML-35 : Syntax and semantics for embedding RDF in XHTML

4.3. Separation of Hyperlinks Content, Presentation, and Interaction </span> </p> <p class="practice"> Format specification authors SHOULD incorporate hypertext links into

The Web is a format if hypertext heterogeneous environment where a wide variety of agents provide access to content to users with a wide variety of capabilities. It is deleted text: the expected user interface paradigm. </p> </div> <p> Per the above good practice note, hypertext links should be based on URIs. See also for authors to create content that can reach the widest possible audience, including users with graphical desktop computers, hand-held devices and cell phones, users with disabilities who may require speech synthesizers, and devices not yet imagined. Furthermore, authors cannot predict in some cases how an agent will display or process their content. Experience shows that the allowing authors to separate content, presentation, and interaction concerns promotes reuse and device-independence (see [ DIPRINCIPLES ]); this follows from the section on <a shape="rect" href="#xml-links"> links in XML principle of orthogonal of specifications .

</div> <div class="section"> <h3> 4.6. <a shape="rect" id="xml-formats" name="xml-formats"> XML-Based Data Formats </a> </h3>

deleted text: Many data formats are <a name="xml-based" id="xml-based"> <dfn> XML-based </dfn> Good practice: Separation of content, presentation, interaction </a>,

Language designers SHOULD design formats that is allow authors to say they conform separate content from presentation and interaction concerns.

Note that when content, presentation, and interaction are separated by design, agents need to recombine them. There is a recombination spectrum, with "client does all" at one end and "server does all" at the syntax rules defined in other. There are advantages to each: recombination on the XML specification <a shape="rect" href="#XML10"> [XML10] </a>. This section discusses issues server allows the server to send out generally smaller amounts of data that are specific can be tailored to specific devices (such as mobile phones). However, such formats. Anyone seeking guidance in this area is urged data will not be readily reusable by other clients and may not allow client-side agents to consult perform useful tasks unanticipated by the "Guidelines For author. When a client does the Use work of XML in IETF Protocols" <a shape="rect" href="#IETFXML"> [IETFXML] </a>, which contains recombination, content is likely to be more reusable by a very thorough discussion broader audience and more robust. However, such date may be of greater size and may require more computation by the considerations that govern whether or not XML ought to be used, as well as specific guidelines on how client.

Of course, it ought to may not always be used. While it is directed at Internet applications with specific reference desirable to protocols, reach the discussion is generally applicable widest possible audience. Application context may require a very specific display (for a legally-binding transaction, for example). Also, digital signature technology, access control , and other technologies are appropriate for controlling access to Web scenarios as well. content.

The discussion here should be seen Some data formats are designed to describe presentation (including SVG and XSL Formatting Objects). Data formats such as ancillary these demonstrate that one can only separate content from presentation (or interaction) so far; at some point it becomes necessary to talk about presentation. Per the content principle of <a shape="rect" href="#IETFXML"> [IETFXML] </a>. Refer also to "XML Accessibility Guidelines" <a shape="rect" href="#XAG"> [XAG] </a> for help designing XML orthogonal specifications, these data formats that lower barriers to Web accessibility for people with disabilities. should only address presentation issues.

See the TAG issues formattingProperties-19 and contentPresentation-26 .

<h4> 4.6.1. <a shape="rect" name="xml-when" id="xml-when"> When to Use an XML-Based Format

4.4. Hypertext </h4>

XML defines textual data formats A defining characteristic of the Web is that are naturally suited it allows embedded references to describing data objects which are hierarchical other Web resources via URIs. The simplicity of creating links using absolute URIs ( <a href="http://www.example.com/foo"> ) and processed in an in-order sequence. It is widely, but not universally applicable for format specifications; an audio or video format, for example, relative URI references ( <a href="foo"> and <a href="foo#anchor"> ) is unlikely to be well suited to expression in XML. Design constraints that would suggest the use of XML include: </p> <ol> <li> Requirement for a hierarchical structure. </li> <li> The data's usefulness should outlive the tools currently used to process it. </li> <li> Ability to support internationalization in a self-describing way that makes confusion over coding options unlikely. </li> <li> Early detection partly (perhaps largely) responsible for the birth of encoding errors with no requirement the hypertext Web as we know it today.

When one resource (representation) refers to "work around" such errors. </li> <li> A high proportion of human-readable textual content. </li> <li> Potential composition another resource with a URI, this constitutes a link between the two resources. Additional metadata may also form part of the data format with other XML-encoded formats. </li> </ol> </div> <div class="section"> <h4> 4.6.2. <a shape="rect" name="xml-links" id="xml-links"> Links and Qnames in XML link (see [ XLink10 </h4> ], for example).

Sophisticated linking Good practice: Link mechanisms

Language designers SHOULD provide mechanisms deleted text: have been invented for XML formats. XPointer allows links to address content that does not have an explicit, named anchor. XLink allows identifying links to have multiple ends other resources and to be expressed either inline or in "link bases" stored external to any or all portions of the resources identified by the links it contains. representation data (via fragment identifiers).

For formats based on XML, format Good practice: Web linking

Language designers should consider using XLink and the XPointer framework. To define fragment identifier syntax, SHOULD provide mechanisms that allow Web-wide linking, not just internal document linking.

Good practice: Generic URIs

Language designers SHOULD allow authors to use at least the XPointer Framework and XPointer element() Schemes. URIs without constraining them to a limited set of URI schemes.

deleted text: TAG issue: What agents do with a hypertext link is not constrained by Web architecture and may depend on application context. Users of the scope hypertext links expect to be able to navigate links among representations. Data formats that do not allow authors to create hypertext links lead to the creation of using XLink? <a shape="rect" href="http://www.w3.org/2001/tag/ilist#xlinkScope-23"> xlinkScope-23 </a>. "terminal nodes" on the Web.

Good practice: <a shape="rect" name="qname-mapping" id="qname-mapping"> QName Mapping Hypertext links

Format specification authors who use QNames MUST provide Language designers SHOULD incorporate hypertext links into a mapping to URIs. data format if hypertext is the expected user interface paradigm.

<div class="boxedtext"> <p> <span class="practicelab"> Good practice: <a shape="rect" name="qname-uri-syntax" id="qname-uri-syntax"> QNames Indistinguishable from URIs

4.4.1. URI References </span>

Links are commonly expressed using </p> <p class="practice"> Format specification authors MUST NOT define an attribute whose value URI references (defined in section 4.2 of [ URI ]), which may be either combined with a base URI or QName since to yield a usable URI. Section 5.1 of [ URI ] explains different mechanisms for establishing a base URI for a resource and establishes a precedence among the two types cannot various mechanisms. For instance, the base URI may be distinguished by syntax. </p> </div> <p> <span class="ednote"> Editor's note </span>: The two previous points need more introduction. a URI for the resource, or specified in a representation (see the "base" element in HTML and XML, and the HTTP 'Content-Location' header). See also the section on links in XML .

See Agents resolve a URI reference before using the TAG finding <cite> " <a shape="rect" href="http://www.w3.org/2001/tag/doc/qnameids.html"> Using QNames as Identifiers resulting URI to interact with another agent. URI references help in Content </a> " </cite> content management by allowing authors to design a representation locally, i.e., without concern for more information. See also TAG issues <a shape="rect" href="http://www.w3.org/2001/tag/ilist"> rdfmsQnameUriMapping-6 </a> and <a shape="rect" href="http://www.w3.org/2001/tag/ilist#qnameAsId-18"> qnameAsId-18 </a>. which global identifier may later be used to refer to the associated resource.

<h4> 4.6.3. <a shape="rect" name="xml-namespaces" id="xml-namespaces"> XML Namespaces

4.5. XML-Based Data Formats </h4> <div class="boxedtext">

Many data formats are <span class="storylab"> Story </span> </p> <div class="story"> <p> The authority responsible for "weather.example.com" realizes that it can provide more interesting representations by creating instances that consist of elements defined in different <a shape="rect" href="#xml-based"> XML-based formats , deleted text: such as XHTML, SVG, and MathML. </p> </div> </div> <p> How do the application designers ensure that there are no naming conflicts when is to say they combine elements from different formats (e.g., suppose that conform to the "p" element is syntax rules defined in two or more the XML formats)? "Namespaces in XML" [ <a shape="rect" href="#XMLNS"> XMLNS </a> ] provides a mechanism for establishing a globally unique name specification [XML10] . This section discusses issues that can be understood are specific to such formats. Anyone seeking guidance in any context. </p> <p> The "expanded name" of an XML element or attribute name this area is urged to consult the combination "Guidelines For the Use of its namespace URI and its local name. This is represented lexically XML in documents by associating namespace names with (optional) prefixes and combining prefixes and local names with IETF Protocols" [IETFXML] , which contains a colon thorough discussion of the considerations that govern whether or not XML ought to be used, as well as specific guidelines on how it ought to be used. While it is directed at Internet applications with specific reference to protocols, the discussion is generally applicable to Web scenarios as described in "Namespaces in XML." well.

Format specification designers that declare namespaces thus provide a global context for instances of The discussion here should be seen as ancillary to the data format. Establishing this global context allows those instances (and portions thereof) content of [IETFXML] . Refer also to be re-used and combined in novel ways not yet imagined. Failure "XML Accessibility Guidelines" [XAG] for help designing XML formats that lower barriers to provide a namespace makes such re-use more difficult, perhaps impractical in some cases. Web accessibility for people with disabilities.

<div class="boxedtext"> <p> <span class="practicelab"> Good practice: <a shape="rect" name="use-namespaces" id="use-namespaces">

4.5.1. When to Use Namespaces an XML-Based Format </span> </p> <p class="practice"> Format specification authors that create new XML vocabularies SHOULD place all element names and global attribute names in a namespace. </p> </div>

Attributes XML defines textual data formats that are always scoped by the element on naturally suited to describing data objects which deleted text: they appear. In that respect they are a somewhat special case. An attribute that is "global," that is, one that might meaningfully appear on different elements, including elements in other namespaces, should be explicitly placed hierarchical and processed in a namespace. Local attributes, ones associated with only a particular element, need an in-order sequence. It is widely, but not universally applicable for data format specifications; an audio or video format, for example, is unlikely to be included well suited to expression in a namespace since their meaning will always be clear from the context provided by XML. Design constraints that element. would suggest the use of XML include:

<p>
  1. Requirement for a hierarchical structure.
  2. The <code> type </code> attribute from W3C data's usefulness should outlive the tools currently used to process it (though obviously XML deleted text: Schema is an example of a global attribute. It can be used by authors for short-term needs as well).
  3. Ability to support internationalization in a self-describing way that makes confusion over coding options unlikely.
  4. Early detection of any vocabulary encoding errors with no requirement to make an assertion about the type "work around" such errors.
  5. A high proportion of human-readable textual content.
  6. Potential composition of the element on which it appears. The <code> type </code> attribute occurs data format with other XML-encoded formats.

4.5.2. Links in deleted text: the W3C XML Schema namespace

Sophisticated linking mechanisms have been invented for XML formats. XPointer allows links to address content that does not have an explicit, named anchor. XLink allows links to have multiple ends and must always to be fully qualified. The <code> frame </code> attribute on an HTML table is an example of a local attribute. There is no value in placing that attribute expressed either inline or in a namespace since the attribute is very unlikely "link bases" stored external to be useful on an element other than an HTML table. any or all of the resources identified by the links it contains.

Applications that rely on DTD processing must impose additional constraints on the use of namespaces. DTDs perform validation For formats based on XML, language designers should consider using XLink and the lexical form of the element XPointer framework. To define fragment identifier syntax, use at least the XPointer Framework and attribute names XPointer element() Schemes.

XLink is an appropriate specification for representing links in hypertext XML applications.

TAG issue: What is the document. This makes prefixes semantically significant in ways that are not anticipated by [ <a shape="rect" href="#XMLNS"> XMLNS </a> ]. scope of using XLink? xlinkScope-23 .

4.6.4. <a shape="rect" name="ns-versioning" id="ns-versioning"> 4.5.3. XML Namespaces deleted text: and Versioning

Story

Nadia and Dirk are designing an XML format to encode data about the film industry. They provide The authority responsible for extensibility "weather.example.com" realizes that it can provide more interesting representations by deleted text: using XML namespaces and creating a schema instances that allows the inclusion, in certain places, consist of elements from any namespace. When they revise their format, Nadia proposes a new optional "lang" attribute on the "film" element. Dirk feels that defined in different XML-based formats , such a change requires them to assign a new namespace name, which might require changes to deployed software. Nadia convinces Dirk that they have chosen a namespace policy that allows changes that will not affect conformance of existing content and software, as XHTML, SVG, and thus no change to the namespace identifier is required. MathML.

Dirk and Nadia have chosen a particular namespace change policy How do the application designers ensure that allows them to reuse a namespace name whenever there are no naming conflicts when they make changes combine elements from different formats (for example, suppose that do not affect conformance of deployed content and software. They might have chosen the "p" element is defined in two or more XML formats)? "Namespaces in XML" [ XMLNS ] provides a different policy, mechanism for example establishing a globally unique name that can be understood in any new element or attribute has to belong to context.

Language specification designers that declare namespaces thus provide a namespace other than the original one. Whatever the chosen policy, it should set clear expectations global context for users instances of the data format. Establishing this global context allows those instances (and portions thereof) to be re-used and combined in novel ways not yet imagined. Failure to provide a namespace makes such re-use more difficult, perhaps impractical in some cases.

Good practice: <a shape="rect" name="pr-doc-ns-policy" id="pr-doc-ns-policy"> Document namespace policy Namespace adoption

Format Language designers SHOULD document change policies for who create new XML namespaces. vocabularies SHOULD place all element names and global attribute names in a namespace.

As Attributes are always scoped by the element on which they appear. An attribute that is "global," that is, one that might meaningfully appear on different elements, including elements in other namespaces, should be explicitly placed in a namespace. Local attributes, ones associated with only a particular element, need not be included in a namespace since their meaning will always be clear from the context provided by that element.

The type attribute from W3C XML Schema is an example of a change policy designed global attribute. It can be used by authors of any vocabulary to reflect make an assertion about the variable stability type of the element on which it appears. The type attribute occurs in the W3C XML Schema namespace and must always be fully qualified. The frame attribute on an HTML table is an example of a namespace, consider the <a shape="rect" href="http://www.w3.org/1999/10/nsuri"> W3C namespace policy </a> for documents on the W3C Recommendation track. The policy sets expectations local attribute. There is no value in placing that the Working Group responsible for the namespace may modify it attribute in deleted text: any way until a deleted text: certain point in the process ("Candidate Recommendation") at which point W3C constrains the set possible changes to the namespace in order since the attribute is unlikely to promote stable implementations. be useful on an element other than an HTML table.

Note Applications that since namespace names are URIs, rely on DTD processing must impose additional constraints on the party (if any) responsible for a namespace URI has use of namespaces. DTDs perform validation based on the authority to decide lexical form of the namespace change policy. element and attribute names in the document. This makes prefixes syntactically significant in ways that are not anticipated by [ XMLNS ].

4.6.5. <a shape="rect" name="namespace-documents" id="namespace-documents"> 4.5.4. Namespace Documents

Story

Nadia receives a representation data from "weather.example.com" in an unfamiliar data format. She knows enough about XML to recognize which XML namespace the elements belong to. Since the namespace is identified by a URI, the URI "http://weather.example.com/2003/format", she asks her browser to retrieve a representation of the namespace via that URI. Nadia is requesting the namespace document .

Nadia gets back some useful data that allows her to learn more about the data format. Nadia's browser may also be able to use data optimized for agents perform some operations automatically (i.e., unattended by a human overseer) to perform useful tasks given data that has been optimized for software agents. For example, her browser might, on Nadia's behalf, deleted text: such as download additional agents to process and render the format.

There are many reasons to provide information about a namespace. A person might want to:

  • understand its purpose,
  • learn how to use the markup vocabulary in the namespace,
  • find out who controls it,
  • request authority to access schemas or collateral material about it, or
  • report a bug or situation that could be considered an error in some collateral material.

A processor might want to:

  • retrieve a schema, for validation,
  • retrieve a style sheet, for presentation, or
  • retrieve ontologies, for making inferences.

In general, there is no "best" data format established best practice for creating a namespace document. Application expectations will influence what data format or formats are used to create a namespace document. Application expectations will also influence whether relevant information appears in the namespace document itself or is referenced from it.

Good practice: <a shape="rect" name="namespace-docs" id="namespace-docs"> Namespace documents

Resource owners who publish an XML namespace name SHOULD make available material intended for people to read and material optimized for software agents in order to meet the needs of those who will use the namespace vocabulary.

For example, the following are examples of formats used to create namespace documents: [ <a shape="rect" href="#OWL10"> OWL10 ], [ <a shape="rect" href="#RDDL"> RDDL ], [ <a shape="rect" href="#XMLSCHEMA"> XMLSCHEMA ], and [ <a shape="rect" href="#XHTML10"> XHTML XHTML11 ]. Each of these formats meets different requirements described above for satisfying the needs of a person or an agent that wants more information about the namespace. Note, however, issues related to fragment identifiers and multiple representations if content negotiation is used with namespace documents.

Issue : <a shape="rect" href="http://www.w3.org/2001/tag/ilist#namespaceDocument-8"> namespaceDocument-8 : What should a "namespace document" look like?

Issue : <a shape="rect" href="http://www.w3.org/2001/tag/ilist.html#abstractComponentRefs-37"> abstractComponentRefs-37 : Definition of abstract components with namespace names and frag ids

4.6.6. <a shape="rect" name="fragids" id="fragids"> Fragment Identifiers and ID Semantics 4.5.5. QNames in XML

The section on <a shape="rect" href="#media-type-fragid"> media types and fragment identifier semantics </a> discusses the interpretation of fragment identifiers. Per the previous <a shape="rect" href="#define-fragids"> good practice note on fragment identifiers </a>, designers of an XML-based format specification should define the semantics of fragment identifiers Qualified names ("QNames") were introduced by "Namespaces in that format. The XPointer Framework XML" [ <a shape="rect" href="#XPTRFR"> XPTRFR XMLNS ] provides a interoperable starting point. </p> <p> When the media type assigned to representation data is <code> application/xml </code>, there ]. In that specification QNames are deleted text: no semantics defined for fragment identifiers, element and authors should not make use of fragment identifiers in such data. The same is true if the assigned media type has the suffix <code> +xml </code> (defined in "XML Media Types" attribute names and provide a mechanism for concisely creating a URI/local-name pair. Other specifications, starting with [ <a shape="rect" href="#RFC3023"> RFC3023 XSLT10 ]), and ], have employed the format specification does not specify fragment identifier semantics. In short, just knowing QName idea in contexts other than element and attribute names. Specifically, QNames have been used in attribute values and element content. Some specifications use QNames as shortcuts for unique identifiers derived from a URI/local-name pair that content is XML does not provide information about fragment identifier semantics. have no relationship to element or attribute names.

Many people assume that the fragment identifier <code> #abc </code>, when referring Using a QName as a shortcut for a URI/local-name pair is often convenient, but at a cost. There is no single, accepted way to convert a QName into a URI/local-name pair or vice-versa. Experience has also revealed other limitations to QNames, such as losing namespace bindings after XML data, identifies canonicalization. Although QNames are convenient, they do not replace the element in URI as the document with identifying mechanism of the ID "abc". However, there Web. The use of QNames as identifiers without providing a mapping to URIs is no normative support for this assumption. inconsistent with Web architecture.

Good practice: QName Mapping

Language designers who use QNames MUST provide a mapping to URIs.

QNames and URIs cannot be distinguished lexically.

Unfortunately, there are Good practice: QNames Indistinguishable from URIs

Language designers MUST NOT define an attribute whose value may be either a number of open issues associated with finding URI or QName since the element with two types cannot be distinguished by syntax.

For examples of QName-to-URI mappings, see [ RDF10 ]. See the ID "abc" TAG finding " Using QNames as Identifiers in an Content " for more information. See also TAG issues rdfmsQnameUriMapping-6 , qnameAsId-18 , and abstractComponentRefs-37 .

4.5.6. XML document. In XML, the quality of "being an ID" is associated with the type of the attribute, not its name. ID Semantics

Consider the following fragment: fragment of XML: <section name="foo"> . Does the section element have the ID "foo"? One cannot answer this question by examining the element and its attributes alone. In XML, the quality of "being an ID" is associated with the type of the attribute, not its name. Finding the IDs in a document requires additional processing.

  1. Processing the document with a processor that recognizes DTD attribute list declarations (in the external or internal subset) might reveal a declaration that identifies the name attribute as an ID. Note: This processing is not necessarily part of validation. A non-validating, DTD-aware processor can perform ID assignment.
  2. Processing the document with a W3C XML Schema might reveal an element declaration that identifies the name attribute as an xs:ID .
  3. In practice, processing the document with another schema language, such as RELAX NG, NG [ RELAXNG ], might reveal the attributes of type ID. Many modern specifications begin processing XML at the Infoset [ INFOSET ] level and do not specify normatively how an Infoset is constructed. For those specifications, any process that establishes the ID type in the Infoset (and Post Schema Validation Infoset, or Infoset ( PSVI ) defined in [ XMLSCHEMA ]) may successfully usefully identify the attributes of type ID.

To further complicate matters, DTDs establish the ID type in the Infoset whereas W3C XML Schema produces a PSVI but does not modify the original Infoset. This leaves open the possibility that a processor might only look in the Infoset and consequently would fail to recognize schema-assigned IDs. </p> <p> <span class="ednote"> Editor's note </span>: W3C's <a shape="rect" href="http://www.w3.org/XML/Core/"> XML Core Working Group </a> is investigating the question of fragment identifier semantics. </p> <p> TAG issue <a shape="rect" href="http://www.w3.org/2001/tag/ilist#fragmentInXML-28"> fragmentInXML-28 </a>: Do fragment identifiers refer to a syntactic element (at least for XML content), or can they refer to abstractions? </p> <p> TAG issue <a shape="rect" href="http://www.w3.org/2001/tag/ilist#xmlIDSemantics-32"> xmlIDSemantics-32 </a>: How should To further complicate matters, DTDs establish the deleted text: problem of identifying ID semantics type in the Infoset whereas W3C XML formats be addressed in Schema produces a PSVI but does not modify the absence of original Infoset. This leaves open the possibility that a DTD? processor might only look in the Infoset and consequently would fail to recognize schema-assigned IDs.

See TAG finding " <a shape="rect" href="http://www.w3.org/2001/tag/doc/xmlIDsemantics-32.html"> How should the problem of identifying ID semantics in XML formats be addressed in the absence of a DTD? " .

4.6.7. <a shape="rect" name="xml-media-types" id="xml-media-types"> 4.5.7. Media Types for XML

RFC 3023 defines the Internet Media Types application/xml and text/xml , and describes a convention whereby XML-based data formats use Internet Media Types with a +xml suffix, for example image/svg+xml .

These Internet Media Types create two problems: First, for data identified as text/* , Web intermediaries are allowed to "transcode", i.e., convert one character encoding to another. Transcoding may make the self-description false or may cause the document to be not well-formed.

Good practice: <a shape="rect" name="no-text-xml" id="no-text-xml"> XML and text/* </a> </span> </p> <p class="practice"> In general, server managers SHOULD NOT assign Internet Media Types beginning with <code> text/ </code> to XML representations. </p> </div> <p> Second, representations whose Internet Media Types begin with <code> text/ </code> are required, unless the <code> charset </code> parameter is specified, to be considered to be encoded in US-ASCII. Since the syntax of XML is designed to make documents self-describing, it is good practice to omit the <code> charset </code> parameter, and since XML is very often not encoded in US-ASCII, the use of " <code> text/ </code> " Internet Media Types effectively precludes this good practice. </p> <div class="boxedtext"> <p> <span class="practicelab"> Good practice: <a shape="rect" name="no-charset" id="no-charset"> XML and character encodings </a> </span> </p> <p class="practice"> In general, server managers SHOULD NOT specify the character encoding for XML data in protocol headers since the data is self-describing. </p> </div> </div> </div> <div class="section"> <h3> 4.7. <a shape="rect" name="formats-future" id="formats-future"> Future Directions for Formats </a> </h3> <p> There remain open questions regarding resource representations. The following sections identify a few areas of future work in the Web community. The TAG makes no commitment at this time to pursuing these issues. </p> <div class="section"> <h4> 4.7.1. <a shape="rect" name="composition" id="composition"> Composition of Data Formats </h4> <p> Many modern data format specifications include mechanisms for composition. These mechanisms range from relatively shallow and limited

In general, server managers SHOULD NOT assign Internet Media Types beginning with text/ to relatively deep and sophisticated. XML representations.

Toward the shallow end of Second, representations whose Internet Media Types begin with text/ are required, unless the spectrum, it charset parameter is possible specified, to embed text comments in some image formats, such as JPEG/JFIF. Although these comments are embedded be considered to be encoded in US-ASCII. Since the containing data, they have little or no effect on the content syntax of the image. </p> <p> Towards the deep end, it XML is possible designed to compose XML make documents with elements from a variety of namespaces. How these namespaces interact and what effect an element's namespace has on its ancestors, siblings, and descendents self-describing, it is not always obvious. </p> <p> Near the middle of the spectrum, there are container formats such as SOAP which fully expect good practice to be composed from multiple namespaces but which provide an overall semantic relationship of message envelope and payload. </p> <p> These relationships can be mixed and nested arbitrarily. In principle, a SOAP message can contain a JPEG image that contains an RDF comment that references a vocabulary of terms for describing omit the image. </p> <p> TAG issue <a shape="rect" href="http://www.w3.org/2001/tag/ilist#xmlProfiles-29"> xmlProfiles-29 </a>: When, whither charset parameter, and how to profile W3C specifications since XML is very often not encoded in US-ASCII, the XML Family? </p> <p> TAG issue <a shape="rect" href="http://www.w3.org/2001/tag/ilist#mixedUIXMLNamespace-33"> mixedUIXMLNamespace-33 </a>: Composability for user interface-oriented XML namespaces use of " text/ " Internet Media Types effectively precludes this good practice.

TAG issue <a shape="rect" href="http://www.w3.org/2001/tag/ilist#xmlFunctions-34"> xmlFunctions-34 </a>: Good practice: XML deleted text: Transformation and composability (e.g., XSLT, XInclude, Encryption) character encodings

<p> TAG issue <a shape="rect" href="http://www.w3.org/2001/tag/ilist#RDFinXHTML-35"> RDFinXHTML-35 </a>: Syntax and semantics

In general, server managers SHOULD NOT specify the character encoding for embedding RDF XML data in XHTML protocol headers since the data is self-describing.

deleted text: </div>
<h2> 5. <a shape="rect" id="conformance" name="conformance"> Conformance

4.5.8. Fragment Identifiers in XML </h2>

This document defines conformance profiles for the following The section on media types deleted text: of entities: </p> <ul> <li> Agents (including user agents) </li> <li> Resource owners and server managers </li> <li> Authors fragment identifier semantics discusses the interpretation of specifications </li> </ul> <p> To conform as a given entity type, a claimant must satisfy all fragment identifiers. Designers of an XML-based data format specification should define the good practice notes having that entity type as subject and semantics of fragment identifiers in that include the phrase "MUST" or "MUST NOT". format. The terms MUST, MUST NOT, SHOULD, SHOULD NOT, and MAY are thus used in accordance with RFC 2119 XPointer Framework [ <a shape="rect" href="#RFC2119"> RFC2119 XPTRFR ]. ] provides a interoperable starting point.

The existence When the media type assigned to representation data is application/xml , there are no semantics defined for fragment identifiers, and authors should not make use of a conformance claim fragment identifiers in such data. The same is true if the assigned media type has the suffix +xml (defined in "XML Media Types" [ RFC3023 ]), and the data format specification does not imply specify fragment identifier semantics. In short, just knowing that content is XML does not provide information about fragment identifier semantics.

Many people assume that deleted text: W3C has reviewed the claim or assured its validity. As of fragment identifier #abc , when referring to XML data, identifies the publication of this document, W3C does not act as an assuring party, but it may do so element in the future, or it may establish recommendations document with the ID "abc". However, there is no normative support for assuring parties. this assumption.

deleted text: </div> <div class="section"> <h2> 6. <a shape="rect" id="glossary" name="glossary"> Glossary </a> </h2>

<em> Glossary not yet completed </em>. TAG issue fragmentInXML-28 : Do fragment identifiers refer to a syntactic element (at least for XML content), or can they refer to abstractions?

6.1. <a shape="rect" id="app-principles" name="app-principles"> Principles, Constraints, etc. 4.6. Future Directions for Formats

<span class="ednote"> Editor's note </span>: The TAG is still experimenting with the categorization of points in this document. This list is likely to change. It has also been suggested that the categories clearly indicate their primary audience. </p> <p> There remain open questions regarding resource representations. The important points of this document are categorized as follows: </p> <dl> <dt> <a shape="rect" name="cat-constraint" id="cat-constraint"> Constraint </a> </dt> <dd> An architectural constraint is following sections identify a restriction in behavior or interaction within the system. Constraints may be imposed for technical, policy, or other reasons. </dd> <dt> <a shape="rect" name="cat-design" id="cat-design"> Design Choice </a> </dt> <dd> In the design of the Web, some design choices, like the names of the <p> and <li> elements in HTML, or the choice few areas of the colon character future work in deleted text: URIs, are somewhat arbitrary; if <par>, <elt>, or <code> * </code> had been chosen instead, the large-scale result would, most likely, have been the same. Other design choices are more fundamental; these are Web community.

4.6.1. XML Profiles

TAG issue xmlProfiles-29 : When, whither and how to profile W3C specifications in the focus of this document. </dd> XML Family?

5. Term Index

<a shape="rect" name="cat-practice" id="cat-practice"> Good practice Dereference a URI
Good practice — by software developers, content authors, site managers, users, and specification writers — increases Access the value of resource identified by the Web. URI.
<a shape="rect" name="cat-principle" id="cat-principle"> Principle Fragment identifier
An architectural principle is The part of a fundamental rule URI that applies to a large number of situations and variables. Architectural principles include "separation of concerns", "generic interface", "self-descriptive syntax," "visible semantics," "network effect" (Metcalfe's Law), and Amdahl's Law: "The speed allows identification of a system is determined by its slowest component." secondary resource.
<a shape="rect" name="cat-property" id="cat-property"> Property Language extension
Architectural properties include both the functional properties achieved by the system, such as accessibility and global scope, and non-functional properties, such as relative ease of evolution, re-usability One language is an extension of components, efficiency, another if and dynamic extensibility. only if the second is a language subset of the first.
</dl> </div> </div> <div class="section"> <h2> 7. <a shape="rect" id="index" name="index"> Term Index </a> </h2> <ul> <li> <a href="#def-uri-ambiguity"> URI ambiguity </a> </li> <li> <a href="#def-URI-persistence"> URI persistence </a> </li> <li> <a href="#uriref"> URI reference </a> </li> <li>
<a href="#def-uri"> Uniform Resource Identifier (URI) Language subset </li> <li>
One language is a subset of another if and only if any document in the first language is also a valid document in the second language and has the same interpretation in the second language.
<a href="#def-web-agent"> Web agent Link </li> <li>
A relationship between two resources when one resource (representation) refers to the other resource by means of a URI.
<a href="#baseuri"> base URI Message </li> <li>
A unit of communication between agents.
<a href="#uri-dereference"> deference Message metadata </li> <li>
Metadata about a message.
<a href="#format-specification"> format specification Namespace document </li> <li>
The resource identified by a namespace URI.
<a href="#def-fragid"> fragment identifier Representation </li> <li>
An octet sequence that consists of representation data and representation metadata, especially a media type.
<a href="#hyperlink"> hyperlink Representation data </li> <li>
Electronic data expressing resource state, part of a representation of the resource.
<a href="#link"> link Representation metadata </li> <li>
The metadata part of a representation.
<a href="#def-message"> message Resource </li> <li>
An item of interest in the information space known as the World Wide Web.
<a href="#message-metadata"> message Resource metadata </li> <li>
Metadata about a resource.
<a href="#def-namespace-document"> namespace document Safe interaction </li> <li>
Interaction with a resource where an agent does not incur any obligation beyond the interaction.
<a href="#def-representation"> representation Secondary resource </li> <li> <a href="#representation-metadata">
A resource that is related to another resource by a relationship that between representation metadata data, a fragment identifier, and a media type for interpreting the data.
URI ambiguity </li> <li>
The use of the same URI to refer to more than one distinct resource.
<a href="#def-resource"> resource URI ownership </li> <li>
The relationship between assigning agent and URI that is defined by a URI scheme.
<a href="#resource-metadata"> resource metadata URI persistence </li> <li>
The social expectation that once a URI identifies a particular resource, it should continue indefinitely to refer to that resource.
<a href="#def-safe-interaction"> safe interaction URI reference </li> <li>
An operational shorthand for a URI.
<a href="#def-secondary-resource"> secondary resource Uniform Resource Identifier (URI) </li> <li>
A global identifier in the context of the World Wide Web.
unsafe Unsafe interaction </li> <li>
Interaction with a resource that is not safe interaction.
user User agent </li> <li>
One type of Web agent; a piece of software acting on behalf of a person.
Web agent
A person or a piece of software acting on the information space on behalf of a person, entity, or process.
xml-based XML-based format </li> </ul>
One that conforms to the syntax rules defined in the XML specification.

8. <a shape="rect" id="refs" name="refs"> 6. References

8.1. <a shape="rect" name="normative" id="normative"> Normative References 6.1. Internet Specifications

<a shape="rect" name="IANASchemes" id="IANASchemes"> IANASchemes
IANA's <a shape="rect" href="http://www.iana.org/assignments/uri-schemes"> online registry of URI Schemes is available at http://www.iana.org/assignments/uri-schemes.
<a shape="rect" href="http://www.w3.org/Addressing/schemes"> Dan Connolly's list of URI schemes is a useful resource for finding out which references define various URI schemes.
<a shape="rect" name="MEDIATYPEREG" id="MEDIATYPEREG"> MEDIATYPEREG
IANA's <a shape="rect" href="http://www.iana.org/assignments/media-types/index.html"> online registry of Internet Media Types is available at http://www.iana.org/assignments/media-types/index.html.
<a shape="rect" name="RFC2045" id="RFC2045"> RFC2045
IETF " <a shape="rect" href="http://www.ietf.org/rfc/rfc2045.txt"> RFC 2045: Multipurpose Internet Mail Extensions (MIME) Part One: Format of Internet Message Bodies ", , N. Freed, N. Borenstein, November 1996. Available at http://www.ietf.org/rfc/rfc2045.txt.
<a shape="rect" name="RFC2046" id="RFC2046"> RFC2046
IETF " <a shape="rect" href="http://www.ietf.org/rfc/rfc2046.txt"> RFC 2046: Multipurpose Internet Mail Extensions (MIME) Part Two: Media Types ", , N. Freed, N. Borenstein, November 1996. Available at http://www.ietf.org/rfc/rfc2046.txt.
<a shape="rect" name="RFC2119" id="RFC2119"> RFC2119
IETF " <a shape="rect" href="http://www.ietf.org/rfc/rfc2119.txt"> RFC 2119: Key words for use in RFCs to Indicate Requirement Levels ", , S. Bradner, March 1997. Available at http://www.ietf.org/rfc/rfc2119.txt.
<a shape="rect" name="URI" id="URI"> URI
"Uniform Uniform Resource Identifiers (URI): Generic Syntax" Syntax (T. Berners-Lee, R. Fielding, L. Masinter, Eds.) is currently being revised. The IETF Internet Draft <a shape="rect" href="http://www.apache.org/~fielding/uri/rev-2002/rfc2396bis.html"> draft-fielding-uri-rfc2396bis-03 </a> is expected to obsolete <a shape="rect" href="http://www.ietf.org/rfc/rfc2396.txt"> RFC 2396 </a>, which is the current URI standard. "Architecture of the World Wide Web" uses the concepts and terms defined in draft-fielding-uri-rfc2396bis-03, preferring them to those defined in RFC 2396. The TAG Eds.) is tracking the evolution of draft-fielding-uri-rfc2396bis-03. currently being revised. Citations labeled [ URI ] refer to draft-fielding-uri-rfc2396bis-03 .
<a shape="rect" name="RFC2616" id="RFC2616"> RFC2616
IETF " <a shape="rect" href="http://www.ietf.org/rfc/rfc2616.txt"> RFC 2616: Hypertext Transfer Protocol — HTTP/1.1 ", , J. Gettys, J. Mogul, H. Frystyk, L. Masinter, P. Leach, T. Berners-Lee, June 1999. Available at http://www.ietf.org/rfc/rfc2616.txt.
<a shape="rect" name="RFC2717" id="RFC2717"> RFC2717
IETF " <a shape="rect" href="http://www.ietf.org/rfc/rfc2717.txt"> Registration Procedures for URL Scheme Names ", , R. Petke, I. King, November 1999. Available at http://www.ietf.org/rfc/rfc2717.txt.

8.2. <a shape="rect" name="archspecs" id="archspecs"> 6.2. Architectural Specifications

<a shape="rect" name="ATAG10" id="ATAG10"> ATAG10
<a shape="rect" href="http://www.w3.org/TR/2000/REC-ATAG10-20000203/"> "Authoring Authoring Tool Accessibility Guidelines 1.0," 1.0 , J. Treviranus, C. McCathieNevile, I. Jacobs, deleted text: and J. Richards, eds., Editors, W3C Recommendation, 3 February 2000. This 2000, http://www.w3.org/TR/2000/REC-ATAG10-20000203 . Latest version available at http://www.w3.org/TR/ATAG10 .
UAAG10
User Agent Accessibility Guidelines 1.0 , I. Jacobs, J. Gunderson, E. Hansen, Editors, W3C Recommendation, 17 December 2002, http://www.w3.org/TR/2002/REC-UAAG10-20021217/ . Latest version available at http://www.w3.org/TR/UAAG10/ .
XAG
XML Accessibility Guidelines , D. Dardailler, S. B. Palmer, C. McCathieNevile, Editors, W3C Working Draft (work in progress), 3 October 2002, http://www.w3.org/TR/2002/WD-xag-20021003 . Latest version available at http://www.w3.org/TR/xag .
CHARMOD
Character Model for the World Wide Web 1.0 , T. Texin, M. J. Dürst, F. Yergeau, R. Ishida, M. Wolf, Editors, W3C Working Draft (work in progress), 22 August 2003, http://www.w3.org/TR/2003/WD-charmod-20030822/ . Latest version available at http://www.w3.org/TR/charmod/ .
DIPRINCIPLES
Device Independence Principles , R. Gimson, Editors, W3C Working Group Note, 1 September 2003, http://www.w3.org/TR/2003/NOTE-di-princ-20030901/ . Latest version available at http://www.w3.org/TR/di-princ/ .
QA
QA Framework: Specification Guidelines , D. Hazaël-Massieux, L. Henderson, L. Rosenthal, Editors, W3C Candidate Recommendation is http://www.w3.org/TR/2000/REC-ATAG10-20000203/. (work in progress), 10 November 2003, http://www.w3.org/TR/2003/CR-qaframe-spec-20031110/ . Latest version available at http://www.w3.org/TR/qaframe-spec/ .
WCAG20
Web Content Accessibility Guidelines 2.0 , W. Chisholm, G. Vanderheiden, J. White, B. Caldwell, Editors, W3C Working Draft (work in progress), 24 June 2003, http://www.w3.org/TR/2003/WD-WCAG20-20030624/ . Latest version available at http://www.w3.org/TR/WCAG20/ .
<a shape="rect" name="CHARMOD" id="CHARMOD"> CHARMOD WSA
<a shape="rect" href="http://www.w3.org/TR/2002/WD-charmod-20020430/"> "Character Model for the World Wide Web," Web Services Architecture , M. Dürst and Champion, C. Ferris, D. Orchard, D. Booth, H. Haas, F. Yergeau, eds., 30 April 2002. This McCabe, E. Newcomer, Editors, W3C Working Draft is http://www.w3.org/TR/2002/WD-charmod-20020430/. The <a shape="rect" href="http://www.w3.org/TR/charmod/"> latest (work in progress), 8 August 2003, http://www.w3.org/TR/2003/WD-ws-arch-20030808/ . Latest version deleted text: is available at http://www.w3.org/TR/charmod/. http://www.w3.org/TR/ws-arch/ .
<a shape="rect" name="DIPRINCIPLES" id="DIPRINCIPLES"> DIPRINCIPLES EXTLANG
<a shape="rect" href="http://www.w3.org/TR/2001/WD-di-princ-20010918/"> "Device Independent Principles," Web Architecture: Extensible Languages R. Gimson, Ed., 18 September 2001. , T. Berners-Lee, D. Connolly, 10 February 1998. This W3C Working Draft is http://www.w3.org/TR/2001/WD-di-princ-20010918/. The <a shape="rect" href="http://www.w3.org/TR/di-princ/"> latest version </a> Note is available at http://www.w3.org/TR/di-princ/. http://www.w3.org/TR/1998/NOTE-webarch-extlang-19980210.
<a shape="rect" name="Fielding" id="Fielding"> Fielding
" <a shape="rect" href="http://www.ics.uci.edu/~fielding/pubs/webarch_icse2000.pdf"> Principled Design of the Modern Web Architecture ", , R.T. Fielding and R.N. Taylor, UC Irvine. In Proceedings of the 2000 International Conference on Software Engineering (ICSE 2000), Limerick, Ireland, June 2000, pp. 407-416. This document is available at http://www.ics.uci.edu/~fielding/pubs/webarch_icse2000.pdf.
<a shape="rect" name="RFC1958" id="RFC1958"> RFC1958
IETF " <a shape="rect" href="http://www.ietf.org/rfc/rfc1958.txt"> RFC 1958: Architectural Principles of the Internet ", , B. Carpenter, June 1996. Available at http://www.ietf.org/rfc/rfc1958.txt.
<dt> <a shape="rect" name="QA" id="QA"> QA </a> </dt> <dd> <a shape="rect" href="http://www.w3.org/TR/2003/WD-qaframe-spec-20030210/"> "QA Framework: Specification Guidelines," </a> D. Hazaël-Massieux, L. Henderson, L. Rosenthal, D. Dimitriadis, K. Gavrylyuk, eds., 10 February 2003.This W3C Working Draft is http://www.w3.org/TR/2003/WD-qaframe-spec-20030210/. The <a shape="rect" href="http://www.w3.org/TR/qaframe-spec/"> latest version </a> is available at http://www.w3.org/TR/qaframe-spec/. </dd> <dt> <a shape="rect" name="UAAG10" id="UAAG10"> UAAG10 </a> </dt> <dd> <a shape="rect" href="http://www.w3.org/TR/2002/REC-UAAG10-20021217/"> "User Agent Accessibility Guidelines 1.0," </a> I. Jacobs, J. Gunderson, E. Hansen, eds., 17 December 2002. This W3C Recommendation is http://www.w3.org/TR/2002/REC-UAAG10-20021217/. </dd> <dt> <a shape="rect" name="WCAG10" id="WCAG10"> WCAG10

6.3. Additional References </dt> <dd> <a shape="rect" href="http://www.w3.org/TR/1999/WAI-WEBCONTENT-19990505/"> "Web Content Accessibility Guidelines 1.0," </a> W. Chisholm, G. Vanderheiden, and I. Jacobs, eds., 5 May 1999. This W3C Recommendation is http://www.w3.org/TR/1999/WAI-WEBCONTENT-19990505/. </dd>

<a shape="rect" name="WSA" id="WSA"> WSA CGI
<a shape="rect" href="http://www.w3.org/TR/2003/WD-ws-arch-20030514/"> "Web Services Architecture," </a> D. Booth, M. Champion, C. Ferris, F. McCabe, E. Newcomer, D. Orchard eds., 14 May 2003. This W3C Working Draft is http://www.w3.org/TR/2003/WD-ws-arch-20030514/. The <a shape="rect" href="http://www.w3.org/TR/ws-arch/"> latest version </a> of this document is available at http://www.w3.org/TR/ws-arch/. </dd> <dt> <a shape="rect" name="XAG" id="XAG"> XAG Common Gateway Interface/1.1 Specification </dt> <dd> " <a shape="rect" href="http://www.w3.org/TR/2002/WD-xag-20021003"> XML Accessibility Guidelines </a> ", D. Dardailler, S. Palmer, C. McCathieNevile, 3 October 2002. This W3C Working Draft is http://www.w3.org/TR/2002/WD-xag-20021003. The <a shape="rect" href="http://www.w3.org/TR/xag"> latest version </a> is available . Available at http://www.w3.org/TR/xag. http://hoohoo.ncsa.uiuc.edu/cgi/interface.html.
deleted text: </dl> </div> <div class="section"> <h3> 8.3. <a shape="rect" name="informative" id="informative"> Non-Normative References </a> </h3> <dl>
<a shape="rect" name="Cool" id="Cool"> Cool
" <a shape="rect" href="http://www.w3.org/Provider/Style/URI.html"> Cool URIs don't change " T. Berners-Lee, W3C, 1998 Available at http://www.w3.org/Provider/Style/URI. Note that the title is somewhat misleading. It is not the URIs that change, it is what they identify.
<a shape="rect" name="DAMLOIL" id="DAMLOIL"> DAMLOIL </a> </dt> <dd> " <a shape="rect" href="http://www.w3.org/TR/2001/NOTE-daml+oil-reference-20011218"> DAML+OIL (March 2001) Reference Description </a> ", D. Connolly, F. van Harmelen, I. Horrocks, D. L. McGuinness, P. F. Patel-Schneider, 18 Dec 2001. This W3C Note is available at http://www.w3.org/TR/2001/NOTE-daml+oil-reference-20011218. </dd> <dt> <a shape="rect" name="EXTLANG" id="EXTLANG"> EXTLANG Eng90
deleted text: " <a shape="rect" href="http://www.w3.org/TR/1998/NOTE-webarch-extlang-19980210"> Web Architecture: Extensible Languages </a>, T. Berners-Lee, D. Connolly, 10 February 1998. This W3C Note is available at http://www.w3.org/TR/1998/NOTE-webarch-extlang-19980210. </dd> <dt> <a shape="rect" name="Eng90" id="Eng90"> Eng90 </a> </dt> <dd> " <a shape="rect" href="http://www.bootstrap.org/augment/AUGMENT/132082.html"> Knowledge-Domain Interoperability and an Open Hyperdocument System ", , D. C. Engelbart, June 1990.
FREENET
The Free Network Project .
<a shape="rect" href="http://www.w3.org/Addressing/schemes"> Dan Connolly's list of URI schemes is a useful resource for finding out which references define various URI schemes.
<a shape="rect" name="IETFXML" id="IETFXML"> IETFXML
IETF " <a shape="rect" href="http://www.imc.org/ietf-xml-use/xml-guidelines-07.txt"> Guidelines For The Use of XML in IETF Protocols </a>," , S. Hollenbeck, M. Rose, L. Masinter, eds., 2 November 2002. This IETF Internet Draft is available at http://www.imc.org/ietf-xml-use/xml-guidelines-07.txt. If this document is no longer available, refer to the <a shape="rect" href="http://www.imc.org/ietf-xml-use/index.html"> ietf-xml-use mailing list .
<a shape="rect" name="IRI" id="IRI"> IRI
IETF " <a shape="rect" href="http://www.w3.org/International/iri-edit/draft-duerst-iri.html"> Internationalized Resource Identifiers (IRIs)" </a>, (IRIs) , M. Duerst, M. Suignard, Nov 2002. This IETF Internet Draft is available at http://www.w3.org/International/iri-edit/draft-duerst-iri.html. If this document is no longer available, refer to the home page for <a shape="rect" href="http://www.w3.org/International/iri-edit/"> Editing 'Internationalized Resource Identifiers (IRIs)' .
<a shape="rect" name="OWL10" id="OWL10"> OWL10 </a> </dt> <dd> " <a shape="rect" href="http://www.w3.org/TR/2003/CR-owl-ref-20030818/"> Web Ontology Language (OWL) Reference Version 1.0 </a> ", M. Dean and G. Schreiber, eds., 18 August 2003. This W3C Candidate Recommendation is available at http://www.w3.org/TR/2003/CR-owl-ref-20030818/. </dd> <dt> <a shape="rect" name="P3P10" id="P3P10"> P3P10 MLDONKEY
deleted text: " <a shape="rect" href="http://www.w3.org/TR/2002/REC-P3P-20020416/"> The Platform for Privacy Preferences 1.0 (P3P1.0) Specification MLDonkey Project ", M. Marchiori, ed., 16 April 2002. This W3C Recommendation is available at http://www.w3.org/TR/2002/REC-P3P-20020416/.
<a shape="rect" name="RDDL" id="RDDL"> RDDL
" <a shape="rect" href="http://www.tbray.org/tag/rddl/rddl3.html"> Resource Directory Description Language (RDDL) ", , J. Borden, T. Bray, eds., 1 June 2003. This document is available at http://www.tbray.org/tag/rddl/rddl3.html.
<a shape="rect" name="RDF10" id="RDF10"> RDF10 RELAXNG
" <a shape="rect" href="http://www.w3.org/TR/1999/REC-rdf-syntax-19990222/"> Resource Description Framework (RDF) Model and Syntax Specification The RELAX NG ", O. Lassila, R. R. Swick, eds., 22 February 1999. This W3C Recommendation is available at http://www.w3.org/TR/1999/REC-rdf-syntax-19990222/. schema language project.
<a shape="rect" name="REST" id="REST"> REST
" <a shape="rect" href="http://www.ics.uci.edu/~fielding/pubs/dissertation/rest_arch_style.htm"> Representational State Transfer (REST) ", , Chapter 5 of "Architectural Styles and the Design of Network-based Software Architectures", Doctoral Thesis of R. T. Fielding, 2000. Available at http://www.ics.uci.edu/~fielding/pubs/dissertation/rest_arch_style.htm.
<a shape="rect" name="RFC2141" id="RFC2141"> RFC977
IETF RFC 977: Network News Transfer Protocol , B. Kantor, P. Lapsley, February 1986. Available at http://www.ietf.org/rfc/rfc977.txt.
RFC2141
IETF " <a shape="rect" href="http://www.ietf.org/rfc/rfc2141.txt"> RFC 2141: URN Syntax ", , R. Moats, May 1997. Available at http://www.ietf.org/rfc/rfc2141.txt.
<a shape="rect" name="RFC2718" id="RFC2718"> RFC2326
IETF RFC 2326: Real Time Streaming Protocol (RTSP) , H. Schulzrinne, A. Rao, R. Lanphier, April 1998. Available at: http://www.ietf.org/rfc/rfc2326.txt.
RFC2718
IETF " <a shape="rect" href="http://www.ietf.org/rfc/rfc2718.txt"> RFC 2718: Guidelines for new URL Schemes ", , L. Masinter, H. Alvestrand, D. Zigmond, R. Petke, November 1999. Available at: http://www.ietf.org/rfc/rfc2718.txt.
<a shape="rect" name="RFC3023" id="RFC3023"> RFC3023 RFC2818
IETF " <a shape="rect" href="http://www.ietf.org/rfc/rfc3023.txt"> RFC 3023: XML Media Types 2818: HTTP Over TLS ", M. Murata, S. St. Laurent, D. Kohn, January 2001. , E. Rescorla, May 2000. Available at: http://www.rfc-editor.org/rfc/rfc3023.txt http://www.ietf.org/rfc/rfc2818.txt.
<a shape="rect" name="RFC3236" id="RFC3236"> RFC3236 RFC3023
IETF " <a shape="rect" href="http://www.ietf.org/rfc/rfc3236.txt"> RFC 3236: The 'application/xhtml+xml' 3023: XML Media Type Types ", , M. Baker, P. Stark, Murata, S. St. Laurent, D. Kohn, January 2002. 2001. Available at: http://www.rfc-editor.org/rfc/rfc3236.txt http://www.rfc-editor.org/rfc/rfc3023.txt
<a shape="rect" name="RFC3401" id="RFC3401"> RFC3401 RFC3236
IETF " <a shape="rect" href="http://www.ietf.org/rfc/rfc3401.txt"> RFC 3401: Dynamic Delegation Discovery System (DDDS) Part One: 3236: The Comprehensive DDDS 'application/xhtml+xml' Media Type ", , M. Mealing, October Baker, P. Stark, January 2002. Available at: http://www.rfc-editor.org/rfc/rfc3401.txt </dd> <dt> <a shape="rect" name="SOAP12" id="SOAP12"> SOAP12 </a> </dt> <dd> " <a shape="rect" href="http://www.w3.org/TR/2003/REC-soap12-part1-20030624/"> SOAP Version 1.2 Part 1: Messaging Framework </a> ", M. Gudgin, M. Hadley, N. Mendelsohn, J.-J. Moreau, H. Frystyk Nielsen, eds., 24 June 2003. This W3C Recommendation is available at http://www.w3.org/TR/2003/REC-soap12-part1-20030624/. http://www.rfc-editor.org/rfc/rfc3236.txt
<a shape="rect" name="SVG11" id="SVG11"> SVG11 SAX
" <a shape="rect" href="http://www.w3.org/TR/2003/REC-SVG11-20030114/"> Scalable Vector Graphics (SVG) 1.1 Specification Resources related to the Simple API for XML ", J. Ferraiolo, Fujisawa Jun, D. Jackson, eds., 14 January 2003. This W3C Recommendation is available at http://www.w3.org/TR/2003/REC-SVG11-20030114/. ( SAX ).
<a shape="rect" name="UNICODE" id="UNICODE"> UNICODE
See the <a shape="rect" href="http://www.unicode.org/"> Unicode Consortium home page for information about the latest version of Unicode and character repertoires.
<a shape="rect" name="UniqueDNS" id="UniqueDNS"> UniqueDNS
" <a shape="rect" href="http://www.icann.org/correspondence/iab-tech-comment-27sept99.htm"> IAB Technical Comment on the Unique DNS Root" </a>, Root , B. Carpenter, 27 September 1999. Available at http://www.icann.org/correspondence/iab-tech-comment-27sept99.htm.
<a shape="rect" name="XHTML10" id="XHTML10"> XHTML10 XMPP
deleted text: " <a shape="rect" href="http://www.w3.org/TR/2002/REC-xhtml1-20020801/"> XHTML 1.0: The Extensible HyperText Markup Language: A Reformulation of HTML 4 in XML 1.0 </a> ", S. Pemberton et al., 26 January 2000, revised 1 August 2002. Available at http://www.w3.org/TR/2002/REC-xhtml1-20020801/. </dd> <dt> <a shape="rect" name="XMLSCHEMA" id="XMLSCHEMA"> XMLSCHEMA </a> </dt> <dd> " <a shape="rect" href="http://www.w3.org/TR/2001/REC-xmlschema-1-20010502/"> XML Schema Part 1: Structures </a> ", H. Thompson, D. Beech, M. Maloney, N. Mendelsohn, 2 May 2001. Available at http://www.w3.org/TR/2001/REC-xmlschema-1-20010502/ . </dd> <dt> <a shape="rect" name="XLink10" id="XLink10"> XLink10 </a> </dt> <dd> " <a shape="rect" href="http://www.w3.org/TR/2001/REC-xlink-20010627/"> XML Linking Language (XLink) Version 1.0 </a> ", S. DeRose, E. Maler, D. Orchard, 27 June 2001. This W3C Recommendation is available at http://www.w3.org/TR/2001/REC-xlink-20010627/. </dd> <dt> <a shape="rect" name="XML10" id="XML10"> XML10 </a> </dt> <dd> " <a shape="rect" href="http://www.w3.org/TR/2000/REC-xml-20001006"> Extensible Markup Language (XML) 1.0 (Second Edition) </a> ", T. Bray, J. Paoli, C.M. Sperberg-McQueen, E. Maler, 6 October 2000. This W3C Recommendation is available at http://www.w3.org/TR/2000/REC-xml-20001006. </dd> <dt> <a shape="rect" name="XMLNS" id="XMLNS"> XMLNS </a> </dt> <dd> " <a shape="rect" href="http://www.w3.org/TR/1999/REC-xml-names-19990114/"> Namespaces in XML Messaging and Presence Protocol ( XMPP ) IETF Working Group deleted text: ", T. Bray, D. Hollander, A. Layman, 14 Jan 1999. This W3C Recommendation is available at http://www.w3.org/TR/1999/REC-xml-names-19990114/. </dd> <dt> <a shape="rect" name="XPTRFR" id="XPTRFR"> XPTRFR </a> </dt> <dd> " <a shape="rect" href="http://www.w3.org/TR/2003/REC-xptr-framework-20030325/"> XPointer Framework </a> ", P. Grosso, E. Maler, J. Marsh, N. Walsh, eds., 25 March 2003. This W3C Recommendation developing "an open, XML-based protocol for near real-time extensible messaging and presence. It is available at http://www.w3.org/TR/2003/REC-xptr-framework-20030325/. the core protocol of the Jabber Instant Messaging and Presence technology..."

9. <a shape="rect" name="acks" id="acks"> 7. Acknowledgments

This document was authored by the W3C Technical Architecture Group which included the following participants: Tim Berners-Lee (co-Chair, W3C), Tim Bray (Antarctica Systems), Dan Connolly (W3C), Paul Cotton (Microsoft Corporation), Roy Fielding (Day Software), Chris Lilley (W3C), David Orchard (BEA Systems), Norman Walsh (Sun), and Stuart Williams (co-Chair, Hewlett-Packard).

The TAG thanks people for their thoughtful appreciates the many contributions on the TAG's public mailing list, www-tag@w3.org ( <a shape="rect" href="http://lists.w3.org/Archives/Public/www-tag/"> archive ). ), that have helped to improve this document.