DRAFT: Architectural Principles of the World Wide Web

This version:: http://www.w3.org/2001/tag/2002/0826-archdoc
Superseded by:: http://www.w3.org/2001/tag/2002/0828-archdoc
Previous version:: http://www.w3.org/2001/tag/2002/0813-archdoc
Editor:: Ian Jacobs, W3C

Abstract

The World Wide Web is a networked information system. Web Architecture is the set of principles that all agents in the system follow that result in the large-scale effect of a shared information space. Identification, data formats, and protocols are the main technical components of Web Architecture, but the large-scale effect depends on social behavior as well.

This document strives to establish a reference set of principles for Web architecture. Some of these principles may conflict with current practice, and so education and outreach will be required to improve on that practice. Other principles may fill in gaps in published specifications or may call attention to known weaknesses in those specifications.

Status of this document

This document has been superseded. See next version.

This document has been developed for discussion by the W3C Technical Architecture Group.

This draft is highly unstable. This draft represents substantial input from TAG participants, but does not yet represent consensus. It is a draft with no official standing. Once this document has undergone substantial revision, the TAG expects to develop it on the W3C Recommendation track.

Please send comments on this document to the public W3C TAG mailing list www-tag@w3.org (archive).

Publication of this document by W3C indicates no endorsement by W3C.

1. Introduction

The World Wide Web ("Web" from here on) is a networked information system consisting of agents (programs acting on behalf of another person, entity, or process) that exchange information. Open: Web Architecture is the set of principles that all agents in the system follow that result in the large-scale effect of a shared information space that scales well and behaves predictably.

This architecture consists of:

Identifiers. A single specification of the way in which objects in the system are identified: the Uniform Resource Identifier (URI) [RFC2396].
Formats. Specifications of a nonexclusive set of data formats designed for interchange between agents in the system. This includes several formats used in isolation or in combinations (e.g., XHTML, CSS, PNG, XLink, RDF, SMIL animation), as well as technologies for designing new formats (XML, XML namespaces).
Protocols. Specifications of a small and nonexclusive set of protocols for interchanging information between agents, including HTTP [RFC2616], SMTP and others. Several of these protocols share a reliance on the Internet Media Type (or, "MIME") metadata/packaging system [RFC2046].

1.1. Structure and conventions of this document

After this introduction, chapters two, three, and four discuss identifiers, formats, and protocols, respectively. Each of those chapters includes principles of Web architecture. Each principle has a title and is highlighted visually in a shaded box. The last section of the introduction lists all of the principles.

The terms MUST, SHOULD, MAY, etc. are used in accordance with RFC 2119 [RFC2119].

Open issues or questions are highlighted.

1.2. Audience of this document

The intended audience for this document includes:

Participants in W3C groups
Groups outside of W3C devloping technologies to be integrated into the Web

The authors have made every effort to keep this document terse, with the expectation that additional documents will elaborate on the principles below.

1.3. Limits of this document

This document focuses on architectural principles specific to or fundamental to the Web. It does not address general principles of design, which are also important to the success of the Web. Indeed, behind many of the principles of Web Architecture lie these and other principles: minimal constraint (fewer rules makes the system more flexible), modularity, minimum redundancy, extensibility, simplicity, robustness, etc.

This document does not address architectural design goals covered by targeted W3C specifications:

Internationalization; see W3C's Internationalization Activity.
Accessibility; see W3C's Web Accessibility Initiative.
Device independence; see W3C's Device Independence Activity.

1.4. List of principles in this document

This document establishes the following principles:

1. Use URIs:: All important resources SHOULD be identified by an absolute URI reference.
2. Valid use of an absolute URI reference:: If you are using a registered URI scheme and following all the other relevant protocol specifications, it is unambiguous what resource you are referring to.
3. Allow dereference:: Agents SHOULD be able to dereference absolute URI references for important resources.
4. Describe resources:: Owners of resources that are important abstract concepts (for example, Internet protocol parameters) SHOULD make available human and/or machine readable representations that describe the nature and purpose of those resources.
5. Representation retrieval is safe:: Agents do not incur obligations by retrieving a representation (e.g., by following a link). [TAG finding "URIs, Addressability, and the use of HTTP GET"]
6. Consistent representations:: There is a strong expectation of consistency between the representations of a resource; to the extent possible, representations SHOULD be equivalent.
7. Context-insensitive absolute URI references:: An absolute URI reference SHOULD denote the same resource or concept independent of the context(s) in which the identifier is used.
8. Persistent absolute URI references:: Those who create and manage resources and their identifiers SHOULD design the identifiers in such a way as to ensure their persistence.
9. New URI schemes expensive:: Since correct processing of URIs is often scheme-dependent, and since a huge range of software is expected to be able to process URIs, the cost of introduction of new URI schemes is very high. Authors of specifications SHOULD avoid introducing new URI schemes when existing schemes can be used to meet the same goal..
10. Public use of unregistered schemes:: People MUST NOT use an unregistered URI scheme on the public Internet.
11. URI case sensitivity:: People SHOULD NOT assume that two URIs that differ only in case can be used interchangeably.
12. Content negotiation and fragments:: Authors SHOULD NOT use HTTP content negotiation for different media types that do not share the same fragment identifier semantics.

2. Identifiers and resources

The Web is a universe of resources. Resources are a generalization over documents, files, menu items, machines, and services, as well as people, organizations, concepts, etc. Web architecture starts with a uniform syntax for resource identifiers, so that we can refer to resources, access them, describe them, share them, etc. The syntax employs an extensible set of URI schemes. Several URI schemes incorporate established identification mechanisms (that pre-date the Web) into this syntax:

mailto:nobody@example.org. The MAILTO scheme is for mailbox names (including DNS domain names).
ftp://example.org/aDirectory/aFile. The FTP scheme is for ftp file names (including DNS domain names).
news:comp.infosystems.www. The NEWS scheme is for newsgroup names.
tel:+1-816-555-1212. The TEL scheme is for telephone numbers.
urn:uuid:BDC6E3F0-6DA3-11d1-A2A3-00AA00C14882. For UUIDs, from Apollo/DCE/COM

Other URI schemes have been introduced since the advent of the Web, including those introduced as a consequence of new protocols:

http://www.example.org/something?with=arg1;and=arg2. For HTTP resources.
ldap://ldap.itd.umich.edu/c=GB?objectClass?one. For LDAP resources.
urn:oasis:SAML:1.0. A namespace from an Oasis specification.

Identifiers in any of these schemes can be composed with a fragment identifier to yield an identifier for a resource that is a part of, or a view of, another resource:

ftp://example.org/aDirectory/aDocument#section1
http://www.example.org/aList#item1
http://www.example.org/states#texas

Note that while this composition is syntactically fully general, many cases such as mailto:nobody@example.org#abc do not make much sense to any deployed software or specifications.

To summarize, a Uniform Resource Identifier, or URI, is a character sequence starting with a scheme name, followed by a number of scheme-specific fields. An absolute URI reference is a URI followed optionally by a fragment identifier (see [RFC2396] for the complete list of syntactic constraints). URIs and absolute URI references identify Web resources. The principles in this document are expressed in terms of absolute URI references.

Open: While people agree that URIs identify resources (per RFC 2396 [RFC2396]), there is not yet consensus that absolute URI references with fragment identifies may be used to identify resources. Some people contend that an absolute URI reference with a fragment identifier identifies a portion of a representation.

Note: The current URI specification, RFC 2396 [RFC2396], also includes the concept of the "relative URI reference." The syntax for a relative URI reference is a shortened form of that for absolute URI reference, where some prefix of the URI is missing and certain path components ("." and "..") have a special meaning when, and only when, interpreting a relative path. For example, in a document whose base URI is http://example/dir1/dir2/file1, the relative URI reference ../file2 abbreviates http://example/dir1/file2 and the relative URI reference #abc abbreviates http://example/dir1/dir2/file1#abc.

2.1. Resources, URIs, and the shared information space

When one resource refers to another via an absolute URI reference, a link is formed. When many resources are linked this way, the large-scale effect is a shared information space, addressable by an absolute URI reference. The Web is more valuable for every resource in the space, and in turn, resources are more valuable when they are addressable in the Web. Hence:

Use URIs: All important resources SHOULD be identified by an absolute URI reference.¹

The impact of making resources addressable with absolute URI references varies from linking and bookmarking to unintended consequences, such as global search services. See the TAG finding URIs, Addressability, and the use of HTTP GET for some details about the interaction of this principle in HTTP application design.

Some resources do not have URIs. URIs are denumerable, which means there are enough to give one to every real number without collisions, for example.

Open Say something here à la what Tim Bray said: "Designers SHOULD NOT build a world of resources that cannot be identified by URI."?

2.2. Operations on URIs

The two primary operations on absolute URI references are:

Interaction with resources.
Comparison (e.g., of XML namespace identifiers [XMLNS] for equivalence).

There may be applications where comparison is expected the sole or primary operation on an absolute URI reference. In such cases, it does not matter whether one has chosen a URI or an absolute URI reference to identify a resource.

When one expects to interact with a resource, there are some advantages to identifying that resource with a URI rather than an absolute URI reference: only URIs work with intermediaries in the Web architecture (e.g., proxies) or with redirection (in HTTP, for example).

Note: Even if an absolute URI reference with a fragment identifier is used to refer to a resource, one may refer to a portion of that resource with a different absolute URI reference.

2.2.1. Interactions with resources

To dereference an absolute URI reference is to use it to interact with the resource it identifies. One interacts with a resource through representations of the resource. A resource is an abstraction for which there is a conceptual mapping to a (possibly empty) set of representations. A representation may be full fidelity, i.e. a complete description, or it may be partial, i.e. describes some aspect of the resource. The interpretation of any such representation is determined by its Media type.

For instance, suppose the URI http://weather.yahoo.com/forecast/MXOA0069 identifies a resource that is "the weather forecast for Oaxaca, Mexico". A representation retrieved by means of that URI may be encoded in any number of formats, including HTML, XHTML, SVG, etc.; see chapter 2 for more information about formats.

Interaction with a resource is governed by recursive application of a finite set of specifications, beginning with the specification that governs the scheme of the URI. For example, suppose the absolute URI reference http://www.example.org/test/foo.svg is used within an a element of an SVG document. The sequence of specifications applied is:

The URI specification [RFC2396]. This specification says (in section 3.1) that the scheme "define the semantics for the remainder of the URI string." In this case, the URI scheme is HTTP.
The HTTP/1.1 protocol. Section 3.2.2 of RFC2616 [RFC2616] explains the semantics of HTTP URIs.
The SVG 1.0 Recommendation [SVG10], which imports the link semantics defined by XLink 1.0 [XLink10]. Section 17.4 of the SVG specification suggests that interaction with an a link involves retrieving a representation a resource, identified by the XLink href attribute: "By activating these links (by clicking with the mouse, through keyboard input, voice commands, etc.), users may visit these resources." This means that the GET method defined in HTTP/1.1 is used to retrieve the representation of the resource.
Once the representation has been retrieved, the Media type of the representation (here, SVG 1.0) governs its interpretation (here, rendering).

Each valid use of an absolute URI reference unambiguously identifies one resource.

Valid use of an absolute URI reference: If you are using a registered URI scheme and following all the other relevant protocol specifications, it is unambiguous what resource you are referring to.

There may be several ways to interact with a resource. One of the most important operations for the Web is to retrieve a representation of a resource (such as with HTTP GET). There are other ways to interact with a resource (such as with HTTP POST).

Allow dereference: Agents SHOULD be able to dereference absolute URI references for important resources.

Describe resources: Owners of resources that are important abstract concepts (for example, Internet protocol parameters) SHOULD make available human and/or machine readable representations that describe the nature and purpose of those resources.

Representation retrieval is safe: Agents do not incur obligations by retrieving a representation (e.g., by following a link). [TAG finding "URIs, Addressability, and the use of HTTP GET"]

Open Need to say something about difference between assertions about a resource and assertions about a representation.

2.2.2. Consistent representations

The representations of a resource may vary as a function of factors including time, the identity of the agent accessing the resource, data submitted to the resource when interacting with it, and changes external to the resource (e.g., the weather). For example, for the resource "the weather forecast for Oaxaca, Mexico," the representations depend on (at least) time, the expressed preference of the user for Fahrenheit or Celsius, and the identity of the user-agent software receiving the representation.

Consistent representations: There is a strong expectation of consistency between the representations of a resource; to the extent possible, representations SHOULD be equivalent.

Open Need to clarify what "equivalent" means in the previous sentence.

2.3. Some generalities about absolute URI references

The following statements are useful generalities about some absolute URI references. Some of these generalities do not hold for some URI schemes.

The authority over an absolute URI reference determines which resource it identifies.
It is not generally possible to inspect an absolute URI reference and determine what resource it identifies. For example, in general, one cannot look at http://www.example.com/lj45sr and know that it refers to "my old car" or "the weather forecast for Oaxaca, Mexico." Note that over time, we trust that some absolute URI references will identify familiar resources. That trust derives from social behavior, not the spelling of the identifier.
In general, several absolute URI references may identify the same resource.
It is not generally possible to inspect two absolute URI references and determine that they identify the same resource. This does not prevent some URI schemes from mandating equivalence for particular sets of URIs using that scheme.
It is possible to compare two absolute URI references to see whether they are spelled equivalently; see the section on URI equivalence and comparison for more details.

2.3.1. Absolute URI references and context-sensitivity

Each valid use of an absolute URI reference identifies one resource, but the resource itself may be inherently context-sensitive. For instance, http://www.example.com/ identifies the same resource in any context. On the other hand, http://localhost/ and file:/etc/hosts each identify one resource, but that resource is "local" to a particular computer. It is valid to use a URI such as file:/etc/hosts on a given computer, and even on several computers, if you are confident that all of those computers are running the same type of operating system.

Context-insensitive absolute URI references: An absolute URI reference SHOULD denote the same resource or concept independent of the context(s) in which the identifier is used.

2.4. Characteristics of absolute URI references

2.4.1. Persistence

Note the difference between changes in representations of a resource and changes in the binding between an absolute URI reference and a resource. Today, the absolute URI reference http://www.w3.org/ identifies the resource "the W3C home page." A representation retrieved today for that absolute URI reference is likely to differ from one you get tomorrow, since W3C updates its home page frequently with news items. These changes in representation are predictable, and the resource remains "the W3C home page".

On the other hand, if tomorrow, the same absolute URI reference identified a different resource (for example, because the domain was sold and the new owner decided to assert a different URI-Resource relationship), the identifier would lose value. This type of indiscriminate use of identifiers undermines their value and interferes with people who relied on them (e.g., historians, court archives, new archive services, and anybody frustrated by a broken link).

There are strong social expectations that once an absolute URI reference identifies a particular resource, it should continue indefinitely to refer to that resource. Persistence is always a matter of policy and commitment on the part of authorities assigning URIs rather than a constraint imposed by technological means.

Persistent absolute URI references: Those who create and manage resources and their identifiers SHOULD design the identifiers in such a way as to ensure their persistence.

For example, each W3C technical report (e.g., "the SVG specification") is in fact a series of documents that reflects the maturation of the technical report (Working Drafts, Candidate Recommendations, Proposed Recommendations, and a Recommendation). W3C assigns an absolute URI reference to the "latest version" in the specification series (e.g., http://www.w3.org/TR/SVG). W3C also assigns an absolute URI reference for each specification in the series (called the "this version URI", as in http://www.w3.org/TR/2001/PR-SVG-20010719/). W3C policy is that representations of the "latest version" resource will change over time (with each new publication of an SVG specification). W3C policy is also that representations of a specification designed by a "this version" identifier will not change over time (to the best of W3C's ability to maintain its archives intact).

For more discussion about persistence, refer to "Cool URI's don't change" [Cool].²

2.4.2. URI Schemes

One important characteristic of a URI is its scheme (the string that precedes ":" in a URI). For example the scheme of the URI http://www.example.com/ is "http", and for ftp://ftp.example.com/ it is "ftp". It is common to classify URIs by scheme, calling the two preceding examples respectively an "HTTP URI" and an "FTP URI".

Many of the properties of URIs are scheme-dependent.

New URI schemes expensive: Since correct processing of URIs is often scheme-dependent, and since a huge range of software is expected to be able to process URIs, the cost of introduction of new URI schemes is very high. Authors of specifications SHOULD avoid introducing new URI schemes when existing schemes can be used to meet the same goal..

While "myscheme:blort" is a URI that satisfies the syntactic constraints of [RFC2396], if "myscheme" is not registered, you don't have license to use that URI in any Internet protocols; there aren't any valid uses of it. You can't expect anybody to know what you mean by it, and you aren't guaranteed that somebody else isn't already using it for something else.

Public use of unregistered schemes: People MUST NOT use an unregistered URI scheme on the public Internet.

The IANA registry [IANASchemes] lists URI schemes and the specifications that define them. For instance, the HTTP URI scheme is defined in section 3.2.2 of the HTTP specification [RFC2616]. Refer to RFC2717 for information about registering a new URI scheme.

2.4.2.1. Scheme-specific Resource Classes

Some URI schemes are used for identifying specific classes of resources. For example, TELNET URIs identify telnet services and MAILTO URIs electronic mailboxes.

Open:issue httpRange-14 : What is the range of the HTTP dereference function? Two views held within the TAG are that the range is (1) anything or (2) documents, used in a very broad sense (see Tim Berners-Lee's "What do HTTP URIs identify?").

2.4.2.2. Dereference mechanisms

The procedure for retrieving a representation may vary from scheme to scheme. For example, HTTP URIs are dereferencable using the protocol of the same name, and the dereferencing procedure is defined in section 3.2.2 of the HTTP specification [RFC2616].

On the other hand, the URN scheme [RFC 2141] does not guarantee that a dereference procedure is defined for any given URN.

Open: "Since HTTP GET is defined and widely deployed, agents SHOULD use HTTP URIs.

Open: issue deepLinking-25: What to say in defense of principle that deep linking is not an illegal act?

2.4.2.3. Social Governance

The deployment and use of different URI schemes may require varying degrees of central coordination and administration. For example, HTTP URIs depend (in practice at least) on the use of the DNS infrastructure. Also, there is a central registry of URN subclasses.

2.4.2.4. Equivalence and Comparison

Certain URI schemes provide rules for determining the syntactic equivalence of absolute URI references, i.e., whether two absolute URI references are different spellings of the same identifier. These rules vary from scheme to scheme.

For example, URNs begin with two colon-delimited fields, the first of which must be urn and the second identifies the subclass of URN, for example urn:ietf:example. In URNs, these two fields are to be compared in a case-insensitive fashion. The remainder of the URN following the second colon is subject to rules dependent on the content of the second field (following the first colon) - thus the equivalence rules may vary within subclasses of URNs.

Section 3.2.3 of the HTTP specification [RFC2616] states that, when comparing two HTTP URIs, the host name part must be considered case-insensitive, so http://WWW.EXAMPLE/ and http://www.example/ identify the same resource.

URI case sensitivity: People SHOULD NOT assume that two URIs that differ only in case can be used interchangeably.

Note: Equivalence of URIs is not the same as equivalence of representations of a resource.

Open: issue URIEquivalence-15: When are two URI variants considered equivalent?

2.5. Fragment identifiers

In some URI schemes, absolute URI references may end with a fragment identifier. The fragment identifier is interpreted only after the retrieval of a representation. Section 4.1 of [RFC2396] states that "the format and interpretation of fragment identifiers is dependent on the media type [RFC2046] of the retrieval result," that is, the representation.

For instance, if the representation is an HTML document, the fragment identifies a hypertext anchor. In the case of a graphics format, a URI reference might identify a circle or spline. In the case of RDF, a a URI reference can identify anything, be it abstract (e.g., a dream) or concrete (e.g., my car). The media type 'text/plain' does not define semantics for fragment identifiers.

2.5.1. Design weakness: HTTP content negotiation and fragment identifiers

Content negotiation and fragments: Authors SHOULD NOT use HTTP content negotiation for different media types that do not share the same fragment identifier semantics.

Open: New access protocols should provide a means to convert fragment identifiers according to media type.

3. Formats

3.1. Scope

What is a format, and how does it relate to the concept of a document. Do all documents have a format? Is a document a collection of resources of different formats organized into a whole? Is a document the same as a resource? the same as a message body? as a non-multipart message body? What is the distinction between documents and data, if any. Does 'document' imply human readable and if so, does it imply presentation? Does it imply a hierarchically structured, report-like document with headings and subheadings? Is a catalog a document? Is a rave flyer a document?

Negotiation (stuff above might go here also) by network request, by listed alternatives in content any preference? Resource variants, foo.css and foo.html unlikely to be equivalent.

3.2. Model View Controller

Separation allows more easily composable specifications, allows multimodal access, clarifies the concept of multiple, synchronous views of a document, and enhances accessibility.

3.3. The model - document formats

Composability (ns-meaning). Use of XML for tree structured content. Linking in general v. idref in one document. Human readable v. machine data. Served or not (hidden behind server - semantic firewall, accessibility. Linking into parts of the model, transclusion of parts. Compound documents, components from multiple servers - scalability, deep linking. Processing models, error handling.

3.4. The view - presentation

Presentation by decoration (application of CSS to XML as presentation), and by derivation (creation of html/svg/etc as presentation). Linking between view and model. Inheritance of properties across namespaces. Consistency of property names. Subsets. 'Applies to' as opposed to 'set on'. Specificity of properties as attributes, chaining styling, restyling. Time-lines, linking to portions of a time-line.

3.5. The controller - animation, scripting, events, client/server interaction

Declarative v. script based - accessibility, power; formalization of common functionality (loop animation, rollovers) in declarative form. DOM - making additional methods, add to rather than replacing XML DOM. Effect of script/programming language limitations on choice of element and attribute names. Linking to active components - XForms example with model and abstract form control, can be extended to presentational instantiation of form control.

Ideas and issues:

For new format specifications, use XML family of specifications unless there's a good reason not to. Open: Which XML specifications? Open: which particular family members?
Format designers should use URIs without constraining content providers to particular URI schemes. Open: what does "use" mean? IDREF v. linking - web-wide rather than document-wide references.
Namespaces. Issues namespaceDocument-8, mixedNamespaceMeaning-13
Qnames: Issues rdfmsQnameUriMapping-6, qnameAsId-18 and finding "Using QNames as Identifiers in Content"
Formatting properties: Issue formattingProperties-19
Error handling: Issue errorHandling-20
Media type registration: RFC3023Charset-21, finding Internet Media Type registration, consistency of use. Also, makes sure to define fragment identifier semantics.
Effect of Mobile on architecture - size, complexity, memory constraints. Binary infosets, storage efficiency. Composable subsets.
What is the scope of using XLink? xlinkScope-23
Can a specification include rules for overriding HTTP content type parameters? contentTypeOverride-24

4. Protocols

As mentioned in the introduction, the Web is designed to create the large-scale effect of a shared information space that scales well and behaves predictably. The architectural style known as Representational State Transfer [REST] encapsulates this notion of a shared information space. According to Fielding:

REST provides a set of architectural constraints that, when applied as a whole, emphasizes scalability of component interactions, generality of interfaces, independent deployment of components, and intermediary components to reduce interaction latency, enforce security, and encapsulate legacy systems.
-- Roy Fielding, Section 5.5 of [REST]

HTTP has been specially designed for REST interactions. HTTP has a variety of methods designed to manipulate resource state through representation transfer between agents. These methods include GET (covered in section 1.2), POST, PUT, and DELETE.

This chapter uses the REST model to explain how Web protocols take into account the properties of resources and URIs, as well as real-world time and space constraints, in order to improve the user's Web experience.

Ideas and issues:

Consistency of media types and message contents (from "TAG Finding: Internet Media Type registration, consistency of use"
Consistency of communicating character encoding (same source).
HTTP as a substrate protocol [TAG issue HTTPSubstrate-16]

5. Tips on URIs

5.1. Spelling of URIs

Do not make assumptions about a resource based on the spelling of a URI that refers to it (other than what is defined in specifications for the URI scheme). Since URIs are opaque, it is an error to assume, for example, that a URI that happens to end with the string ".html" refers to a resource that has an HTML representation. Though people must not infer anything about the nature of a resource representation from a URI ending in ".html", resource owners must not create confusion by purposely mis-assigning suffixes and representation types.

At times it is useful or necessary to reveal a URI (e.g., in an advertisement on the side of a bus), in which case, good social behavior requires that the URI be easy to use. In general, URIs should be hidden from view as they tend to lure us into thinking they hold definitive meaning about a resource.

Open: Canonical form of URIs. Seeissue URIEquivalence-15.

5.2. Unique URIs

Authors should not use a URI to identify more than one resource.

Nothing prevents us from considering "a representation of the novel Moby Dick" to be a resource itself (and thus to have an assigned URI). Authors should not use the same URI to refer to the resource "Moby Dick" and to the particular representation of that resource. Similarly, authors should not use the same URI to refer to a person and to that person's mailbox.

6. End notes

This principle dates back at least as far as Douglas Engelbart's seminal work on open hypertext systems; see section Every Object Addressable in [Eng90]. (Note 1 context.)
The title is somewhat misleading. It's not the URIs that change, it's what they identify. (Note 2 context.)

7. Glossary

Absolute URI Reference: a URI followed optionally by a fragment identifier
Agents: programs acting on behalf of another person, entity, or process
Dereference: To dereference an absolute URI reference is to use it to interact with the resource it identifies.
Internet Media Type: metadata/packaging system [RFC2046].
Link: When one resource refers to another via an absolute URI reference, a link is formed.
Persistence
Resource: a generalization over documents, files, menu items, machines, and services, as well as people, organizations, concepts, etc.
URI Scheme
Uniform Resource Identifier (URI): a character sequence starting with a scheme name, followed by a number of scheme-specific fields.

8. References

8.1. Normative References

IANASchemes: IANA's online registry of URI Schemes is available at http://www.iana.org/assignments/uri-schemes.; Dan Connolly's list of URI schemes is a useful resource for finding out which references define various URI schemes.
RFC2046: IETF "RFC 2046: Multipurpose Internet Mail Extensions (MIME) Part Two: Media Types", N. Freed, N. Borenstein, November 1996. Available at http://www.ietf.org/rfc/rfc2046.
RFC2119: IETF "RFC 2119: Key words for use in RFCs to Indicate Requirement Levels", S. Bradner, March 1997. Available at http://www.ietf.org/rfc/rfc2119.txt.
RFC2396: IETF "RFC 2396: Uniform Resource Identifiers (URI): Generic Syntax", T. Berners-Lee, R. Fielding, L. Masinter, August 1998. Available at http://www.ietf.org/rfc/rfc2396.
RFC2616: IETF "RFC 2616: Hypertext Transfer Protocol -- HTTP/1.1", J. Gettys, J. Mogul, H. Frystyk, L. Masinter, P. Leach, T. Berners-Lee, June 1999. Available at http://www.ietf.org/rfc/rfc2616.
RFC2717: IETF "Registration Procedures for URL Scheme Names", R. Petke, I. King, November 1999. Available at http://www.ietf.org/rfc/rfc2717.

8.2. Non-Normative References

Axioms: "Universal Resource Identifiers - Axioms of Web Architecture", T. Berners-Lee, living document dated December 1996. Available at http://www.w3.org/DesignIssues/Axioms
Cool: "Cool URI's don't change" T. Berners-Lee, W3C, 1998 Available at http://www.w3.org/Provider/Style/URI
CSS2: "Cascading Style Sheets, level 2", B. Bos, H. Lie, C. Lilley, I. Jacobs, 12 May 1998. This W3C Recommendation is available at http://www.w3.org/TR/1998/REC-CSS2-19980512/.
Eng90: "Knowledge-Domain Interoperability and an Open Hyperdocument System", D. C. Engelbart, June 1990.
Fielding: " Principled Design of the Modern Web Architecture", R.T. Fielding and R.N. Taylor, UC Irvine.In Proceedings of the 2000 International Conference on Software Engineering (ICSE 2000), Limerick, Ireland, June 2000, pp. 407-416.
Fragments: "Fragment Identifiers on URIs", T. Berners-Lee, living document dated April 1997. Available at http://www.w3.org/DesignIssues/Fragment
HTML40: "HTML 4.01 Specification", D. Raggett, A. Le Hors, I. Jacobs, 24 December 1999. This W3C Recommendation is available at http://www.w3.org/TR/1999/REC-html401-19991224/.
P3P10: "The Platform for Privacy Preferences 1.0 (P3P1.0) Specification", M. Marchiori, ed., 16 April 2002. This W3C Recommendation is available at http://www.w3.org/TR/2002/REC-P3P-20020416/.
REST: " Representational State Transfer (REST)", Chapter 5 of "Architectural Styles and the Design of Network-based Software Architectures", Doctoral Thesis of R. T. Fielding, 2000.
RFC2141: IETF "RFC 2141: URN Syntax", R. Moats, May 1997. Available at http://www.ietf.org/rfc/rfc2141.txt.
RFC2718: "Guidelines for new URL Schemes", L. Masinter, H. Alvestrand, D. Zigmond, R. Petke, November 1999. Available at: http://www.ietf.org/rfc/rfc2718.txt.
RFC3236: IETF "RFC 3236: The 'application/xhtml+xml' Media Type", M. Baker, P. Stark, January 2002. Available at: http://www.rfc-editor.org/rfc/rfc3236.
SVG10: "Scalable Vector Graphics (SVG) 1.0 Specification", J. Ferraiolo, ed., 4 Sep 2001. This W3C Recommendation is available at http://www.w3.org/TR/2001/REC-SVG-20010904/.
UniqueDNS: " IAB Technical Comment on the Unique DNS Root", B. Carpenter, 27 Sep 1999.
XHTML10: "XHTML 1.0: The Extensible HyperText Markup Language: A Reformulation of HTML 4 in XML 1.0", S. Pemberton et al., 26 January 2000. The latest version of this W3C Recommendation is available at http://www.w3.org/TR/xhtml1/.
XLink10: "XML Linking Language (XLink) Version 1.0", S. DeRose, E. Maler, D. Orchard, 27 June 2001. This W3C Recommendation is available at http://www.w3.org/TR/2001/REC-xlink-20010627/.
XML10: "Extensible Markup Language (XML) 1.0 (Second Edition)", T. Bray, J. Paoli, C.M. Sperberg-McQueen, E. Maler, 6 October 2000. This W3C Recommendation is available at http://www.w3.org/TR/2000/REC-xml-20001006.
XMLNS: "Namespaces in XML", T. Bray, D. Hollander, A. Layman, 14 Jan 1999. This W3C Recommendation is available at http://www.w3.org/TR/1999/REC-xml-names-19990114/.
W3CPROCESS: " W3C Process Document", 19 July 2001 Version.

Ian Jacobs
Last modified $Date: 2002/08/28 13:29:26 $ by $Author: ijacobs $
Version: $Version$

DRAFT: Architectural Principles of the World Wide Web

Abstract

Status of this document

Table of Contents

1. Introduction

1.1. Structure and conventions of this document

1.2. Audience of this document

1.3. Limits of this document

1.4. List of principles in this document

2. Identifiers and resources

2.1. Resources, URIs, and the shared information space

2.2. Operations on URIs

2.2.1. Interactions with resources

2.2.2. Consistent representations

2.3. Some generalities about absolute URI references

2.3.1. Absolute URI references and context-sensitivity

2.4. Characteristics of absolute URI references

2.4.1. Persistence

2.4.2. URI Schemes

2.4.2.1. Scheme-specific Resource Classes

2.4.2.2. Dereference mechanisms

2.4.2.3. Social Governance

2.4.2.4. Equivalence and Comparison

2.5. Fragment identifiers

2.5.1. Design weakness: HTTP content negotiation and fragment identifiers

3. Formats

3.1. Scope

3.2. Model View Controller

3.3. The model - document formats

3.4. The view - presentation

3.5. The controller - animation, scripting, events, client/server interaction

4. Protocols

5. Tips on URIs

5.1. Spelling of URIs

5.2. Unique URIs

6. End notes

7. Glossary

8. References

8.1. Normative References

8.2. Non-Normative References