Architecture of the World Wide Web

PROPOSED REPLACEMENT TEXT FOR WEBARCH (section 4)
Draft $Id: v3.html,v 1.9 2003/01/02 10:06:02 sandro Exp $

Status: This text (itself version 1.8 2002/12/05) was never actually proposed to the TAG. I talked it over with TimBL and he convinced me that the expository style was more appropriate for a background-material paper. But I haven't re-organized it yet, and may never do so, as my sense of what to say has changed. Some of these issues I raised directly on the mailing list. -- sandro

4. Interaction

While in a literal sense the web is made up of agents, it appears to users as an enormous collection of web pages. The pages are clustered into web sites, which tend to have an obvious identity (like a brand) and a coherent theme or purpose. Individually, each page conveys some information, typically with words and perhaps a few pictures, much like a brochure or magazine article, while being linked to each other in a way which allows the user to easily move to a task- or interest-appropriate page. Since the actual medium of the page is computer-based (not paper-based) it can contain sound, animation, video, etc; the architectural guidance is for grouping into identifiable, navigable units, not toward particular languages or media within the units.

The construction of this "information space" of pages on behalf of users is performed jointly by all the web agents, acting in concert through web protocols. These protocols (such as HTTP) allow relatively simple computer programs to participate in the web. Most participants act either on the "server" side, where pages conceptually exist, or on the "client" side, where access to pages is desired; access is provided via the web's network communication protocols.

While the information space abstraction has so far been used mostly to make the web's vast store of information tractable to people, it may also be useful for machine interoperation. In this area, in particular, architectural guidance is important, as the lack of human users tempts designers to stretch a guiding metaphor they might not understand. In fact, for human or machine users, the web's use of navigable information units is essential to its architecture.

4.1. An Information Space

The idea of an information space is that information can be collected into units, and these units can be arranged to somehow be navigable [Bees&Ants]. A simple example is a wall of bulletin boards, where peoople post small items of general interest, such as a job opening, a bicycle for sale, or an open meeting of a chess club. Each item appears on its own index card, tacked up on one of the boards. When you look at the board, your eyes can wander from one item to another. If you find a particularly interesting one, you can remember its rough location on one of the boards and return to it later; you may even be able to describe the location to another person ("on the left side of board number six, about a third of the way up from the bottom").

The whole of scientific literature forms another, less tangible information space. Here the unit is the published article and the identifier, the "location", is the bibliographic information. The interwoven citations of the literature, which have been essential to the progress of science, are a strong inspiration for the web.

Another kind of information space is the computer filesystem, where the unit is the file and the identifier is the the absolute-path filename. Computer files are often used to store web pages, with the page URIs mapping directly to filenames

By maintaining information in such separate, addressable units, the web allows people to point to pages and talk about them. This allows for advertisements on the sides of busses, search engines, bookmarks, and hypertext links in general. Without usable and relatively persistent URIs, the web would be like the cards from the bulletin board, put in a box and shaken; there would be no way to point to information of interest, and the whole system would rapidly grow unworkable.

4.2. Web Protocols

The division into pages also allows a division of labor, as certain agents (web servers) can each have responsibility for managing the content of certain pages. The pages themselves, like index cards, are inanimate things; they may be acted upon but do not act. The web servers for a particular page, however, are active; they speak network protocols and can handle requests concerning the pages.

Web protocols allow communicating about web pages. The conversation between a client and server is similar to the telephone conversation two people might have if one (the server) were standing in front of a bulletin board and the other (the client) was interested in the contents of some of the cards on the board. The most basic operation would be "read me a (particular) card", but other operations like "tell me about the card", "take down the card", and "put up a new card for me" are possible.

If several clients were interested in modifying the same card at the same time, the conversations could get quit complicated. If the information was private, the server would need to be able to identify the client and somehow protect against wiretapping. If the authenticity of the information were important, the client would have to be able to identify the server.

While the protocols might get complicated, they remain focused on the cards. The client does not ask, "What is the weather forecast for tomorrow?" Instead it asks for the contents of a card which it has reason to believe contain that weather forecast. The server just talks about and manipulates the cards and their contents, not the meaning of their contents. If the server spoke only French and cards were all in German, the server could still do its job, talking about the individual Latin-alphabet letters on the card with no understand of the words.

@@ media-types, representations, knowledge representation languages, extensibility — most of that should go elsewhere though, I think... But I probably have some things to say about them.

4.2.1 HTTP

@@@ a few paragraphs about HTTP is really supposed to work?

4.2.2 FTP as a Web Protocol

@@@

4.2.3 Transactions (DAV)

@@@ include RDF-database access issues?

4.2.* (other protocol issues?)

4.3 Turning Messages into Web Pages

While the web is intended to support communication indirectly, with information traveling via addressable storage areas (web pages), the more direct message-passing approach can be effectively and productively incorporated by recognizing that the contents of a message are often stored for a time by the message receiver.

...

A common view of HTTP POST is that it provides a mechanism for clients to send messages to servers, while its original purpose (as above) was more in the file-creation vein. The posting operation was intended to allow for new information to be made available on the server, much like a new file. These two views are compatible, however, if we view the posted information (a message) as something to which the server MAY store and assign a URI.

This view matches written communication, where messages are passed via persistent and identifiable media, such as pieces of paper. The receiver of some written correspondence is generally entitled to keep it for some time and in some cases even to publish it. W3C archived mailing lists turn potentially transient messages into proper web content by assigning them URIs and assuming responsibility as web servers for those URIs. Web bases bulletin board systems often do the same using messages sent via HTTP POST.

The architectural point here is subtle but important: if messages are assigned suitable URIs they become units in the web's information space and get all the benefits of linking and the web's infrastructure. Systems which have a reason to keep the information in a message they receive SHOULD assign it a URI and be a server for it if privacy mechanisms are in place (appropriate for context of the message).

For example, when a user presses "BUY NOW" to finalize a purchase, the seller will (for business reasons) need to maintain a record of that transaction. If that record is assigned a URI, it becomes an information unit in the web; links can be made to it, it can have metadata, a link to it can be dragged and dropped onto a desktop! Both internally and externally, given sufficient access control, the URI can greatly assist in coherent sharing of knowledge about the transaction.

As a more complex example, consider arranging travel to the San Francisco Bay Area. You can fly into SFO, OAK, or SJO, and then rent a car and drive to your destination. Your choice of airport depends on price and availability from both the airlines and the car-rental agencies. As a human attempting this transaction, or as a software agent, you will need some kind of "temporary hold" protocol with the service providers, where you are quoted a price which is guaranteed for a short but sufficient time. Lacking that, the best you can do is try to minimize your risk. @@@ detail using HTTP and WebDAV [is WebDAV sufficient?]

4.4 Turning "Objects" into Web Pages

Object-oriented approaches to the construction of information systems are well-known and widely used. In object-oriented analysis, the different classes of conceptual or physical "objects" in some domain of interest are identified and characterized. Object-oriented programming systems then provide mechanisms for all the information about the objects in each class to be handled in the same, class-dependent manner.

Object-oriented systems are, in effect, information spaces with additional restrictions. Like information spaces in general, they gather information into navigable units, but in object-oriented systems the units correspond one-to-one to instances in the domain of discourse. In a running C++ program storing information about five students, there may well be a "Student" class, with five instances; each instance will have a block of memory for storing the information about the corresponding student (or pointers to the information). Software developers may well speak of the block of memory as if it were the student, a useful simplification allowing them to better engage the problem domain.

Object-oriented systems also require that access to the information about an object be controlled by class-specific software. This continues the engagement illusion, allowing developers to think of the "student" performing operations like "add test score (for yourself)" and "calculate (your) final grade." These operations are of course performed by the class-specific software modules, not the student herself or the block of memory.

In adapting object-oriented technology to the web, these two differences raise potential problems. At first glance, the fit is good: objects (in the OOP sense) are really units of information storage, just like web pages. But what about the class-specific behaviors? Web protocols operate directly on the stored information, not acting through problem-domain operations like "calculate final grade".

4.4.1. Direct Access to Object Information (REST)

The relationship between web pages and object-information is clear in the Representational State Transfer (REST) [REST] architecture. In this view, web pages (like OOP objects) correspond one-to-one with the things about which information is being stored. Accessing a page via a web protocol allows mediated transfer of information about the object, which in the OO system is generally called the object's "state".

A classic example here is a coffee machine on the web. It has a URI, and when your browser fetches a representation for you, you can see the state of the machine (is there coffee brewed? how hot is it? how much coffee has been brewed in the past 72 hours?). Following object-oriention, the unit of information about the machine is intentionally conflated with the machine itself, with the same URI denoting both things.

This conflation can be confusing, however. What if there is a page about two coffee pots? What object is that? What if there are two slightly-different pages about the same coffee pot? When should you use one URI instead of the other, and how can you say they are distinct when they are the same object? What if there is a web page about the web page about the coffee pot?

The problem here is that by using the same identifier for two things and saving ourselves some mental work (letting ourselves think in the domain of discourse), we have become unable to distinguish between the two things. We might view the identifier as still being unambiguous, as identifying something which is by its nature both a web page and a coffee pot, but it's unlikely that approach will support semantic coherence in the long run.

The same situation actually occurs in object-oriented programming systems when for some reason the programmer wants to break through the facade that a block of memory is a student and actually access the bytes of memory. This kind of breaking-though is considered very dangerous; as systems increase in scale and complexity, the likelyhood of making errors when doing so increases.

Still, there are significant benefits of giving URIs to coffee pots and all the other things we might want to talk about. If we give up conflation, then how can we use the information-space architecture to manage information about these things? We need some simple mapping from non-page URIs to relevant page URIs. One proposal has been to use the "#" character as a separator, with the portion on the left being a page URI and the entire URI denoting the non-page thing. This approach comes close to the use of fragment identifiers in HTML and XPointer, but perhaps not in a compatible manner.

Consensus has not yet been reached on this issue.

In general, however, object-oriented methodologies can be applied quite effectively to arranging a part of the web's information space in correspondence with some particular problem domain. When pages correspond to object, navigation and authoring are often much simpler. [@ more details?]

4.4.2. Problem-Domain Operations on Objects

The other restriction in object-oriented systems, that access be mediated by class-specific software, requires adding another layer on top of web protocols. The web's protocols allow controlled access to units of information only in a generic manner; to provide class specific operations, the client and server must share a higher-layer protocol where the identity of such operations is encoded within the information itself.

@@ possible example of how SOAP does this.

PROPOSED REPLACEMENT TEXT FOR WEBARCH (section 4)Draft $Id: v3.html,v 1.9 2003/01/02 10:06:02 sandro Exp $