W3 Concepts

The world-wide web is conceived as a seamless world in which ALL information, from any source, can be accessed in a consistent and simple way.

Universal Readership

Before W3, typically to find some information at CERN one had to have one of a number of different terminals connected to a number of different computers, and one had to learn a number of different programs to access that data.

The W3 principle of universal readership is that once information is available, it should be accessible from any type of computer, in any country, and an (authorized) person should only have to use one simple program to access it. This is now the case.In practice the web hangs on a number of essential concepts. Though not the most important, the most famous if that of hypertext.

Hypertext

Hypertext is text with links. Hypertext is not a new idea: in fact, when you read a book there are links between references (see section 3), footnotes, and between the table of contents or index and the text. If you include bibliographies which refer t other books and papers, text is in fact already full of references. With hypertext, the computer makes following such references as easy as turning the page. This means that the reader can escape from the sequential organization of the pages to follow pursue a thread of his or her own. This makes hypertext an incredibly powerful tool for learning. Hypertext authors design their material to make it open to active exploration, and in doing so communicate their information and ideas more effectively.

W3 uses hypertext as the method of presentation, although as we shall see, this does not necessarily require that authors write hypertext. In W3, links can lead from all or part of a document to all or part of another document. Documents need not be text: they can be graphics, movies and sound, so the term "hypermedia", meaning "multimedia hypertext" applied equally well to W3.

Searching

Whilst hypertext is a powerful tool for finding information, it cannot cope with large amorphous masses of data. For these cases, computer-generated indexes allow the user to pick out interesting items from textual input. There are therefore two operations a reader can use: the hypertext jump and the text search. Indexes appear within the web just like other documents, but a search panel (or FIND command) accompanies them which allows the input of text. Behind each index is some search engine: many different search engines with different capabilities exist on different servers. However, they are all used in exactly the same simple way: you type in some text, and you get back a hypertext answer which points you to things which were found by the search.

Client-Server Model

To allow the web to scale, it was designed without any centralized facility. Anyone can publish information, and anyone (authorized) can read it. There is no central control. To publish data you run a server, and to read data you run a client. All the clients and all the servers are connected to each other by the Internet. The W3 protocols and other standard protocols allow all clients to communicate with all servers.

Format negotiation

Since computers were invented, there have been a great variety of different codes for representing information. It has never been possible to pick one as the "best" code, as each has its advantages and its advocates. Our experience is that any attempt to enforce a particular representation such as postscript, TeX, or SGML leads to immediate war.

A feature of HTTP is that the client sends a list of the representations it understands along with its request, and the server can then ensure that it replies in a suitable way. We needed this feature to cope with the existing mass of graphics formats for example (GIF, TIFF, JPEG to name but a few). If we cannot cope with the existing formats, how can we hope to evolve to take advantage of all the exciting new formats yet to be invented? Format negotiation allows the web to distances itself from the technical and political battles of the data formats.

A spin-off of this involves high-level formats for specific data. In certain fields, special data formats have been designed for handling for example DNA codes, the spectra of stars, classical Greek, or the design of bridges. Those working in the field have software allowing them not only to view this data, but to manipulate it, analyse it, and modify it. When the server and the client both understand such a high-level format, then they can take advantage of it, and the data is transferred in that way. At the same time, other people (for example high school students) without the special software can still view the data, if the server can convert it into an inferior but still useful form. We keep the W3 goal of "universal readership" without compromising total functionality at the high level.

Part of the W3 seminar . On to W3 protocols.

Tim BL