Henrik Frystyk, July 1994

Summary

This report describes the World-Wide Web as a generic model for exchanging information on the Internet. The model supports several Presentation Layer protocols such as FTP, Gopher, etc. but the main protocol is the Hypertext Transfer Protocol. This protocol is a client-server based protocol based on the TCP protocol. The TCP protocol has an advantage over UDP in that it provides a reliable stream-oriented service to the HTTP protocol. On the other hand, TCP implies a substantial overhead in the 3 Way Handshake (3WHS) technique used for setting up a connection. A very interesting alternative to the TCP protocol is Transactional TCP (T/TCP) protocol that short circuits the 3WHS connection establishment and optimizes the reliable stream transport for client-server applications with many consecutive connections between the same hosts.

The following software modules and programs which are all developed and maintained by the CERN World-Wide Web team

World-Wide Web Library of Common Code: A general purpose code basis for many World-Wide Web applications. It contains code for accessing and parsing information from the Internet and pass it to the client application. An updated flow diagram is presented.
Line Mode Browser: A test tool for the library and useful as a filter run from batch jobs
HTTP Server: A generic HTTP server for handling client requests handling access authentication etc.
Proxy Server: A special HTTP server that can be used on a firewall machine. The proxy waits for a request from inside the firewall, forwards the request to the remote server outside the firewall, reads the response and then sends it back to the client. Furthermore, the CERN Proxy server has a cache manageer included in the application. As the proxy is a central node for connections network performance can be improved significantly by this cache.

A new HTTP client has been designed and implemented as an alpha version (August 1994). It is based on a state machine concept in order to provide it with a multi-threaded, interruptable I/O interface to the client application. The multi-threaded design is based on a single-stack, single-process environment in order to make the library portable to inherently single-threaded platforms such as PCs. An event loop has been introduced into the control flow diagram of the library. The event loop, implemented inside the library, is based on a set of call back functions to the client application initiated by events from either the network interface or standard input.

A design for how World-Wide Web Clients interface to the library when posting a data object has been designed and implemented in an alpha version (August 1994). It gives the user the opportunity of posting data object to more than one recipient at the same time by creating a POST Web with links to the different destinations, for example a NNTP news group, a SMTP email address, and a HTTP server.

The posting interface is to be followed up by an enhancement of the HTTP protocol to handle multiple POST requests at the same time. A draft proposal is under development but is not finished at this moment. The plan is to combine this feature with a multiple GET feature so that the HTTP protocol is expanded to handle compound requests in both directions all handled within the same transaction. However, the basic stateless client-server model is preserved as the client still sends exactly one request and the server sends exactly one response. The current specifications only allow one single data object to be transferred per transaction.

Henrik Frystyk, frystyk@info.cern.ch, July 1994