W3C libwww Architecture

The Net Class

In a single-process, single threaded environment all requests to, for example, the I/O interface blocks any further progress in the process. Any combination of a multiprocess or multi threaded implementation of the Library makes provision for the application to request several independent documents at the same time without getting blocked by slow I/O operations. As a Web application is expected to use much of the time doing I/O such as "connect" and "read", a high degree of optimization can be obtained if multiple threads can run at the same time.

The Net class manages information related to a "thread" in libwww. As libwww threads are not really threads but a notion of using interleaved, non-blocking I/O for accessing data objects from the network (or local file system), they can be used on any platform with or without support for native threads. In the case where you have an application using real threads the Net class is simply a object maintaining links to all other objects involved in serving the request. If you are using the libwww pseudo threads then the Net object contains enough information to stop and start a request based on which BSD sockets are ready. In practise this is of course transparent to the application - this is just to explain the difference.

As libwww threads are not really threads but a notion of using non-blocking I/O for accessing data objects from the network (or local file system), it can be used on any number of platforms with or without native support for threads, and this section describes the model behind libwww threads and how it affects applications.

The Net object

The Net object contains all the state information required to stop and start execute a request using asynchronous IO. The use of asynchronous IO has an important implication on the implementation of the access modules in the Library, for example the HTTP module which is explained later:

Pseudo Threads

The main reason for keeping the Net object separate from the Request object is that some requests require more than one Net object, for example FTP which has a Net object for the control TCP connection and a Net object for each data TCP connection. In the case of HTTP/0.9 and HTTP/1.0, there is a 1:1 correspondence between a Net object and a request object. In HTTP/1.1 a Net object can live longer than a single request as persistent connections might handle a set of requests over the same TCP socket. Net objects can be used in three different ways:

  1. All requests are preemptive and all I/O is blocking
  2. Requests are non-preemptive managed by an internal event loop
  3. Requests are non-preemptive managed by an external event loop

The three modes are described in more detail in the section on Internal and External Events. In mode 2) the Net objects is used to make the binding between the socket based internal event loop (using a select() call) and a request, so that a socket ready for an I/O action can make the corresponding libwww thread active. In mode 1) and 3) Net objects represents the socket interface of a request.

Creation and Termination of a Net object

A Net object is created by the Net manager from within the Request manager every time a request is passed to the Library. A request can either be issued by the application or the Library itself for example as a result of redirection, access authentication, or when a new data connection is created in a FTP request. All new Net objects are automatically associated with a group which might already exist or be created together with the new Net object.

When a Net object has been created, the Request manager returns immediately to the caller and does not see the request before it has terminated either with a success or an error as result. The request can either be started immediately by the Net manager or put into a queue if the maximum number of open TCP connections is reached. When a request is terminated there are typically a set of tasks that the application would like to do:

Handling the termination of a request is based on call back functions that can be registered in the Net manager dynamically at run time. Multiple call back functions can be registered in which case they are all called from the Net manager in the sequence they were registered. As an example, the Request manager registers a call back function to handle the status of the request regarding to some internal actions. This function is registered at initialization time of the Library. The application can add its own call back functions to be called on termination of a request.

Henrik Frystyk, libwww@w3.org,

@(#) $Id: Threads.html,v 1.15 1997/01/06 16:56:44 jigsaw Exp $