Jigsaw: An object oriented server

This document explains the current design of Jigsaw. Since this project started, a year ago, the only thing that remained across the various test implementations that were written, is the choice of the Java language as the implementation language. It is not the purpose of this document to explain this choice, but we still think that having threads, garbage collection, portability and a secure language helped us a lot throughout the actual design.

This paper is divided into three sections, each of them describing a precise point in the space of possible servers design. We started with a simple-minded file based server, whose principal drawback was the lack of an explicit meta-information store. The second design caught up with this drawback, but lacks the ability to write documents to the server (in the wide sense of write, you can always implement PUT as a CGI script). The third design took on the idea of persistent objects that emerged from the second design, and brought it up to a full writable, persistent object store.

As we walk through these various design, it is left as an exercise to the reader to try to find a suitable answer to the following problems:

Efficient content negotiation: How can content negotiation can be efficiently implemented in this server design. By efficient we typically mean the number of file-system accesses, the number of files to parse, etc required before the actual content negotiation algorithm can run.
How does this design allow for handling PUT: Although acknowledged by vendors only lately, the WWW was initially conceived as a read/write space. Most servers can handle PUT through CGI scripts, however, there is more to editing then just putting a file. Does the proposed design allow you to dynamically plug in authentication information, or other kinds of meta-information ?
How does it handle configuration: When it comes to configuration, most servers today require a stop/edit/restart cycle. Does the proposed design allow for something better ?

We will see by the conclusion, that the current Jigsaw design is tuned for answering each of these precise questions. As we walk through the designs, we will use the configuration process of each of the designed servers as the key to enter their implementations. Configuration of servers is often left-over, although it can be used to reverse engineer the implementation of a server, as we will see.

The file-based design (or the lack of meta-information)

The first of Jigsaw's design, was aimed toward serving files, and supporting things like CGI scripts. Generally, it was the result of trying to emulate what current servers do, while still trying to take advantage of Java features (such as dynamic class loading).

Configuration process

In this design, the configuration process was pretty similar to classical servers. A central configuration file would split the exported information space (implemented by the underlying file-system) into various areas, each of them being handled in some specific ways. Typical statements of the configuration file would include things like:

/foo/bar/* FileHandler()

The server would compile this statement by creating some rule stating that all URLs having the /foo/bar prefix were to be handled by some instance of the FileHandler Java class. At runtime, when receiving a request, the server engine would extract the target URL, find the best matching rule and hand out the request for processing by the appropriate handler. The FileHandler would do the normal work of serving the physical file mapped to the requested URL.

This rule mechanism was directly inspired by the CERN server, with the syntax change justified by the fact that you would be able to dynamically load new area handlers. So, for example, if the rule:

/cgi-bin/* CgiHandler()

was present in the configuration file, the instance of the CgiHandler Java class would be created on demand, (i.e. only if some document in this area was requested.). This handler would be used to run the scripts in the given area, and serve their result, following the CGI specification.

The configuration file syntax included the concept of filters. Filters were invoked before the actual processing of a request by its area handler. To setup a filter on some area, you would add a statement like:

/protected/* AuthFilter("realm", CgiHandler())

This functional view of filters was inspired by the Strand (see the Strand paper of the WWW'95 conference) work at OSF, which uses them in proxies rather then in servers themselves. In the above case, the request would be first run through the AuthFilter (which would check its authorization information) before being handed out to the CgiHandler. Note how the configuration file syntax allowed for passing parameters to the actual filters or handlers (such as the realm in the above statement). The server provided a way to specify what arguments were required by each of the handlers (or filters), and would parse them in some uniform manner, so that the configuration file syntax could be kept consistent. In spirit, this was similar to the way Apache handles configuration files, and allows each module to register for a set of configuration directives.

Lessons learned

This first design had some nice properties. The first one was simplicity. Not so much simplicity per-se, but simplicity because the design would match what people expected from a web server. We felt that the ability to run the server straight out of the box, without having to deal with each exported information individually was important (i.e. the ability to set global defaults, that would apply to everything in the exported information space).

From an implementation point of view, having only one handler object per area also meant that the server would not be overcrowded with thousands of small objects representing each exported resource. This meant that memory requirements would be kept low (well, at least proportional to the number of areas, which is expected to be much smaller then the number of exported objects).

However, the big win in this design was the integration of the concept of filters inside the server architecture. This proved invaluable as a mean of decoupling server's functionality. The simple fact that authentication could be handled in a more generic framework, for example, brought a lot of simplicity in the server design. The important thing here was that the number of phases to process a request was significantly reduced. Instead of being:

URL translation to a local name
Access authentication
Compute the reply
...

We would now have:

URL translation
Compute the reply
...

Since the authentication phase would be integrated (through filters) to the computation of the reply stage. Reducing the number of phases in request processing also meant that the general request processing model of the server would be more simple, which in turn meant that extension writers would have their job eased. Two other important benefits of filters were:

The ability to extend the server functionality orthogonally to the way the request was actually processed by its handler.
The cost of running the filters would be paid only when appropriate. It allowed you for example, to specify that the /foo/bar should be enhanced with some special logging.

However, there were bad news too. Things began to turn out really badly when we tried to add content negotiation to this design. There was mainly two ways to proceed:

The CERN server way: Once the requested URL has been converted to a file-system path, check if it comes with an extension. If no extension is available, list all the files in the requested document's directory, and match them against the root of the file name. Then run the negotiation process among the set of matching entries.
The index file way (currently implemented by Apache and WN): For a document to be negotiated, we have to write some map or index file in that directory, describing each variant of the negotiated resource. When needed, the server parses this file, and negotiates among the set of described variants

The CERN server approach was quite nice since it would run behind the scene: the web master would just start its server, and without (nearly) any additional information, the server would support content negotiation. However, the efficiency of the above scheme was obviously poor. The second approach coped with the efficiency problem (although the server would still have to parse an ASCII file to get the set of variants), but incurred more work for the web master (which would now have to write the appropriate description files).

Although we felt content negotiation was an important feature to provide, what this revealed was a much more general problem: the lack of an efficient meta-information store. With such a thing, we would be able to efficiently access a resource's meta-information, which would allow for a fast implementation of the content-negotiation algorithm.

To make this problem more apparent, ask yourself how your favorite server deduces the content-type of the document it serves. For example, the CERN server walks through all file extensions it knows of (which it gets from the central configuration file), trying to find the one that matches the name of the document (a file in this case) to be served. The thing that matters here is not really the computation of the content-type by itself, but rather the fact that it is recomputed each time the same document is served. If your document is accessed fifty times per second, then you will walk through the extensions list fifty times, in the same second.

For content-type this is not a big deal, although if the list of known extensions grows too much, then this can cause some significant damage to the server performances. Where things really hurt, is when it comes to meta-information that is hard to compute, such as digital signatures, or meta-information to be extracted by parsing an HTML document (i.e. as given through the META tag). In these cases, you really don't want the server to spend time recomputing the meta-information, what you want is to cache them once they have been computed for latter reuse.

This lack of efficient meta-information store has already been acknowledged by some server authors. The WN server [reference], for example, defines per-directory configuration files, that allows web-masters to provide the additional meta-information pertaining to specific documents in some given directory. We will come back to this latter.

Summary

At this point, we had a first server running, but we were stuck, mainly by the lack of a meta-information store. We had also left over a number of important features, such as how PUT would be handled, how a single file would be protected (although this would be possible as an extreme case of an area containing a single file).

In fact, we were ready for our next redesign.

Efficient per-directory configuration files (or the read-only server)

Guess what ? The second redesign of Jigsaw emphasised the meta-information storage. Our first goal was to set for a real database of meta-information about resources. Numerous ways were available for implementing this, ranging from connecting the server to a full-blown database, down to simple per-directory configuration files. The WN server uses the latter approach with some success, so we first settled for this.

Toward an efficient meta-information store

The first question we had to answer was how the meta-information store would be implemented. The WN implementation offered at least one possible solution. Each meta-information would be written down in a per-directory configuration file by the web master (with the optional help of an indexer program). After the translation process (i.e once the URL was converted to a local name), the server would check the appropriate per-directory configuration file, parse it and get from it the required information (e.g. the content-type of the document to be served, it's title, etc.).

We were not quite happy with this model, for several reasons. The first obvious one was the CPU cost of reading this per-directory configuration file on each request. Although this can be less expensive than recomputing entirely the meta-information from scratch, we felt that the server had better things to do than parsing some ASCII file before being able to handle an incoming request. We felt this particularly important since we were to use a multi-threaded server rather than a forking server, which meant that we could - at least - be able to cache the result of parsing the per-directory configuration file (more on this latter).

The other concern we had, which was consistent with our first experiment, was to try to reduce the number of phases needed to process a request (with the hope that this would make the overall design simpler, which ultimately would allow us to provide a simple extension API). In particular, we were interested in having an object oriented lookup rather than the not-so-nice translation phase of common servers. What we really wanted to achieve here, is a simplification of the server main loop, so that it would be captured by something similar to the following pseudo code:

public class httpd extends Thread {

    public void run() {
        while ( true ) {
            Socket client = accept() ;
            new httpdClient(socket).start() ;
        }
    }
}

public class httpdClient extends Thread {

    public void run() {
        while (keep_alive) {
            Request  request = getNextRequest() ;
            Resource target  = lookup(request) ;
            Reply reply = target.perform(request);
            reply.emit() ;
        }
    }
}

Once again, our goal here was to provide a simple enough request processing model, to allow for an easy to use server-side API.

Finally, in the process of architecturing this new design, another idea made came about: as the server would have to maintain a database of per-resource information, why not use this database to also configure the resource behavior. A resource would have a set of attributes, containing both its meta-information, and its configuration data. As an example, the FileResource (i.e. the resource class that knows how to serve files) could have a writable attribute to configure the fact that PUT is allowed.

As a result of all this, the idea that exported resources should be persistent objects seemed appealing enough to be worth a try.

Configuration process

Instead of going into the implementation details, we will quickly walk through the configuration process, giving the details needed to understand the pitfall of this design. To implement persistent objects, we first needed a way of describing these objects, so that the web master would be able to create and configure them. We choosed to use per-directory configuration file, along with an ASCII syntax for describing these objects. These files would then be read and parsed by an indexer program, that would resurrect them out of their ASCII representation, and dump them out in a binary format, suitable to be fed to the server itself. The main purpose of this extra compilation stage was to release the server from having to parse the ASCII files as mentioned above.

In this setting, a typical configuration file would look like this:

 1] (entity w3c.http.core.DirectoryEntity
 2]  :name "root"
 3]  :entities ((entity w3c.http.core.FileResource 
 4]              :name "Welcome.html"
 5]              :content-type "text/html")
 6]             (filter w3c.http.auth.BasicAuthFilter
 7]              :realm "my-realm"
 8]              (entity w3c.http.core.FileResource 
 9]               :name "protected.html" 
10]               :content-type "text/html"))))

This file would be put in some directory, then the web master would run the jindex program, that would convert this file (usually named jindex.txt) into a binary file, containing a serialization of each of the objects described in it (this last file would be called jindex). More precisely, the above file describes three different resources:

The first one (starting at line 1), of Java class DirectoryResource describes how to export the directory in which the above configuration file was found. It has two attributes: its name, whose value here is root, and an entities attribute giving the list of sub-resources (lines 3 to 10)
The first sub-resource (lines 3 to 5) described here is a simple file resource, whose name is Welcome.html and whose content-type is text/html. In the URL space, this resource would be named .../root/Welcome.html. The lookup process would first get to the resource named root, and would invoke the root resource lookup method, with as parameter the Welcome.html string, to get back the actual FileResource encapsulating the file.
The last sub-resource (lines 6 to 10) here is filtered by an instance of the BasicAuthFilter class. This class has a realm attribute indicating in which realm incoming requests should be authentified, before being handed out to the actual resource.

Once this file was compiled, the server would then be run. Upon receiving a request the lookup process would load into memory the appropriate binary files, and resurrect the objects as required. Loading these files was done through a cache, so that efficient access would be allowed (we'll come back to this latter).

With this new design, we were eager to see how content negotiation would be handled (hey, the whole point was to have an efficient implementation of content negotiation). Content negotiation would be handled by the NegotiatedResource Java class, that would store all its variants, and their meta-information, in order to run the negotiation algorithm as quickly as possible. A typical negotiated resource would be described by:

 1] (entity w3c.http.core.NegotiatedResource
 2]  :name "foo"
 3]  :variants ((entity w3c.http.core.FileResource 
 4]              :name "foo.gif" 
 5]              :type "image/gif" 
 6]              :quality 0.9)
 7]             (entity w3c.http.core.FileResource 
 8]              :name "foo.png" 
 9]              :type "image/x-png" 
10]              :quality 1.0)))

In brief, the NegotiatedResource class had a variants attribute describing all the variants among which negotiation was allowed for the given resource. Each variant would get its quality from a quality attribute. Note how this time, all the information required to run the content negotiation algorithm would be readily available to the server. Note also how the content-type and other meta-information would be accessed directly once a resource was restored from its serialized form.

However, a set of new problems arose in this design, which made up a new lesson.

Lessons

First of all, we were quite happy with this new design. It had solved the meta-information store problem in a nice way, but most interestingly, each exported resource would now be encapsulated in its own object, bringing up a natural server extension API: to extend the server with a new functionality (such as CGI scripts handling), you would just write a new resource sub-class matching your needs. One nice side-effect of this new design was that agents (eg downloading code to the server) could be implemented really easily: instead of serializing a resource to the disk, you would just serialize it to a socket connected to the target server . The implementation of this design indeed came with an Agent class, sub-class of the generic Resource class, that would allow them to go from one site to the other by the use of a go(URL url) method.

The filter concept was reintegrated in the new design. This was not so simple: the previous design allowed you to filter a whole area (as defined by its prefix), what we wanted now was both the ability to plug a filter on one single resource, while at the same time keep a way to tell that a whole area should be filtered. To solve the conflict, we wrote a new FilteredResource class, whose lookup method would invoke any of its filters. By using this trick, we were able to filter, for example, all accesses to some /foo/bar/* area: the resource exporting the bar directory would be a sub-class of the FilteredResource, and what ever filters were plugged on it would be called each time the bar resource was crossed by the lookup process.

Compared to other servers, this design allowed for an efficient implementation, since the cost of recomputing meta-information would totally disappear. There were still, however, some drawbacks.

The implementation complexity reached a new order of magnitude: this time, the server process would have to host one Java object per exported resource. This meant that it might have to keep in memory hundreds of thousands of these objects. Fortunately, because these objects were persistent, it was quite easy to fix this problem by using some appropriate caching mechanism. Once a resource was detected idle for a given duration of time, it could be serialized back to disk (if needed), freeing the memory it occupied before. It would then be brought up to memory later if requested.

The configuration process was now much more difficult to handle. We experimented with ways of making it easier. The idea here, was to put the complexity in the indexer program: when the indexer was run in some directory, it would match all files found against a global configuration database, describing how files should be wrapped into resources, based on their extensions. A typical statement in this configuration database would look like:

(extension ".html"
 :class w3c.http.core.FileResource
 :type "text/html"
 :quality 1.0
 :icon "html.gif")

When encountering a .html file, the indexer would then know that by default (i.e. if no other statement was made about the file in the per-directory configuration file), it should build a FileResource to export the file.

Another problem with this design was the requirement to run the indexer in each directory before running the server. This was definitely breaking with our "run straight out of the box" goal. More importantly, the fact that meta-information were stored both in a ASCII file version, and in a binary version (the compiled form) meant that the server had to deal with two kinds of inconsistencies:

As the server would cache the binary files, they might become out of synchronization with the ASCII file, or even with the binary file themselves: we lacked a way of notifying the server that it should flush some cached binary file (well, it was possible, but added as a hack, instead of being thought of right from the beginning).
We wanted our server to handle PUT of new documents. This would mean that the server itself would have to edit the binary version of the per-directory configuration file. This, by itself was easy, but changes to the binary file would be superseded by the recompilation of the ASCII per-directory configuration file.

Summary

Except for the problem of the cache coherency mentioned above, this design was quite nice for a read-only server. In particular, it made us buy the idea of a server of persistent objects, by opposition of the simple file-based design that we designed first. This was a big step forward.

However, when it came to make the server writable, things hurt. The real lesson from this design was two fold:

Persistent objects were good
If you were to have your server serve an editable space, then simply dumping the body of a PUT method into some file was not enough: you ought to provide a way of editing the meta-information associated to each resource through the server itself.

As a side effect, the second point would solve our caching of the per-directory information problem: only the server should write these information, in order to avoid the conflicts created by having two potential sources of edits to a single resource. The implications of these two lessons lead us to a new redesign of Jigsaw.

The object oriented design

This last presented design is the current Jigsaw design, as of version 1.0a. The main lessons it takes into account can be summarized as:

All exported resources are persistent objects, they can be edited only through the server
All resources should be able to support filters
The configuration process must be changed in order to be done through the server, so that it can keep its cache of persistent resources up-to-date.

As usual now, we are going to walk through the configuration process, trying to highlight the parts of implementation that are relevant to this discussion.

Configuration process

As of this release, the configuration process is now totally runnable through HTML forms. To handle this in some generic fashion, the resource objects have been extended with a way to describe themselves. Each resource supports a getAttribute() method, which is responsible for returning the full list of attributes they support, along with various type information. More precisely, each declared attribute of a resource as a piece of meta-information associated with it, that describes how values for this attribute can be edited and saved. Based on this description, the Jigsaw configuration module is able to dynamically generate a form, suitable to edit any resource attribute. This is how Jigsaw's configuration is done today.

Resource filters are handled as a specific attribute of the FilteredResource class, whose instances can be edited dynamically. As a side effect of this resource module, it appeared that any piece of data (be it configuration or not), that needs to be edited through the server could now be turned into some specific kind of resources. The current Jigsaw implementation takes full advantage of this, and things like authentication realms are typically implemented as special classes of resources. This not only means that we get authentication realms for free, it also means that any caching scheme applied to resources also applies to authentication realms. Jigsaw keeps an LRU list of resources loaded in memory, this list can include things like authentication realms (which are kept in memory only when they are really needed), authentication realm users, etc. All these data structures, by inheriting from the Resource class, gets the caching behavior for free.

Another place where resources are used is in the global configuration database. Remember how the previous design maintained a list of extensions, in order to provide to the indexer program the knowledge about how files should be indexed (wrapped into resource instances) by default. This last design of Jigsaw maintains a similar database, with the indexer now being part of the server itself. For example, each extension property is defined through a special extension resource, that has attributes for storing the class of resource to export the matching files, default attribute values (such as the expire date, etc).

Because the indexer is now part of Jigsaw, the server can now be used straight out of the box: you can just run Jigsaw in some root directory, and it will dynamically create the queried resource instances as it runs. Each time a resource is created, it is made persistent, so that next time the server runs, it doesn't have to recreate them. Resources are serialized into per-directory binary files, which are totally under the control of the server (these files are usually named .jigidx files, with their backup version being .jigidx.bak).

The usual stop/edit/restart cycle needed to change the server configuration has now totally disappeared, you can edit the resource configuration (defined through some of their attribute values) dynamically, while the server is running.

Lessons

Jigsaw's current design seems to fulfill all our needs. The implementation still needs some tuning, and some features need to be added, but the APIs seems pretty stable now.

The implementation is now fairly complex. It uses a number of caching mechanisms (who said a server was just a series of caches ?), which do impact the current extension APIs (although not the one that will be commonly used, such as the resource class). Improvement to the implementation will be made by turning most of the design into interfaces and by making the current implementations (the actual classes) a set of sample implementations for these interfaces.

Jigsaw also lacks some important features: support for virtual hosts, server-side markup and these kind of things. Most of these features (if not all) can be implemented as either new resource classes, or new filters.

Next step

So what will the next design of Jigsaw will be? Well, as said above, there will probably not be a redesign of Jigsaw until some time. Once the current implementation is cleanly separated between a set of interfaces and a set of implementations, it will be possible to add functionality in a clean way.

Among the number of things that might happen, there are two of them on which I would like to insist:

As a matter of fact, Jigsaw can be considered as some sort of object oriented database. It would be nice to be able to reuse available technology here. The ResourceStore interface can be implemented to fetch serialized objects from any kind of database, rather then from a per-directory configuration file. This could improve both the robustness and the performance.
The configuration process, as described above, suffers from a lack of user friendliness. This can be remedied by having the client being more smart. The use of HTML forms as a general means to edit the object attributes limits the range of usable UI metaphors. Work will be done to first provide access to the server through RMI, and then using this remote interface, write up a real server administration application (probably composed of a set of applets).
Use Java ability to move code around. All along the paper we have insisted on the fact that resources were self-contained: they know of their own description, their configuration and their behavior. This means that if you were able to move these objects on the network, you could achieve a nice replication environment, bringing computation closer to the clients. More precisely, today the world is divided into two big categories: the CGI world which emphasis on the computation happening on the server side, and the Applet world, which emphasizes on the computation happening on the client side. We would like to experiment with something in between those, where computation would happen in proxy servers.

Conclusion

We have walked through three different server designs. The first one was a simple file-based server, we show why the lack of a meta-information store lead to an inefficient server implementation. The second design corrected this by using per-directory configuration files. This effectively solved the first problem, however it brought up a new set of issues when it came to have a writable server. Our last design is an attempt at solving all the problems encountered so far, it has now been stable for several months.

It is interesting to note how our requirements on the server have changed during the whole year, in particular, the emphasis on a server to both serve and store documents has taken an increasing priority in the design.

To conclude, I would like to highlight how the three problems we mentioned in the introduction are now solved:

Efficient content negotiation: We show that this first problem is related with the fact that our initial design didn't support efficiently a meta-information store. The current version of Jigsaw supports content negotiation and is able to run it without touching the disk in most cases.
PUTed document meta-information: Jigsaw supports PUT: the first time a document is PUT'ed to the server (it is created in fact), Jigsaw creates an appropriate empty resource instance, which will capture all the meta-information that come from the PUT request. This means that you can safely PUT a document whose name is foo.bar, and whose content-type is text/html.
Editing the information space configuration: Adding authentication to an area of information exported by the server, is just adding an authentication filter to the appropriate DirectoryResource. This can be done entirely through forms, by editing the resource's attributes.

Jigsaw Team
$Id: wp.html,v 1.11 2017/10/02 11:00:09 denis Exp $