W3C logo
Jigsaw

Jigsaw
Internal design of Jigsaw 2.0


Jigsaw Home / Documentation Overview

In Jigsaw each served URI is bound to an object generating content. This object is mapped to a FramedResource (FileResource, DirectoryResource...) which in turn is associated with a ProtocolFrame (usually the HTTPFrame or any subclass) instance, which in turn is created manually (with JigAdmin 1.0 or 2.0) or automatically by an indexer.

This document has the following sections:

What is a Resource?

The Resources are the objects exported by Jigsaw to the outside world. Resources can generate raw data stream, like text files or image files, or they can be active objects, generating data stream on the fly, depending on different contexts, like servlets, cgi scripts, filtered resources.

Inside Jigsaw, a resource is a full Java object, containing only information that the raw Resource (a file, a directory...) can provide (eg, for a file, the size, last modification date...)

To be available on the server, a resource MUST be associated with a ProtocolFrame implementing the protocol available on that particular server. An instance of Jigsaw (ie: the java process) can support multiple servers, each server having the possibility of implementing different protocols and may or may not share the same set of resources.

The list of all Resources is available.

What is a Frame?

Since Jigsaw-2.0, the Resources are very basic and contains only the intrinsic knowledge of this resource. For file, you have the size, last modification date... The 2.0 Resources have protocol frames. Those frames are handling the different protocols used to fetch this particular resource.

A Frame is a full Java Object, containing all the information needed to serve this Resource using a specific Protocol (eg, HTTPFrame for HTTP).

The list of all Frames is available.

Resources and frames

What is a Filter?

The filters alters the resources, by requesting authentification for example. The filters are attached to the resource. In the 2.0 version the filters are atteched to the protocol frame of the resource rather than to the resource itself because many filtering scheme depends on the protocol used, the authentification in HTTP is very specific to HTTP and can't be used in other protocols. The filters are called before and after serving the resource. You may have filters that are called only before the resource is served, like an authentification filter, or after, like a "find and replace" filter.

A Filter is a full Java Object, associated to a Frame, that can modify the Request and/or the Reply. For example the Authentication is handled by a special filter, the GenericAuthFilter (Note, as this filter is protocol dependant, it is placed on the protocol frame).

The list of all Filters is available.

What is an Indexer ?

There are two ways of configuring the resources, by adding directly a specific resource at a specific place, or by letting the indexers create the resource in the server hierarchy. Of course the manual tune can be used along with the indexers. That is the most common way to configure Jigsaw.

An indexers, placed on a Container, will be in charge of creating its sons resources. It will create Resource of a special kind depending, for example, on the extension of the filename ("html" for an html page, "png" for a PNG image file....). Or place a specific indexer for cgi on a directory named "cgi-bin". In Jigsaw-2.0, the indexer is not only in charge of creating the resource, it has also to put the right protocol frames (and other frames if necessary) on the created resource.

An indexer have two main part:

  1. The Directories.
  2. The Extensions.
In the Directories, you have to specify how to index directories with a specific name. The default name is "*default*", in the default indexer the resource created is a DirectoryResource. In the 2.0 version, it creates a DirectoryResource with a default HTTPFrame.

In the extensions, you have to specify how to index files or leaf Resources. The default extension are mapped to FileResource, html, gif, png, txt... In the 2.0 version, an HTTPFrame is added to the Resource.

A tutorial about the setup of indexers is available, it helps understanding how it works.

The new Resource

In the previous version of Jigsaw (1.0), the inheritance tree of Resources was:
Sketches of the old design

All the basic Resources, such as FileResource and DirectoryResource, were heavily linked to HTTP as all the resources served were extensions protocols that are not HTTP-related, we propose this new version of the Resource:

basic resource with frames

Where (1) and (2) are ResourceFrames. A Resource is now a very basic thing, containing only information that the raw Resource can provide (e.g., for a file, the size, last modification date, creation date if available, etc.), then, attached to that Resource, we find the ResourceFrames that extend Resource (they are handled the same way) and contain information about the Resource they are attached to.

To serve a resource using a protocol -- for instance, HTTP -- the Resource will have a protocol ResourceFrame, HTTPFrame, that contains all the information needed to serve this Resource using HTTP. This frame is like the old version of HTTPResource, but it contains more information than the previous version.

The filters are now divided in two categories: the filters on the Resource and the filters on the protocol Frames. The usual filtering scheme used in the previous version of Jigsaw is still valid. The main difference is that filters are no longer attached to the Resource itself but to its protocol frame. ResourceFrames can also have frames.

Other kind of frames can be attached, like RDF frame for metadata, PICS frame to rate this resource, etc...

The new inheritance tree is:

new design (inheritance)
sketches
basic resource framed resource resource frame abstract container file resource container resource protocol frame external container resource filter directory resource http frame

more complex, but more flexible than the previous version.

The new ResourceStoreManager

In order to share all the Resources amongst different servers efficiently, we created a new central ResourceStoreManager. In the previous version the Resources were handled by other Resources. For example, the FileResource was handled by its DirectoryResource. This induced a number of bugs and was not very well-adapted to the new way of sharing Resources. There is now only one manager for the server handler so that each server will talk to this sole manager.
resource manager drawing

This RSM contains a hashtable associating a key (unique indentifier of a ResourceContainer) and a StoreEntry. The StoreEntry contains the store of the resource sons and a hashtable associating the identifier of the sons of the resource and the ResourceReference of those resources.

The ResourceReference is used like this:

 ResourceReference rr;
 ....
 try {
 Resource res = rr.lock();
 ....
 } catch (InvalidResourceException ex) {
 /* InvalidResource means that the resource has been deleted */
 ....
 } finally {
 rr.unlock();
 }
 ...
If the resource has been garbage-collected, the rr.lock() will load it again, and during the lock, it is guaranteed that the resource won't be deleted, unloaded or modified by someone else. This allows safe concurrent modification access to this resource.

Now the container is no longer responsible for the management of its son; it only has a key to the StoreEntry, which contains its sons. To get its own store, the resource has to ask its parent for the StoreEntry that contains it.

The lookup and perform algorithm

This part describe the lookup and the perform algorithm used by Jigsaw.

The following picture show a Jigsaw resources tree (relative to the URL /archives/index.html), where root and archives are DirectoryResource (root is the root resource) and index.html is a FileResource. F1, F2 and F3 are filters (ResourceFilter subclass instance).

graphical description of request handling

In the following description, Jigsaw receive an HTTP GET request for /archives/index.html. To handle the incomming request, Jigsaw will go through the following steps:

  1. Lookup for /archives/index.html
  2. Call the ingoingFilter method of filters
  3. Perform the request
  4. Call the outgoingFilter method of filters
  5. Emit the reply
1) Lookup for /archives/index.html. The LookupState (ls) keeps the state info, and the LookupResult (lr) is the result of the lookup algorithm.
root lookup(ls,lr)
 -> HTTPFrame1 lookup(ls,lr)
 -> F1 lookup(ls,lr)
 -> HTTPFrame1 lookupDirectory(ls,lr)
 -> archives lookup(ls,lr)
 -> HTTPFrame2 lookup(ls,lr)
 -> F2 lookup(ls,lr)
 -> HTTPFrame2 lookupDirectory(ls,lr)
 -> index.html lookup(ls,lr)
 -> HTTPFrame3 lookup(ls,lr)
 -> F3 lookup(ls,lr)
 -> HTTPFrame3 lookupFile(ls,lr) => index.html
the lookup algorithm

2) Call the ingoingFilter method of filters. Request is the incomming request.

F1 ingoingFilter(Request)
F2 ingoingFilter(Request)
F3 ingoingFilter(Request)
Note that if any filter answer with a non-null Reply, the process is stopped and the Reply is sent back to the client directly (like in the GenericAuthFilter)

3) Perform the request (GET) on the resource found at lookup time.

index.html perform(Request)
 -> HTTPFrame3 perform(Request)
 -> HTTPFrame3 get(Request)
 -> HTTPFrame3 getFileResource(Request) => Reply
4) Call the outgoingFilter method of filters. Request is the incomming request, Reply is the reply created by HTTPFrame3.
F3 outgoingFilter(Request, Reply)
F2 outgoingFilter(Request, Reply)
F1 outgoingFilter(Request, Reply)
the perform algorithm, with filters and such

5) Emit the reply created by HTTPFrame3.