W3C Jigsaw

Jigsaw configuration tutorial

This tutorial will walk through the whole Jigsaw configuration process. It only assumes that you have unpacked the distribution, it will then take you through the following steps:

It is recommended, however, that before reading this tutorial, you read the brief overview of Jigsaw architecture.

Note also that a new configuration tool, JigAdmin, has been provided to make the configuration of Jigsaw easier, and more understandable. Of course, reading this tutorial is still very useful!

Running Jigsaw

Jigsaw distribution comes with a sample root directory. The very first thing to do, is to check that it runs properly in this directory, for this purpose, just run the following command (it should work on any platform that supports Java):

For version 1.0alpha5 and up:

java w3c.jigsaw.Main -root root

For version 1.0alpha4 and lower:

java w3c.jigsaw.http.httpd -host host -port port -root root

You should substitute:

If everything runs smoothly, then Jigsaw will print the following message:

loading properties from: /afs/w3.org/usr/abaird/Jigsaw/config/httpd.props
Jigsaw[1.0alpha3]: serving at http://www24.w3.org:8888

Meaning that the server is ready, and listening at the given URL. If things do not run as nicely, here is a list of possible hints:

We are now assuming that Jigsaw is running in its sample root directory. If you want to run Jigsaw in some other place then its sample root directory, you just need to copy it to the appropriate location (using a recursive copy program, so that you get the sub-directories too).

The sample root directory provides a set of resources (located in the Admin directory) that allows you to configure the server. As this directory doesn't come protected by default, you should continue to read this tutorial until you have protected it (you can jump directly to the authentication section, but it is recommended that you keep reading until reaching it).

Global configuration

Tired of writing all these command line arguments when running Jigsaw ? In this section, we will explain how to edit the server global configuration, which will allow you to save your default settings.

Point your favorite browser to /Admin/Properties. This page has a number of links that points   Jigsaw module. At this stage, the first thing you want to do, is to save the actual settings: the command line flags all end up modifying the properties, so what you see is the correct set of properties (since you have provided the appropriate command line flags, e.g. -host, -port, etc.), but these are not consistent with the default property files. Save the properties by clicking the Save link. You can now play around, and change your settings as needed.

Note that some settings require a server restart (e.g. changing the server's port number). In such cases, you will be notified of the need of a server restart through an appropriate message. You need not (and shouldn't) save the properties before restarting the server (so that you can check them before actually saving them), however, once the server has restarted, and if you find these settings convenient, then you should save them by going back to the properties editor, and clicking the Save button.

Let's say, for example, that you want to turn the server in trace mode. You will first set the server's trace flag to true, then go to the connections property sheet, and turn the connection trace flag to true too. Don't save the properties yet, but restart the server. During its restart the server will first emit a message saying that it has shutdown itself, and then emit a second message saying that it has initialized itself. The server now runs in trace mode (i.e. it will display requests and replies). If you want this change to be persistent (i.e. always run the server in trace mode), then go back to the properties editor, and Save the properties.

In some cases, after you change some properties, the server won't be able to restart (because the new settings are inconsistent). In this case, just restart the server manually (through the command line): this will make it read its settings from the unchanged properties.

Resources configuration

At this point, Jigsaw should be running, and its global configuration should be up to date. You may want to check that everything is fine by visiting some of the documents under the Jigsaw space directory (the WWW directory by default). You can read the documentation, for example, starting from /User/Overview.html.

This section will explain how you can declare new file extensions (the usual AddType directive of servers), and how you can create directory templates .

Describing files by extensions

To export any piece of information, Jigsaw wraps it into an HTTPResource object (see Jigsaw resource factory documentation for more information on this). This resource factory maintains a database to keep track of how files should be wrapped into resources. This database, known as the extensions database describe, for each extension, an (optional) mapping to some resource class, along with a set of default attribute values for the newly created resource that will encapsulate the file.

Let's say we want all files having the png extension, to be exported by a FileResource instance (the FileResource class is the class that knows how to send back files as reply to requests). We also want to state that the content-type attribute value for these file resources should be image/x-png. We first point our favorite browser to /Admin/Extensions, this displays the sorted list of registered extensions. What we want to do is to add a new extension, so we click on the Add extension link. This brings up a new form, asking us for an extension name, and an (optional) class. We fill in the name of the extension (png, without the leading dot), and the extension class (w3c.jigsaw.resources.FileResource), and then press the Create button. This brings up yet another form, which allows us to enter default attribute values for resources that will be created through this extension. We want to specify the content-type, so we fill this field with image/x-png, and - just for fun - we may want the default icon for theses resources to be image.gif, so we fill in this field too. We then press the OK button, which brings us back to the list of extensions (in which the new png extension has been inserted).

If you have a .png file under Jigsaw space directory, you can now query it: the resource factory will know how to wrap into a FileResource, and will send it back to you properly.

In some cases, you may just want to state that for all files having a particular extensions, some specific attribute should default to some specific value. A typicall example of this is to state that files having the .fr extension have been written in french. To handle such a case, we just follow the above procedure: we follow the Add extension link, and fill in the name of the extension (fr here), however, this time, we will leave the class name empty. This means that by itself, the fr extension will not cause a file to be exported, but if a file having the fr extension is to be exported (because, for example, it has both the fr and html extensions), then we are willing to provide some default values for some of its generic attributes. We press the Create button to register the extension, this brings up a form containing the generic attributes (i.e. attribute tha apply to all HTTPResource instances). We just fill in the content-language field, stating that its value should be fr, then we press the OK button, which brings us back to the list of registered extensions (in which fr has been inserted).

You should now be able to register as many extensions as you want. You should also check that the default extensions meet your  expectations, by carefully viewing them. You may also want to remove some of the default extensions, or the one we have created above. To delete an extension, go to the page displaying the list of registered extensions, mark the ones you want to unregister by clicking on the checkbox at their right, and press the Delete button.

At this point, you might be surprised that we haven't save any of our changes. Jigsaw uses a complex caching scheme, and it decides itself when it is best to save changes to disk. However, you should remember that because of this, the server should be killed only by getting the /Admin/Exit resource (which will shutdown the server properly, saving to disk what needs to be saved at this point). If you really want your changes to be saved right now, you can still click on the /Admin/Properties checkpoint link, which will ensure all cached data are written back to the disk (it's usually a good idea to do that after editing the configuration).

Directory templates

The previous section has explained how to map files to resources based on their extensions. This section now explains how to map directories to resources.

As for files, mapping directories to resources is done by the resource factory, however the mapping is done per directories name, rather then per extensions. Directory templates are records that describe how directories of a given name should be mapped to resource instances. The first rule to be aware of, is that if no directory templates is available for a specific directory, the file-system directory will be exported through a simple DirectoryResource instance.

Back to directory templates. Directory templates associate a directory name with a resource class along with a set of default attribute values for the directory. Directory templates may be of two kinds: they can be generic or normal (depending on the value of their generic attribute). Generic directory templates apply to all directories having, as part of their path, the directory template name, while normal directory templates apply only to directory whose last name is the template name. Generic directory templates allows you to use some specific directory resource to export a whole hierarchy of information.

To access the directory templates database, point your favorite browser to /Admin/DirectoryTemplates. This will list the currently defined directory templates. Let's add a new directory template, for all CVS directories, which we want to export through the CvsDirectoryResource (this resource provides a form-based interface to CVS). To do this, we follow the Add directory link, which brings up a form that prompts us for a directory name and a directory class. We want this directory template to apply to all CVS directory, so the name is just CVS, and we want these directories to be exported through a w3c.jigsaw.cvs.CvsDirectoryResource instance. We fill in the fields, and press the OK button, this brings up a new form, with the templates attribute themselves. We don't want this directory template to be generic, so we leave these fields alone. The template link brings up a form to edit the default attribute values for the resource to be created. The fields in this form depends on the class you have given when you created the directory template. In the case of the CvsDirectoryResource, none of these attributes need default values, so we just skip them. We are all set, now when the resource factory will be queried to wrap a directory whose name is CVS, instead of creating a directory resource, it will create a resource that will allow you to control the cvs status of the files in the directory, neat isn't it ?

Another typical usage of directory templates, is to use them to provide writable areas. The PutableDirectory class extends the basic DirectoryResource class with the ability to create new resources to handle the HTTP PUT method (a bunch of browsers now support this PUT method, you can try for example Amaya, or GNN). Let's define a directory template for some directory named Writable and all its children directories. As before, we click on the Add directory link, the name of our directory template is simply Writable and its associated class is w3c.jigsaw.resources.PutableDirectory. We press the OK button, and this time, we state that the template is generic, by setting the generic attribute to true. This time we want to edit the template so we follow the link. The first thing we may want to do is to provide a fancy default icon, so that putable directories can be distinguished from the others, lets' use burst.gif, then if using GNN, you might want to turn the browsable attribute to true. This will make the resource handles the GNN specific BROWSE method. We press the OK button, and we are done. Don't forget to read the authentication section, that will explain how to setup authentication for these kinds of directories.

At some point, you may want to delete directory templates. In this case, go to the /Admin/DirectoryTemplates location, and mark each of the templates you want to delete by clicking on the checkbox to the right of their names. Then press the Delete button, you are done.

Editing

At this point, you should be able to run Jigsaw, to configure its properties and to edit the resource factory configuration databases. In this section, we will concentrate on Jigsaw's ability to edit existing resources.

In some circumstances, you may want to customize a single resource of your whole information space. This may be because you want it to have a specific icon, or because it's a special resource that needs some specific configuration, etc.

Just for fun, let's say that although the document /User/Overview.html has the html extension, we want its content type to be advertized as text. To do this, what is needed is to edit the content-type attribute of the appropriate resource in order to change it into text/plain. Jigsaw comes with a generic resource editor (which is itself a resource), that allows you to edit any specific resource exported by the server. To edit a resource, we just append its path to the /Admin/Editor resource. So, to edit our /User/Overview.html resource, we just point our browser to /Admin/Editor/User/Overview.html. This brings up a form containing the specific attributes of the resource. The one we are interested in, here, is its content-type that we change to text/plain. We press the OK button, et voila ! Try loading /User/Overview.html: instead of displaying the HTML document itself, your browser should now display the HTML text making the document. Enough fun, let's turn the resource content-type to text/html, and check that the resource is back to normal...good !

All resources are editable in the same way, again for fun only, let's say we want to add the ThreadStat resource (which display the threads running in the server) into the /User directory resource. To do this, we launch the generic editor on the /User directory, by pointing our browser to /Admin/Editor/User location. This brings up a form with one field per directory resource attribute. This time, what we want to do is to add a resource, so we go straight down the page, and follow the Add resource link. We are prompted for a resource name, and a resource class. The name of the resource is the name under which it will be retreived from its directory, let's use threads. The full class name of the ThreadStat resource is w3c.jigsaw.status.ThhreadStat. We fill in the two fields, and press the Create button. Then we follow the Existing resources link. This shows up a sorted list of all the resources contained by the /User directory resource, among which is the threads resource. Follow threads link, this pops up an editor to edit the ThreadStat resource attributes. You may want to change the refresh rate, by changing the value of the refresh attribute (which gives the refresh period in seconds). Press the OK button. Now point your browser to /User/threads, and you will see the threads that the server is running.

Now that we got all this fun, its time to do some clean-up, let's remove the threads resource from the /User directory resource. Point your browser to the /Admin/Editor/User location. Mark the threads resource by clicking on the checkbox right to its name,  select the Remove command and press the OK button. By the way, the Reindex button here, will reindex the marked resources (this is usefull if you change your resource factory configuration), and the Update button will update any of the resource attributes that is computed.

Authentication

At this point, you nearly knows everything about Jigsaw. This section will provide you wit ha basic explanation of filters, and how to use them to setup authentication.

Resource filters are attached to specific resources in order to filter accesses to them. These filters are called once at lookup time, and once at reply time. On the way in (lookup time), they allow you to manipulate the request before the target resource handles it, and on the way out, they allow you to manipulate the target's reply before it is emited back to the browser.

Although Jigsaw provides a number of filters, we will focus here on the authentication filter, that authenticate requests before thy are handled by their appropriate target resources. The GenericAuthFilter is currently the only available authentication filter.

To illustrate its usage, we will go through the steps required to protect the /Admin directory. The first thing you want to do, is to define an authentication realm. An authentication realm is a database that will contain the description of a set of users, along with their passwords and/or IP adresses. Let's first define an admin authentication realm, describing all the users allowed to access the /Admin resources.For this we point our favorite browser to /Admin/Realms. This display the set of authentication realms the server knows about (this will be empty at this time). Let's create our realm: we click on the Add realm link, which brings up a form prompting us for the realm name (the server wide identifier for the realm), and the realm repository. Let's fill in the realm name with admin. The realm repository is the name of the file that the server will use to store the users database for the realm, it's good practice to put these files under the config/auth directory of Jigsaw. So we fill in the repository field with .../config/auth/admin.db (were ... is to be substituted by the absolute path of the root directory of the server). Once both fields are filled in, we press the Create button to create the realm. This brings up a form containing the list of users defined in the realm (which is, of course, initially empty). We follow the AddUser link, which prompts us for a user name. Let's say that admin is the user we want to describe. We fill in the name, and press the Create button, this again brings up new form, prompting us for a bunch of information about the admin user. The email address is currently unused (but it might be used in the future for email notification), you can iethre fill it of leave it empty. You can type in any comments in the comments field which is used only for informational purposes. The ipaddress field allows you to state from which machine the user is allowed to connect. This field is not mandatory: if left blank, only the password will be used for authentication (be warned that the password authentication scheme used by HTTP is very weak, for protecting such a critic space as the Admin space, you should always specify both a password and some IP addresses). If you decide to fill in the ipaddress field, you can enter multiple addresses for the same user (one per line). In my case, I fill it with my office IP address and my home IP address, so I enter the following two lines of text:

18.52.0.144
18.23.1.195

You can also use * in the ip address field, meaning that any user connecting from the given set of IP addresses is to be authentified as the admin user. We then enter some password, and any optional groups we want the user to belong to (groups are a way of tagging a set of users belonging to the same realm, we will see some sample usage of groups below).

Once all the fields are filled-in, we press the OK button: the admin user is now registered as part of the admin realm. You can define as much users as you want, by following these same instructions.

Once you are satisfied with your realm, it is time to actually protect the /Admin directory. For this purpose we launch the generic resource editor on the /Admin resource by pointing our browser to /Admin/Editor/Admin location. What we want to do is add the authentication filter as one of the filters for the /Admin resource: we click on the Add filter link, and enter the following filter class: w3c.jigsaw.auth.GenericAuthFilter. We press the OK button to register the filter, this add two new links to the editor: let's follow the filter-0 link. This pops up a form-based editor to edit the filter's attributes. The first identifier field need not be filled in (you can change it to auth in order to have more relevants link names), you can just skip it. The method field allow you to describe what methods are to be protected. As the /Admin is such a critic resource, we want to protect it against any method, which is the default when the field is left blank (otherwise, we would specify one method per line). The realm name field must be filled in: it should give the identifier of the realm we want to use for authenticating the /Admin space. We use here the realm that we have created above: admin. The users field allows you to restrict access further to only a set of users in the realm, and the group field allows you to restrict access to a set of groups of the realm. In both cases you should enter one user name (or group name) per line. If left blank, all users within the realm will be allowed in (which is what we want since we have defined a specific realm to describe the users that are allowed to configure the server). We can now press on the OK button: the /Admin directory is now protected. To check this, just point your browser to /Admin, this should now prompt you for a user name, and a password.

You are now all set: Jigsaw is configured, enjoy !


Jigsaw Team
$Id: configuration.html,v 1.14 1997/07/31 08:26:25 ylafon Exp $