W3C logo
Jigsaw

Indexers
How to create resources automatically


Jigsaw Home / Documentation Overview / Jigsaw Administration Tool

This document has the following sections:

What is an Indexer?

Goal of an indexer

The main goal of an indexer is to create and setup some resource automatically. The resources can be created depending on their name or their extension. Once the resource has been created, the indexer is also in charge of attaching the right frames to this resource, like the HTTP frame, the filters and so on.

Each DirectoryResource (and subclasses) is associated to an indexer, if no indexer is specified the DirectoryResource is associated to the "default" indexer.

Description of an indexer

  1. Class and attributes of an indexer
    Class:
    Usually, the indexer's class is org.w3c.tools.resources.indexer.SampleResourceIndexer
    Identifier
    The name of the indexer, ex: "icons"
    Last Modified
    Unused, but resent as, internally, it is a resource.
    Super Indexer
    The name of the parent indexer used when the current indexer fails to index. By default, the super indexer is the "default" indexer.
  2. The sons of an indexer
    directories
    Used to index files matching exactly a name, mainly used to index directories. You can specify that an "Icons" directory will always be negotiable, for example. The default name (ie: matching all directory names) is "*default*"
    extensions
    Used to index files with a specific extension. For example, "html" is a FileResource with an HTTPFrame set to give the "text/html" content type to this file. Then all the "foo.html" files will be indexed as "text/html" type object when accessed by HTTP. The default extension (ie: matching all the extension names) is "*default*". To index files with no extensions, you must use the name "*noextension*".
    content-types (only for the Content Type Indexer)
    In some cases the file extension is not the only criteria, for example when a PUT request occurs the indexer should use the Content-Type header coming with the request (if there is a content-type header). This is the job of the Content Type Indexer. The Content Type Indexer (org.w3c.jigsaw.indexer.ContentTypeIndexer), has one more child, the content-types node. The associations between mime types and resources are stored in this new child.

    Since 2.0.2 the ContentTypeIndexer accept generic mime types like text:*, *:xml or even *:*. For example, if you define text:* as a FileResource using a HTTPFrame (with a content-type set to *none*) all content types like text/html, text/plain, text/xml will be accepted.

    Note: The mime types stored in the indexer are not "real" mime types, the '/' has been replaced by a ':'. We decided that because the '/' can create some conflicts with the URLs in Jigsaw.

You can find a sample indexer configuration in this page.

Indexers in JigAdmin

indexers

The Indexers Space is exactly the same thing than the Documents Space except that indexers classes are available in the "Available Resources" window. You are still able to add, delete, configure resources and frames but only in the indexers nodes (directories, extensions and sometimes content-types). Of course, you can also create new indexers (under the Indexers node).