W3C PICS

Push-based Web Filtering Using PICS Profiles


Push-based Web Filtering Using PICS Profiles

Download source code .zip file / .tar.gz file

Dowbload API .zip file / .tar.gz file

Browse the API


This page is the documentation for the source code that was developed as a part of this thesis. The Java source code available from this page is a complete working prototype of a push-based PICS filtering system using the RSACi rating service. Only information specific to the source code is found on this page. For more general information about this project, see the actual thesis document.


Installation

NOTE: This software requires a Java 1.1 Virtual Machine, such as Sun's JDK 1.1.x.

To install this system, first download the source code (.zip file / .tar.gz file)

After uncompressing the file, you should have a directory named w3c, and a file named rsaci.rat. From the installation directory, you must create four new directories, and name them: profiles, labels, csrc, and output. After creating the four required directories, you should have the following directory structure:

INSTALLDIR--w3c--pics--db--parser
     |
     |
     -------profiles
     |
     |
     -------labels
     |
     |
     -------csrc
     |
     | 
     -------output
     |
     |
     rsaci.rat

Add the INSTALLDIR to your CLASSPATH. You should now be ready to go!

Starting the system

The system is run through a command-line interface provided the CDatabase class.
To start the system type:

java w3c.pics.db.CDatabase -service rsaci.rat

The exact command will differ slightly if you are using a Java VM other than the JDK.

The following command line arguments are available:

After invoking the CDatabase, the following commands are available:

Unless otherwise noted, all labels that are read or created by this program must reside in the labels subdirectory, and all profiles that are read or created by this program must reside in the profiles subdirectory. All converted profiles and labels will be placed in the csrc subdirectory.

Convert labels

This command will prompt for two inputs: a labelcount and a basename. The labelcount is simply the number of labels to be converted. The basename is the base file name that the labels are stored in. For example, if the labels are stored in the files mylabel1.txt, mylabel2.txt, mylabel3.txt, etc... the basename is mylabel. Note that this means that the label files must end in the extension .txt.

All converted labels will be saved as individual files in the csrc subdirectory. The names of these files will be: labeldata0, labeldata1, labeldata2, ... Note that these files should NEVER be edited or renamed. They should only be read or created by other commands within this system.

Convert profiles

This command will prompt for three inputs: a usercount, a basename, and a labelcount. The usercount is simply the number of the labels to be converted. The basename is the base file name that the profiles are stored in. For example, if the profiles are stored in the files myprofile1.rlz, myprofile2.rlz, myprofile3.rlz, etc.. the basename is myprofile. Note that this means that the profile files must end in the extension .rlz. The labelcount is the number of labels that are going to be processed against these profiles. This number can be changed later. It is placed into the C code as a constant which can easily be changed.

All converted profiles will be saved in C source files in the csrc subdirectory. The names of these files should NEVER be edited or renamed.

Along with the C source file, a Makefile will be created. Compiling with this Makefile will build the entire Profile Store, ready to be run against converted labels. The Makefile which is created is for Windows machines. It is compatible with the nmake utility provided with MSVC. Users of Unix machines will have to modify this Makefile to make it work properly with their compiliers.

Create labels

This command will prompt for two inputs: a labelcount and a basename. The labelcount is simply the number of labels to be created. The basename is the base file name that the labels will be stored in. For example, if three labels are constructed, with the basename of testlabel, the files created will be: testlabel1.txt, testlabel2.txt, testlabel3.txt.

Create profiles

This command will prompt for two inputs: a usercount and a basename. The usercount is simply the number of profiles to be created. The basename is the base file name that the profiles will be stored in. For example, if three profiles are constructed, with the basename of testprof, the files created will be: testprof1.rlz, testprof2.rlz, testprof3.rlz.

Using Labels and Profiles from the Web

To use Labels and Profiles that were either created by hand or acquired from third parties, you have to do the following:
  1. Place the labels/profiles in the labels/profiles directory.
  2. Make sure that the names of the files follow the naming standard as given above. Labels should be of the form: [name][number].txt. Profiles should be of the form: [name][number].rlz. In order to prevent overwriting existing profiles, profiles should be number consecutively without reusing numbers. Label numbers can be reused, unless you want to re-evaluate profiles against those labels again at some later time.
  3. Use the Convert labels/profiles command.
  4. If profiles were added, they must be compilied using the Makefile.
  5. Run main.exe.

Example Session

The following in an example of the series of steps that would need to be done in order to start running this system for the first time, using labels and profiles gathered from outside sources.
  1. Rename all the profiles and labels. The labels are given the names label1.txt, label2.txt, etc... The profiles are given the names user1.rlz, user2.rlz, etc... All profiles are placed in the profiles subdirectory. All labels are placed in the labels subdirectory.
  2. The system is started with the command:
    java w3c.pics.db.CDatabase -service rsaci.rat -without
    This command starts the converter in the mode which will enable HTML output for the results of processing.
  3. Use the convert labels command. This command will convert all of the labels into machine readable format. When prompted, enter the total number of labels and the base file name used for the labels (in this example, "label").
  4. Use the convert profiles command. This command will convert all of the profiles into C source code. When prompted, enter the total number of labels and profiles, and the base file name used for the profiles (in the example, "user").
  5. Compiling the C code using the Makefile that was generated in the csrc subdirectory in the previous step.
  6. Run main.exe.
  7. The results of running the system will appear as HTML files in the output subdirectory. Each profile will have its own HTML file. Each file will be a list of links to the pages whose labels matched that profile's settings.

Class notes

Below is a brief description of the various Java classes included in this software. For more details, consult the API


dshapiro@w3.org
16 April 98