Tips for the Log Validator
Scheduling the LogValidator
If you run the LogValidator on a Unix-like system, you can schedule its use with the
Add this line to your crontab config file to run it every sunday, for example:
12 2 * * 7 root "/path/to/logprocess.pl -q -f /path/to/config-file.cf"The output should be sent by mail to the user 'root'. Replace it with 'webadmin' or any valid username that you'd like. If you want to send results for several site to their respective owner, you don't need to wait for the mail feature in the log validator, just use:
12 2 * * 7 root "/path/to/logprocess.pl -q -f /path/to/config-sitefoo.cf | mail firstname.lastname@example.org"
Hook the Log Validator to CVS or other versioning systems
If your web site runs under a versioning system such as CVS (or subversion), it is generally possible to look at the latest modified documents in CVS to build a list of potential candidates for checking. Below is the recipe used at W3C (www.w3.org uses CVS for versioning) for checking a specific user's latest commits of html documents.
- Get the recent-commits.pl
perl script and adapt it to your server settings. if you do not have Unix's textutils, replace
tail -r, that should do the job.
- Run the script above (on the cvs server) to generate a list of recently modified documents
- Finally, configure the logvalidator to use that list as a source, and batch-validate all the recent changes done by each user
- Automate this process: run
logprocess.pl --email -s email@example.com a cron job and get mail sent to each user regularly. The
QuietIfNoReport 1logvalidator option can also be used to make sure the mail will only be sent if documents need fixing.
Using the HTMLValidator module
When you are using the LogValidator with the HTML validator modules,
you can be discouraged by the apparent amount of work needed.
Don't be afraid! The work seems to be huge at the begining but, it's just a matter of strategy
For an individual resource, a huge number error doesn't mean necessary you have a huge number of mistakes in your document's markup. For example, if you have a wrong doctype, the validator may try to validate something it doesn't know and will give you a long list of errors. Fix the doctype and the number of errors will decrease or even get to zero.
When you have the list of invalid pages, it may appear that a large number of similar pages have the same number of errors. There's a strong possibility that your template engine and/or the library which creates your html files are wrong. Fix your template first and see if the problem is solved.
If you want to use the HTML validator module intensively, because you need to request often a lot of page to verify their validity (in case you have a very big site, for example), you may want to install the HTML validator locally on your network and change the logvalidator config to use this validation service instead of the one at W3C : it should improve dramatically the speed of the script.