From sscprick@horus.sara.nl Fri Oct 6 15:34:52 1995
Article: 25400 of comp.infosystems.www.authoring.html
From: sscprick@horus.sara.nl (Rick Jansen)
Subject: Webxref new version
Date: Wed, 4 Oct 95 13:55:10 MET
Organization: sara
Webxref is a Perl program for making cross references from a html document and
the html documents linked from that html document. I.e. the links found in
that document are checked for missing links or files, then the links in that
document are checked and so on. Webxref is a quick and easy tool to quickly
check a local tree of html documents.
As of version 0.1.1 a 1-level check of external URLs is done, i.e. it is
checked if the http://links encountered exist and work. This can be switched
off with the -nohttp option. Other new options are -htmlonly and -avoid, see
below.
When ready a list (and direct and indirect references) is printed of:
-all html files
-directories
-images
-mailto's
-news
-ftp
-telnet
-gopher
-external URLs
-cgi-bin forms/scripts
-named anchors
-files and images that could not be found
-files that are not world readable
-directories that could not be found
-named anchors that could not be found
-http:// ok references
-http:// failed references
usage: webxref [-help -nohttp -htmlonly -avoid regexp] file.html
-nohttp: do not check external URLs
-htmlonly: only inspect files with the .html suffix
-avoid regexp: avoid files with names matching regexp for inspection
Examples
webxref file.html
checks file.html and files/URLs referenced from file.html
webxref -nohttp file.html
checks file.html and references, but not external URLs
webxref -htmlonly file.html
checks file.html, but only files with the .html extension
webxref -avoid '.*Archive.*' file.html
checks file.html but avoids files with names containing
'Archive'
webxref -avoid '.*Archive.*|.*Distribution.*' file.html
Same as above, but also files with names containing
'Distribution' are skipped.
Webxref was written as part of the SURFACE project (SURFnet Advanced
Communication Environment).
Webxref is available from:
http://www.sara.nl/cgi-bin/rick_acc_webxref
Rick Jansen
__
rick@sara.nl http://www.sara.nl/Rick.Jansen
_____________________________________________
S&H's a module and s&h's looking good