From sscprick@horus.sara.nl Fri Oct  6 15:34:52 1995
Article: 25400 of comp.infosystems.www.authoring.html
From: sscprick@horus.sara.nl (Rick Jansen)
Subject: Webxref new version
Date: Wed, 4 Oct 95 13:55:10 MET
Organization: sara

Webxref is a Perl program for making cross references from a html document and 
the html documents linked from that html document. I.e. the links found in 
that document are checked for missing links or files, then the links in that 
document are checked and so on. Webxref is a quick and easy tool to quickly 
check a local tree of html documents.

As of version 0.1.1 a 1-level check of external URLs is done, i.e. it is 
checked if the http://links encountered exist and work. This can be switched 
off with the -nohttp option. Other new options are -htmlonly and -avoid, see 
below. 

When ready a list (and direct and indirect references) is printed of: 

    -all html files 
    -directories 
    -images 
    -mailto's 
    -news 
    -ftp 
    -telnet 
    -gopher 
    -external URLs 
    -cgi-bin forms/scripts 
    -named anchors 
    -files and images that could not be found 
    -files that are not world readable 
    -directories that could not be found 
    -named anchors that could not be found 
    -http:// ok references
    -http:// failed references

 usage: webxref [-help -nohttp -htmlonly -avoid regexp] file.html 

 -nohttp: do not check external URLs
 -htmlonly: only inspect files with the .html suffix
 -avoid regexp: avoid files with names matching regexp for inspection

 Examples
   webxref file.html
             checks file.html and files/URLs referenced from file.html
   webxref -nohttp file.html
             checks file.html and references, but not external URLs
   webxref -htmlonly file.html
             checks file.html, but only files with the .html extension
   webxref -avoid '.*Archive.*' file.html
             checks file.html but avoids files with names containing
             'Archive'
   webxref -avoid '.*Archive.*|.*Distribution.*' file.html
             Same as above, but also files with names containing
             'Distribution' are skipped.

Webxref was written as part of the SURFACE project (SURFnet Advanced 
Communication Environment). 

Webxref is available from: 

http://www.sara.nl/cgi-bin/rick_acc_webxref

Rick Jansen
__
rick@sara.nl   http://www.sara.nl/Rick.Jansen
_____________________________________________
S&H's a module and s&h's looking good