This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 31 - checklink: Try GET if server responds 501 (or 405) to HEAD
Summary: checklink: Try GET if server responds 501 (or 405) to HEAD
Status: ASSIGNED
Alias: None
Product: LinkChecker
Classification: Unclassified
Component: checklink (show other bugs)
Version: unspecified
Hardware: Other other
: P2 normal
Target Milestone: ---
Assignee: This bug has no owner yet - up for the taking
QA Contact: qa-dev tracking
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2002-10-25 22:51 UTC by Ville Skyttä
Modified: 2013-11-03 07:35 UTC (History)
4 users (show)

See Also:


Attachments

Description Ville Skyttä 2002-10-25 22:51:57 UTC
The link checker should do a GET request when the server replies 501 to a HEAD
request.
Comment 1 Frank Ellermann 2003-06-23 06:01:37 UTC
JFTR, some servers don't send 501 if they don't like HEAD, e.g. 
http://www.rsasecurity.com/rsalabs/challenges/factoring/numbers.html

Trying to write my own checklink (REXX + rxsock.dll on OS/2)
I obviously caused some serious trouble for some servers. My
new strategy is to never test any given host again if it sent
400, 405, or 5??.  Until now I haven't seen any 501, and if I
understand RfC 2916 correctly 501 *_should not_* be used in
replies to HEAD requests.
Comment 2 Ville Skyttä 2003-09-15 13:10:45 UTC
You're right, the results with servers that do not support HEAD vary a lot,
almost any status can be the result.  For example as reported in
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=188123

But RFC's 1945, 2068 and 2616 (section 5.1.1) all mention 501 as a "SHOULD" for
unrecognized methods; 2068 and 2616 additionally define the possibility of a 405.

I have always though that support for HEAD is a "MUST" in HTTP 1.1, but RFC 2616
(and 2068, FWIW) say:

   "The methods GET
   and HEAD MUST be supported by all general-purpose servers."

Note "general-purpose".  *sigh*
Comment 3 Frank Ellermann 2003-09-15 14:37:52 UTC
Oops... tnx for this "general purpose" hint. That's a bit
like the many lines explaining GMT in date-headers, and
ending with the conclusion, that a server can omit this
header if it has difficulties to determine Zulu time... ;-)

Actually this makes sense (routers without clock etc.),
but not supporting HEAD is hard. Better stay away from GET,
at least as long as you don't support robots.txt

BTW, do you have admin rights on this bugzilla ?  If YES
please (re)enable the option to modify mail addresses.  
It exists, I've seen it on the distributed bugzilla, where
I could change my address.
Comment 4 Ville Skyttä 2003-09-15 14:50:01 UTC
I haven't seen the option to change a mail address in any Bugzilla.  Not to say
such a thing doesn't exist, but I believe it's a recent addition, most likely
available in newer versions than this one.  And no, my rights aren't up to the
task anyway, I suggest contacting Terje Bless <link@pobox.com> and/or Olivier
Thereaux <ot@w3.org> on Bugzilla issues, I believe they have the necessary admin
rights.  Hm, maybe a component named "Bugzilla" wouldn't be a bad idea here... :)