This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.
Often times I want to use the recursive link checker to check the links on my pages but the link checker will go off checking other people's links. There should be an option to scan only the pages in a given domain. While doing this it will still check to see that the pages linked to outside the domain exist but i will not scan the links on those pages.
Could you provide an example URI that reproduces this?
http://validator.w3.org/checklink?uri=tacvek.tripod.com&summary=on&hide_type=all&recursive=on&depth=3&check=Check the above uri scans my web page. the 3rd page scaned for bad links is one that is not on my site. this is not what i want. i want to recursivly check the pages on my site only. the feature i am suggesting is to resrict the recusive scanning to the original domain/sub domain/folder witghin domain.
Yes, I get your point. The way checklink should work at the moment is not only to restrict the recursion scope to the same domain or host but the same *base URI*. Base URI means that if you're checking links for a resource at <http://foo.bar.com/quux/something.html>, checklink *should* check only resources whose URI starts with "http://foo.bar.com/quux/". This is probably what you mean by a given "folder". Obviously, your example URI reveals a reproducible bug in checklink, I have logged this as bug 115, and already found a workaround. Check it out, and thanks for the sample URI.
The bug described in 115 appears to be the same bug as this one. That is true at least if Comment 4 is correct. However the link still does not appear to work properly.
Whoops... I meant comment 3
I think bug 115 is not a duplicate; it is about the current recursion restrictions being broken. If I understand correctly from your initial comment and the bug summary, this one is about a new feature, which actually would /broaden/ the scope of recursion, not restricting it to the same base URI. I apologize if I misunderstood, but the domain/host "restriction" feature has been asked before, I'm leaving this one open. OTOH, it might be that bug 115 has caused the feature requests... :) The version at validator.w3.org/checklink hasn't been updated yet, it's still the old one with the recursion bug. It'll be updated shortly. The fix is only in CVS for now, get version 3.6.2.2 if you want to try it locally.
Well actually that would be even better. so leave this bug open. I think this bug should be left open as a posiblility and perhaps put on a todo list in an 'optional' section. I have no real way to try a cvs version as my site will probably not allow it to be run (Tripod has a very picky cgi policy) and I don't have a personal web server.
Yes, it's already kinda on the todo list; logged as an enhancement in Bugzilla :) Anyway, I think the public service will be updated soon (I have no direct control over it). And in case you didn't know, checklink can also be run on the command line...
Just a quick followup: the current version running at http://validator.w3.org/checklink should no longer have the recursion bug.