<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<!DOCTYPE bugzilla SYSTEM "https://www.w3.org/Bugs/Public/page.cgi?id=bugzilla.dtd">

<bugzilla version="5.0.4"
          urlbase="https://www.w3.org/Bugs/Public/"
          
          maintainer="sysbot+bugzilla@w3.org"
>

    <bug>
          <bug_id>1182</bug_id>
          
          <creation_ts>2005-03-28 10:00:23 +0000</creation_ts>
          <short_desc>add an exclude-links option analog to exclude-docs</short_desc>
          <delta_ts>2006-10-19 20:59:25 +0000</delta_ts>
          <reporter_accessible>1</reporter_accessible>
          <cclist_accessible>1</cclist_accessible>
          <classification_id>1</classification_id>
          <classification>Unclassified</classification>
          <product>LinkChecker</product>
          <component>checklink</component>
          <version>4.0</version>
          <rep_platform>Other</rep_platform>
          <op_sys>Linux</op_sys>
          <bug_status>RESOLVED</bug_status>
          <resolution>DUPLICATE</resolution>
          <dup_id>689</dup_id>
          
          <bug_file_loc></bug_file_loc>
          <status_whiteboard></status_whiteboard>
          <keywords></keywords>
          <priority>P2</priority>
          <bug_severity>enhancement</bug_severity>
          <target_milestone>---</target_milestone>
          
          
          <everconfirmed>1</everconfirmed>
          <reporter name="Stefan Ruppert">stefan</reporter>
          <assigned_to name="Ville Skyttä">ville.skytta</assigned_to>
          <cc>bruce</cc>
          
          <qa_contact name="qa-dev tracking">www-validator-cvs</qa_contact>

      

      

      

          <comment_sort_order>oldest_to_newest</comment_sort_order>  
          <long_desc isprivate="0" >
    <commentid>3955</commentid>
    <comment_count>0</comment_count>
    <who name="Stefan Ruppert">stefan</who>
    <bug_when>2005-03-28 10:00:23 +0000</bug_when>
    <thetext>add an option --exclude-links analog to --exclude-docs but instead of excluding
parsing a document exclude a link for checking. For example running checklink
locally a regexp of &quot;^http:&quot; would exclude all remote link checking.

Add the following line after:

next if ($u =~ m/^mailto:/);

next if ($u =~ $Opts{Exclude_Links});</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>9982</commentid>
    <comment_count>1</comment_count>
    <who name="Bruce Altmann">bruce</who>
    <bug_when>2006-06-13 06:00:53 +0000</bug_when>
    <thetext>Hello Ville,

Did you have a chance to add this.
I was thinking of adding a --staywithin option.
(I see this as a way to say - stay on our web servers. So links out are checked but not followed in recursion)

ex --staywithin *.amd.com

For 4.2.1 code - I assume this goes in the
sub in_recursion_scope()
Is that correct?

(as I see the mailto filter (in 4.2.1) is applied when it builds the list of broken links)

-Bruce
</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>10027</commentid>
    <comment_count>2</comment_count>
    <who name="Ville Skyttä">ville.skytta</who>
    <bug_when>2006-06-14 17:48:28 +0000</bug_when>
    <thetext>No, this has not been implemented yet, I&apos;ll look into it.

Regarding --staywithin, the recursion scope is already limited to the base URI and below of the initial document by default, and can be controlled using the --location option.  I&apos;m considering improving that by making it possible to specify multiple recursion bases by specifying --location more than once.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>10040</commentid>
    <comment_count>3</comment_count>
    <who name="Bruce Altmann">bruce</who>
    <bug_when>2006-06-15 10:00:25 +0000</bug_when>
    <thetext>Yes, I ran into this base and location issue testing  http://www.amd.com/us-en/

this main page points to other key amd URIs
enterprise.amd.com
amdlive.amd.com
search.amd.com

I am looking to have recursive check any link found under amd.com/us-en/
but stays within *.amd.com

Is there a way with the current --location option (4.2.1) to say - 
check all links (as a page) that have *.amd.com in them?
(so it does not get stuck under http://www.amd.com/us-en/)

-Bruce
</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>10044</commentid>
    <comment_count>4</comment_count>
    <who name="Ville Skyttä">ville.skytta</who>
    <bug_when>2006-06-15 17:30:20 +0000</bug_when>
    <thetext>(In reply to comment #3)
&gt; Is there a way with the current --location option (4.2.1) to say - 
&gt; check all links (as a page) that have *.amd.com in them?

I&apos;m afraid there isn&apos;t.

This is getting off topic for this particular bug/RFE, and Bugzilla is not a good tool to facilitate discussion in the first place.  So please use the www-validator mailing list for discussions, and open new bugs for new issues, thanks in advance.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>12553</commentid>
    <comment_count>5</comment_count>
    <who name="Ville Skyttä">ville.skytta</who>
    <bug_when>2006-10-19 20:59:25 +0000</bug_when>
    <thetext>Bug 689 is actually the same as this one - marking this one as a duplicate because the other has some votes on it already.

*** This bug has been marked as a duplicate of bug 689 ***</thetext>
  </long_desc>
      
      

    </bug>

</bugzilla>