<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<!DOCTYPE bugzilla SYSTEM "https://www.w3.org/Bugs/Public/page.cgi?id=bugzilla.dtd">

<bugzilla version="5.0.4"
          urlbase="https://www.w3.org/Bugs/Public/"
          
          maintainer="sysbot+bugzilla@w3.org"
>

    <bug>
          <bug_id>6286</bug_id>
          
          <creation_ts>2008-12-06 22:32:08 +0000</creation_ts>
          <short_desc>XML comments are incorrectly counted in HTTPXHTMLResource.java</short_desc>
          <delta_ts>2008-12-06 22:34:13 +0000</delta_ts>
          <reporter_accessible>1</reporter_accessible>
          <cclist_accessible>1</cclist_accessible>
          <classification_id>1</classification_id>
          <classification>Unclassified</classification>
          <product>mobileOK Basic checker</product>
          <component>Java Library</component>
          <version>unspecified</version>
          <rep_platform>PC</rep_platform>
          <op_sys>Linux</op_sys>
          <bug_status>RESOLVED</bug_status>
          <resolution>FIXED</resolution>
          
          
          <bug_file_loc></bug_file_loc>
          <status_whiteboard></status_whiteboard>
          <keywords></keywords>
          <priority>P2</priority>
          <bug_severity>normal</bug_severity>
          <target_milestone>---</target_milestone>
          
          
          <everconfirmed>1</everconfirmed>
          <reporter name="fd">fd</reporter>
          <assigned_to name="Abel Rionda">abel.rionda</assigned_to>
          
          
          

      

      

      

          <comment_sort_order>oldest_to_newest</comment_sort_order>  
          <long_desc isprivate="0" >
    <commentid>22701</commentid>
    <comment_count>0</comment_count>
    <who name="fd">fd</who>
    <bug_when>2008-12-06 22:32:08 +0000</bug_when>
    <thetext>The regular expression used to match comments in HTTPXHTMLResource.java is defined as:
 Pattern.compile(&quot;(&lt;!-- .* --&gt;)&quot;, Pattern.MULTILINE);

This is incorrect because:
 1. &quot;.&quot; does not match new lines unless the Pattern.DOTALL is also set
 2. regular expression are greedy in Java, meaning that if there is one comment at the beginning of the document and one comment at the end, the regular expression will just match the entire document between the beginning of the first comment and the end of the second one
 3. There may be no space between the beginning and the end of the comment, i.e. &quot;&lt;!--comment--&gt;&quot; is a valid comment.

The correct regular expression should rather be:
 Pattern.compile(&quot;(&lt;!--.*?--&gt;)&quot;, Pattern.MULTILINE | Pattern.DOTALL);</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>22702</commentid>
    <comment_count>1</comment_count>
    <who name="fd">fd</who>
    <bug_when>2008-12-06 22:34:13 +0000</bug_when>
    <thetext>Fixed regular expression in HTTPXHTMLResource.java</thetext>
  </long_desc>
      
      

    </bug>

</bugzilla>