9448 – Some bugs with the FTScope in full text

This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 9448 - Some bugs with the FTScope in full text

Summary: Some bugs with the FTScope in full text

Status:	CLOSED FIXED

Alias:	None

Product:	XPath / XQuery / XSLT
Classification:	Unclassified
Component:	Full Text 1.0 (show other bugs)
Version:	Candidate Recommendation
Hardware:	PC Windows XP

Importance:	P2 normal
Target Milestone:	---
Assignee:	Jim Melton
QA Contact:	Mailing list for public feedback on specs from XSL and XML Query WGs

URL:
Whiteboard:
Keywords:

Depends on:
Blocks:

Reported:	2010-04-08 13:33 UTC by xero_123
Modified:	2011-01-06 01:21 UTC (History)
CC List:	1 user (show)

See Also:

Attachments

Description xero_123 2010-04-08 13:33:14 UTC

I think that there is some bugs with the FTScope.
Let we see the function  fts:ApplyFTScopeDifferentSentence in the XQuery and XPath Full Text 1.0 W3C Candidate Recommendation 28 January 2010.



OK! I think that if we have a stringInclude1 which tokenInfo/@startSent != tokenInfo/endSentSent in the match, and then this stringInclude1 will always be 

right ,no matter what the $stringInclude2 is. Right or not?

On the other hand,if we have a stringInclude1 which tokenInfo/@startSent == tokenInfo/endSentSent in the match,and then this stringInclude1 will always be 

wrong ,because the $stringInclude2 maybe the same of  stringInclude1.

so I think we should change the 'where sub' is 'where every $stringInclude in $match/fts:stringInclude satisfies ($stringInclude1/fts:tokenInfo/@startSent != 

$stringInclude1/fts:tokenInfo/@endSent '.

I am a chinese student.So there may be some grammatical errors, and I am sorry.

MY EMail:guoxiaolei_neu@163.com

Look forward to your reply!!!!
 

Function Content:

	declare function fts:ApplyFTScopeDifferentSentence (
      $allMatches as element(fts:allMatches) ) 
   as element(fts:allMatches) 
{
   <fts:allMatches stokenNum="{$allMatches/@stokenNum}">
   {
      for $match in $allMatches/fts:match
      where every $stringInclude1 in $match/fts:stringInclude,
                  $stringInclude2 in $match/fts:stringInclude  
            satisfies ($stringInclude1/fts:tokenInfo/@startSent !=  
                       $stringInclude2/fts:tokenInfo/@startSent 
                    or $stringInclude1/fts:tokenInfo/@startSent !=  
                       $stringInclude1/fts:tokenInfo/@endSent 
                    or $stringInclude2/fts:tokenInfo/@startSent !=  
                       $stringInclude2/fts:tokenInfo/@endSent ) 
                   and $stringInclude1/fts:tokenInfo/@startSent > 0 
                   and $stringInclude2/fts:tokenInfo/@endSent > 0
      return 
         <fts:match>
         {
            $match/fts:stringInclude,
            for $stringExcl in $match/fts:stringExclude
            where every $stringIncl in $match/fts:stringInclude
                  satisfies ($stringIncl/fts:tokenInfo/@startSent !=  
                             $stringExcl/fts:tokenInfo/@startSent 
                          or $stringIncl/fts:tokenInfo/@startSent !=  
                             $stringIncl/fts:tokenInfo/@endSent 
                          or $stringExcl/fts:tokenInfo/@startSent !=  
                             $stringExcl/fts:tokenInfo/@endSent ) 
                         and $stringIncl/fts:tokenInfo/@startSent > 0 
                         and $stringExcl/fts:tokenInfo/@endSent > 0
            return $stringExcl
         }
         </fts:match>
   }
   </fts:allMatches>
};

Comment 1 Michael Dyck 2010-05-16 04:32:28 UTC

> I think that if we have a stringInclude1 which tokenInfo/@startSent !=
> tokenInfo/endSentSent in the match, and then this stringInclude1 will always be 
> right, no matter what the $stringInclude2 is.

That depends on what you mean by "right". The condition in the QuantifiedExpr will be satisfied for that $stringInclude1 no matter what $stringInclude2 is. 

> On the other hand, if we have a stringInclude1 which tokenInfo/@startSent ==
> tokenInfo/endSentSent in the match, and then this stringInclude1 will always be 
> wrong, because the $stringInclude2 maybe the same of  stringInclude1.

Hmm. So when an "input" match contains a stringInclude that starts and ends within the same sentence, that match cannot succeed, because the QuantifiedExpr's condition will return false when $stringInclude1 and $stringInclude2 are both bound to that stringInclude. Yeah, that's a bug.

(The condition used to begin with
    $stringInclude1 is $stringInclude2 or ...
to "skip over" the cases where the two variables were bound to the same stringInclude, but that had a bug too: a match containing a single stringInclude would always succeed.)

We'll have to take another crack at writing that function.

Comment 2 Michael Dyck 2010-09-07 04:08:31 UTC

At meeting #437 on 2010-05-18, the Working Groups agreed to change the 'where' clause in fts:ApplyFTScopeDifferentSentence() as follows:

      where
            count($match/fts:stringInclude) > 1
            and
            every $stringInclude1 in $match/fts:stringInclude,
                  $stringInclude2 in $match/fts:stringInclude  
            satisfies
               $stringInclude1 is $stringInclude2
               or
               (
                  [pre-existing 'satisfies' condition]
               )

We believe this fixes the bug you discovered. Therefore, I am marking this bug resolved-FIXED. If you agree with this resolution, please mark it CLOSED.

Comment 3 Michael Dyck 2010-12-20 20:28:11 UTC

(By the way, the function fts:ApplyFTScopeDifferentParagraph had the same bug as fts:ApplyFTScopeDifferentSentence, and was also fixed by the change in comment #2.)