This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 5620 - fts:FormCombinations is incorrect
Summary: fts:FormCombinations is incorrect
Status: CLOSED FIXED
Alias: None
Product: XPath / XQuery / XSLT
Classification: Unclassified
Component: Full Text 1.0 (show other bugs)
Version: Working drafts
Hardware: PC Windows XP
: P2 normal
Target Milestone: ---
Assignee: Jim Melton
QA Contact: Mailing list for public feedback on specs from XSL and XML Query WGs
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2008-04-03 18:01 UTC by Thomas Baby
Modified: 2008-04-15 02:51 UTC (History)
0 users

See Also:


Attachments

Description Thomas Baby 2008-04-03 18:01:59 UTC
The XQuery Full-Text specification uses the function fts:FormCombinations (shown below) to implement the FTTimes operator:

declare function fts:FormCombinations (
      $sms as element(fts:match)*,
      $times as xs:integer )
   as element(fts:match)*
{
   if ( $times eq 1 ) then $sms
   else if (fn:count($sms) lt $times) then ()
   else if (fn:count($sms) eq $times) then
      <fts:match>{$sms/*}</fts:match>
   else (
      fts:FormCombinations(fn:subsequence($sms, 2), $times),
      for $combination in
         fts:FormCombinations(fn:subsequence($sms, 2), $times - 1)
      return
      <fts:match>
      {
         $sms[1]/*,
         $combination/*
      }
      </fts:match>
   )
};

The implementation of fts:ApplyFTTimesAtLeast simply invokes fts:FTCombinations, which suggests that the above function implements the "at least $times" semantic.

declare function fts:ApplyFTTimesAtLeast (
      $allMatches as element(fts:allMatches),
      $n as xs:integer )
   as element(fts:allMatches)
{
   <fts:allMatches stokenNum="{$allMatches/@stokenNum}">
   {fts:FormCombinations($allMatches/fts:match, $n)}
   </fts:allMatches>
};


However, looking at the body of fts:FormCombinations, the assumption that it implements the "at least $times" semantic does not appear to be correct. For example, fts:FTCombinations invoked with a value of 2 for $times produces only 2-way combinations, and it does not produce 3-way combinations, 4-way combinations, etc. (Step 3 of the example in Section 4.3.3 should have produced 3-way combinations, but it does not.)
Comment 1 Thomas Baby 2008-04-10 22:17:21 UTC
Michael Dyck and I have fixed this issue and other issues in the FTTimes neighborhood. We have added a new function called fts:FormCombinationsAtLeast, which generates $k-or-more combinations of a sequence, by repeatedly invoking fts:FormCombinations. fts:FormCombinations is an existing function that generates $k combinations of a sequence. Some of the base case checks in fts:FormCombinations have also been fixed. The new functions are included below:

declare function fts:FormCombinations (
      $sms as element(fts:match)*, 
      $k as xs:integer ) 
   as element(fts:match)*
(:
   Find all combinations of exactly $k elements from $sms, and
   for each such combination, construct a match whose children are
   copies of all the children of all the elements in the combination.
   Return the sequence of all such matches.
:)
{
   if ($k eq 0) then <fts:match/>
   else if (fn:count($sms) lt $k) then ()
   else if (fn:count($sms) eq $k) then <fts:match>{$sms/*}</fts:match>
   else
      let $first := $sms[1],
          $rest  := fn:subsequence($sms, 2)
      return (
         (: all the combinations that don't involve $first :)
         fts:FormCombinations($rest, $k),

         (: and all the combinations that do involve $first :)
         for $combination in fts:FormCombinations($rest, $k - 1)
         return
            <fts:match>
            {
               $first/*,
               $combination/*
            }
            </fts:match>
      )
};

declare function fts:FormCombinationsAtLeast (
      $sms as element(fts:match)*,
      $times as xs:integer)
   as element(fts:match)*
(:
   Find all combinations of $times or more elements from $sms, and
   for each such combination, construct a match whose children are
   copies of all the children of all the elements in the combination.
   Return the sequence of all such matches.
:)
{
   for $k in $times to fn:count($sms)
   return fts:FormCombinations($sms, $k)
};
Comment 2 Thomas Baby 2008-04-10 22:19:17 UTC
Some of the existing functions have also been changed to now invoke the new function fts:FormCombinationsAtLeast, instead of fts:FormCombinations. These functions are listed below:

declare function fts:FormRange (
      $sms as element(fts:match)*, 
      $l as xs:integer, 
      $u as xs:integer, 
      $stokenNum as xs:integer ) 
   as element(fts:allMatches)
{
   if ($l > $u) then ()
   else 
      let $am1 := <fts:allMatches stokenNum="{$stokenNum}">
                     {fts:FormCombinationsAtLeast($sms, $l)}
                  </fts:allMatches>
      let $am2 := <fts:allMatches stokenNum="{$stokenNum}">
                     {fts:FormCombinationsAtLeast($sms, $u+1)}
                  </fts:allMatches>
      return fts:ApplyFTAnd($am1,
                            fts:ApplyFTUnaryNot($am2))
};
            

declare function fts:ApplyFTTimesAtLeast (
      $allMatches as element(fts:allMatches),
      $n as xs:integer ) 
   as element(fts:allMatches) 
{
   <fts:allMatches stokenNum="{$allMatches/@stokenNum}"> 
   {fts:FormCombinationsAtLeast($allMatches/fts:match, $n)} 
   </fts:allMatches>
};

Comment 3 Thomas Baby 2008-04-15 02:50:29 UTC
Okay with the resolution.