This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 6105 - [FT] Test Suite cont.
Summary: [FT] Test Suite cont.
Status: CLOSED FIXED
Alias: None
Product: XPath / XQuery / XSLT
Classification: Unclassified
Component: Full Text 1.0 (show other bugs)
Version: Candidate Recommendation
Hardware: All All
: P2 normal
Target Milestone: ---
Assignee: Jim Melton
QA Contact: Mailing list for public feedback on specs from XSL and XML Query WGs
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2008-09-22 02:24 UTC by Christian Gruen
Modified: 2008-12-23 01:56 UTC (History)
1 user (show)

See Also:


Attachments

Description Christian Gruen 2008-09-22 02:24:14 UTC
Hello,

I have encountered some new inconsistencies in the current test suite...


[1] FTPrimary-FTWords-q1.xq
    FTPrimary-FTWords-FTTimes-q1.xq,
    FTPrimary-FTExtensionSelection-q1.xq
    FTPrimary-FTSelection-q1.xq

The child step "/div2" could to be rewritten to "//div2" to match the result files.


[2] FTPrimary-FTExtensionSelection-q1.xq

As far as I know, the pragma (# exq:classifier with class 'Antonyms' #) cannot be specified at the current location. Moreover, by using pragmas, it might be that not all implementations will yield correct results.


[3] FTSelection-Weight-q1a.xq
    FTSelection-Weight-q1b.xq
    ...

Again, the location steps "/div[//a/@id" won't yield any results. The alternative "//div2[@id=" should suffice.


[5] Examples/2.2.2/ft-222-examples-results-q1.txt

The correct result for "examples-222-q1.xq" should be:

<author>Melton Mowbray</author>


[6] FTPrimary-FTWords-q1_result.xml

Here I get another result as well (it's based on the query rewriting of [1])..

<paragraphs>
  <p>
Note that, along with the syntax rules above, there is an extra-grammatical 
constraint,
<loc xmlns:xlink="http://www.w3.org/1999/xlink" href="#parse-note-multiple-match-options" xlink:type="simple" xlink:show="replace" xlink:actuate="onRequest">multiple-match-options
      </loc>,
which needs to be considered, if multiple match options are specified.
It states that within a single <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTMatchOptions" xlink:type="simple">FTMatchOptions</nt>
at most one match option of any given 
<termref def="dt-match-option-group">match option group</termref> may
be specified. 
For example, if the <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTCaseOption" xlink:type="simple">FTCaseOption</nt> "lowercase" 
is specified, then "uppercase" cannot also be specified as part of the same 
<nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTMatchOptions" xlink:type="simple">FTMatchOptions</nt>.
</p>
  <p>
<termdef id="dt-match-option-order" term="match option application order">
The order in which effective match options for an 
<nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTWords" xlink:type="simple">FTWords</nt> are applied 
 is called the <term>match option application order</term>.</termdef>
This order is significant
because match options are not always commutative.
For example,
    synonym(stem(word))
is not always the same as
    stem(synonym(word)).
</p>
</paragraphs>


[7] XQFTTSCatalog.xml

=> "Testsources" -> "TestSources"

=> "FTSelection-Weight-1a.xq" -> "FTSelection-Weight-q1a.xq"
...same operation for all files in this directory

=> Query FTSelection-Weight-q1d
 an <expected-error> element should be added here as implementations can raise errors for negative values

=> Query ft-34-examples-q2 - ft-342-examples-q4 -> don't exist yet


  
Thanks,

Christian, BaseX Team 
http://www.basex.org
Comment 1 Jim Melton 2008-10-30 23:08:45 UTC
Christian,

Many thanks for your comments!  Please keep them coming!  (For tracking purposes, it would be slightly better for you to put each separate comment into a separate Bugzilla bug, if that's convenient.)

I have responded to each of your comments below:

[1] FTPrimary-FTWords-q1.xq
    FTPrimary-FTWords-FTTimes-q1.xq,
    FTPrimary-FTExtensionSelection-q1.xq
    FTPrimary-FTSelection-q1.xq

The child step "/div2" could to be rewritten to "//div2" to match the result
files.

<reply>
You are correct and we have updated the tests to correct the error. 
</reply>


[2] FTPrimary-FTExtensionSelection-q1.xq

As far as I know, the pragma (# exq:classifier with class 'Antonyms' #) cannot
be specified at the current location. Moreover, by using pragmas, it might be
that not all implementations will yield correct results.

<reply>
I have run the query that you identified above through the Full Text parser applet (see http://www.w3.org/2007/01/applets/xquery-fulltextApplet.html) and it appears to parse without error.  Therefore, I conclude that it is valid to use the pragma where it appears in the example.  If you still believe that it should not parse correctly, please help me understand why. 

It is, of course, true that "not all implementations" will return any given result when pragmas are used.  However, there is no way to test whether implementations are tolerant of pragmas without writing examples that contain pragmas.  And that is the purpose of this test -- to ensure that implementations will tolerate the presence of a pragma that they do not understand.  (I do not expect that any implementation will recognize, or implement, the "exq:classifier" pragma.)
</reply>


[3] FTSelection-Weight-q1a.xq
    FTSelection-Weight-q1b.xq
    ...

Again, the location steps "/div[//a/@id" won't yield any results. The
alternative "//div2[@id=" should suffice.

<reply>
You are, of course, correct.  I am mystified why I wrote the predicate the way that I did, even though I recall doing it deliberately.  I have corrected these errors. 
</reply>


[5] Examples/2.2.2/ft-222-examples-results-q1.txt

The correct result for "examples-222-q1.xq" should be:

<author>Melton Mowbray</author>

<reply>
Actually, I think the result should be:

<author>Heavy Rain</author>
<author>Melton Mowbray</author>

That is, I believe that both books satisfy the ftcontains conditions (particularly  because of "with stemming" applied to "dog").  I have changed the result to reflect that belief.  If you still believe that only one author should be returned, please help me understand why. 
</reply>


[6] FTPrimary-FTWords-q1_result.xml

Here I get another result as well (it's based on the query rewriting of [1])..

<paragraphs>
  <p>
Note that, along with the syntax rules above, there is an extra-grammatical 
constraint,
<loc xmlns:xlink="http://www.w3.org/1999/xlink"
href="#parse-note-multiple-match-options" xlink:type="simple"
xlink:show="replace" xlink:actuate="onRequest">multiple-match-options
      </loc>,
which needs to be considered, if multiple match options are specified.
It states that within a single <nt xmlns:xlink="http://www.w3.org/1999/xlink"
def="doc-xquery-FTMatchOptions" xlink:type="simple">FTMatchOptions</nt>
at most one match option of any given 
<termref def="dt-match-option-group">match option group</termref> may
be specified. 
For example, if the <nt xmlns:xlink="http://www.w3.org/1999/xlink"
def="doc-xquery-FTCaseOption" xlink:type="simple">FTCaseOption</nt> "lowercase" 
is specified, then "uppercase" cannot also be specified as part of the same 
<nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTMatchOptions"
xlink:type="simple">FTMatchOptions</nt>.
</p>
  <p>
<termdef id="dt-match-option-order" term="match option application order">
The order in which effective match options for an 
<nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTWords"
xlink:type="simple">FTWords</nt> are applied 
 is called the <term>match option application order</term>.</termdef>
This order is significant
because match options are not always commutative.
For example,
    synonym(stem(word))
is not always the same as
    stem(synonym(word)).
</p>
</paragraphs>

<reply>
I do not see any difference between your result given above and the result that I see in FTPrimary-FTWords-q1_result.xml.  Can you tell me what the differences are?  There are some minor white space differences, but the XML comparison ought to normalize those away. 
</reply>


[7] XQFTTSCatalog.xml

=> "Testsources" -> "TestSources"

<reply>
Fixed, thanks!
</reply>

=> "FTSelection-Weight-1a.xq" -> "FTSelection-Weight-q1a.xq"
...same operation for all files in this directory

<reply>
Fixed, thanks!
</reply>

=> Query FTSelection-Weight-q1d
 an <expected-error> element should be added here as implementations can raise
errors for negative values

<reply>
Fixed, thanks!
</reply>

=> Query ft-34-examples-q2 - ft-342-examples-q4 -> don't exist yet

<reply>
I'm not sure that I know what this report means.  In XQFTTSCatalog.xml, I have found mention of ft-34-examples-q1 and ft-34-examples-q2, but no others.  Sure enough, there are no files of those names in directory 3..4-MatchOptions.  That's because the person responsible for writing the tests for that section has created the catalog information but has not yet provided the actual tests.  This will be resolved as test development proceeds. 
</reply>

Again, many thanks for your comments, and we hope this means that you plan to run the final test suite and report your results!

I am marking this bug as FIXED based on my responses above.  If you disagree, feel free to REOPEN the bug; otherwise, please mark the bug CLOSED. 

Jim
Comment 2 Christian Gruen 2008-10-31 06:50:35 UTC
Hi Jim,

thanks for your detailed replies!

> (For tracking purposes, it would be slightly better for you to put each 
> separate comment into a separate Bugzilla bug, if that's convenient.)

Sure, no problem.. I have just created a new list, and I'll create some
more entries this time. - Just two last addition to this post:


[2] FTPrimary-FTExtensionSelection-q1.xq

> I have run the query that you identified above through the Full Text parser
> applet (see http://www.w3.org/2007/01/applets/xquery-fulltextApplet.html) and
> it appears to parse without error.

Yes.. my fault, our parser ignored the Pragma expression.. Still, I have some problems with the current version of this query - even with the parser applet - as the namespace is declared after the variable. If I switch these two lines..

 declare namespace exq = "http://example.org/examples/pragmas";  (2)
 declare variable $input-context external;  (1)

..everything works fine for me.


[5] Examples/2.2.2/ft-222-examples-results-q1.txt

> Actually, I think the result should be:
> 
> <author>Heavy Rain</author>
> <author>Melton Mowbray</author>
> 
> That is, I believe that both books satisfy the ftcontains conditions
> (particularly  because of "with stemming" applied to "dog").  I have changed
> the result to reflect that belief.  If you still believe that only one author
> should be returned, please help me understand why.

..yes, I think that "<title>Dogs and Cats</title>" should be excluded
from the result as only the dogs are stemmed, but not the cats.
The where clause was:

  where $b/title ftcontains ("dog" with stemming) ftand "cat"  

Just tell me if I got it wrong.


Thanks,

Christian, BaseX Team 
http://www.basex.org



Comment 3 Jim Melton 2008-12-22 23:50:44 UTC
Thanks for your further response and analysis.  I agree with both of your positions on items [2] and [5] and have made the appropriate changes to the relevant test and to the relevant result. 

I believe that this completes addressing all of your points.  When you have the time, please mark this bug report CLOSED. Thanks!
Comment 4 Christian Gruen 2008-12-23 01:56:40 UTC
Thanks..