This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 12807 - [XQFTTS] xquery-xpath-composability-queries-results-q9* tests assume stemming algorithm
Summary: [XQFTTS] xquery-xpath-composability-queries-results-q9* tests assume stemming...
Status: NEW
Alias: None
Product: XPath / XQuery / XSLT
Classification: Unclassified
Component: Full Text 1.0 (show other bugs)
Version: Candidate Recommendation
Hardware: All All
: P2 normal
Target Milestone: ---
Assignee: Jim Melton
QA Contact: Mailing list for public feedback on specs from XSL and XML Query WGs
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-05-28 14:25 UTC by Paul J. Lucas
Modified: 2011-12-29 16:21 UTC (History)
1 user (show)

See Also:


Attachments

Description Paul J. Lucas 2011-05-28 14:25:47 UTC
These queries contains the words "successfully", "completing", and "tasks" and use stemming.  Our implementation returns the correct result and, additionally, the "Usability Basics" title.

I think the expected result is due to the test assuming that the stemming algorithm stems "successfully" to "successful".  Our implementation stems "successfully" to "success".  Accordingly, the following XML fragment in the test document:

      <p>Users are asked to complete tasks which
            measure the success of the information
            architecture and navigational elements of the
            site.</p>

matches and, correspondingly, the entire title is matched and returned.  Since the stem of a token is "implementation-defined", this test is incorrect in assuming that "successfully" stems to "successful" and not "success".
Comment 1 Tim Mills 2011-06-28 11:33:28 UTC
Personal response:

The test xquery-xpath-composability-queries-results-q9 specifies the stemming dictionary in the catalog:

        <aux-URI role="stemming-dictionary">english</aux-URI>

which corresponds to the "english-stems.txt" file.

This file specifies no rule to stm "successfully".  According to the guidelines for running the test suite:

"The stopwords, thesaurus, and stemming-dictionary sources are not intended to be used directly in the form in which they are given, but to provide information to those running the test suite about the expectations a particular test has about various implementation-specific aspects of the execution context. Implementations are expected to provide equivalent information to the query, but in whatever form is appropriate in their context."

This means that to pass the test, the stemmer you use in your test harness mustn't stem "successfully".
Comment 2 Paul J. Lucas 2011-12-22 18:29:03 UTC
SInce stemming is implementation-defined, I don't see how a test can dictate what ought or ought not to be stemmed.  In particular to this case, requiring that "successfully" not be stemmed when it obviously has stems is ridiculous.
Comment 3 Tim Mills 2011-12-29 16:21:08 UTC
True, but if you want to pass the test suite, then you have to follow the guidelines for running the test suite.

(In reply to comment #2)
> SInce stemming is implementation-defined, I don't see how a test can dictate
> what ought or ought not to be stemmed.  In particular to this case, requiring
> that "successfully" not be stemmed when it obviously has stems is ridiculous.