This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 6668 - [FT] Stemming files
Summary: [FT] Stemming files
Status: CLOSED FIXED
Alias: None
Product: XPath / XQuery / XSLT
Classification: Unclassified
Component: Full Text 1.0 (show other bugs)
Version: Candidate Recommendation
Hardware: All All
: P2 normal
Target Milestone: ---
Assignee: Jim Melton
QA Contact: Mailing list for public feedback on specs from XSL and XML Query WGs
URL: http://basex.org
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2009-03-09 14:27 UTC by Christian Gruen
Modified: 2009-03-13 14:22 UTC (History)
1 user (show)

See Also:


Attachments

Description Christian Gruen 2009-03-09 14:27:54 UTC
Sorry, another one.. the stemming file "english-stems.txt" seems to have some inconsistencies..

[...]
test tests testing tested testers
picture pictures
use user
users user
[...]

"tests, testing" etc is stemmed to "test", which won't work for the "user" term.

Thanks,

Christian, BaseX Team 
http://www.basex.org
Comment 1 Pat Case 2009-03-13 12:27:51 UTC
Hi Christian,

I have reordered each line in stemming file to begin with the simplest form of the word. 

I have combined and added forms of words to the use/users line. It is now:
use uses using used user users

I have reviewed every test case that calls the stemming file and whose query contains use or user to be sure that this did not nullify the test cases or change the results.

If this result is acceptable, please close the bug.  

Pat Case