This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.
I've run into a number of issues with the FT use cases when testing the MXQuery implementation against the test suite: - The use case document and the test suite seem to be out of sync for the ignore (14.2) cases: 1) 14.2.1: Test Suite expects empty results, document non-empty. 2) 14.2.2: Queries in document specify to include all chapters, not just the ones containing the search terms. The expected result in both the UC document and the suite list only the chapter containing the terms, not all. ignore-queries-q2 (XQuery solution in suite) is different from UC document (and the XPath solution), lists all books, not just the one with the relevant chapter. 3) 14.2.3: The same discrepancy between XQuery and XPath solution as in 2) shows up in the UC document and the test suite, so all titles are listed for the XQuery case (and just the relevant chapter), while the matching books with all chapters are in the XPath case 4) 14.2.4: Same error as in 2) in 3) present in the test suite, but corrected in the UC document - In 15.2.4 Q4, two issues are present 1) the ignore part of the second FTContains uses .//footnote, even though no expression before has not set a context item (unless FTContains sets a context item, which seems to contradict the description and a number of other test cases). 2) $au contains text fnot should not become true for book number=2, since $au is () there, and the description of the FTContains semantics (4.3 in FT CR) seems to imply that an empty search context always yields false. Several tests in the test suite support this - In 16.2.9 Q9 (xquery-xpath-composability-queries-results-q9a/9b), the keywords "succesfully", "completing" and "tasks" are queried using stemming, and within an unordered window of 4*#chapters in a book. The MXQuery implementation uses a Porter-based stemmer and turns these keywords into the stems "success", "complet" and "task". As a result, the following paragraph also matches this query: <p>Users are asked to complete tasks which measure the success of the information architecture and navigational elements of the site.</p> (less than 12 words, unordered) which in turn means that the book with shortTitle "Improving Web Site Usability" is part of the generated result. If there is no error in my reasoning (and implementation), I'd suggest fixing the use case either by + requesting "ordered" or + not using stemming or + adding the second book to the result - In 17.2.6 Q6 the boolean output of a FTContains is again queried using a FTContains, and part of the result: let $booktext := $book/content contains text ... let score $s := $booktext contains text ... return ($book/metadata/title, $booktext) Probably the correct query is let $booktext := $book/content [. contains text ... ] let score $s := $booktext contains text ... return ($book/metadata/title, $booktext)
Peter, Thank you for reporting these errors in the test suite. We very much appreciate your assistance in getting the test suite right. I report below the action we took on each item. Please review them and if they appear to you to be satisfactory, please mark this bug closed. If not, please let us know what continues to be problematic. 1) 14.2.1: Test Suite expects empty results, document non-empty. --We corrected the XQuery solution in the Use Cases to: for $book in doc("http://bstore1.example.com/full-text.xml") /books/book let $p := $book//p[. contains text "testing" ftand "guidance" ftor "correct" distance at most 60 words without content *] return $book --We replaced the test suite result with the use cases result. 2) 14.2.2: Queries in document specify to include all chapters, not just the ones containing the search terms. The expected result in both the UC document and the suite list only the chapter containing the terms, not all. ignore-queries-q2 (XQuery solution in suite) is different from UC document (and the XPath solution), lists all books, not just the one with the relevant chapter. --We corrected the XQuery solution in the Use Cases to: for $book in doc("http://bstore1.example.com/full-text.xml") /books/book let $chap := $book//chapter[ . contains text "users can be tested at any computer workstation or in a lab" without content .//footnote] return ($book/metadata/title, $chap) --We replaced the test suite result with the use cases result. 3) 14.2.3: The same discrepancy between XQuery and XPath solution as in 2) shows up in the UC document and the test suite, so all titles are listed for the XQuery case (and just the relevant chapter), while the matching books with all chapters are in the XPath case --We replaced the XQuery and XPath solutions in both documents. For bevity, I provide only the XQuery solution from the use cases: for $chapter in doc("http://bstore1.example.com/full-text.xml") /books/book//chapter where $chapter contains text "at any computer workstation or in a lab" without content .//footnote[. contains text "workstation." using wildcards] return ($chapter/ancestor::book/metadata/title, $chapter) --We replaced the test suite result with the use cases result. 4) 14.2.4: Same error as in 2) in 3) present in the test suite, but corrected in the UC document --We replaced the XQuery and XPath solutions in both documents. For bevity, I provide only the XQuery solution from the use cases: for $chapter in doc("http://bstore1.example.com/full-text.xml") /books/book//chapter where $chapter contains text "workstation" ftand "lab" distance at most 6 words without content .//footnote[. contains text "workstation." using wildcards] return ($chapter/ancestor::book/metadata/title, $chapter/(p|p/footnote)) --We replaced the test suite result with the use cases result. - In 15.2.4 Q4, two issues are present 1) the ignore part of the second FTContains uses .//footnote, even though no expression before has not set a context item (unless FTContains sets a context item, which seems to contradict the description and a number of other test cases). 2) $au contains text fnot should not become true for book number=2, since $au is () there, and the description of the FTContains semantics (4.3 in FT CR) seems to imply that an empty search context always yields false. Several tests in the test suite support this --We replaced the XQuery solutions in both documents. For bevity, I provide only the XQuery solution from the use cases: for $book in doc("http://bstore1.example.com/full-text.xml") /books/book let $au := $book/metadata/author let $co := $book/content where $au contains text ftnot ("montana" ftand "marigold") and $co contains text "correct" ftor "comment" using stemming ftor "guidance" ftor "assistance" ftor "help" ftand "usability test.*" using wildcards window 80 words without content $co//footnote return <book number="{$book/@number}"> {$book/metadata/title, $co} </book> --We changed the results in the use cases to No Results Found following our convention in that document and in the test suite to an empty document. - In 16.2.9 Q9 (xquery-xpath-composability-queries-results-q9a/9b), the keywords "succesfully", "completing" and "tasks" are queried using stemming, and within an unordered window of 4*#chapters in a book. The MXQuery implementation uses a Porter-based stemmer and turns these keywords into the stems "success", "complet" and "task". As a result, the following paragraph also matches this query: <p>Users are asked to complete tasks which measure the success of the information architecture and navigational elements of the site.</p> (less than 12 words, unordered) which in turn means that the book with shortTitle "Improving Web Site Usability" is part of the generated result. If there is no error in my reasoning (and implementation), I'd suggest fixing the use case either by + requesting "ordered" or + not using stemming or + adding the second book to the result --No change. We recommend you use the stemming dictionary provided for testing. If you prefer to use a different stemmer and get different results, you may explain why, but we encourage you to follow the instructions in the test suite quoted here: When the catalog entry for a query references a stemming dictionary, the implementation is expected to provide stemming equivalent to the rules given in the stemming dictionary. - In 17.2.6 Q6 the boolean output of a FTContains is again queried using a FTContains, and part of the result: let $booktext := $book/content contains text ... let score $s := $booktext contains text ... return ($book/metadata/title, $booktext) Probably the correct query is let $booktext := $book/content [. contains text ... ] let score $s := $booktext contains text ... return ($book/metadata/title, $booktext) --We used your write and replaced the XQuery solutions in both documents. For bevity, I provide only the XQuery solution from the use cases: for $book in doc("http://bstore1.example.com/full-text.xml") /books/book let $booktext := $book/content [. contains text ("conduct" ftand "usability" ftand "tests" distance at most 10 words) using stemming] let score $s := $booktext contains text (("measuring" ftand "success" distance at most 4 words) weight {0.8}) using stemming order by $s return ($book/metadata/title, $booktext) Again, thanks for your assistance!!!! Pat Case, Library of Congress, for the Full Text Task Force
A while after the previous comment was entered, the Task Force determined that there were problems with some of the specific queries given therein. Revised queries will appear in the next public draft of the Use Cases document. In the mean time, the revisions have already been propagated to the Full Text Test Suite. E.g., the revised 'Solution in XQuery' for Use Case 14.2.1 can be seen at http://dev.w3.org/cvsweb/~checkout~/2007/xpath-full-text-10-test-suite/TestSuiteStagingArea/Queries/XQuery/UseCase/UseCase-IGNORE/ignore-queries-results-q1.xq (modulo the test suite's use of $input-context instead of fn:doc).