8403 – Several issues on use cases, including consistency with XQFTTS

This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 8403 - Several issues on use cases, including consistency with XQFTTS

Summary: Several issues on use cases, including consistency with XQFTTS

Status:	CLOSED FIXED

Alias:	None

Product:	XPath / XQuery / XSLT
Classification:	Unclassified
Component:	Full Text 1.0 Use Cases (show other bugs)
Version:	Candidate Recommendation
Hardware:	All All

Importance:	P2 normal
Target Milestone:	---
Assignee:	Pat Case
QA Contact:	Mailing list for public feedback on specs from XSL and XML Query WGs

URL:
Whiteboard:
Keywords:

Depends on:
Blocks:

Reported:	2009-11-30 17:09 UTC by Peter M. Fischer
Modified:	2011-01-05 23:11 UTC (History)
CC List:	1 user (show)

See Also:

Attachments

Description Peter M. Fischer 2009-11-30 17:09:07 UTC

I've run into a number of issues with the FT use cases when testing the MXQuery implementation against the test suite:

- The use case document and the test suite seem to be out of sync for the ignore (14.2) cases:
1) 14.2.1: Test Suite expects empty results, document non-empty. 

2) 14.2.2: Queries in document specify to include all chapters, not just the ones containing the search terms. The expected result in both the UC document and the suite list only the chapter containing the terms, not all. ignore-queries-q2 (XQuery solution in suite) is different from UC document (and the XPath solution), lists all books, not just the one with the relevant chapter.

3) 14.2.3: The same discrepancy between XQuery and XPath solution as in 2) shows up in the UC document and the test suite, so all titles are listed for the XQuery case (and just the relevant chapter), while the matching books with all chapters are in the XPath case

4) 14.2.4: Same error as in 2) in 3) present in the test suite, but corrected in the UC document

- In 15.2.4 Q4, two issues are present
1) the ignore part of the second FTContains uses  .//footnote, even though no expression before has not set a context item (unless FTContains sets a context item, which seems to contradict the description and a number of other test cases). 
2) $au contains text fnot should not become true for book number=2, since $au is () there, and the description of the FTContains semantics (4.3 in FT CR) 
seems to imply that an empty search context always yields false. Several tests in the test suite support this 

- In 16.2.9 Q9 (xquery-xpath-composability-queries-results-q9a/9b), the keywords "succesfully", "completing" and "tasks" are queried using stemming, and within an unordered window of 4*#chapters in a book. The MXQuery implementation uses a Porter-based stemmer and turns these keywords into the stems "success", "complet" and "task". As a result, the following paragraph also matches this query:

<p>Users are asked to complete tasks which 
            measure the success of the information 
            architecture and navigational elements of the 
            site.</p>

(less than 12 words, unordered)

which in turn means that the book with shortTitle "Improving Web Site Usability" is part of the generated result. If there is no error in my reasoning (and implementation), I'd suggest fixing the use case either by 
+ requesting "ordered" or 
+ not using stemming or
+ adding the second book to the result

- In 17.2.6 Q6 the boolean output of a FTContains is again queried using a FTContains, and part of the result:

let $booktext := $book/content contains text ...
let score $s := $booktext contains text ...
return ($book/metadata/title, $booktext)

Probably the correct query is 

let $booktext := $book/content [. contains text ... ]
let score $s := $booktext contains text ...
return ($book/metadata/title, $booktext)

Comment 1 Pat Case 2009-12-09 19:54:12 UTC

Peter,

Thank you for reporting these errors in the test suite. We very much appreciate your assistance in getting the test suite right.

I report below the action we took on each item. Please review them and if they appear to you to be satisfactory, please mark this bug closed. If not, please let us know what continues to be problematic.

1) 14.2.1: Test Suite expects empty results, document non-empty. 
--We corrected the XQuery solution in the Use Cases to:
for $book in doc("http://bstore1.example.com/full-text.xml")
   /books/book
let $p := $book//p[. contains text "testing" ftand "guidance" ftor 
   "correct" distance at most 60 words without content *]
return $book
--We replaced the test suite result with the use cases result.

2) 14.2.2: Queries in document specify to include all chapters, not just the
ones containing the search terms. The expected result in both the UC document
and the suite list only the chapter containing the terms, not all.
ignore-queries-q2 (XQuery solution in suite) is different from UC document (and
the XPath solution), lists all books, not just the one with the relevant
chapter.
--We corrected the XQuery solution in the Use Cases to:
for $book in doc("http://bstore1.example.com/full-text.xml")
   /books/book
let $chap := $book//chapter[
   . contains text "users can be tested at any 
   computer workstation or in a lab" without content 
   .//footnote]
return ($book/metadata/title, $chap)
--We replaced the test suite result with the use cases result.

3) 14.2.3: The same discrepancy between XQuery and XPath solution as in 2)
shows up in the UC document and the test suite, so all titles are listed for
the XQuery case (and just the relevant chapter), while the matching books with
all chapters are in the XPath case
--We replaced the XQuery and XPath solutions in both documents. For bevity, I provide only the XQuery solution from the use cases:
for $chapter in doc("http://bstore1.example.com/full-text.xml")
   /books/book//chapter
where $chapter contains text "at any computer 
   workstation or in a lab" without content 
   .//footnote[. contains text "workstation." using wildcards]
return ($chapter/ancestor::book/metadata/title, $chapter)
--We replaced the test suite result with the use cases result.

4) 14.2.4: Same error as in 2) in 3) present in the test suite, but corrected
in the UC document
--We replaced the XQuery and XPath solutions in both documents. For bevity, I provide only the XQuery solution from the use cases:
for $chapter in doc("http://bstore1.example.com/full-text.xml")
   /books/book//chapter
where $chapter contains text "workstation" ftand "lab" 
   distance at most 6 words without content .//footnote[. 
   contains text "workstation." using wildcards]   
return ($chapter/ancestor::book/metadata/title, $chapter/(p|p/footnote))
--We replaced the test suite result with the use cases result.

- In 15.2.4 Q4, two issues are present
1) the ignore part of the second FTContains uses  .//footnote, even though no
expression before has not set a context item (unless FTContains sets a context
item, which seems to contradict the description and a number of other test
cases). 
2) $au contains text fnot should not become true for book number=2, since $au
is () there, and the description of the FTContains semantics (4.3 in FT CR) 
seems to imply that an empty search context always yields false. Several tests
in the test suite support this 
--We replaced the XQuery solutions in both documents. For bevity, I provide only the XQuery solution from the use cases:
for $book in doc("http://bstore1.example.com/full-text.xml")
   /books/book
let $au := $book/metadata/author
let $co := $book/content
where $au contains text ftnot ("montana" ftand "marigold")
   and $co contains text "correct" ftor "comment" 
   using stemming ftor "guidance" ftor "assistance" 
   ftor "help" ftand "usability test.*" using wildcards 
   window 80 words without content $co//footnote
return <book number="{$book/@number}"> 
          {$book/metadata/title, $co}
          </book>
--We changed the results in the use cases to No Results Found following our convention in that document and in the test suite to an empty document.

- In 16.2.9 Q9 (xquery-xpath-composability-queries-results-q9a/9b), the
keywords "succesfully", "completing" and "tasks" are queried using stemming,
and within an unordered window of 4*#chapters in a book. The MXQuery
implementation uses a Porter-based stemmer and turns these keywords into the
stems "success", "complet" and "task". As a result, the following paragraph
also matches this query:

<p>Users are asked to complete tasks which 
            measure the success of the information 
            architecture and navigational elements of the 
            site.</p>

(less than 12 words, unordered)

which in turn means that the book with shortTitle "Improving Web Site
Usability" is part of the generated result. If there is no error in my
reasoning (and implementation), I'd suggest fixing the use case either by 
+ requesting "ordered" or 
+ not using stemming or
+ adding the second book to the result
--No change. We recommend you use the stemming dictionary provided for testing. If you prefer to use a different stemmer and get different results, you may explain why, but we encourage you to follow the instructions in the test suite quoted here:
     When the catalog entry for a query references a stemming dictionary,
     the implementation is expected to provide stemming equivalent to the
     rules given in the stemming dictionary.

- In 17.2.6 Q6 the boolean output of a FTContains is again queried using a
FTContains, and part of the result:

let $booktext := $book/content contains text ...
let score $s := $booktext contains text ...
return ($book/metadata/title, $booktext)

Probably the correct query is 

let $booktext := $book/content [. contains text ... ]
let score $s := $booktext contains text ...
return ($book/metadata/title, $booktext)
--We used your write and replaced the XQuery solutions in both documents. For bevity, I provide only the XQuery solution from the use cases:
for $book in doc("http://bstore1.example.com/full-text.xml")
   /books/book
let $booktext := $book/content [. contains text ("conduct" 
   ftand "usability" ftand "tests" distance at most 
   10 words) using stemming] 
let score $s := $booktext contains text 
   (("measuring" ftand "success" distance
   at most 4 words) weight {0.8}) using stemming 
order by $s
return ($book/metadata/title, $booktext)

Again, thanks for your assistance!!!!

Pat Case, Library of Congress, for the Full Text Task Force

Comment 2 Michael Dyck 2010-01-20 02:24:50 UTC

A while after the previous comment was entered, the Task Force determined that there were problems with some of the specific queries given therein. Revised queries will appear in the next public draft of the Use Cases document. In the mean time, the revisions have already been propagated to the Full Text Test Suite. 

E.g., the revised 'Solution in XQuery' for Use Case 14.2.1 can be seen at
http://dev.w3.org/cvsweb/~checkout~/2007/xpath-full-text-10-test-suite/TestSuiteStagingArea/Queries/XQuery/UseCase/UseCase-IGNORE/ignore-queries-results-q1.xq
(modulo the test suite's use of $input-context instead of fn:doc).