[Bug 5838] New: [FT] Possible Inconsistencies

http://www.w3.org/Bugs/Public/show_bug.cgi?id=5838

           Summary: [FT] Possible Inconsistencies
           Product: XPath / XQuery / XSLT
           Version: Candidate Recommendation
          Platform: All
        OS/Version: All
            Status: NEW
          Severity: normal
          Priority: P2
         Component: Full Text 1.0 Use Cases
        AssignedTo: pcase@crs.loc.gov
        ReportedBy: christian.gruen@gmail.com
         QAContact: public-qt-comments@w3.org


Dear all,

I had a closer look on the test suite and the XQFT Use Cases now and
tried to answer them with our BaseX full-text implementation. That's
when I came across some possible inconsistencies..

=== XQFT Use Cases, 16 May 2008 ===

- 2.2.7: ....[. ftcontains "improv.* the ....  testing" entire content]

The query contains the token "improv.*", but the wildcard option is
not specified here. If I add "with wildcards", I get the expected
results

- 4.2.1: ....ftcontains "improve" ftand "web" ftand "usability" with
stemming....

If I get the grammar right, the stemming option is only applied on the
last token in this example; I get the correct results if I
parenthesize all the search tokens:

 ....ftcontains ("improve" ftand "web" ftand "usability") with stemming....

- 5.2.1: Getting pedantic.. The "s" in the "solution in XQuery" header
is written in lower-case (I wrote a simple XQuery script to extract
the queries, and this one was left out)

- 16.2.9: ...for $cont := $book/content...

"for" should probably be replaced with "let" (or ":=" with "in").

- 17.2.4:    ...filter( $e/node.... / ....return filter($book....

I can parse this one if I precede the function call with the "local:" prefix.

=== XQFT Test Suite ===

Here I mainly stumbled across some minor serialization issues:

- As far as I know, new lines inside attribute values are removed
while parsing XML documents, so I expected the attribute..

 ....url="http://www.useit.com/papers/heuristic
 /heuristic_list.html">Ten Usability....

...to yield..

 ....url="http://www.useit.com/papers/heuristic
/heuristic_list.html">....

The same applies to two other attributes:

 ....url="http://usability.gov .... /guidelines/index.html"....
 ....shortTitle="Usabilityguy Manuscript .... Guide">....

Next - another bagatelle - the attribute

 ....normalize=
 "1990/1999"....

spans two lines whereas Saxon, Qizx, or BaseX keep it in one line:

 ....<componentDate normalize="1990/1999">1990-1999....

Last but not least, the two test-cases

 element-queries-results-q7.xq  and
 element-queries-results-q7b.xq

use the wildcard and "entire content" option from the above mentioned
use-case query (2.2.7).


I've noticed another possible inconsistency in the Test Suite queries
and Use Cases: many examples, esp. the XPath examples, use the count()
function to check if an ftcontains operator yields results. As
ftcontains returns a boolean value, I assume that the count function
will always return 1..

 count( 'abc' ftcontains 'def' ) > 0  -> true


That's all I found for now - thanks for listening.

Regards,
Christian


-- 
Configure bugmail: http://www.w3.org/Bugs/Public/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug.

Received on Monday, 7 July 2008 13:59:56 UTC