This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 6223 - [XQFTTS] cont.
Summary: [XQFTTS] cont.
Status: CLOSED FIXED
Alias: None
Product: XPath / XQuery / XSLT
Classification: Unclassified
Component: Full Text 1.0 (show other bugs)
Version: Candidate Recommendation
Hardware: All All
: P1 normal
Target Milestone: ---
Assignee: Jim Melton
QA Contact: Mailing list for public feedback on specs from XSL and XML Query WGs
URL: http://basex.org
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2008-11-12 20:31 UTC by Christian Gruen
Modified: 2009-02-04 19:53 UTC (History)
3 users (show)

See Also:


Attachments

Description Christian Gruen 2008-11-12 20:31:37 UTC
Hello,

below some more test suite corrections. Due to time constraints I have put all of them into one bug report - if you think that a single comment needs more discussion, feel free to create an extra entry for it:


[1] RESULT ft-222-examples-results-q1.txt

OLD: <author>Heavy Rain</author><author>Melton Mowbray</author>
NEW: <author>Melton Mowbray</author>
...still to be fixed (see bug 6105)


[2] Path Mismatches
OLD: ExpectedResults/.../2.3-ScoreVariables
NEW: ExpectedResults/.../2.3.1-UsingWeightsWithinAScoredFTContainsExpr
...the queries and expected results are found in different directories
(the XQFTTSCatalog uses "2.3.1 ..." as directory)


[3] QUERY ft-3.2-examples-q5.xq

OLD: $book/title
NEW: $book/title/@shortTitle
...text content does not contain 'Web Site Usability'

OLD: return $book/@number
NEW: return data($book/@number)
...or string() or number() ...


[4] QUERY ft-3.3-examples-q1.xq

OLD: ...at least 2 times]/@number
NEW: ...at least 2 times]


[5] RESULTS ft-3.4-examples-... / ft-3.4.2-...

OLD: <?xml... / -
NEW: true
...all queries return boolean values instead of nodes (true/false)


[6] RESULTS ...global

OLD: <paragraphs></paragraphs>
NEW: <paragraphs/>
...inconsistent serialization of empty elements


[7] SOURCES xpath-full-text-10.xml

OLD: FTAnyallOptions
NEW: FTAnyallOption


[8] RESULT FTPrimary-FTWords-anyword-q2b_result.xml

NEW: ... <p>In general ... literal </p> ...
...expected as 2nd result


[9] RESULT FTPrimary-FTWords-any-q4b_result.xml

NEW: <p>FTWords ... FTAnyAllOption ...</p>, <p>...
...I would expect all <p> elements as results which contain the token
'FTAnyAllOption'. I might be wrong - if that's the case, just tell me why.


[10] QUERIES FTSelection-Weight-q...xq

OLD: ... /div ...
NEW: ... //div2 ...


[11] QUERIES FTSelection-FTTimes-q...xq

OLD: ... /div2 ...
NEW: ... //div2 ...

OLD: occurences
NEW: occurrence


[12] RESULTS FTSelection-FTTimes-q1a/-q2a/-q3a/-q4a_result.xml

OLD: <nt def=:doc-xquery-FTWords">
NEW: <nt xmlns:xlink="http://www.w3.org/1999/xlink" def="doc-xquery-FTWords" xlink:type="simple">
..to avoid too many alternative results, it might be useful to rewrite the
queries to get results w/o namespaces


[13] RESULT proximity-queries-results-q1.xml

OLD: <book number="1">...
NEW: <book number="3">...


[14] RESULT proximity-queries-results-q5.xml

OLD: <book number="1">...
NEW: alternative empty result
...if an implementation defines paragraph boundaries different from <p>
(such as e.g. new lines, 0x0A), this query won't yield any results.


[15] RESULT axes-queries-results-q1.txt

OLD: <p>Users can be tested..
NEW: <p>This is a basic ...</p> ... <p>Users can be tested..
...in this query, $para contains all <p> elements and not only
the one which contains the query terms


[16] RESULT axes-queries-results-q2.txt

...I would expect <title shortTitle="Improving Web Site Usability...">...
as first result


[17] RESULT axes-queries-results-q4.txt

...I would expect <title>Heuristic Evaluation</title>...
as second result.


[18] RESULT full-text-composability-queries-results-q4.txt

...I would expect <title>Heuristic Evaluation</title> ...
as second result.


[18] QUERY full-text-composability-queries-results-q5.xq

OLD: &gt;, ...
NEW: >, ...
...mysterious syntax


[19] RESULT xquery-xpath-composability-queries-results-q7.txt

...wrong results.. would be correct if where clause was specified
as predicate after $book//subject


[20] RESULT score-queries-results-q1.txt

OLD: ... </book><book> ...
NEW: ... </book>, <book> ...
...no commas between <book> elements. and, more important, as long as
no reference scoring model is defined, it might better to avoid score
values in the test result, as was done here


[21] QUERY score-queries-results-q2.xq

OLD: ... >$result/metadata/title</
NEW: ... >{ $result/metadata/title }</
...missing brackets. next, the order of the results will once more
be implementation dependent, so it might be recommendable to omit
the "order" clause


[22] RESULT score-queries-results-q3.txt

...once more different results, dep. on the implementation, and same
problem for nearly all scoring queries. Is something like a default
scoring model planned for XQFT or the test suitesTS?


[23] QUERY score-queries-results-q3b.xq

OLD: &gt;
NEW: >


[24] QUERY ft-3.4.1-examples-q1.xq

OLD: ... language "en"
NEW: ... language "en"]/..


[25] QUERIES examples-348-q1, examples-348-q2.xq,
  FTExtensionSelection-q1.xq, Extension1.xq

...XQFT grammar: namespace and ftoption declaration must precede
variable declaration


[26] RESULT Extension1.txt

...seems to be not included yet


[27] RESULT full-text-composability-queries-results-q2.txt
...language support should be optional (alternative result: FTST0009)


[28] QUERIES full-text-composability-queries-results-q2/...

OLD: ... at most 3 words ftand ...
NEW: ... (... at most 3 words) ftand
...no ftand allowed after FTRange, other bugs


Sorry, running out of time. There might be some more issues.

Christian, BaseX Team 
http://www.basex.org
Comment 1 Mary Holstege 2008-11-24 22:02:32 UTC
Items 2, 25, 26 resolved as indicated.
Item 6: XML results should follow canonicalization and use <this></this> form.
I changed everything I could find that didn't do this.
Comment 2 Christian Gruen 2008-11-24 22:17:16 UTC
just tampering.. "XML results should follow canonicalization and use <this></this> form. I changed everything I could find that didn't do this."

..I would expect all results to be serialized as "empty elements", i.e. using </this> as syntaxk; this would comply with the XQTS. If I'm wrong, just tell me.

Thanks,

Christian, BaseX Team 
http://www.basex.org

Comment 3 Mary Holstege 2008-11-24 23:35:41 UTC
Interesting. If you read the instructions on XQTS, they say to canonicalize
also, and then do the comparison. OTOH, given that it is an XML comparison,
it really shouldn't matter.
Comment 4 Christian Gruen 2008-11-24 23:57:53 UTC
..I'm just confused as I don't know any XQuery implementation (Saxon, Qizx, Kawa, Zorba, BaseX, ..), which will output empty elements as start-/end tag-pairs. All of them choose to serialize the nodes as empty elements. The same can be said about the XQTS results in the "ExpectedTestResults" directory: All "empty elements" which I remember are in fact represented and serialized in the empty elements notation. So I would assume that this representation will be the preferred one for the XQFT-TS users, but, once again, tell me if I'm wrong.

Christian, BaseX Team 
http://www.basex.org

Comment 5 Mary Holstege 2008-11-25 18:46:47 UTC
(Trying to clarify, and be a little more strictly correct about what I say.)
I agree, I don't know anyone who prefers <element></element> serialization to <element/> either.  We don't. Either way is fine as a way to serialize results, and we're not saying you have to do otherwise. As far as XML comparison in the testsuite is concerned, it doesn't matter, because you're supposed to be comparing the canonicalized results of both what is in the expected test results and your own results. The canonicalization of those results will use the start-tag/end-tag form.  So the net result of all my changes really amounts to nothing more than pre-canonicalizing the expected test results, and is therefore, at some level, pointless and unnecessary.  In retrospect, sticking with the preferred serialization on the ground is probably simpler for all of us. 

The important technical point is that you should not be performing a text comparison of your results directly to the expected output, however you (or we) choose to serialize: that would be an incorrect way to run the tests.
Comment 6 Christian Gruen 2008-11-25 19:52:15 UTC
..thank you for the clarification and sorry for persisting. I got your point, I agree that the canonicalization of both inputs will lead to the same result.
Comment 7 Pat Case 2008-11-29 17:24:24 UTC
Hi Christian,

Some more responses. Still figuring out how to handle the score queries #20-22.

[5] RESULTS ft-3.4-examples-... / ft-3.4.2-...

OLD: <?xml... / -
NEW: true
...all queries return boolean values instead of nodes (true/false)

--Yes. These have been chaged to return T/F.

[13] RESULT proximity-queries-results-q1.xml

OLD: <book number="1">...
NEW: <book number="3">...

--Yes. expected results corrected to Bk 3.

[14] RESULT proximity-queries-results-q5.xml

OLD: <book number="1">...
NEW: alternative empty result
...if an implementation defines paragraph boundaries different from <p>
(such as e.g. new lines, 0x0A), this query won't yield any results.

--Yes. Added alternative result.


[15] RESULT axes-queries-results-q1.txt

OLD: <p>Users can be tested..
NEW: <p>This is a basic ...</p> ... <p>Users can be tested..
...in this query, $para contains all <p> elements and not only
the one which contains the query terms

--Yes, I corrected the expected results.


[16] RESULT axes-queries-results-q2.txt

...I would expect <title shortTitle="Improving Web Site Usability...">...
as first result

--Only the scored queries order results. Other results may be returned in any order. No change made.


[17] RESULT axes-queries-results-q4.txt

...I would expect <title>Heuristic Evaluation</title>...
as second result.

--Only the scored queries order results. Other results may be returned in any order. No change made.


[18] RESULT full-text-composability-queries-results-q4.txt

...I would expect <title>Heuristic Evaluation</title> ...
as second result.

--Only the scored queries order results. Other results may be returned in any order. No change made.

[18] QUERY full-text-composability-queries-results-q5.xq

OLD: &gt;, ...
NEW: >, ...
...mysterious syntax

--Yes, changed to >.


[19] RESULT xquery-xpath-composability-queries-results-q7.txt

...wrong results.. would be correct if where clause was specified
as predicate after $book//subject

--Corrected query as recommended.


[23] QUERY score-queries-results-q3b.xq

OLD: &gt;
NEW: >

--Yes, changed to >.


[24] QUERY ft-3.4.1-examples-q1.xq

OLD: ... language "en"
NEW: ... language "en"]/..

--Done. Also made numerous other changes to this query.


[27] RESULT full-text-composability-queries-results-q2.txt
...language support should be optional (alternative result: FTST0009)

--Added alternative result.


[28] QUERIES full-text-composability-queries-results-q2/...

OLD: ... at most 3 words ftand ...
NEW: ... (... at most 3 words) ftand
...no ftand allowed after FTRange, other bugs

--Added parens as proposed.

Continued thanks for helping us clean up the test suite.

Pat
Comment 8 Christian Gruen 2008-11-29 18:45:13 UTC
Pat,

thanks once more for the detailed bug fix comments! A small addition to [16]-[18]: Sorry, I wanted to say that I expected other results in these queries. For example, in [16] I would expect the following result (the title "Improving Web Site Usability" is not included in the XQFT version.):

<title shortTitle="Improving Web Site Usability">Improving 
      the Usability of a Web Site Through Expert Reviews and 
      Usability Testing</title>
<title shortTitle="Usability Basics">Usability 
      Basics: How to Plan for and Conduct Usability Tests 
      on Web Site Thereby Improving the Usability of Your 
      Web Site</title>
<step number="1">Clarify and 
            articulate the goal of the usability testing.</step>
<step number="2">Identify tasks which 
            are critical for users to be able to complete 
            successfully.</step>


By the way, we now have released version 5 of our BaseX XQuery processor, and we would be pleased to see the project applied as a test tool for the completion of the XQFT test suite. It should be simple to run, and the frontend allows for an interative XQuery input and result output. Any comments and questions are more than welcome.

Thank you,

Christian, BaseX Team 
http://www.basex.org


Comment 9 Jim Melton 2008-12-23 01:05:49 UTC
[1] I agree with your your argument in http://www.w3.org/Bugs/Public/show_bug.cgi?id=6105#2 and have revised the result in ft-222-examples-results-q1.txt to remove the "Heavy Rain" author element. 

[3] I agree that there is a problem with the test ft-3.3-examples-q1, but I don't agree with your proposed fix.  You proposed:
   OLD: ...at least 2 times]/@number
   NEW: ...at least 2 times]
However, the purpose of the query, as stated in Section 3.3 Cardinality Selection, is "The following expression returns the example book element's number...".  Therefore, the result currently provided as ft-3.3-examples-q1.xml is incorrect -- it should not contain the <book> element, but merely the value of the number attribute.  I have replaced ft-3.3-examples-q1.xml with ft-3.3-examples-q1.txt (containing the single character "1") and updated the catalog accordingly. 

[7] We had previously discovered this error and corrected it in the files we use to generate the Full Text spec.  I have now taken the additional step of copying the corrected file into the test suite space. 

[8] I agree and have made the appropriate change to the result file. 

[9] I agree and have made the appropriate change to the result file. 

[10] I agree and have made the proposed change to the queries. 

[11] I agree to the part of the comment about the use of // and have made that change to the queries.  I agree that, in the queries that use the variable, there is a spelling discrepancy between the variable declaration and its use and have made appropriate changes to the queries to correct that. 

[12] By "too many results", do you mean "several results that differ only in the relative placement of the attributes"?  If so, then (a)eliminating namespaces won't make a difference and (b)I'm pretty sure (but haven't confirmed) that the XML comparison (compare="XML") "normalizes" the attributes to avoid this problem.  If you mean something else, I would be grateful for an explanation. 

Until you respond to my question in [12], I am not marking this bug report RESOLVED.  However, I believe that all other items have been resolved. 
Comment 10 Christian Gruen 2008-12-23 02:18:54 UTC
Hi Jim,

> [12] By "too many results", do you mean "several results that differ only in
> the relative placement of the attributes"?  If so, then (a)eliminating
> namespaces won't make a difference and (b)I'm pretty sure (but haven't
> confirmed) that the XML comparison (compare="XML") "normalizes" the attributes
> to avoid this problem.  If you mean something else, I would be grateful for an
> explanation. 

Yes, sorry for that; XML comparison should solve these problems.

> Until you respond to my question in [12], I am not marking this bug report
> RESOLVED.  However, I believe that all other items have been resolved.

Just to get sure: I think that [16]-[18] and [20]-[22] are still to be handled.


Thanks,

Christian, BaseX Team 
http://www.basex.org
Comment 11 Jim Melton 2008-12-23 22:05:28 UTC
You're right.  I actually meant to say "completed except for [20]-[22], but got overly enthusiastic.  I'd thought that [16]-[18] had been completed, because http://www.w3.org/Bugs/Public/show_bug.cgi?id=6223#c7 said:

<response>
[16] RESULT axes-queries-results-q2.txt

...I would expect <title shortTitle="Improving Web Site Usability...">...
as first result

--Only the scored queries order results. Other results may be returned in any
order. No change made.


[17] RESULT axes-queries-results-q4.txt

...I would expect <title>Heuristic Evaluation</title>...
as second result.

--Only the scored queries order results. Other results may be returned in any
order. No change made.


[18] RESULT full-text-composability-queries-results-q4.txt

...I would expect <title>Heuristic Evaluation</title> ...
as second result.

--Only the scored queries order results. Other results may be returned in any
order. No change made.
</response>

However, further reading revealed that you stated in http://www.w3.org/Bugs/Public/show_bug.cgi?id=6223#c8 that you're not satisfied with those responses and want us to consider [16]-[18] further.  We will, of course, do so. 

Thanks!
Comment 12 Pat Case 2009-02-03 20:14:57 UTC
Hi Christian,

Sorry to be so long in responding. 

This response concerns the test cases produced from the score examples in the Full Text Use Cases document. While many of these score examples make great use cases, I agree some make loosey test cases. 

The Full Text Task Force however wants all the uses cases in test suite to avoid questions about completeness of the test suite. 

So realizing that scoring is implementation-dependent and that any of these examples could theoretically return almost any result set, I have provided a few alternative results for each such query, but more importantly, have set the compare parameter to Inspect and noted in the test description that any result is acceptable.

I did this to all 6 score use case queries.

I also:

[20] RESULT score-queries-results-q1.txt

OLD: ... </book><book> ...
NEW: ... </book>, <book> ...
no commas between <book> elements. and, more important, as long as
no reference scoring model is defined, it might better to avoid score
values in the test result, as was done here

--I must have removed commas between book elements previously. I don't see them now.

[21] QUERY score-queries-results-q2.xq

OLD: ... >$result/metadata/title</
NEW: ... >{ $result/metadata/title }</
...missing brackets. next, the order of the results will once more
be implementation dependent, so it might be recommendable to omit
the "order" clause

--Added the missing brackets.

[22] RESULT score-queries-results-q3.txt

...once more different results, dep. on the implementation, and same
problem for nearly all scoring queries. Is something like a default
scoring model planned for XQFT or the test suitesTS?

I am hoping this addresses items 20-22. 

Please let me know if you disagree.

Thanks again for helping us make the test suite better.

Pat
Comment 13 Pat Case 2009-02-03 21:02:21 UTC
One more Christian,

I am revisiting items 16-18 as requested in comment 8.

[16] RESULT axes-queries-results-q2.txt

Your addition al commnet:
thanks once more for the detailed bug fix comments! A small addition to
[16]-[18]: Sorry, I wanted to say that I expected other results in these
queries. For example, in [16] I would expect the following result (the title
"Improving Web Site Usability" is not included in the XQFT version

...I would expect <title shortTitle="Improving Web Site Usability...">...
as first result
<title shortTitle="Improving Web Site Usability">Improving 
      the Usability of a Web Site Through Expert Reviews and 
      Usability Testing</title>
<title shortTitle="Usability Basics">Usability 
      Basics: How to Plan for and Conduct Usability Tests 
      on Web Site Thereby Improving the Usability of Your 
      Web Site</title>
<step number="1">Clarify and 
            articulate the goal of the usability testing.</step>
<step number="2">Identify tasks which 
            are critical for users to be able to complete 
            successfully.</step>

--Fair enough. I added your expected result as an alternative result for axes-queries-results-q2.txt

[17] RESULT axes-queries-results-q4.txt

...I would expect <title>Heuristic Evaluation</title>...
as second result.

--<title>Heuristic Evaluation</title>'s <p> doesn't contain "usability testing" so even if it has the same following sibling as <title>Usability Testing></title>'s <p>, I don't see it as a possible result. Sure I built these to work one way and that might make it harder for me to see alternatives. Let me know if you disagree.

[18] RESULT full-text-composability-queries-results-q4.txt

...I would expect <title>Heuristic Evaluation</title> ...
as second result.

--<title>Heuristic Evaluation</title> is as I expect you know a chapter title, so I think you are recommending also returning Book 1: Improving, which to my surprise does come close to being a valid result, except I think it is excluded by the ftnot ("montana" ftand "marigold"). Same disclaimer as above. Let me know if you disagree.

Pat
Comment 14 Christian Gruen 2009-02-04 11:53:14 UTC
Dear Pat, 

thank you for the updates on my comments. Yes, I think it is a good idea to use the 'inspect' property for handling scoring queries. - Just a last note to comment #17:

> [17] ...I would expect <title>Heuristic Evaluation</title>...
> as second result.
> 
> --<title>Heuristic Evaluation</title>'s <p> doesn't contain "usability 
> testing" so even if it has the same following sibling as <title>Usability
> Testing></title>'s <p>, I don't see it as a possible result. Sure I built 
> these to work one way and that might make it harder for me to see 
> alternatives. Let me know if you disagree.

I just tested this one again, and indeed it was wrong from me to talk about a "second result". Instead, the query is supposed to return all chapters, and only the one is shown in the result which contains the keywords. Below I've attached the current and the expected result:

=== QUERY =========================================================

for $book in $input-context/books/book
let $chapters := $book//chapter
where $chapters[./p ftcontains "usability 
   testing" and ./p/following-sibling::p ftcontains 
   "information architecture"]
return ($book/metadata/title, $chapters)

--- CURRENT -------------------------------------------------------

<title shortTitle="Improving Web Site Usability">Improving 
the Usability of a Web Site Through Expert Reviews and 
Usability Testing</title>
 <chapter>
     <title>Usability Testing</title>
     <p>Once the problems identified by expert 
     reviews have been corrected, it is time to 
     conduct some tests of the site with your unique 
     audience or audiences by conducting usability 
     testing.</p>
     <p>Users are asked to complete tasks which 
     measure the success of the information 
     architecture and navigational elements of the 
     site.</p>
     <p>Then changes are made to improve service to 
     users.</p>
</chapter>

--- EXPECTED ------------------------------------------------------

<title shortTitle="Improving Web Site Usability">Improving 
      the Usability of a Web Site Through Expert Reviews and 
      Usability Testing</title>
<chapter>
  <title>Heuristic Evaluation</title>
  <p>Expert reviewers critique an interface to 
            determine conformance with recognized 
            usability principles.<footnote>One of the 
            best known lists of heuristics is<citation url="http://www.useit.com/papers/heuristic/heuristic_list.html">Ten Usability 
            Heuristics by Jacob Nielson</citation>. Another 
            is<citation url="http://usability.gov/guidelines/index.html">Research-Based Web 
            Design and Usability Guidelines</citation>
    </footnote>
  </p>
</chapter>
<chapter>
  <title>Cognitive Walk-Through</title>
  <p>Expert reviewers evaluate Web site 
            understandability and ease of learning while 
            performing specified tasks. They walk through 
            the site answering questions such as "Would a 
            user know by looking at the screen how to 
            complete the first step of the task?" and "If 
            the user completed the first step, would the 
            user know what to do next?," with the goal of 
            identifying any obstacles to completing the 
            task and assessing whether the user would 
            cognitively be aware that he was successful in 
            completing a step in the process.</p>
</chapter>
<chapter>
  <title>Usability Testing</title>
  <p>Once the problems identified by expert 
            reviews have been corrected, it is time to 
            conduct some tests of the site with your unique 
            audience or audiences by conducting usability 
            testing.</p>
  <p>Users are asked to complete tasks which 
            measure the success of the information 
            architecture and navigational elements of the 
            site.</p>
  <p>Then changes are made to improve service to 
            users.</p>
</chapter>



Thanks,

Christian, BaseX Team 
http://www.basex.org

Comment 15 Pat Case 2009-02-04 19:41:30 UTC
Christian,

re: Just a last note to comment #17:

[17] RESULT axes-queries-results-q4.txt

I just tested this one again, and indeed it was wrong from me to talk about a
"second result". Instead, the query is supposed to return all chapters, and
only the one is shown in the result which contains the keywords. 

--Yes. Of course. Fixed. Pat

--I think this is the final item in this bug to be addressed, so I am going to mark the bug fixed. If you agree, please mark the bug closed.

Pat
Comment 16 Christian Gruen 2009-02-04 19:53:02 UTC
..thanks; I've closed this bug (and added some others some days ago.. http://www.w3.org/Bugs/Public/show_bug.cgi?id=6469).

Christian, BaseX Team 
http://www.basex.org