This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 6470 - [FT] ordered and/or queries
Summary: [FT] ordered and/or queries
Status: CLOSED WORKSFORME
Alias: None
Product: XPath / XQuery / XSLT
Classification: Unclassified
Component: Full Text 1.0 (show other bugs)
Version: Candidate Recommendation
Hardware: All All
: P2 normal
Target Milestone: ---
Assignee: Jim Melton
QA Contact: Mailing list for public feedback on specs from XSL and XML Query WGs
URL: http://basex.org
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2009-01-25 04:20 UTC by Christian Gruen
Modified: 2009-02-09 20:17 UTC (History)
1 user (show)

See Also:


Attachments

Description Christian Gruen 2009-01-25 04:20:55 UTC
Hi there,

I've come across another interesting query in the test suite, FTOrder-andor1.xq:

  $input-context/books/book[para ftcontains
    ("swift" ftand "persuasion") ftor
    ("ninja" ftand "software") ordered]/title 

As the "ordered" examples in the XQFT Specs are very simple, I would like to know what is supposed to happen here:

 Query: ('A' & 'B') | ('C' & 'D') ordered

  a) ("A" and "B") has to be ordered, OR ("C" and "D") has to be ordered?
  b) ("A" and "C"), ("A" and "D"), ("B" and "C") OR ("B" and "D") has to be ordered?
  c) "A" and "B" and "C" and "D" has to be ordered?

Assuming case a), is it equivalent to the following query?

  d) ("A" & "B" ordered) | ("C" & "D" ordered)

Next, what about this query?..

  e) ("A" & "B" ordered) | ("C" & "D" ordered) ordered

To put it differently, does the "ordered" selection really evaluate AND/OR combinations? I am still trying to find solutions for the following queries - can you help me out?..

  f) ('A' | 'B') ordered
  g) ('A' | 'B') & ('C' | 'D') ordered
  h) ('A' & !'B') ordered


Thanks for all,

Christian, BaseX Team 
http://www.basex.org
Comment 1 Jim Melton 2009-02-06 23:16:48 UTC
In this comment, I'm going to respond to some more of your bug reports:

[1] ft-3.2-examples-q5.xq:

You're right about having only a minimum of queries with scoring, because it's very difficult to predict what the results might be.  The XML Query WG and XSL WG have a sort of policy that requires us to place every example in the spec into the test suite, and this is one of those.  I'm not entirely sure how to resolve this dilemma other than to either (1) change the comparison specified in the catalog to "inspect" or (2) add a second possible result of "no file at all".  The TF will discuss this issue. 


[2] ft-3.3-examples-q1.xq:
You're right. This should be changed in the spec and in the test. 


[7] FTPrimary-FTWords-any-q4b.xq:
You are obviously correct.  Fixed.  (Note that this result had all 9 paragraphs and should have had none, but the result of the next test you cite had no paragraphs and should have had all 9.  I suspect the result files got reversed.) 


[8] FTPrimary-FTWords-anyword-q4b.xq
Again, obviously correct. Fixed.  (Note that this result had no paragraphs and should have had all 9, but the result of the previous test you cite had all 9 paragraphs and should have had none.  I suspect the result files got reversed.)



[9] FTPrimary-FTWords-anyword-q2b_result.xml
Good catch! Fixed. 


[10] FTWords/*.xml
I appreciate the comment, but it would be extremely helpful if you could identify the specific locations. 


[11] FTPrimary-FTWords-phrase-q3a.xq - -q4a.xq
I agree.  I've fixed the problems in each of the tests in slightly different ways. 
I also think that FTPrimary-FTWords-phrase-q4b.xq had the wrong result; I believe that its result file should contain only an empty <paragraphs> element. Do you agree?


[12] FTOr-badexpr1.xq
Good question; the TF will discuss this. 


[13] FTNot-q1.xq
I think I agree with you, but the TF should discuss this.  Of course, the question is 'Does book[para ftcontains ftnot "Ninja"] mean "find all books that have a para element that doesn't contain 'Ninja'" or "Find all books that do not have a para element that contains 'Ninja'"?'  You believe that it's the former, and I think you're probably right. 


[15] FTNot-q2.xq / -q3.xq / -q4.xq / -q5.xq
I agree; I have fixed q2 and q3, but will ask the test author to fix q4 and q5 (because the result will change from .xml to .txt and the comparison from XML to Fragment). 


And, unfortunately, that's all I have time to address today. With luck, other TF members will be able to work on the remaining issues soon. 
Comment 2 Jim Melton 2009-02-06 23:18:31 UTC
Jeez...please ignore comment 1.  I pasted my responses to bug 6469 into the wrong window.  Apologies!!
Comment 3 Michael Dyck 2009-02-07 01:02:26 UTC
[Personal response. All statements should be taken as if prefixed by
"I believe".]

(In reply to comment #0)
> 
> As the "ordered" examples in the XQFT Specs are very simple, I would like
> to know what is supposed to happen here:
> 
>  Query: ('A' & 'B') | ('C' & 'D') ordered
> 
>   a) ("A" and "B") has to be ordered, OR ("C" and "D") has to be ordered?
>   b) ("A" and "C"), ("A" and "D"), ("B" and "C") OR ("B" and "D")
>      has to be ordered?
>   c) "A" and "B" and "C" and "D" has to be ordered?

(a). Note that the 'ordered' filter operates on the matches
generated by the preceding FTOr (operates on each match independent
of the others). In this case, each of those matches will have two
StringIncludes, either one for 'A' and one for 'B', or else one for 'C'
and one for 'D'. So there's no way that the filter could enforce the 
constraints described in (b) and (c).

> Assuming case a), is it equivalent to the following query?
> 
>   d) ("A" & "B" ordered) | ("C" & "D" ordered)

Yes.

> Next, what about this query?..
> 
>   e) ("A" & "B" ordered) | ("C" & "D" ordered) ordered

That one is also equivalent.

> To put it differently, does the "ordered" selection really evaluate
> AND/OR combinations? I am still trying to find solutions for the
> following queries - can you help me out?..
> 
>   f) ('A' | 'B') ordered

The addition of the 'ordered' filter has no effect on the result
of the FTSelection.

>   g) ('A' | 'B') & ('C' | 'D') ordered

The search content must contain [an 'A' or a 'B'] that is followed by
[a 'C' or a 'D'].

>   h) ('A' & !'B') ordered

The search context must contain an 'A' that is not followed by a 'B'.
Comment 4 Christian Gruen 2009-02-07 03:19:07 UTC
Hi Michael,

thanks for your easy-to-grasp answers. I have come across some other queries for which I have no definite answers. If you, or someone else, has some time - I'm pleased with each answer..

 a) 'A B C' ftcontains ('C' | 'A') & ('B') ordered

The intuitive answer might be: 'A' is followed by 'B', so the query will yield true. But I'm wondering how this query is to be evaluated, as the 'ordered' position filter and the 'ftand' connective are still out of scope if the 'ftor' operator is being processed. The problem might not occur for this query..

 b) 'A B C' ftcontains ('C' & 'B') | ('A' & 'B') ordered

..assuming that the 'ftor' operator evaluates the position filters.

A similar problem arises for query h) from my last post:

>   h) ('A' & !'B') ordered
> The search context must contain an 'A' that is not followed by a 'B'.

Your answer seems intuitive to me, but again I'm wondering how the query evaluation could look like. Let's take the following query:

 c) 'B A' ftcontains ('A' & !'B') ordered

This query should probably return 'true' as 'A' is not followed by a 'B'. But the last query:

 d) 'B A' ftcontains ('A' & !'B')

will return false as the source string contains 'A' and 'B'. 

I hope I'm not stuck too deep in my own logic..

Christian, BaseX Team 
http://www.basex.org


Comment 5 Michael Dyck 2009-02-07 04:19:16 UTC
[personal response again:]

(In reply to comment #4)
> 
>  a) 'A B C' ftcontains ('C' | 'A') & ('B') ordered
>
> The intuitive answer might be: 'A' is followed by 'B', so the query will
> yield true.

Correct, it returns true.

> But I'm wondering how this query is to be evaluated, as the
> 'ordered' position filter and the 'ftand' connective are still
> out of scope if the 'ftor' operator is being processed.

By "are still out of scope", I'm guessing you mean something like "have
yet to be procesed". If so, I'm not sure why that's a problem.

- When the 'ftor' operator is processed, it generates two matches, one with
  a StringInclude (SI) for 'C' and one with an SI for 'A'.
  
- When the 'ftand' operator is processed, it also generates two matches:
  one has an SI for 'C' and an SI for 'B'; the other has an SI for 'A' and
  an SI for 'B'.

- When the 'ordered' filter is processed, it rejects the first match
  (because the 'C' and 'B' don't have that order in the search context)
  and accepts the second (because 'A' and 'B' do have that order in the
  search context).

> The problem might not occur for this query..
> 
>  b) 'A B C' ftcontains ('C' & 'B') | ('A' & 'B') ordered
> 
> ..assuming that the 'ftor' operator evaluates the position filters.

Returns true.

> A similar problem arises for query h) from my last post:
> 
> >   h) ('A' & !'B') ordered
> > The search context must contain an 'A' that is not followed by a 'B'.
> 
> Your answer seems intuitive to me, but again I'm wondering how the query
> evaluation could look like. Let's take the following query:
> 
>  c) 'B A' ftcontains ('A' & !'B') ordered
>
> This query should probably return 'true' as 'A' is not followed by a 'B'.

Correct, returns true.

> But the last query:
> 
>  d) 'B A' ftcontains ('A' & !'B')
> 
> will return false as the source string contains 'A' and 'B'. 

Correct, returns false.
Comment 6 Christian Gruen 2009-02-07 07:42:06 UTC
..thanks Michael, this has helped me further. I'll have a second look at the StringExclude instances.
Comment 7 Jim Melton 2009-02-09 19:39:17 UTC
Christian, we believe that Michael Dyck's answers have answered your questions and we believe that no changes are required to the spec.  On that basis, I am marking this bug WORKSFORME.  If you are satisfied, please mark it CLOSED. 
Comment 8 Christian Gruen 2009-02-09 20:17:31 UTC
Thank you Jim, I have closed the bug.