This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 7271 - XQuery Full Text 1.0 grammar is not LL(k) because of "case" and "with" use
Summary: XQuery Full Text 1.0 grammar is not LL(k) because of "case" and "with" use
Status: CLOSED FIXED
Alias: None
Product: XPath / XQuery / XSLT
Classification: Unclassified
Component: Full Text 1.0 (show other bugs)
Version: Candidate Recommendation
Hardware: PC Linux
: P2 normal
Target Milestone: ---
Assignee: Jim Melton
QA Contact: Mailing list for public feedback on specs from XSL and XML Query WGs
URL:
Whiteboard:
Keywords:
: 7907 (view as bug list)
Depends on:
Blocks:
 
Reported: 2009-08-14 02:08 UTC by Nikolay Ognyanov
Modified: 2009-10-15 23:24 UTC (History)
2 users (show)

See Also:


Attachments

Description Nikolay Ognyanov 2009-08-14 02:08:12 UTC
Reuse of  "case" in TypeSwitchExpr and FTMatchOption makes the Full Text grammar non LL(k) for any fixed k. Consider for example the following two fragments of a TypeswitchExpr :

  case a return b ftcontains "c" case sensitive 

  case a return b ftcontains "c" 
  case sensitive return ExprSingle


For the grammar to be LL(k), it must be possible to choose correct alternative for FTSelection with k symbols of lookahead when looking at the fragment 

b ftcontains "c" case sensitive

and beyond. At first glance the lookahead "return" solves easily the problem. Besides CaseClause though "return" also appears in FLWORExpr (and TransformExpr if Update is added to the grammar) where a fragment 

b ftcontains "c" case sensitive return ExprSingle

can appear in trailing position. Choice of correct alternative does depend on whether parsing happens within typeswitch or FLWOR. Therefore it is only possible to distinguish between the alternatives in consideration by looking beyond "return ExprSingle" (e.g. for typeswitch default clause). This can not be done with any fixed amount of lookahead due to the recursive nature of ExprSingle.

Analysis of problems caused by "case sensitive" applies also to "case insensitive".

If Full Text is combined in the same grammar with Update then grammar becomes non LL(k) also due to reuse of "with" in ReplaceExpr and FTMatchOption. For example based on syntax only, the following are valid expressions :

replace node a ftcontains "b" with wildcards
replace node a ftcontains "b" with wildcards with something

It is very easy here to decide between the 2 alternatives for parsing of the fragment

a ftcontains "b" with wildcards

by fixed amount of lookahead (assuming that at most 1 FTWildCardOption per FTPrimaryWithOptions is allowed). The logic needed however only applies if the fragment is embedded in ReplaceExpr and fails otherwise. On the other hand the logic which has to be applied outside ReplaceExpr fails within it. The problem with LL(k) then is that no amount of lookahead can resolve whether the fragment is within or out of ReplaceExpr.

Analysys of problems caused by "with wildcards" applies also to "with stemming" and "with thesaurus". It does not apply to "with stop words" because "stop words" by itself is not a valid expression.

Regards
Nikolay
Comment 1 Jim Melton 2009-09-11 02:18:18 UTC
The Full Text Task Force has determined to fix this problem by replacing the word "with" with the word "using", and slightly regularizing the grammar so that "using" is used consistently throughout the various match options. 

I am not yet marking this bug RESOLVED/FIXED because some grammar details are still being hashed out. 
Comment 2 Jim Melton 2009-10-08 22:19:04 UTC
In its teleconference of 2009-09-28, the Full Text Task Force resolved to adopt the solution mentioned in http://www.w3.org/Bugs/Public/show_bug.cgi?id=7271#c1.  As a result of this change to the grammar, I am marking this big RESOLVED/FIXED.  If you agree with the solution, please mark it CLOSED. 
Comment 3 Michael Dyck 2009-10-13 15:05:34 UTC
*** Bug 7907 has been marked as a duplicate of this bug. ***
Comment 4 Nikolay Ognyanov 2009-10-15 13:40:48 UTC
Sorry, I can not agree or disagree with the solution because is not described in sufficient detail here.
Comment 5 Michael Dyck 2009-10-15 18:01:41 UTC
To resolve the conflicts you pointed out, we made the following changes to the "match options" portion of the grammar:

1) In the production for FTMatchOptions, introduce the keyword "using" before the symbol FTMatchOption, i.e.:
    [166] FTMatchOptions ::= ("using" FTMatchOption)+

2) Remove all occurrences of the "with" keyword.

3) Change each occurrence of "without" to "no".
   (Note that this doesn't apply to the "without" in FTIgnoreOption.)

To improve consistency, we also made the following change:

4) In FTStopWordOption, change the "default" syntax from
       default stop words ...
   to
       stop words default ...
Comment 6 Nikolay Ognyanov 2009-10-15 23:24:43 UTC
Thank you for the detailed description. I agree that these grammar changes solve the problems described above.