[Bug 7271] New: XQuery Full Text 1.0 grammar is not LL(k) because of "case" and "with" use

http://www.w3.org/Bugs/Public/show_bug.cgi?id=7271

           Summary: XQuery Full Text 1.0 grammar is not LL(k) because of
                    "case" and "with" use
           Product: XPath / XQuery / XSLT
           Version: Candidate Recommendation
          Platform: PC
        OS/Version: Linux
            Status: NEW
          Severity: normal
          Priority: P2
         Component: Full Text 1.0
        AssignedTo: jim.melton@acm.org
        ReportedBy: nikolay.ognyanov@gmail.com
         QAContact: public-qt-comments@w3.org


Reuse of  "case" in TypeSwitchExpr and FTMatchOption makes the Full Text
grammar non LL(k) for any fixed k. Consider for example the following two
fragments of a TypeswitchExpr :

  case a return b ftcontains "c" case sensitive 

  case a return b ftcontains "c" 
  case sensitive return ExprSingle


For the grammar to be LL(k), it must be possible to choose correct alternative
for FTSelection with k symbols of lookahead when looking at the fragment 

b ftcontains "c" case sensitive

and beyond. At first glance the lookahead "return" solves easily the problem.
Besides CaseClause though "return" also appears in FLWORExpr (and TransformExpr
if Update is added to the grammar) where a fragment 

b ftcontains "c" case sensitive return ExprSingle

can appear in trailing position. Choice of correct alternative does depend on
whether parsing happens within typeswitch or FLWOR. Therefore it is only
possible to distinguish between the alternatives in consideration by looking
beyond "return ExprSingle" (e.g. for typeswitch default clause). This can not
be done with any fixed amount of lookahead due to the recursive nature of
ExprSingle.

Analysis of problems caused by "case sensitive" applies also to "case
insensitive".

If Full Text is combined in the same grammar with Update then grammar becomes
non LL(k) also due to reuse of "with" in ReplaceExpr and FTMatchOption. For
example based on syntax only, the following are valid expressions :

replace node a ftcontains "b" with wildcards
replace node a ftcontains "b" with wildcards with something

It is very easy here to decide between the 2 alternatives for parsing of the
fragment

a ftcontains "b" with wildcards

by fixed amount of lookahead (assuming that at most 1 FTWildCardOption per
FTPrimaryWithOptions is allowed). The logic needed however only applies if the
fragment is embedded in ReplaceExpr and fails otherwise. On the other hand the
logic which has to be applied outside ReplaceExpr fails within it. The problem
with LL(k) then is that no amount of lookahead can resolve whether the fragment
is within or out of ReplaceExpr.

Analysys of problems caused by "with wildcards" applies also to "with stemming"
and "with thesaurus". It does not apply to "with stop words" because "stop
words" by itself is not a valid expression.

Regards
Nikolay


-- 
Configure bugmail: http://www.w3.org/Bugs/Public/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug.

Received on Friday, 14 August 2009 02:08:22 UTC