This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 3737 - [FT] EBNF snippets confusing
Summary: [FT] EBNF snippets confusing
Status: CLOSED FIXED
Alias: None
Product: XPath / XQuery / XSLT
Classification: Unclassified
Component: Full Text 1.0 (show other bugs)
Version: Working drafts
Hardware: PC Windows XP
: P2 normal
Target Milestone: ---
Assignee: Jim Melton
QA Contact: Mailing list for public feedback on specs from XSL and XML Query WGs
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2006-09-18 19:14 UTC by Mary Holstege
Modified: 2007-03-16 19:35 UTC (History)
0 users

See Also:


Attachments

Description Mary Holstege 2006-09-18 19:14:03 UTC
There is a general problem with the expression part of the EBNF, which is that
it makes the sections look unconnected to each other and (ironically enough)
obscures the actual syntax of each individual construct.  For example, look at
the EBNF at the top of section 3. FTSelection refers to FTOr. When we look
at the next section, there is an EBNF for FTWords, but what is the relation of
that to FTSelection? Dunno.  What is FTOr? Well, its EBNF talks about FTAnd,
which talks about FTUnaryNot, which talks about FTWordSelection
which... um.. is nowhere to be found.  So one is left to trolling through the
appendix to figure out the connection between FTWords and FTSelection.
Since FTWords is the workhorse, common case of FTSelection, this is a real
problem in specification clarity.  Fixing this in the grammar will be tricky, I
recognize.

Look in particular at sections 3.1.3/3.1.4/3.1.5 (FTAnd/FTMildNot/FTUnaryNot)

What does FTMildNot have to do with FTAnd? Having FTMildNot reference FTUnaryNot is really really confusing, because you don't (can't) put
a real unary not in that expression.  And in 3.1.5 we go RIGHT off the deep end
and claim that the "!" is optional for FTUnaryNot, which is baffling for people
not deeply familiar with how the grammar is constructed.  For this specific
case I think we are better off making the grammar a teeny bit more complex and
not trying to put FTWords syntactically under FTUnaryNot.  

Some other minor tweaks may be possible to make the snippets more meaningful.
Comment 1 Jochen Doerre 2006-10-13 12:58:15 UTC
I agree that we have a problem in the exposition of the grammar and description of the constructs in Section 3, but I do not think that this is a problem of the EBNF, nor that it could be fixed by "tweaking" the EBNF.
When XQuery-FT adopted the TexQuery proposal we had simple grammar rules where you could explain the language constructs 1-1 with the grammar rules. To adapt to the style of the XQuery grammar we had to change that. Now we have a grammar that is LL(k) parsable and that also reflects the operator precedences in the grammar rules, like XQuery base. Thus a parser can automatically built from the grammar without further ado.
What we have missed to change is the way we describe the language constructs in terms of the grammar rules. For instance, in the Spec we talk about FTAnd as if that would be an &&-expression, meaning an FTSelection that is composed of the && operator plus operand FTSelections. But that is not the case. FTAnd has this grammar rule now:

FTAnd ::= FTMildnot ( "&&" FTMildnot )*

Hence, it is an abstract grammar symbol that just has the potential of expanding to an &&-expression, but it need not. Thus, when explaining the && operation we should not confuse the &&-expression with the FTAnd grammar symbol, like in 3.1.3.:
"FTAnd finds matches that satisfy both of the selection criteria."
The same applies to many other places in Section 3. For instance, all the proximities (FTOrder, FTDistance, FTWindow, FTContent, FTScope) are explained as if the grammar symbol (for instance FTOrderedIndicator) represents a full FTSelection involving that operator. But the grammar symbol in these cases only expands to the operator itself ("ordered" in this case). 

In the XQuery spec there is the same mismatch between grammar symbols and the language constructs you would like to explain, but the editors there do a good job of keeping those apart where necessary, e.g. they talk about an "or-expression" to describe the expression involving the logical "or" operator and do not confuse that with the grammar symbol "OrExpr". Also in that Spec the 
grammar of expressions is explained by first giving the top-level rules Expr and ExprSingle, but then introducing the different kinds of expressions bottom-up starting with all the PrimaryExpressions. There is also no obvious relation between Expr and PrimaryExpr at first. I don't think this is a problem for our Spec either. It is ok to start out with talking about FTSelections in general, but then starting to gradually introduce the different kinds bottom-up. Of course, it is not ideal from a pedagogical point of view that when explaining the "&&" we are using a grammar rule with an in this place totally unmotivated FTMildnot. But that's how the grammar is built for reasons of encoding the operator precedences. The same is actually true for XQuery base, e.g. when introducing Intersection the grammar rule mentions InstanceOfExpr!

But we need to change our exposition to not confuse the grammar symbols of that particular LL(k) grammar and the general language constructs.

/Jochen


Comment 2 Jim Melton 2007-03-16 19:00:29 UTC
The Task Force has carefully considered this issue and has agreed to make changes in the exposition of the grammar and associated rules.  ACTION FTTF-136-14 was assigned and has been completed. 

This bug is being marked as FIXED in the belief that our action satisfied your comment.  If you disagree, you may reopen the bug.  If you agree, we would be grateful if you would mark this bug CLOSED. 
Comment 3 Mary Holstege 2007-03-16 19:35:12 UTC
Looks good.