This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 1374 - [XQuery] some editorial comments on A.1 EBNF (notation)
Summary: [XQuery] some editorial comments on A.1 EBNF (notation)
Status: CLOSED FIXED
Alias: None
Product: XPath / XQuery / XSLT
Classification: Unclassified
Component: XQuery 1.0 (show other bugs)
Version: Last Call drafts
Hardware: All All
: P2 normal
Target Milestone: ---
Assignee: Scott Boag
QA Contact: Mailing list for public feedback on specs from XSL and XML Query WGs
URL:
Whiteboard: grammar
Keywords:
Depends on:
Blocks:
 
Reported: 2005-05-11 07:26 UTC by Michael Dyck
Modified: 2005-09-29 10:59 UTC (History)
0 users

See Also:


Attachments

Description Michael Dyck 2005-05-11 07:26:50 UTC
A.1 EBNF (notation)

"The following term definitions ..."
    Change "term definitions" to just "definitions".

--------------------
symbol

"Each rule in the grammar defines one symbol, in the form ..."
    This isn't a good definition for 'symbol'. But I'm not convinced it needs
    one. (Note that the corresponding chunk of the XML spec doesn't bother
    trying to define it.)  Maybe just drop the surrounding "[Definition:", "]".

--------------------
terminal

"A terminal is a single unit of the grammar that can not be further subdivided,
and is specified in the EBNF by a character or characters in quotes, or a
regular expression."
    This isn't a good definition for 'terminal'...

    "that can not be further subdivided"
        In order for this to be a useful component of a definition, you would
        have to answer the question "Subdivided how?".

        Note that anything consisting of two or more characters can at the
        very least be subdivided into its component characters. Stepping up
        a level, many of the terminals enumerated in A.2.1 can be conceptually
        subdivided, e.g.:
            a DoubleLiteral into its mantissa-part and exponent-part;
            a StringLiteral into its delimiter chars and content chars;
            a CommentContents into 'top-level' content and 'nested' content; and
            a QName into its prefix and local-part.

        It's possible you're thinking of "subdivided by insertion of
        whitespace", but that would lead to a circular definition:
            Q: Where can I insert whitespace?
            A: Between the terminals.
            Q: But what are the terminals?
            A: They're the things you can't insert whitespace into.

        So I recommend you don't try to answer the question of "Subdivided
        how?", because:
        (a) I don't think it has a good answer, and
        (b) I don't think 'non-subdividability' is an essential part of
            defining 'terminal'.

    "and is specified in the EBNF by a character or characters in quotes,
    or a regular expression."
        But that's everything. The RHS of every production is a regular
        expression over the symbols of the grammar.

        Again, you could try to refine the existing phrasing to more accurately
        convey what you have in mind, but I think the result would be messy,
        and unnecessary.

    In fact, I'm pretty sure that this section has no need to define or use the
    word "terminal".  (It has no bearing on the "meaning" of the EBNF notation.)
    And if/when you *do* need to define it (in A.2), it's much simpler to just
    enumerate the symbols that you want to be considered terminals.

"The following expressions"
    Maybe change "expressions" to "constructs" or "patterns".
    (Yes, the XML spec calls them 'expressions', but the XML spec doesn't use
    the word for anything else. The XQuery spec certainly does.)

"are used to match strings of one or more characters in a terminal:
    Delete "in a terminal". These constructs are also used in productions for
    symbols that *aren't* listed as terminals in A.2.1.

#xN
[#xN-#xN]
[#xN#xN#xN]
[^#xN-#xN]
[^#xN#xN#xN]
    The XQuery grammar doesn't use any of these constructs. Delete them.

[^a-z]
    The XQuery grammar doesn't use this either.

"[abc] Enumerations and ranges can be mixed in one set of brackets."
"[^abc] Enumerations and ranges of forbidden values can be mixed in one set of
brackets."
    The XQuery grammar doesn't mix enumerations and ranges.

"matches a literal string matching that given inside the double/single quotes."
    Throughout this section, the usage of "matches" is:
        {construct in the grammar} matches {characters in a query}
    (As such, it means roughly the same as "derives" in its technical sense.)
    However, this sentence adds the usage
        {characters in a query} matching {characters in the grammar}
    which reverses the sense.

    How about this:
        "matches the sequence of characters that appear inside the double/single
        quotes"

"matches a production"
    No, matches any string matched by that production.

"For the purposes of this secificiation"
    Fix typo: "specification"

"the entire unit is defined as a terminal."
    Actually, you should probably delete this sentence.  Rather than saying the
    [http:...] construct is a terminal, it's simpler and presumably equivalent
    to just designate the production's LHS symbol as a terminal.  (According to
    A.2.1, CharRef, QName, NCName, and S have already been handled this way.
    PITarget and Char haven't.)

--------------------
production

"[Definition: A production combines symbols to form more complex patterns.]
The following productions ..."
    Ack! This is a gross misuse of the term "production". Not only does it
    conflict with standard usage, it conflicts with other (correct) usage
    within the very same spec!
        1 Introduction: "The following example production"
        3 Expressions: "the left side of the grammar production"
        A.1 EBNF: "comments on grammar productions"

    If you need a word for these constructs, call them "patterns".

"serve as examples, where A and B represent simple expressions:"
    There's no definition of "simple expressions".  In fact, the word "simple"
    is unjustified, since some of the constructs that stand in for A and B
    can be fairly complicated. (e.g., see
        [21] SchemaImport,
        [95] DirAttributeList,  and
        [140] DoubleLiteral)

So, covering the last two points, you could replace the paragraph with
something like:
    "Patterns (including the above constructs) can be combined with
    grammatical operators to form more complex patterns, matching more
    complex sets of character strings. In the examples that follow,
    A and B represent (sub-)patterns."

"(expression)"
    Change to "(A)" -- you've already set up A as a placeholder, so why not
    use it.

A B
"This operator has higher precedence than alternation; thus A B | C D is
identical to (A B) | (C D)."
    As far as I can tell, constructs such as 'A B | C D' do not occur in the
    XQuery grammar , so it is unnecessary to define the relative precedence
    of juxtaposition and '|'. Delete the sentence.

A+
"thus A+ | B+ is identical to (A+) | (B+)"
    No such constructs occur. Delete the sentence.

A*
"thus A* | B* is identical to (A*) | (B*)"
    No such constructs occur. Delete the sentence.

(angle-bracket groups)
    Since the angle-bracket group is a notation used in the grammar, it should
    be defined here. I suggest putting it right after "(expression)", since
    it has a similar flavour. (In each case, the grouped construct matches the
    same thing as the ungrouped construct.)
Comment 1 Scott Boag 2005-07-08 15:14:13 UTC
> A.1 EBNF (notation)
> 
> "The following term definitions ..."
>     Change "term definitions" to just "definitions".

Done.

> 
> --------------------
> symbol
> 
> "Each rule in the grammar defines one symbol, in the form ..."
>     This isn't a good definition for 'symbol'. But I'm not convinced it needs
>     one. (Note that the corresponding chunk of the XML spec doesn't bother
>     trying to define it.)  Maybe just drop the surrounding "[Definition:", 
> "]".

Done, un-termed.


> 
> --------------------
> terminal
> 
> "A terminal is a single unit of the grammar that can not be further subdivided,
> and is specified in the EBNF by a character or characters in quotes, or a
> regular expression."
>     This isn't a good definition for 'terminal'...
...

Redefined as part of the work to divide the terminals from the main grammar.

> 
> "The following expressions"
>     Maybe change "expressions" to "constructs" or "patterns".
>     (Yes, the XML spec calls them 'expressions', but the XML spec doesn't use
>     the word for anything else. The XQuery spec certainly does.)

changed to "constructs".

> 
> "are used to match strings of one or more characters in a terminal:
>     Delete "in a terminal". These constructs are also used in productions for
>     symbols that *aren't* listed as terminals in A.2.1.
> 
> #xN
> [#xN-#xN]
> [#xN#xN#xN]
> [^#xN-#xN]
> [^#xN#xN#xN]
>     The XQuery grammar doesn't use any of these constructs. Delete them.

Removed (or at least hidden via the XML, in case they're needed later).

> 
> [^a-z]
>     The XQuery grammar doesn't use this either.

Removed.

> 
> "[abc] Enumerations and ranges can be mixed in one set of brackets."
> "[^abc] Enumerations and ranges of forbidden values can be mixed in one set of
> brackets."
>     The XQuery grammar doesn't mix enumerations and ranges.

Removed.

> 
> "matches a literal string matching that given inside the double/single quotes."
>     Throughout this section, the usage of "matches" is:
>         {construct in the grammar} matches {characters in a query}
>     (As such, it means roughly the same as "derives" in its technical sense.)
>     However, this sentence adds the usage
>         {characters in a query} matching {characters in the grammar}
>     which reverses the sense.
> 
>     How about this:
>         "matches the sequence of characters that appear inside the double/single
>         quotes"

Text adapted.

> 
> "matches a production"
>     No, matches any string matched by that production.
> 

Fixed.

> "For the purposes of this secificiation"
>     Fix typo: "specification"

Fixed.

> 
> "the entire unit is defined as a terminal."
>     Actually, you should probably delete this sentence.  Rather than saying the
>     [http:...] construct is a terminal, it's simpler and presumably equivalent
>     to just designate the production's LHS symbol as a terminal.  (According to
>     A.2.1, CharRef, QName, NCName, and S have already been handled this way.
>     PITarget and Char haven't.)

Sentence deleted.

> So, covering the last two points, you could replace the paragraph with
> something like:
>     "Patterns (including the above constructs) can be combined with
>     grammatical operators to form more complex patterns, matching more
>     complex sets of character strings. In the examples that follow,
>     A and B represent (sub-)patterns."

Text adapted.

> 
> "(expression)"
>     Change to "(A)" -- you've already set up A as a placeholder, so why not
>     use it.

Done.

> 
> A B
> "This operator has higher precedence than alternation; thus A B | C D is
> identical to (A B) | (C D)."
>     As far as I can tell, constructs such as 'A B | C D' do not occur in the
>     XQuery grammar , so it is unnecessary to define the relative precedence
>     of juxtaposition and '|'. Delete the sentence.

No.  That construct occurs often if you accept that angle-bracket groups have no
definitional significance.

Example:

[38] OrderByClause ::= (<"order" "by"> | <"stable" "order" "by">) OrderSpecList

In the main exposition, where angle brackets are not used:

[38] OrderByClause ::= ("order" "by" | "stable" "order" "by") OrderSpecList

Otherwise, we need to define <A B> in terms of (A B), and in that case, the
parens need to be put in the exposition, which would look pretty ugly and
unnecessary in many places.

> 
> A+
> "thus A+ | B+ is identical to (A+) | (B+)"
>     No such constructs occur. Delete the sentence.
> 
> A*
> "thus A* | B* is identical to (A*) | (B*)"
>     No such constructs occur. Delete the sentence.

Given that I think I've established that we need the precedence for A B, I would
prefer to go ahead and define the precedence for these, even if not used.

> 
> (angle-bracket groups)
>     Since the angle-bracket group is a notation used in the grammar, it should
>     be defined here. I suggest putting it right after "(expression)", since
>     it has a similar flavour. (In each case, the grouped construct matches the
>     same thing as the ungrouped construct.)

Done.
Comment 2 Scott Boag 2005-07-22 19:28:04 UTC
A joint meeting of the Query and XSLT working groups considered this comment on 
July 20, 2005.  

The WGs agreed to resolve these editorial issues as listed in my previous comment.

If you do not agree with this resolution, please add a comment explaining why.
If you wish to appeal the WG's decision to the Director, then change the Status
of the record to Reopened. If we do not hear from you in the next two weeks, we
will assume you agree with the WG decision.