This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 13298 - Clean the grammar of unnecessary trivial productions
Summary: Clean the grammar of unnecessary trivial productions
Status: RESOLVED WONTFIX
Alias: None
Product: XPath / XQuery / XSLT
Classification: Unclassified
Component: XQuery 3.0 (show other bugs)
Version: Working drafts
Hardware: All All
: P2 trivial
Target Milestone: ---
Assignee: Jonathan Robie
QA Contact: Mailing list for public feedback on specs from XSL and XML Query WGs
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-07-19 12:55 UTC by Gabriel Petrovay
Modified: 2011-09-06 15:11 UTC (History)
1 user (show)

See Also:


Attachments

Description Gabriel Petrovay 2011-07-19 12:55:43 UTC
Hi,

The XQuery 3.0 grammar is full of unnecessary productions. One such example (out of many) is:
[150] CompElemConstructor ::= "element" (EQName | ("{" Expr "}")) "{" ContentExpr? "}"
[151] ContentExpr ::= Expr

The ContentExpr translates to Expr and is not referred in the grammar anywhere else. The Computed Element Desription only mentions about "The content expression". Just for this, it makes no sense to have an additional productions.

Having additional unnecessary productions make:
- the grammar too verbose and harder to read
- the implementers not respect the standard because of performace issues when implementing parsers
- more work to maintain from both W3C and implementers

This should be like in the already existing case of:
[152] CompAttrConstructor ::= "attribute" (EQName | ("{" Expr "}")) "{" Expr? "}"
where a ContentExpr is not necessary


The only exceptions from this should be the following two because of their importance and extensive usage:
[38]   	QueryBody         ::=   	Expr
[125]   VarName	          ::=   	EQName
[168]   AtomicOrUnionType ::=   	EQName
[184]   AttributeName	   ::=   	EQName
[185]   ElementName	   ::=   	EQName
[186]   TypeName	   ::=   	EQName
[191]   URILiteral	   ::=   	StringLiteral


The same request to remove the trivial production applies to the following:
 [29]   	VarValue	   ::=   	ExprSingle
 [30]   	VarDefaultValue	   ::=   	ExprSingle
-> [28] description can easily make the destinction

 [35]   	FunctionBody	   ::=   	EnclosedExpr

 [56]   	CurrentItem	   ::=   	EQName
 [57]   	PreviousItem	   ::=   	EQName
 [58]   	NextItem	   ::=   	EQName
-> [55] is explicit enough to only have EQName references

 [79]   	TryTargetExpr	   ::=   	Expr

[151]   	ContentExpr	   ::=   	Expr
-> make [150] it uniform with [152]

[154]   	Prefix	   ::=   	NCName
[155]   	PrefixExpr	   ::=   	Expr
[156]   	URIExpr	   ::=   	Expr

[179]   	AttributeDeclaration	   ::=   	AttributeName
-> totally redundant


All this makes the grammar have at least 12 less productions which is not bad.


PS: Or at least make the grammar uniform. Either:
1. Remove all unnecessary productions or
2. Add all missing unnecessary productions
Comment 1 Gabriel Petrovay 2011-07-19 12:59:37 UTC
correction:

...
The only exceptions from this should be the following *seven* because of their
importance and extensive usage:
...
Comment 2 Michael Kay 2011-07-19 13:21:10 UTC
Some of these production rules were introduced to create a "handle" allowing the description of the semantics to refer to constructs in the grammar. Certainly I remember VarValue and VarDefaultValue being introduced expressly for this purpose. Similarly, AtomicOrUnionType is referenced in the description of error XPST0051. There might be some productions that are not referenced in the prose, but they would need to be checked on a case-by-case basis. We would also need to check that the production names are not referenced from another specification in the family, such as the XSLT specification.

At best these additional production names add clarity and readability. At worst, they are harmless.

I have always argued that the grammar should be designed primarily for the benefit of users of the language rather than implementors. There are some who disagree with me on this, but your arguments in my view take too much of an implementor viewpoint. Removing a few redundant productions is very easy compared with the other demands we place on implementors of this grammar. 

(personal response)
Comment 3 Gabriel Petrovay 2011-07-19 13:49:11 UTC
In my opinion any of the two solutions in the end of my initial comment is good:
1. Remove all unnecessary productions or
2. Add the missing ones

For 1 (which I favour), unnecessary does not mean *all*. Taking a few examples:
[55] is explicit enough on it's own and does not need more productions.
[150] should be as [152] (or the other way round if solution 2 is the decision)

But, yes they have to be checked each individually to see if:
- they make sense
- they are used in other specs
Comment 4 Jonathan Robie 2011-07-19 17:02:03 UTC
(In reply to comment #3)
> In my opinion any of the two solutions in the end of my initial comment is
> good:
> 1. Remove all unnecessary productions or
> 2. Add the missing ones

I believe this is largely a matter of editorial taste. Accordingly, I am marking this as editorial.

When the text is difficult to write without a clearly named production, it makes sense to add such a production for the sake of the description of that particular expression. If an expression can be clearly described without adding such a production, there is no need to add one.

Jonathan
Comment 5 Jonathan Robie 2011-09-06 15:10:55 UTC
I am closing this, per comment #4.

Jonathan