Bug 17452 - WebIDL: at some places in the grammar you probably intend mandatory whitespace
Summary: WebIDL: at some places in the grammar you probably intend mandatory whitespace
Alias: None
Product: WebAppsWG
Classification: Unclassified
Component: WebIDL (show other bugs)
Version: unspecified
Hardware: PC All
: P2 major
Target Milestone: ---
Assignee: Cameron McCormack
QA Contact: public-webapps-bugzilla
Depends on:
Reported: 2012-06-09 10:22 UTC by Wolfgang Keller
Modified: 2012-06-22 06:24 UTC (History)
2 users (show)

See Also:


Note You need to log in before you can comment on or make changes to this bug.
Description Wolfgang Keller 2012-06-09 10:22:58 UTC
To quote from the WebIDL specification:

"Implicitly, the whitespace terminal is allowed between every terminal in the input text being parsed. Such whitespace terminals, which actually encompass both whitespace and comments, are ignored while parsing."

I believe at some places in the grammar you want to put mandatory whitespace between the terminals because otherwise the grammar would probably not be unique.

An example of such a rule is
[25]	ImplementsStatement	→	identifier "implements" identifier ";"

Here you surely want to put a mandatory whitespace between identifier and "implements" because otherwise we could not detect whether

"fooimplementsbarimplementsbluv" stands for
"fooimplementsbar implements bluv"
"foo implements barimplementsbluv".

You should mark all places where you want to require mandatory whitespaces.
Comment 1 Cameron McCormack 2012-06-22 06:24:45 UTC
"fooimplementsbarimplementsbluv" must be tokenised as a single identifier token, because of the rule that says to tokenise the longest thing it can:

  When tokenizing, the longest possible match MUST be used. For example, if the
  input text is “a1”, it is tokenized as a single identifier, and not as a
  separate identifier and integer. If the longest possible match could match both
  an identifier and one of the quoted terminal symbols from the grammar, it MUST
  be tokenized as the quoted terminal symbol. Thus, the input text “long” is
  tokenized as the quoted terminal symbol "long" rather than an identifier called

So I don't think there is a need to annotate in the grammar where explicit white space is required.