This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 11859 - "^" is a valid regular expression
Summary: "^" is a valid regular expression
Status: CLOSED FIXED
Alias: None
Product: XML Schema
Classification: Unclassified
Component: Datatypes: XSD Part 2 (show other bugs)
Version: 1.1 only
Hardware: PC Windows NT
: P2 normal
Target Milestone: ---
Assignee: David Ezell
QA Contact: XML Schema comments list
URL:
Whiteboard:
Keywords: resolved
Depends on:
Blocks:
 
Reported: 2011-01-25 09:12 UTC by Michael Kay
Modified: 2011-03-20 17:36 UTC (History)
2 users (show)

See Also:


Attachments

Description Michael Kay 2011-01-25 09:12:56 UTC
G.1, in a Note, states:

The string '^' is unambiguous: the grammar recognizes it as a positive character group containing the character '^'.  But the grammatical derivation of the string violates the rule just given, so the string '^' must not be accepted as a regular expression.

The last phrase should read "must not be accepted as a character group". The string "^" is fine as a regular expression; the problem being discussed is the interpretation of "[^]".
Comment 1 C. M. Sperberg-McQueen 2011-03-08 02:10:50 UTC
I think a simpler change would be to replace the initial and final references to '^' with references to '[^]'. 
Hmm.  Maybe not simpler (it requires other rewordings), but perhaps easier to follow in context.

To make it a formal proposal, I'll say:  make the second paragraph of the note read:

    The string '[^]' is unambiguous: the grammar recognizes it as a character 
    class expression containing a positive character group containing just 
    the character '^'.  But the grammatical derivation of the string violates 
    the rule just given, so the string '[^]' must not be accepted as a regular 
    expression.

In addition to changing '^' to '[^]' twice, this changes 'as a positive character group containing' to 'as a character class expression containing a positive character group containing just'.

The change suggested by MK would also work; I suggest an alternative because I'm a little unhappy, now that my attention has been drawn to it, by the note's saying "the grammar" recognizes '^' as a positive character group.  It's true (or true-ish) in context, but if '^' is parsed against the regExp non-terminal (and regExp is after all the natural start symbol, if we are going to refer to "the grammar"), then, no, it won't be recognized as a positive character group.

This proposal has not been reviewed by the other editors but I'm going to take the liberty of marking the issue needsReview anyway.
Comment 2 Dave Peterson 2011-03-08 15:35:22 UTC
(In reply to comment #1)

> This proposal has not been reviewed by the other editors but I'm going to take
> the liberty of marking the issue needsReview anyway.

+1 (another editor)
Comment 3 David Ezell 2011-03-18 15:14:48 UTC
RESOLVED: adopt the proposal in comment #1.
Comment 4 C. M. Sperberg-McQueen 2011-03-20 14:43:45 UTC
The proposal in comment 1 has now been integrated into the status quo document.

Michael, if you would indicate your satisfaction with this result by closing the bug, or your dissatisfaction by reopening it?  Thanks.