11859 – "^" is a valid regular expression

This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 11859 - "^" is a valid regular expression

Summary: "^" is a valid regular expression

Status:	CLOSED FIXED

Alias:	None

Product:	XML Schema
Classification:	Unclassified
Component:	Datatypes: XSD Part 2 (show other bugs)
Version:	1.1 only
Hardware:	PC Windows NT

Importance:	P2 normal
Target Milestone:	---
Assignee:	David Ezell
QA Contact:	XML Schema comments list

URL:
Whiteboard:
Keywords:	resolved

Depends on:
Blocks:

Reported:	2011-01-25 09:12 UTC by Michael Kay
Modified:	2011-03-20 17:36 UTC (History)
CC List:	2 users (show)

See Also:

Attachments

Description Michael Kay 2011-01-25 09:12:56 UTC

G.1, in a Note, states:

The string '^' is unambiguous: the grammar recognizes it as a positive character group containing the character '^'.  But the grammatical derivation of the string violates the rule just given, so the string '^' must not be accepted as a regular expression.

The last phrase should read "must not be accepted as a character group". The string "^" is fine as a regular expression; the problem being discussed is the interpretation of "[^]".

Comment 1 C. M. Sperberg-McQueen 2011-03-08 02:10:50 UTC

I think a simpler change would be to replace the initial and final references to '^' with references to '[^]'. 
Hmm.  Maybe not simpler (it requires other rewordings), but perhaps easier to follow in context.

To make it a formal proposal, I'll say:  make the second paragraph of the note read:

    The string '[^]' is unambiguous: the grammar recognizes it as a character 
    class expression containing a positive character group containing just 
    the character '^'.  But the grammatical derivation of the string violates 
    the rule just given, so the string '[^]' must not be accepted as a regular 
    expression.

In addition to changing '^' to '[^]' twice, this changes 'as a positive character group containing' to 'as a character class expression containing a positive character group containing just'.

The change suggested by MK would also work; I suggest an alternative because I'm a little unhappy, now that my attention has been drawn to it, by the note's saying "the grammar" recognizes '^' as a positive character group.  It's true (or true-ish) in context, but if '^' is parsed against the regExp non-terminal (and regExp is after all the natural start symbol, if we are going to refer to "the grammar"), then, no, it won't be recognized as a positive character group.

This proposal has not been reviewed by the other editors but I'm going to take the liberty of marking the issue needsReview anyway.

Comment 2 Dave Peterson 2011-03-08 15:35:22 UTC

(In reply to comment #1)

> This proposal has not been reviewed by the other editors but I'm going to take
> the liberty of marking the issue needsReview anyway.

+1 (another editor)

Comment 3 David Ezell 2011-03-18 15:14:48 UTC

RESOLVED: adopt the proposal in comment #1.

Comment 4 C. M. Sperberg-McQueen 2011-03-20 14:43:45 UTC

The proposal in comment 1 has now been integrated into the status quo document.

Michael, if you would indicate your satisfaction with this result by closing the bug, or your dissatisfaction by reopening it?  Thanks.