This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 3260 - Use of "-" in regular expressions
Summary: Use of "-" in regular expressions
Status: RESOLVED FIXED
Alias: None
Product: XML Schema
Classification: Unclassified
Component: Datatypes: XSD Part 2 (show other bugs)
Version: 1.1 only
Hardware: PC Windows XP
: P1 normal
Target Milestone: ---
Assignee: C. M. Sperberg-McQueen
QA Contact: XML Schema comments list
URL:
Whiteboard: cluster: regex
Keywords: resolved
Depends on:
Blocks:
 
Reported: 2006-05-09 11:21 UTC by Michael Kay
Modified: 2008-06-13 19:09 UTC (History)
0 users

See Also:


Attachments

Description Michael Kay 2006-05-09 11:21:22 UTC
QT approved comment

(General comment) in regular expressions defining permitted values for data types, the character "-" is sometimes preceded by "\" and sometimes not. The schema regex syntax, I believe, does not permit the "\".
Comment 1 Dave Peterson 2006-05-11 02:07:16 UTC
(In reply to comment #0)

> (General comment) in regular expressions defining permitted values for data
> types, the character "-" is sometimes preceded by "\" and sometimes not. The
> schema regex syntax, I believe, does not permit the "\".

There was a time when there was a question of how to reconcile the use of '-' as
a metacharacter in an seRange, and during that time some REs were written with
escaped '-' characters.  They need to be cleaned up, but they are essentially typos
so we did not hold up Last Call to fix them.
Comment 2 C. M. Sperberg-McQueen 2008-05-30 04:05:21 UTC
I believe the premise of the comment is faulty:  there is a single-character
escape for "-" which can be used whenever "-" itself must be denoted,
whether inside a character-class expression or outside it.  Consistency
and clarity need to be weighed, sometimes against each other.  

So I believe the correct resolution of this issue is as WORKSFORME.
Comment 3 C. M. Sperberg-McQueen 2008-06-13 19:09:37 UTC
The XML Schema WG discussed this issue at its telcon of 13 June 2008,
and agreed that in view of the work we have recently done on 
regular expressions, which has removed some of the variation in the
use of - in the lexical-space regular expressions, this issue should
be closed as FIXED.

We reminded ourselves that bug 5321 on the EBNF notation used to 
define the regex language is still open; MK pointed out that some
confusion arises from the fact that the grammar notation uses some
regex-like constructions, but not always exactly the same regex
constructions as the regex language we are defining.  The editors
expressed a hope that we could reduce the opportunities for confusion
by resolving 5321, which is classed editorial.

Michael, if you would report this resolution to QT and indicate their
acceptance or objection in the usual way, it would be helpful.  Thank you.