Re: [Issue-67] [Action-385] Work on regex for validating regex subset proposal

Am 08.04.13 18:28, schrieb Jirka Kosek:
> On 8.4.2013 18:15, Felix Sasaki wrote:
>
>> Trying to move this forward:
>> Would this ABNF make sense to you
>> http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2013Apr/0027.html
>>
>> ("BMP+escapes" still needs to be defined)
> I'm not sure whether this ABNF does what it should do. For example this
> grammar allows ^ almost anywhere but I think that in most RE engines ^
> should directly follow [ if it's meant as a negation.

Agree - you could resolve that by removing neg from
char = [neg] BMP+escapes
and change
allowedCharacters = start 1*range end ["+"]
to
allowedCharacters = start [neg] 1*range end ["+"]

>
> Maybe starting with grammar in W3C XML Schema spec and forbidding some
> rules would be easier.

Currently in the spec
http://www.w3.org/International/multilingualweb/lt/drafts/its20/its20.html#allowedchars-definition
We reference the XML Schema grammar
http://www.w3.org/TR/xmlschema-2/#charcter-classes
but not a specific production in the grammar. Which one would you 
choose, e.g.
http://www.w3.org/TR/xmlschema-2/#nt-charClassExpr
?

I'm fine with the "XML Schema disallowing" approach. But ending up with 
a means to validate the regex, and not leaving that to the regex engine, 
seems crucial as part of resolving the issue. From previous discussions 
it seems pointing people to XML Schema with some additional information 
(e.g. "assume that this is not allowed" won't help - implementers will 
just use their (non XML Schema) engine.

>
>> P.S.: different topic - I had the same issues as Pablo with the
>> validation with the testsuite: I had to use my local copy of jing, the
>> one in github didn't work.
> It works for me. Anyway I synced versions of Jing, so you can give it
> another try.

Thanks, will do.

Best,

Felix

Received on Monday, 8 April 2013 16:48:59 UTC