This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 879 - Possible regex bugs when moving to character strings
Summary: Possible regex bugs when moving to character strings
Status: REOPENED
Alias: None
Product: Validator
Classification: Unclassified
Component: check (show other bugs)
Version: 0.7.0
Hardware: Other other
: P2 normal
Target Milestone: ---
Assignee: Terje Bless
QA Contact: qa-dev tracking
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2004-09-13 06:16 UTC by Bj
Modified: 2010-01-19 16:39 UTC (History)
0 users

See Also:


Attachments

Description Bj 2004-09-13 06:16:25 UTC
When we move to using proper Perl character strings, a number of regular 
expressions will behave differently, for example \s will suddenly match U+0085, 
U+2028 and U+2029 instead of just [\r\n\t ], similar for \w and other symbols. 
We need to check where this might be desired and where it might lead to 
problems and have proper test cases for them.
Comment 1 Bj 2005-08-18 03:47:45 UTC
I've added notes in the code where this might be an issue.