This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.
When we move to using proper Perl character strings, a number of regular expressions will behave differently, for example \s will suddenly match U+0085, U+2028 and U+2029 instead of just [\r\n\t ], similar for \w and other symbols. We need to check where this might be desired and where it might lead to problems and have proper test cases for them.
I've added notes in the code where this might be an issue.