This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.
Atoms contain some treatment for those characters - https://github.com/SeleniumHQ/selenium/blob/master/javascript/atoms/dom.js#L1208 This change seems to be done in August, 2013 after the algorithm for Webdriver W3C spec was written. So I think README.md of Webdriver project should have a not that when someone makes changes to API of remote end he should also file a bug against Webdriver spec (or even file a bug against Webdriver spec prior to making a change) so Webdriver spec and Selenium won't become out-of-sync. P.S.: Also I haven't noticed in lines near L1208 code that removes \f, \v
(In reply to Andrey Botalov from comment #0) > > P.S.: Also I haven't noticed in lines near L1208 code that removes \f, \v Step 2 -> 1 -> 2nd bullet -> 1 handles this scenario
https://dvcs.w3.org/hg/webdriver/rev/4e8c789c7f54
There are other whitespace and BiDi characters in http://www.unicode.org/Public/6.3.0/ucd/PropList.txt and http://en.wikipedia.org/wiki/Space_(punctuation)#Spaces_in_Unicode. I think that if only \u200b, \u200e, \u200f, \v, \f should be removed by getElementText() from the string, then the spec should also contain an explanation (note) about what makes those characters special and why other invisible "spaces" shouldn't be removed. I don't know much about Unicode but IMO those "spaces" also look like zero-width: U+180E U+200C U+2060 U+061C etc. I also found this line in gecko-dev repository: https://github.com/mozilla/gecko-dev/blob/master/browser/base/content/browser.js#L2205: > value = value.replace(/[\u00ad\u034f\u061c\u115f-\u1160\u17b4-\u17b5\u180b-\u180d\u200b\u200e-\u200f\u202a-\u202e\u2060-\u206f\u3164\ufe00-\ufe0f\ufeff\uffa0\ufff0-\ufff8]|\ud834[\udd73-\udd7a]|[\udb40-\udb43][\udc00-\udfff]/g, encodeURIComponent); It seems that implementation in Firefox is a bit more complicated.
I would rather have this either mimick the current implementation, which this bug initial was about. Adding other spaces would need to be in a new bug with specific use cases so that we can discuss. I suggest bringing this issue up on the mailing list once you have a bug.