Definition of whitespace for the TrimTextNodes parameter in C14N 2.0

In C14N 2.0, the TrimTextNodes parameter removes leading and trailing whitespace, but we haven't precisely defined what whitespace is.

 

I propose that we define it the same that the XML spec defines it, which is  space character, carriage return, line feed or tab.

 

Here is the excerpt from XML  specification  http://www.w3.org/TR/xml/#NT-S 

 

----------------------------------------------------------------

S (white space) consists of one or more space (#x20) characters, carriage returns, line feeds, or tabs.

 

White Space

[3]    S    ::=    (#x20 | #x9 | #xD | #xA)+ 

 

Note:

 

The presence of #xD in the above production is maintained purely for backward compatibility with the First Edition. As explained in 2.11 End-of-Line Handling, all #xD characters literally present in an XML document are either removed or replaced by #xA characters before any other processing is done. The only way to get a #xD character to match this production is to use a character reference in an entity value literal.

-----------------------------------------------------------

 

Pratik

Received on Tuesday, 26 April 2011 21:46:17 UTC