ISSUE-63

White-Space Canonicalization of XML Literals

State:
CLOSED
Product:
RDFa
Raised by:
Ben Adida
Opened on:
2007-11-09
Description:
There is  question as to what canonicalization of white-space literals we should
do, as per this thread:

http://lists.w3.org/Archives/Public/public-rdf-in-xhtml-tf/2007Nov/0000.html

The XPath normalize-space() function can be used on plain literals, but there
are issues to discuss for XML 
Literals, as per Ivan's point:

http://lists.w3.org/Archives/Public/public-rdf-in-xhtml-tf/2007Oct/0173.html
Related emails:
  1. ISSUE-63: White-Space Canonicalization of XML Literals (from dean+cgi@w3.org on 2007-11-09)

Related notes:

2008-02-19: 2008-02-14 [RRS] http://www.w3.org/2008/02/14-rdfa-minutes.html RESOLUTION: RDFa will state that whitespace is preserved and note that some implementations might not behave this way

2008-02-19: Manu tested several Javascript implementations for XML Literal and reported his results in http://lists.w3.org/Archives/Public/public-rdf-in-xhtml-tf/2007Nov/0022.html [[ In short - Firefox's implementation allows you to retrieve the original whitespace and line breaks using Javascript. IE7 does not. IE7 normalizes all of the whitespace before inserting it into the DOM, which means that Javascript does not have access to the original text in the XHTML file. ]]

2008-02-19: [RRS] Niklas Lindström followed up Manu's tests in IE7 in more detail and reported in http://lists.w3.org/Archives/Public/public-rdf-in-xhtml-tf/2007Nov/0027.html [[ I just did these quick tests in case the capabilities of IE7 is what would put an end to any hope of keeping non-canonicalized XMLLiterals. There seems to be some possibilities, but perhaps not stable enough? So if nothing else, it should be noted that requiring normalized space from RDFa parsers in such a case would require manual processing (DOM walking + normalizing) in some (at least non-XHTML-aware..) client implementations. ]]

2008-02-19: [2007-11-15] telecon discussion record in http://www.w3.org/2007/11/15-rdfa-minutes.html#item04 - RESOLVED plain literals are normalized using XPath normalized-space ... [[ Ben: for XMLLiterals, I'd like us to preserve them if we can ... but if browsers do canonicalize, I think we have to allow it ... I don't expect canonicalization to change how the literal is rendered ... conclusion; we'll normalize plain literals and element content but not normalize @content value except perhaps for special rule about leading newlines depending on what browsers do ]]