This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 5441 - [SER] Text output method and normalization-form
Summary: [SER] Text output method and normalization-form
Status: CLOSED FIXED
Alias: None
Product: XPath / XQuery / XSLT
Classification: Unclassified
Component: Serialization 1.0 (show other bugs)
Version: Recommendation
Hardware: PC Windows NT
: P2 normal
Target Milestone: ---
Assignee: Henry Zongaro
QA Contact: Mailing list for public feedback on specs from XSL and XML Query WGs
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2008-01-29 17:34 UTC by Tim Mills
Modified: 2009-03-24 17:10 UTC (History)
0 users

See Also:


Attachments

Description Tim Mills 2008-01-29 17:34:38 UTC
Consider the serialization of:

<element>
  <first>&#x0063;</first>
  <second>&#x0327;</second>
</element>

with parameters:
method 'text'
normalization-form 'nfc'

It's not clear from "8.1.8 Text Output Method: the normalization-form Parameter" how normalization is applied.  Let D be the document created by sequence normalization.

A. to the string value of D (mentioned in 8 Text Output Method), or
B. to the text nodes in D, as described in in 4 Phases of Serialization, step 3, part d (which states that "For the output methods defined in this specification, these phases are carried out sequentially as follows...")

Depending on whether it is A or B, the answer will be different.

Note that in step 3, the specification states that:

"For each text and attribute node, the following rules are applied in sequence."

however, a string value is clearly not a text node.
Comment 1 Henry Zongaro 2008-02-04 19:02:41 UTC
This is a personal response, not on behalf of the XSL and XQuery working groups.

Thank you for your comment.  I agree that the text is ambiguous.  Clearly the only useful result for the purposes of the normalization-form parameter would be one in which the serialized result is in the specified normalization form.

I propose that section 2 of the specification [1] should be revised to add an eighth step to sequence normalization that applies only in the case of the text output method.  In that step a new document node with a single text node child would be created, where the string value of the result of the seventh step would be the string value of the new text node.  Then the first paragraph of section 8 [2] could be revised to remove the words, "by outputting the string value of the document node"

[1] http://www.w3.org/TR/xslt-xquery-serialization/#serdm
[2] http://www.w3.org/TR/xslt-xquery-serialization/#text-output
Comment 2 Henry Zongaro 2008-03-13 19:28:13 UTC
At the XSL WG teleconference of 2008-03-13,[3] Michael Kay proposed that, rather than introduce method-specific behaviour in the sequence normalization process of Section 2, we should modify the "Markup generation" step of the "Phases of Serialization" described in section 4 to produce a new document node with a single text node child in the case of the text output method.  The following change is needed to implement that proposal:

In section 4, in the second item in the item in the outermost list, change "In the case of the text output method, this phase has no effect" to "In the case of the text output method, this phase replaces the single document node produced by sequence normalization with a new document node that has exactly one child that is a text node.  The string value of the new text node is the string value of the document node that was produced by sequence normalization."

In section 8,[2] in the first paragraph change "the document node created by sequence normalization" to "the document node created by the 'markup generation step' of the phases of serialization...."

[3] http://lists.w3.org/Archives/Member/w3c-xsl-wg/2008Mar/0015.html
(Members-only link)
[4] http://www.w3.org/TR/xslt-xquery-serialization/#serphases
Comment 3 Henry Zongaro 2008-03-31 13:50:53 UTC
This will be published as Serialization erratum SE.E5.
Comment 4 Henry Zongaro 2008-03-31 13:53:10 UTC
Sorry - I applied comment #3 to the wrong bug report.  The response to this bug has not been accepted.
Comment 5 Henry Zongaro 2009-02-05 17:39:53 UTC
At its teleconference of 2009-02-05, the XSL WG considered and approved the substantive changes proposed in comment#2.  XQuery WG consideration of the bug is still pending.
Comment 6 Henry Zongaro 2009-02-10 18:39:12 UTC
At the joint teleconference of 2009-02-10, the XQuery WG concurred with the decision of the XSL WG.  This will be erratum SE.E8.
Comment 7 Tim Mills 2009-02-11 09:58:56 UTC
Thanks.
Comment 8 Henry Zongaro 2009-03-24 17:10:30 UTC
Michael Kay suggested the following correction in the wording:

In SE.E8, I would change "that" to "which". Specifically: "this phase
replaces the single document node produced by sequence normalization with a
new document node that has exactly one child that is a text node." becomes
"this phase replaces the single document node produced by sequence
normalization with a new document node that has exactly one child, which is
a text node.". The current wording suggests
"count(child::node()[self::text()])=1" rather than "count(child::node())=1
and child::node()[1][self::text()]".

I will apply that change.