This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 2463 - [XDM] String value of a document node
Summary: [XDM] String value of a document node
Status: CLOSED FIXED
Alias: None
Product: XPath / XQuery / XSLT
Classification: Unclassified
Component: Data Model 1.0 (show other bugs)
Version: Candidate Recommendation
Hardware: PC Windows XP
: P2 normal
Target Milestone: ---
Assignee: Norman Walsh
QA Contact: Mailing list for public feedback on specs from XSL and XML Query WGs
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2005-11-04 17:02 UTC by Michael Kay
Modified: 2006-11-16 18:47 UTC (History)
0 users

See Also:


Attachments

Description Michael Kay 2005-11-04 17:02:27 UTC
When a document is constructed from an Infoset, or from a PSVI, or using a
document node constructor in XQuery, it ends up having a string value that is
the concatenation of all the descendant text nodes in the document. However,
there appears to be no constraint in XDM that this is true of documents
constructed in other ways, for example "synthetically" by an application. It
appears to be possible for such document nodes to have a string value that is
quite unrelated to the textual content of the document. The same is true (in a
more complicated way) of element nodes. 

I'm sure that this was never intended.

XSLT, unlike XQuery, does not say what the string value of a newly constructed
document is. We all assumed that this was specified in XDM, but it seems that it
isn't.

Proposal: add a constraint to XDM.
Comment 1 Norman Walsh 2006-01-20 17:32:57 UTC
I think Mike is right. I propose to add the following constraint to 6.1.1:

4. Regardless of how a document node is constructed, its string value must
always be the concatenation of the string-values of all its Text Node
descendants in document order or, if the document has no such descendants, the
zero-length string.

It's less clear what we should do in the element case. I'm inclined to something
less crisp. In 6.2.1:

14. The string-value of an element node must be consistent with its typed value.

That at least prevents some random construction process from creating an element
with a typed value of 3.0 and a string-value of "New York State".
Comment 2 Michael Kay 2006-01-20 18:01:17 UTC
I think we should probably define the same constraint for element nodes (that
is, the string value is the concatenation of the descendant text nodes). This
seems to cover several cases:

(a1) if the element has simple content and what the implementation actually
stores is the string value, then it must behave as if it had a text node with
that value

(a2) if the element has simple content and what the implementation actually
stores is the text node, then it must behave as if it had a string value with
the same content as the text node

(a3) if the element has simple content and what the implementation actually
stores is the typed value, then it must generate the string value and the text
node from this typed value in the same way - it can't present a string value of
"3" and a text node of "003".

(b) if the element has mixed content, then the string value must be the same as
the typed value

(c) if the element has element-only content, then it has no typed value, but the
string value must be the same as the concatenation of text nodes, as in the case
for document nodes.