This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 10982 - [DM11] RFE: Retrieving the typed-value that a Text Node represents
Summary: [DM11] RFE: Retrieving the typed-value that a Text Node represents
Status: RESOLVED WONTFIX
Alias: None
Product: XPath / XQuery / XSLT
Classification: Unclassified
Component: Data Model 3.0 (show other bugs)
Version: Working drafts
Hardware: PC Windows NT
: P2 enhancement
Target Milestone: ---
Assignee: Norman Walsh
QA Contact: Mailing list for public feedback on specs from XSL and XML Query WGs
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-10-06 02:19 UTC by Rick Yorgason
Modified: 2011-03-08 21:50 UTC (History)
3 users (show)

See Also:


Attachments

Description Rick Yorgason 2010-10-06 02:19:49 UTC
For implementations that store the typed-values of Text Nodes (as described in 3.3.1.3 Relationship Between Typed-Value and String-Value) it would be nice if there was a way to retrieve the typed-value directly from the Text Node.

Currently, dm:typed-value returns content as an xs:untypedAtomic, which is fairly useless since content can be retrieved in a more useful format via dm:string-value.

For implementations that *only* store the typed-value, simply using parent->typed-value is not sufficient, because in a typical implementation, the parent itself needs to query the Text Node to discover the typed-value, which means the implementation is forced to implement another accessor to retrieve the actual-typed-value, which is as silly as it sounds.

This feature would also simplify serializing from the data model to binary XML formats.

In section 6.7.2 (Text Nodes; Accessors) I request that you replace:

    dm:type-name
        Returns xs:untypedAtomic.

    dm:typed-value
        Returns the value of the *content* property as an xs:untypedAtomic.

with:

    dm:type-name
        If the string represents a typed-value, *may* return the represented type-name; otherwise, returns xs:untypedAtomic.

    dm:typed-value
        If the string represents a typed-value, *may* return the represented typed-value; otherwise, returns the value of the *content* property as an xs:untypedAtomic.
Comment 1 Michael Kay 2010-10-06 08:17:21 UTC
This seems a bad idea to me.

There are a lot of queries that do things like

employee/salary/text()

(This is particularly prevalent in the XQuery community, because the style has been encouraged by tutorials, often for poor reasons.)

There are two reasons people might write this:

(a) because they actually want the text nodes, as distinct from the typed value of the salary (for example, because they want to control the impact of comments and processing instructions appearing among the text nodes)

(b) because they are copying blindly from a book and didn't realise that writing employee/salary would have been more appropriate.

Either way, your proposed change would break this query.

Quite apart from compatibility, I think there are good reasons for the spec being the way it is. Only elements and attributes have a type annotation; text nodes do not. Your proposal would raise many questions about what happens, for example, when a text node is copied from one element to another.

(But personally, I think the idea of storing the typed value rather than the string value is usually a bad idea anyway.)
Comment 2 Rick Yorgason 2010-10-06 14:17:13 UTC
It shouldn't break text(), since dm:node-kind would continue to return 'text'.

And as far as what happens when a text node is copied, it actually *simplifies* things.  If the typed-value is stored in the text node and the implementation queries the text node to see it's typed-value, then when that text node is copied or moved to another element, then the element's dm:type-value will automatically work.

If, on the other hand, the typed-value is stored in the parent element, then that value has to be recomputed every time a child is added or removed.
Comment 3 Michael Kay 2010-10-06 14:56:08 UTC
>And as far as what happens when a text node is copied, it actually *simplifies*
things. 

Not so. If you retain the typed value, this is likely to be inconsistent with the schema for the new document into which you have inserted the node, so there's a whole set of rules needed for resolving conflicts.
Comment 4 Rick Yorgason 2010-10-06 16:39:03 UTC
The document model doesn't care if your document is schema-validated, *especially* if you're using direct construction :)

(And we *are* talking about direct construction now, right?  Otherwise, I'm not sure how a mutable operation like moving a node could occur.)

In the example you provided, your new document might be invalid according to the schema, but it would be perfectly valid according to the document model.

Also, you could just as easily construct a document that doesn't validate against your schema by moving around ordinary, untyped text nodes.

Anyway, it sounds less like you disagree with my RFE and more like you disagree with clause 3.3.1.3.

I can certainly understand why this feature would be an inconvenience for some implementations; for instance, if you store the schema-validated type in the element and use that to cast the text node to the appropriate typed-value on request.  NOT having this feature is equally as inconvenient for implementers who store the typed-value and generate the string-value on request (which is a perfectly reasonable thing to do, and even the standard says so).

That's why I think this feature should be optional.  If the implementation doesn't have the data available, falling back on untypedAtomic is a perfectly reasonable thing to do.
Comment 5 Andrew Eisenberg 2011-03-08 21:50:18 UTC
The XML Query WG discussed this issue at its Feb./March 2011 F2F meeting.

We reaffirmed the decision that we made long ago, that type names should be associated only with element and attribute nodes.

We make this decision partly because we recognize that a text node is not always the entire content of an element. Consider a document that contains the following element:

<a xsi:type="xs:integer">1<!-- comment -->0</a>

This element has a typed value of 10. It also contains two text nodes separated by a comment node.

I would ask you to close this bug report if you agree with the WG decision.