This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 3758 - [FS] technical: 4.7.1: losing type information
Summary: [FS] technical: 4.7.1: losing type information
Status: CLOSED FIXED
Alias: None
Product: XPath / XQuery / XSLT
Classification: Unclassified
Component: Formal Semantics 1.0 (show other bugs)
Version: Candidate Recommendation
Hardware: All All
: P2 normal
Target Milestone: ---
Assignee: Michael Dyck
QA Contact: Mailing list for public feedback on specs from XSL and XML Query WGs
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2006-09-21 03:57 UTC by Michael Dyck
Modified: 2007-10-01 08:00 UTC (History)
1 user (show)

See Also:


Attachments

Description Michael Dyck 2006-09-21 03:57:37 UTC
4.7.1 / Norm
"In general, we do not want to convert all atomic values to text nodes,
especially when performing static-type analysis, because we lose useful
type information. [For example,
    <date>{ xs:date("2003-03-18") }</date>
should be normalized to
    element date { xs:date("2003-03-18") }
rather than
    element date { text { "2003-03-18" } }
because the latter loses useful type info.] To preserve useful type
information, we distinguish between direct element constructors that
contain one element-content unit and those that contain more than one..."

    If you perform STA on these two CompElemConstructors (using
    4.7.3.1 / STA / rule (1|2)), you find that
        element date { xs:date("2003-03-18") }
    fails at premise 4, because xs:date is not a subtype of
        attribute *, (element | text | comment | processing-instruction)*

    And even if you fix those premises to handle atomic types, you wind
    up with the same type for the two CompElemConstructors, i.e. either
        element date of type xs:anyType
    or
        element date of type xs:untyped
    depending only on statEnv.constructionMode. The type of the content
    expression doesn't have any effect.

    Thus, although converting all atomic values to text nodes may lose
    type information, it's information that will be lost anyway. So there
    doesn't seem to be any reason to handle the n=1 case differently.

    (The word "especially" suggests that there are occasions *other* than
    STA when "we do not want want to convert all atomic values to text
    nodes". What did you have in mind there?)

    (Section 4.7.1.1 has a similar split betwen the n=1 and n>1 cases,
    which is even less justified.)
Comment 1 Jerome Simeon 2006-09-26 15:06:31 UTC
I would think it is too late in the process to revisit those inference rules which are pretty complex. I would suggest to fix the text to remove the part about 'preserving useful type information'.

- Jerome
Comment 2 Jerome Simeon 2006-09-26 15:11:19 UTC
I would think it is too late in the process to revisit those inference rules which are pretty complex. I would suggest to fix the text to remove the part about 'preserving useful type information'.

- Jerome
Comment 3 Michael Dyck 2006-09-27 07:14:54 UTC
How can you choose not to revisit rules that clearly don't work? (Or do you think they do work? In which case, you should refute my example.)

Removing the text about "preserving useful type information" is not a fix. It does not make the STA failure go away, and it leaves the (completely unmotivated) division of normalization into n=1 and n>1 cases.
Comment 4 Jerome Simeon 2006-10-10 19:11:54 UTC
I did misread and thought the problem was that the two alternatives would yield the same type. The line about the bug was somewhat burried in the middle of the example. It would be useful to make sure that this is given upfront in further reports.

About the problem itself, I think we indeed must fix this. I think the most natural proposal is treat all cases homegeneously, i.e., remove the normalization rule (and the corresponding text) for the case of a single ElementContentUnit.

- Jerome
Comment 5 Michael Dyck 2006-10-17 04:40:15 UTC
(In reply to comment #4)
>
> I think the most natural proposal is treat all cases homegeneously,
> i.e., remove the normalization rule (and the corresponding text)
> for the case of a single ElementContentUnit.

I agree. And similarly for AttributeContentUnit.
Comment 6 Jerome Simeon 2006-10-17 12:48:10 UTC
Agreed
Comment 7 Jerome Simeon 2006-10-19 19:34:42 UTC
The XML Query and XSLT working groups have decided to adopt the proposed fix to solve that bug, as per comments #4 and #5.

Best,
- Jerome Simeon, on Behalf of the XSLT and XML Query WGs
Comment 8 Michael Dyck 2007-05-12 21:08:08 UTC
In the Rec, the corresponding changes for AttributeContentUnit (Comment #5) have not been applied.

Also, 4.7.1 / Norm / para 2 contains a sentence that is no longer true:
    We distinguish between direct element constructors
    that contain only one element-content unit and those
    that contain more than one element-content unit.
It should just be deleted.
Comment 9 Michael Dyck 2007-09-30 19:56:23 UTC
Although Comment #4 spoke of removing the normalization rule for the n=1 case, it's still there in both 4.7.1 and 4.7.1.1.

Moreover, neither section appears to recognize the n=0 case
(e.g, consider <e a=''></e>).

I propose to retain just the general rule, and explicitly say that it covers the n=0 and n=1 cases.
Comment 10 Michael Dyck 2007-10-01 08:00:30 UTC
This issue has been entered as FS erratum E011. The changes proposed in Comment #9 have been committed to the source files for the next edition of the FS document. Consequently, I'm marking this issue resolved-FIXED, and CLOSED.