This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 10236 - key/keyref/unique fields having a complex type with simple content
Summary: key/keyref/unique fields having a complex type with simple content
Status: CLOSED FIXED
Alias: None
Product: XML Schema
Classification: Unclassified
Component: Structures: XSD Part 1 (show other bugs)
Version: 1.0/1.1 both
Hardware: PC Windows NT
: P2 normal
Target Milestone: ---
Assignee: David Ezell
QA Contact: XML Schema comments list
URL:
Whiteboard:
Keywords: resolved
Depends on:
Blocks:
 
Reported: 2010-07-25 22:40 UTC by Michael Kay
Modified: 2010-11-03 21:00 UTC (History)
2 users (show)

See Also:


Attachments

Description Michael Kay 2010-07-25 22:40:38 UTC
In XSD 1.0, it was specified that the element acting as the target of the field expression in key/keyref/unique "must" have a simple type; having a complex type with simple content was not good enough. In XSD 1.1 the specification has changed to allow a complex type with simple content.

There appears to be some outstanding admin associated with this change.

(a) there seems to be no bug report that I can find that triggered the change.

(b) the change is not listed in appendix G of the specification

(c) in consequence, the change does not appear in the list of changes that need to be tested by the XSD 1.1 test suite

(d) there is no open bug report against XSD 1.0 that needs to be considered when reviewing which changes to retrofit as XSD 1.0 errata.
Comment 1 C. M. Sperberg-McQueen 2010-08-12 23:41:36 UTC
For the record (to save others the task of reconstructing this history from scratch).

The crucial change seems to be in clause 3 of Validation Rule: Identity-constraint Satisfied, which in 1.0 reads (in part):

  3 For each node in the ·target node set· all of the {fields}, with that node as the context node, 
    evaluate to either an empty node-set or a node-set with exactly one member, which must 
    have a simple type. 

  http://www.w3.org/TR/xmlschema-1/#d0e13819

In the current public draft the corresponding sentence reads

  3 For each node in the ·target node set· all of the {fields}, with that node as the context node, 
    evaluates to a sequence of nodes (as defined in XPath Evaluation (§3.13.4.2)) that only 
    contains ·skipped· nodes and at most one node whose ·governing· type definition is either 
    a simple type definition or a complex type definition with {variety} simple.

  http://www.w3.org/TR/xmlschema11-1/#sec-cvc-identity-constraint

Several change proposals are involved in this sentence; others appear nearby but do not seem relevant to the point raised in the bug report.  In chronological order, the changes appear to be:

  - modals (approved 18 February 2005), which changed "must have a simple type" to
    "has a simple type" since there is already a "must" in the introductory prose at 
    the beginning of the list.

  - idc (approved 17 November 2006), which was submitted to the WG in document
    http://www.w3.org/XML/Group/2004/06/xmlschema-1/structures.idc.200611.html
    and which is a fix to bug 1937 and bug 2219, which ask for clarification of issues relating
    to the interaction of identity constraints with skipped subtrees and xsi:nil='true'.

    Diff group idc changed 

        has a simple type

     to 

        either is ·skipped·, or has [nil] true,  or has a ·non-absent· [schema actual value].

  - idc1 (approved 1 December 2006), which contains amendments to the changes 
    adopted 17 November 2006 from an email sent by Sandy Gao on 21 Nov 2006
    http://lists.w3.org/Archives/Member/w3c-xml-schema-ig/2006Nov/0051.html

    In particular, the proposal adopted by the WG changed 

        evaluate to either an empty node-set or a node-set with exactly one member, 
        which either is ·skipped·, or has [nil] true, or has a ·non-absent· [schema actual 
        value]

    to 

        evaluate to a node set that only contains ·skipped· nodes and at most one 
        node whose ·governing· type definition is either a simple type definition 
        or a complex type definition with {variety} simple.

  - b4416-3 (approved 3 August 2007), which changed 

        a node set

    to 

        a sequence of nodes (as defined in XPath Evaluation (§3.12.4))

It seems to me on first examination that the wording in question was introduced by change idc1, but that the substantive change (of including elements whose type is a complex type with simple content) was introduced by change idc and the introduction of the phrase "non-absent [schema actual value]".

The minutes of 17 November and 1 December 2006 are at 

  http://lists.w3.org/Archives/Member/w3c-xml-schema-ig/2006Nov/0047.html
  http://lists.w3.org/Archives/Member/w3c-xml-schema-ig/2006Dec/0006.html

The latter notes in section 6.1.5:

    NM pointed out in email that there is a substantive change to allow both simpletype 
    and complextype with simple content, but is now satisfied with the response that 
    most developers already do it this way.

This appears to refer to the discussion thread started by SG's email cited above.  In that thread, in turn, SG points to yet earlier discussions that have a bearing here.

Schema comment R-206 ("pfiIdConstrFields: Can fields identity nodes with types having simpleContent?" http://www.w3.org/2001/05/xmlschema-rec-comments.html#pfiIdConstrFields, migrated to Bugzilla in 2005 as bug 2198) directly raises the question at issue here.  The issue (raised in February 2003) was discussed in the call of 29 August 2003:

  http://lists.w3.org/Archives/Member/w3c-xml-schema-ig/2003Sep/0002.html

The WG seems to have been of divided mind whether the text of 1.0 was clear and needed a substantive change, or unclear and in need of a clarification.  A test was constructed:

  http://lists.w3.org/Archives/Member/w3c-xml-schema-ig/2003Aug/0035.html

Empirical research showed that all of the then-available processors tested (Tibco, Oracle C, Oracle Java, xsv, Xerces C, Xerces J) interpreted the rules in the spec as covering (i.e. allowing) both simple types and complex types with simple values.  The WG then reached the formal conclusion that the text of 1.0 was not clear, rather than that it was clear and said the wrong thing.  So the official view of the WG (at least, the WG of 2003) is that the change at issue here is not a substantive change but only a clarification.

From the initial description of this issue, I infer that opinions may still be divided on whether the 1.0 text is clear or not.  But in any case, I agree with the implicit suggestion that this should probably be listed explicitly among the changes, if we can find wording we can agree on.
Comment 2 Michael Kay 2010-08-13 08:38:41 UTC
Wow! Great to have this thorough analysis of the history. At the very least, it gives me the confidence to remove the Saxon code that raises an error if the field has a complex type with simple content (which at present is subject to an "if xsdVersion = 1.0" test).

I'm adding the referenced test case to the test suite (as saxonica/Unique/unique001) and making the results unconditional: anyone who fails the tests with a 1.0 processor can always raise a challenge.

Quite how the WG came to decide that the XSD 1.0 spec didn't mean what it plainly says is somewhat beyond me, but I guess it was under pressure at the time.
Comment 3 Michael Kay 2010-08-13 09:17:07 UTC
Some deeper digging revealed also bug #5780 against XSD 1.0, which is still open, and bug #4060 against the test suite. I found these referenced from the Saxon code that handles the question.
Comment 4 David Ezell 2010-08-13 16:19:14 UTC
RESOLUTION: change the change list to call out the change as indicated at the end of comment #1.  This issue is a clarification and should be so listed.  But 4060 give the "official" story that 1.0 was underspecified.
Comment 5 Sandy Gao 2010-11-03 18:08:26 UTC
At the 2010-10-29 telecon, the WG adopted a fix for this bug shown in the following proposal:
 
http://www.w3.org/XML/Group/2004/06/xmlschema-1/structures.omni.20101029.html
  (member-only link)

The fix added a new entry in appendix G as a change since 1.0.

Accordingly, I'm marking this issue as resolved.  Michael Kay, as the
originator of the bug report, you are invited to indicate either that you are
happy with the resolution of the issue (by changing the bug's status to CLOSED)
or else that you are not happy (by reopening it and explaining what's wrong). 
If we don't hear from you in two weeks, the WG will assume you are happy.