This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 5074 - substitution groups not that strictly limited
Summary: substitution groups not that strictly limited
Status: CLOSED FIXED
Alias: None
Product: XML Schema
Classification: Unclassified
Component: Structures: XSD Part 1 (show other bugs)
Version: 1.1 only
Hardware: Macintosh All
: P4 normal
Target Milestone: ---
Assignee: C. M. Sperberg-McQueen
QA Contact: XML Schema comments list
URL: http://www.w3.org/TR/2007/WD-xmlschem...
Whiteboard: clarification cluster
Keywords: resolved
Depends on:
Blocks: 5469
  Show dependency treegraph
 
Reported: 2007-09-26 00:28 UTC by Xan Gregg
Modified: 2008-02-09 04:49 UTC (History)
0 users

See Also:


Attachments
wording changesl for 5072, 5074, 5076 (88.06 KB, text/html)
2008-02-08 23:00 UTC, C. M. Sperberg-McQueen
Details

Description Xan Gregg 2007-09-26 00:28:54 UTC
The substitution group definition (below) says member names can vary widely but that member types are "strictly limited". This may be technically true, but seems to suggest a false sense of security in use of substitution groups. By default, with no derivation constraints, a substitution group can transitive include wildly different types. That is, a restriction step can remove all optional content and an extension step could introduce completely different content.

If my analysis is correct, please consider a milder statement and/or a warning about unintentional extensibility.


---------------------
2.2.2.2 Element Substitution Group
All such members must have type definitions which are either the same as the head's type definition or restrictions or extensions of it. Therefore, although the names of elements can vary widely as new namespaces and members of the ·substitution group· are defined, the content of member elements is strictly limited according to the type definition of the ·substitution group· head.
Comment 1 C. M. Sperberg-McQueen 2007-12-27 00:12:02 UTC
You point seems to me a good one:  the constraint on element content imposed
by the type rules for substitution groups is too loose for it to be helpful
to say

    Therefore, although the names of elements can vary widely as new namespaces 
    and members of the ·substitution group· are defined, the content of member 
    elements is strictly limited according to the type definition of the 
    ·substitution group· head.

I propose to change "is strictly limited according to" to "is constrained by".
Does that seem to you an improvement?
Comment 2 Xan Gregg 2008-01-01 22:23:18 UTC
Sounds better. I realize you don't want to get into all the details such as disallowed substitutions in section 2.

However, I would like to know whether my example is valid. The validators I have handy (XSV and an old Xerces) say it is (schema below), but the first statement of 2.2.2.2 suggests otherwise, unless "such members" restricts the statement to direct members only.

The issue is that eb is in the substitution group headed by ea, but eb's type is neither a restriction nor an extension of ea's type.

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
  
  <xs:complexType name="ta">
    <xs:sequence>
      <xs:element name="a" minOccurs="0"/>
    </xs:sequence>
  </xs:complexType>
  
  <xs:complexType name="t0">
    <xs:complexContent>
      <xs:restriction base="ta">
        <xs:sequence/>
      </xs:restriction>
    </xs:complexContent>
  </xs:complexType>
  
  <xs:complexType name="tb">
    <xs:complexContent>
      <xs:extension base="t0">
        <xs:sequence>
          <xs:element name="b" minOccurs="0"/>
        </xs:sequence>
      </xs:extension>
    </xs:complexContent>
  </xs:complexType>
  
  <xs:element name="ea" type="ta"/>
  
  <xs:element name="e0" substitutionGroup="ea"/>
  
  <xs:element name="eb" substitutionGroup="e0"/>
  
</xs:schema>
Comment 3 C. M. Sperberg-McQueen 2008-01-03 04:52:33 UTC
I believe the schema you provide is valid, but as written it doesn't test 
whether types of elements in the substitution group can be derived indirectly 
from the head of the substitution group, because types are not given on the 
element declarations for e0 or eb.  They default, therefore, to type ta.  (The 
form in which I tested it differs slightly from the one in comment #2; it's on 
the Web at 
http://www.w3.org/XML/2008/xsdl-exx/sgd1.xsd if you wish to inspect it.)

A modified form of the example (http://www.w3.org/XML/2008/xsdl-exx/sgd2.xsd)
with explicit association of types t0 and tb with elements e0 and eb is also
valid, conforming, and correct, as far as I can tell. 

I've tested these two schemas with some instance documents (in the same
directory), with varying results.  

  - libxml / xmllint does not complain about the schema, but it rejects 
    most of the test instances; I wonder if substitution
    groups have not yet been implemented.  
  - MSV rejects schema sgd1.xsd, with the message 'Unimplemented feature: 
    "omitting type attribute in <element> element with substitutionGroup 
    attribute"'.
  - Saxon, Xerces C, and Xerces J all accept the schema and pronounce the 
    test instances valid or invalid in a way consistent with the type 
    definitions.

I believe these implementations are taking the rule in question to mean
that if the head of a substitution group has declared type T, then each 
element in the substitution group must have a declared type derived from
T (subject to the blocks and exclusions and whatnot).  That is, I think they
are not interpreting the rule as meaning the types of member elements must
be derived either exclusively by a series of restriction steps, or exclusively
by a series of extension steps.

Speaking for myself, I think the implementations' interpretation makes sense;
at least, it matches what I always thought the spec meant to say.  But you
have made me aware that the wording can bear a rather different interpretation;
I think both that the 1.1 text should be revised and that an erratum should
be issued for 1.0.
Comment 4 Michael Kay 2008-01-03 09:44:56 UTC
It's also worth noting that the general tone of section 2.2.2 is explanatory and introductory: section 2 is headed "Conceptual Framework", and I think most readers would not expect to find detailed rules about the validity of element declarations in that section unless they are repeated elsewhere. In this case I think it's fairly clear that the statement in 2.2.2.2 is intended as a summary of the intent of rule 4 of Schema Component Constraint: Element Declaration Properties Correct. 

This kind of thing is a minefield when writing specifications: it's often helpful to give a high-level overview of the rules, but it's difficult to ensure that this isn't mistaken for the real thing.

As for the original point: yes, a member of a substitution group can have wildly different content from that of the substitution group head; but only to the extent that the definition of the head element permits this. 
Comment 5 Xan Gregg 2008-01-03 13:54:48 UTC
Thanks for correcting my schema -- yours is what I intended. Now I see I was reading things too literally, and that saying a type is a restriction or extension of another type was just meant as shorthand for the full definition of valid derivation.

Though the section is introductory, I believe it is normative. The sentence in question uses "MUST", so I appreciate any tightening you can add.
Comment 6 David Ezell 2008-01-25 21:10:18 UTC
WG instructs editors to make required changes to 2.2.2.2:
In "All such members must have type definitions which are either the same as the head's type definition or restrictions or extensions of it." replace "or restrictions or extensions of it" with "or derived from it."
Comment 7 C. M. Sperberg-McQueen 2008-02-08 02:20:03 UTC
A wording proposal including changes for this issue went to the WG
on 7 February 2008:

  http://www.w3.org/XML/Group/2004/06/xmlschema-1/structures.consent.200801.html#composition

(member-only link).
Comment 8 C. M. Sperberg-McQueen 2008-02-08 23:00:02 UTC
Created attachment 517 [details]
wording changesl for 5072, 5074, 5076

I'm attaching a diff file showing the changes adopted today by the
Working Group for the issues in bug 5072, bug 5074, and bug 5076.
Comment 9 C. M. Sperberg-McQueen 2008-02-08 23:03:02 UTC
The XML Schema Working Group today accepted the proposal mentioned in
comment #7.  The changes are shown in the attachment just placed in 
this bug report.

With this change, the WG believes we have resolved this issue fully
for XSD 1.1.

Accordingly, I am going to 

   - change the status of this issue to RESOLVED - FIXED
   - clone this issue to track the corresponding problem in 1.0
   - set the status of that new issue accordingly, and add Daniel
     Barclay to the CC list for the new issue, as the originator of 
     this issue

Xan, as the originator of this comment, you should receive from
Bugzilla an email notification of this decision.  Please accept our
thanks for the bug report.

Please let us know if you agree with this resolution of your issue, by
adding a comment to the issue record and changing the Status of the
issue to Closed. Or, if you do not agree with this resolution, please
add a comment explaining why. If you wish to appeal the WG's decision
to the Director, then also change the Status of the record to
Reopened. If you wish to record your dissent, but do not wish to
appeal the decision to the Director, then change the Status of the
record to Closed. If we do not hear from you in the next two weeks, we
will assume you agree with the WG decision.
Comment 10 Xan Gregg 2008-02-09 04:49:49 UTC
Closing. Thanks.