This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 5507 - Symbol spaces need clearer description
Summary: Symbol spaces need clearer description
Status: CLOSED FIXED
Alias: None
Product: XML Schema
Classification: Unclassified
Component: Structures: XSD Part 1 (show other bugs)
Version: 1.1 only
Hardware: Macintosh All
: P2 minor
Target Milestone: ---
Assignee: C. M. Sperberg-McQueen
QA Contact: XML Schema comments list
URL:
Whiteboard: terminology cluster
Keywords: editorial, resolved
Depends on:
Blocks: 5584
  Show dependency treegraph
 
Reported: 2008-02-27 18:22 UTC by C. M. Sperberg-McQueen
Modified: 2009-10-23 21:35 UTC (History)
4 users (show)

See Also:


Attachments

Description C. M. Sperberg-McQueen 2008-02-27 18:22:42 UTC
The discussion of bug 5157 suggests that there are some aspects of
symbol spaces which could usefully be made clearer in the spec.  Among
them:

(1) Symbol spaces contain expanded names (namespace name + local name
pairs), not just local names.

(2) Symbol spaces are used to enforce uniqueness constraints: no two
components can have the same name within the same symbol space, so
each name within a symbol space uniquely denotes a component.

(3) Symbol spaces are also relevant to QName resolution, although not
mentioned explicitly in the rules for it.  The resolution of QName
references to components involves looking for a component with the
given expanded name, within the appropriate symbol space.

(4) Symbol spaces are NOT used, however, in attributing element or
attribute instances to particular particles and declarations in the
schema.  An element with a given expanded name will match (other
conditions being propitious) any element declaration with that
expanded name in the content model; it does not matter whether that
element declaration is global or local.  Simiilarly also for
attributes.

Proposition (1) appears to contradict the current text of section 2.5,
which does its best to suggest that symbol spaces are somehow situated
within target namespaces.  It says, for example:

    There is a single distinct symbol space within a given target
    namespace for each kind of definition and declaration component
    ...

But this description contradicts the usage of the term 'symbol space'
in the spec.  Section 3.1.1, for example, refers to

    ... equality of names (including target namespaces) within symbol
    spaces.
  
If equality of names, including equality of target namespaces, can be
tested for 'within' symbol spaces, then there cannot be distinct
symbol spaces for top-level elements in distinct target namespaces.

Another example:  section 3.11.1 says

    Each constraint declaration has a name, which exists in a single
    symbol space for constraints. 

If there is a single symbol space for names of identity constraints, 
then it cannot be located within any single target namespace.

So: first, section 2.5 needs to be corrected to agree with the spec's
actual usage, and second, if possible the relation of symbol spaces to
various name matching tasks (QName resolution, instance attribution,
etc.) should be made clearer.

This problem exists both in 1.1 and in 1.0.
Comment 1 C. M. Sperberg-McQueen 2008-03-21 18:04:17 UTC
The XML Schema WG agreed today (21 March 2008) that this needs to be
fixed in both 1.0 (see bug 5584) and 1.1 (this bug).  To the extent that the
distinction matters, we believe that this is a clarification, not a 
substantive change, both in 1.0 and in 1.1.
Comment 2 C. M. Sperberg-McQueen 2008-05-27 13:31:38 UTC
In view of the WG's expressed view that this is a task of clarifying
the spec's prose rather than changing the rules for conforming processors,
I am adding the keyword 'editorial' to this issue.  A consequence of this
will be that the issue may be dealt with after, not before, the publication
of the next working drafts.
Comment 3 C. M. Sperberg-McQueen 2009-10-08 21:43:55 UTC
I believe that symbol spaces are used to enforce uniqueness, in the sense that (as section 3.1.1 puts it) "multiple copies of components with the same name in the same ·symbol space· must not exist".  Or at least, I believe that that is what the spec believes, and what the wg believes the spec to say.

And I believe that the spec and wg believe that (as section 2.5 puts it) "Every complex type definition defines its own local attribute and element declaration symbol spaces."

By what casuistry, then, do we explain that the following complex type definition is legal?

  <complexType name="upa-demo">
    <sequence>
      <element name="a"/>
      <element name="a" minOccurs="0"/>
    </sequence>
  </complexType>

Two element declarations named tns:a, both in the same local symbol space?

I think the way out of this quandary is to say, not that complex type definitions create their own local symbol space for element declarations, but only that the names of local element declarations don't go into the symbol space for top-level declarations.  

Comment 4 Noah Mendelsohn 2009-10-08 23:06:38 UTC
Rgarding the example:

 <complexType name="upa-demo">
    <sequence>
      <element name="a"/>
      <element name="a" minOccurs="0"/>
    </sequence>
  </complexType>

Doesn't this bug just remind us once again that our notion of component identity is murky?  Specifically, http://www.w3.org/TR/xmlschema11-1/#dcl.elt.local says:

"If the <element> element information item has <complexType> or <group> as an ancestor, and the ref [attribute] is absent, and it does not have minOccurs=maxOccurs=0, then it maps both to a Particle and to a local Element Declaration which is the {term}  of that Particle. "

The obvious difference between the two name="a" lines is in the minOccurs, which maps to the particle, not the element declaration.  That then begs the question of whether the element declaration that is the term of the respective particles is the "same" or not.  Turning the argument around, the statement that 

"Every complex type definition defines its own local attribute and element declaration symbol spaces."

can be taken is at least indirect evidence that the answer is: they are the same.  In any case, this suggests another possible resolution, in addition to the one suggested by MSM.  We could attempt to make clear that, at least in cases like this, all local element declaration markup in the transfer syntax that shares a compexType ancestor and that declares elements of the same expanded name does indeed map to a single element declaration.  I think this would be my preferred casuistry.

Noah
Comment 5 C. M. Sperberg-McQueen 2009-10-09 00:28:45 UTC
Noah's suggestion in comment 4 seems not to handle cases like

  <complexType name="upa-demo">
    <sequence>
      <element name="a"/>
      <element name="a" nillable="true" minOccurs="0"/>
    </sequence>
  </complexType>

It would also seem to entail that the following schema document should give rise to a legal schema: 

  <schema xmlns="http://www.w3.org/2001/XMLSchema">
    <element name="a"/>
    <element name="a"/>
  </schema>

That would be a feasible rule (although it's rather late for such a dramatic clarification), but all the processors I've tested reject it.

Comment 6 Noah Mendelsohn 2009-10-09 00:39:10 UTC
Michael Sperberg-McQueen writes:

> That would be a feasible rule (although it's rather late for such a dramatic
> clarification), but all the processors I've tested reject it.

OK, I'm convinced (I don't know why I thought we allowed that).  Thanks.

Noah
Comment 7 Michael Kay 2009-10-09 06:12:18 UTC
>It would also seem to entail that the following schema document should give rise to a legal schema: 

  <schema xmlns="http://www.w3.org/2001/XMLSchema">
    <element name="a"/>
    <element name="a"/>
  </schema>

>That would be a feasible rule (although it's rather late for such a dramatic clarification), but all the processors I've tested reject it.

They reject it, I think, because they have made the decision to base component identity on the identity of nodes in the schema document. I think that's the only approach that is likely to work in practice, but it's not mandated by the spec. I think our rules for component identity are so weak that a processor could legally construct a schema from the above schema document.

Comment 8 Noah Mendelsohn 2009-10-12 15:03:32 UTC
MK wrote: > I think our rules for component

> identity are so weak that a processor
> could legally construct a schema from
> the above schema document.

That's what I thought.  Indeed, I think it's the case that EDC ensures that such folding of local references can always or usually done.

Still, regardless of what may have been possible or desirable in principle, I can't see changing the spec. in this area if it would cause most or even all widely deployed implementations to become nonconforming.  So, I'm convinced by MSM's argument that constructs like this should be dissallowed, if only for the reason he cites.

Noah
Comment 9 C. M. Sperberg-McQueen 2009-10-16 15:08:34 UTC
A wording proposal for bug 5507 is at

http://www.w3.org/XML/Group/2004/06/xmlschema-1/structures.b5507.html
Comment 10 C. M. Sperberg-McQueen 2009-10-23 21:35:35 UTC
The change proposal mentioned in comment 9 was adopted with amendments by the XML Schema WG at its telcon today and has been integrated into the status-quo documents.