This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 11076 - Element Declarations Consistent: comparing type tables
Summary: Element Declarations Consistent: comparing type tables
Status: CLOSED FIXED
Alias: None
Product: XML Schema
Classification: Unclassified
Component: Structures: XSD Part 1 (show other bugs)
Version: 1.1 only
Hardware: PC Windows NT
: P2 normal
Target Milestone: ---
Assignee: David Ezell
QA Contact: XML Schema comments list
URL:
Whiteboard:
Keywords: resolved
Depends on:
Blocks:
 
Reported: 2010-10-16 22:10 UTC by Michael Kay
Modified: 2011-03-24 08:06 UTC (History)
1 user (show)

See Also:


Attachments

Description Michael Kay 2010-10-16 22:10:49 UTC
Element Declarations Consistent (section 3.8.6.3) contains the rule:

All their {type table}s are either all ·absent· or else all are present and have the same sequence of {alternatives} and the same {default type definition}.

What does "same" mean? Is this an appeal to "component identity" covered by G.2 clause 4: "The identity of components is still underspecified (although a number of points have been clarified, e.g. by the specification of the {scope} property), with the result that some schemas can be interpreted either as conformant or as non-conformant, depending on the interpretation of the specification's appeals to component identity."

If so, it appears to be different from other cases where the specification relies on component identity. In most such cases, a minimal definition of identity is that two components are identical if they are derived from the same declaration in a particular schema document. In this case, there is no way that the type tables for two different element declarations can be derived from the same source. So it seems this rule is inviting a "deep equality" test of some kind. If we are expecting any kind of interoperability, it would seem necessary to articulate the way in which this test is carried out. It's not easy: do we mandate, for example, that XPath expressions are compared in their lexical form as written, with no normalization? Do we require that the base URI is the same even if the processor knows it is not used?

In the interests of interoperability, I would be inclined to replace the above rule by "All their {type table}s are ·absent·".
Comment 1 Michael Kay 2010-10-16 23:53:35 UTC
Having complained about this, I should now report that I have produced a reasonable implementation of code that compares two type tables, subject to intepretation of what it means for two XPath expressions to be the same, and what it means for two types to be the same (in the latter case, it means "originating from the same declaration in a schema document"). Comparing two namespace contexts turned out to be tedious and slow but not especially difficult. Whether the results are usable is another question - users are going to be greatly puzzled if their type tables differ only by one irrelevant in-scope namespace declaration.
Comment 2 David Ezell 2010-11-05 11:39:14 UTC
In Lyon:

The WG discussed this issue.

The minimum an implementation must do is check for identity.  
Editors to draft text describing how type tables should be compared, which is at minimum some kind of deep equality, but should allow implementations to ignore variations that make no difference to the outcome.
Comment 3 C. M. Sperberg-McQueen 2011-03-07 17:39:59 UTC
A wording proposal intended to implement the WG's instructions in comment 2 is now available for review and comment; it's on the server at

  http://www.w3.org/XML/Group/2004/06/xmlschema-1/structures.b11076.html
  (member-only link)

Essentially, it defines a simple mechanical comparison of type tables and type alternatives which should (a) be safe (it should recognize type tables as equivalent only when they really are) and (b) be straightforward to implement.  Additionally, it defines equivalence in purely declarative terms, requires processors to detect equivalence in all cases covered by the simple mechanical comparison, and allows processors to detect equivalence in other cases. 

Note that it poses a choice of the WG:  should implementations be encouraged to detect equivalence more aggressively than the minimum defined?  or allowed to do so without being encouraged to do so?
Comment 4 Michael Kay 2011-03-07 23:55:25 UTC
In the proposal, the phrase "and T1.{type definition} and T2.{type definition} are valid for the same set of input element information items." seems curious - I don't think a type definition can be valid for some information items and invalid for others.

But rather than fix this in the obvious way, I don't think it's helpful to define equivalence using a condition that isn't actually computable by a universal Turing machine. Better to stick with the mechanistic definition, and allow processors to relax it if they are able to. I would suggest wording along the lines:

T1 and T2 are equivalent if all the following conditions are true:

...

A processor MAY also treat T1 and T2 as equivalent in cases where not all the above conditions are true, provided it can determine that that T1.{test} and T2.{test} will always evaluate to the same result for any possible element information item.
Comment 5 Michael Kay 2011-03-08 18:19:41 UTC
I don't think the equivalence of two XPath expressions is computable. Actually I may be wrong here, because I suspect that unlike XQuery, XPath is not Turing-complete. But it's certainly not realistic in practice to recognize all equivalent expressions that return the same result. Even recognizing whether or not ($i + 1) is equivalent to ($i - -1) is not trivial (it requires careful study of the rounding and overflow rules).
Comment 6 David Ezell 2011-03-18 16:01:45 UTC
<cmsmcq> change "are valid for the same set of input element information items" 
<cmsmcq> to "accept the same set of input element information items as valid"
* ht is back
<ht> Agreed
<dezell> RESOLUTION:  adopt the proposed wording as ammended.
Comment 7 C. M. Sperberg-McQueen 2011-03-23 19:29:15 UTC
The decision reported in comment 6 has now been integrated into the status-quo version of the spec, with the amendment mentioned.  Accordingly, I'm marking this issue resolved.

Michael, would you please indicate your agreement or disagreement with this resolution by opening or reopening the bug?  Thanks.