7242 – Type inconsistencies introduced by inheritable attributes

This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 7242 - Type inconsistencies introduced by inheritable attributes

Summary: Type inconsistencies introduced by inheritable attributes

Status:	CLOSED FIXED

Alias:	None

Product:	XML Schema
Classification:	Unclassified
Component:	Structures: XSD Part 1 (show other bugs)
Version:	1.1 only
Hardware:	PC Windows NT

Importance:	P2 normal
Target Milestone:	---
Assignee:	David Ezell
QA Contact:	XML Schema comments list

URL:
Whiteboard:
Keywords:	decided

Depends on:
Blocks:

Reported:	2009-08-07 16:51 UTC by Peter.Geraghty
Modified:	2010-11-10 17:30 UTC (History)
CC List:	1 user (show)

See Also:

Attachments

Description Peter.Geraghty 2009-08-07 16:51:04 UTC

Inheritable attributes as specified allow schemas to be produced which weaken typed information checking and processing. By this I mean that for someone formulating a processing rule (e.g., XPath expression), or producing a binding for a particular complex type, the possibility of inheritable attributes is a difficulty when it comes to creating a definition which can be known to be valid prior to receipt of any particular instance message.

If I understand it correctly, the type and the existence of an inherited attribute are dependent on the ancestor in the instance document within which an element of a particular type appears, not on the type of the element itself. Most of the examples I have seen are considering xml:lang etc. and assume that there is a single governing attribute definition which will apply to all inheritors in any context, but I see nowhere that states this is required to be the case.

For example, an element of a particular type could inherit an attribute called required which in one situation referred to a boolean saying whether or not something was required, but in another situation an element of the same type could inherit an attribute also called "required" which was a date by which something was required. This is true even if the type itself specified a third attribute also called "required" which was of type integer.

The semantics of an inherited attribute may also be misleading if applied to all descendant elements willy nilly, even if there is no type difference. For example, an attribute called version could in one context refer to the version of the software which produced a document but in another context could refer to the version of the document which was produced.

A document which had passed validation and was then processed using XPath based on an XDM which in turn was based on the PSVI could produce unexpected and problematic results in these situations.

The problems above could be avoided if inheritability was limited to attributes which are declared at top-level.

There is also a different semantic problem, which is in the applicability of inheritable attributes to descendant elements.

I have seen an example along the following lines. Suppose start-time is declared as inheritable.

<Conference start-time="08:00:00">
<Meeting>
<Beverage>Juice</Beverage>
</Meeting>
</Conference>

Here start-time may have meaning for Meeting but presumably not for Beverage. If I am producing a Java binding for Beverage, or providing GUI tooling to guide a user through a rule relating to Beverage, how do I know whether start-time is relevant or irrelevant? Shouldn't the schema author be able to specify this?

A possible solution to this is to say that inheritable attributes should be applied only to descendants whose type definitions include compatible attribute uses or attribute wildcards. The defaultAttributeGroup enhancement makes it easy for such attribute uses to be applied across the board if that is what the schema author intends.

In summary, the intention to support xml:lang etc. could be supported in a more controlled and appropriate way if
(a) Only top-level attributes are inheritable and
(b) Inheritance only happens if the inheriting elements type would produce a corresponding attribute attribution were the attribute to appear explicitly.

On a more specific point section 3.12 specifies that the XDM be constructed to include Inheritable attributes. Section 3.13 does not specify this but I think that from what has been said elsewhere there is an implication that when an XDM is constructed from a PSVI inheritable attributes be included. However the referenced edition of XDM is the current recommendation which obviously does not specify this. Perhaps this needs to be said.

Comment 1 C. M. Sperberg-McQueen 2009-08-07 19:20:42 UTC

[Speaking for myself]

You are correct that defining several inheritable attributes with the same name and different types and semantics is likely to cause confusion among unwary users; personally I expect good designers to avoid that practice for that reason.  But it's worth noting that for purposes of conditional type assignment, the attributes and element in the XDM instance are not typed, so whatever the various types assigned to attributes named 'required' by different complex types, any XDM instance used for conditional type assignment will have the attribute bearing the special 'type' xsd:untypedAtomic.   

It would be difficult though not impossible to change this and create an XDM instance with typed attributes for conditional type assignment, because there is no guarantee that all the alternative types being considered will assign the same type to like-named attributes.  We did consider the possibility of  validating all attributes in the instance against the declared type of the element, constructing the XDM, evaluating the test expressions, and then re-validating the attributes again if necessary; that proposal did not generate consensus.

The fact that the attributes are untyped for CTA and typed for assertions may also be responsible for the asymmetry between the two as to whether inherited attributes are present in the XDM at all.  In recent discussions, various people including me have spoken as if inherited attributes are present in the XDM instances used to evaluate assertions, but re-reading the spec more closely I see that I was wrong.  The XDM used for assertions does not include inherited attributes.  I don't remember the discussion of this  point in the WG, if any; it could be just a failure to make the two cases consistent, or it could be a way of avoiding the kind of variation in type assignment that you describe.  

It was a key, and controversial, design decision in the development of assertions to specify that all attributes and children should be typed when the assertion is evaluated. Some members of the WG wanted the element itself to be typed, before they were eventually persuaded that that could lead to logical inconsistency in the XDM model (or possibly they were never persuaded, but gave up because the rest of the WG was adamant).  It's not difficult to see that in that context some WG members would deplore, as you do, the idea of allowing an attribute to show up in the XDM instance bearing now one type and now another unrelated type.

For similar reasons, the fear that attribute inheritance might make XPath-based processing unreliable seems to me ungrounded.  You write

    A document which had passed validation and was then processed using XPath 
    based on an XDM which in turn was based on the PSVI could produce 
    unexpected and problematic results in these situations. 

Well, you are right that it could, if the XDM instance treated inherited attributes as if they were normal attributes, and made them accessible through the attribute axis on elements which inherited them.  

The XDM instances created for the special purposes of checking type-alternative tests do in fact map inherited attributes in the PSVI into XDM attributes.  But the canonical mapping defined by the XDM spec does not do so, and it is not to be expected that it will do so in any future revision, for the reasons you cite, among others.   It's hard to make guarantees about what others will do in the future, but the XSL and XML Query working groups responsible for a revision of XDM would need to be signally bereft of technical judgement in order to make a mistake like that.  And if anyone proposes it, you and I will be on the same side in arguing against it.  The only situation in which it makes sense to map inherited attributes in the XML instance, as currently specified, into normal attributes in the XDM instance is one in which for reasons good or bad it is not feasible to traverse the ancestor axis in order to find the attribute instance one is interested in.  That's the case for conditional type assignment and assertions, in order to preserve the property that elements can be validated against a given type in isolation from their context.  That's not likely to be the case in normal XML processing.

For the same reasons, I would not expect a data binding tool to do anything particular about inherited attributes; specifically, I would not expected inherited attributes to show up in the binding as if they had been locally present.   (But I am not a designer or regular user of data binding tools, so my expectations are not particularly important.)

I would not want to restrict attribute inheritance to top-level attributes, because attribute inheritance is well established in languages like TEI and HTML, which use it in non-top-level cases.

In sum:  in the special case (CTA) where inherited attributes are visible in an XDM instance, they are untyped, so type inconsistencies of the kind you describe cannot, strictly speaking, occur.  Inherited attributes are not (as I understand it) visible in XDM instances used to check assertions, as the spec is written today.  And it is not expected or recommended that inherited attributes should be treated by other specs or processing tools as if they were normal attributes.  

The semantic inconsistencies you describe remain possible, but it's not clear that it's possible to prevent people from writing confusing schemas, or wise to try, by any means other than designing a language in which it's easier to be clear.  ISO 8879 went to great lengths to prohibit specific markup practices it described as "obfuscatory", but in the most widely discussed cases any prohibited obfuscation became legal as soon as one additional level of indirection was added by embedding the prohibited construct within the declaration of a parameter entity.   The result:  some obfuscation was made illegal, at the cost of making people who needed it use even more heavily obfuscated constructs.  A net gain?

Comment 2 Peter.Geraghty 2009-08-10 10:03:52 UTC

I see that I have jumped to some conclusions that the "inheritable attributes" feature was more significant than it is.

The idea of an "inheritable attribute" in a wider sense is fairly common (e.g., as used in the conference / meeting start-date example), and it is probably confusing that "inheritable attribtues" are described as a new feature in Schema 1.1 when in fact this is just a constituent of the CTA feature. At the present time there are quite a lot of schemas which have accompanying explanation that semantically some attribute (e.g., currency) is to be treated as inherited if not explicitly specified on descendant elements. I had (incorrectly) assumed the "inheritable attributes" feature was to be a means of formally expressing this within the schema. You could imagine the schema-for-schemas written differently using such a feature instead of the current arrangement of a "blockDefault" attribute defined inside schema-for-schemas plus an explanation elsewhere that this was to be considered in the absence of a block attribute on a complex type definition.

Inheritable attributes with a rather narrow and esoteric signficance don't leave a comfortable feeling though and I would like to make a few more points.

1. The motivation for CTA seems to be support for Atom. Are inheritable attributes needed for Atom? I.e., should Atom type determination consider some attributes of ancestors but not others?

2. Is there an alternative whereby the spec just says that CTA expression evaluation requires ancestral attributes to be considered?

3. If in fact the Schema 1.1 intention is to be able to use inherited attribute values in "assert" expressions, but no intention to expect XPath XDM to be changed we would be in a confusing situation where a schema assert behaved one way but the exact same expression in Schematron behaved differently.

4. If there is pressure in future to make this feature more general than just CTA the issues already discussed will have to be considered, and the fact that there may be schemas out there by then will make it more difficult to rein back on the latitude allowed.

5. There are implications for inherited attributes with regard to document fragments and XML canonical form. Is this a concern that needs to be thought through?

Comment 3 C. M. Sperberg-McQueen 2009-08-21 22:55:41 UTC

[Again, speaking only for myself]

Regarding the questions asked in comment #2:

    1. The motivation for CTA seems to be support for Atom.  Are
    inheritable attributes needed for Atom?  I.e., should Atom
    type determination consider some attributes of ancestors but
    not others?

Whether Atoms uses any inheritable attributes I don't know.  But
while it's a favorite example to illustrate the utility of the
construct, it's not the only motivation; see the work by Vitali
and all on SchemaPath for further examples of the utility of
conditional type assignment.  (Note, however, that in specifying
CTA the working group added a number of restrictions to the
construct, which have the effect of making it less powerful and
less useful.  So not all the examples in the SchemaPath paper are
examples of what can be done with CTA.)

    2. Is there an alternative whereby the spec just says that
    CTA expression evaluation requires ancestral attributes to be
    considered?

No.  This was proposed as a resolution of bug 5003 but the
working group was deeply divided on the question: there was
enough opposition to prevent the proposal being adopted.

    3. If in fact the Schema 1.1 intention is to be able to use
    inherited attribute values in "assert" expressions, but no
    intention to expect XPath XDM to be changed we would be in a
    confusing situation where a schema assert behaved one way but
    the exact same expression in Schematron behaved differently.

Yes.  This is already the case owing to the special methods of
XDM instance construction specified for checking assertions.  A
Schematron rule can be applied to xhtml:input, for example,
requiring that "ancestor::xhtml:form" be true.  The same rule can
be attached to xhtml:input in an XSD schema, but it will not have
the desired effect as the assertion will always be false.  (The
assertion is checked against an XDM instance in which the
xhtml:input element is the root element and has neither ancestors
nor siblings.)

    4. If there is pressure in future to make this feature more
    general than just CTA the issues already discussed will have
    to be considered, and the fact that there may be schemas out
    there by then will make it more difficult to rein back on the
    latitude allowed.

That may be a reason to require some form of type consistency.

Perhaps, for example, Schema Information Set Contribution:
Inherited Attributes in section 3.3.5.6 might be augmented with a
clause 4:

  4 If T is the locally declared type T of A within the declared
    type of E, then T is also the governing type definition of A.

This would mean that the values of inherited attributes are
inherited only by elements which don't have conflicting
alterative attribute declarations, and would in turn mean that
there would be less bizarre cruft in the way of any future
attempt to make the inherited attribute feature significantly
more general -- in particular, any attempt to allow attribute
declarations to specify a default value which is inherited from
some other location in the document instance.

But the history of attempts to model attribute inheritance in XSD
leads me to doubt the likely success of any such attempt, so it's
not clear to me that the additional clause is worth the effort.

    5. There are implications for inherited attributes with
    regard to document fragments and XML canonical form.  Is this
    a concern that needs to be thought through?

The explicit declaration of inheritability makes it easier to see
from the schema whether taking an element out of context is going
to cause problems or not.  As has been pointed out, inheritable
attributes are already specified by a number of widespred
vocabularies; providing some documented way of identifying them
seems to me a modest step forward.

And now, a question for my own for those interested in this
issue: is the type guard described in the answer to question 4
worth introducing or not?

Comment 4 C. M. Sperberg-McQueen 2009-09-04 16:43:50 UTC

The XML Schema WG discussed this issue again at its call this morning, and concluded (a) that the current spec does not actually have the kind of type inconsistency feared in the original report, and (b) that the additional type check described in point 4 of comment #3 is probably not worth introducing. We were led to the second conclusion by several observations. The only benefit of the type check is to minimize complications for any future attempt to devise a more general mechanism in which default values are derived from the XML-instance context. It currently seems very unlikely that any such mechanism will ever be devised, proposed, or adopted in this or any future version of the XSD spec which binds itself to full backwards and forwards compatibility. So the likelihood of any benefit being realized appears to be small, and the benefit, if realized, will be rather modest. Finally, it seems very late in the day for this kind of design reconsideration: the spec is not a Last Call draft but a Candidate Recommendation.

In sum, making the change described in comment #3 appears to be a case of some pain for no gain.

Accordingly, the WG has asked me to record here our intention to class this bug report as WORKSFORME, reflecting the fact that there aren't in fact any type inconsistencies introduced by inheritable attributes, and to close the bug without further action. I'm changing the status of the bug to RESOLVED.

The addition of this comment should cause email to be sent to the originator of the bug report, to whom the following remarks are addressed. Please consider the issues raised in the discussion of the bug, and decide whether you are satisfied that the WG has given the bug report serious consideration and disposed of it appropriately. If you are content to accept the WG's proposed resolution, please indicate so by changing the status of the bug to CLOSED. If you are unhappy with the proposed resolution and wish to appeal the WG's decision to the Director of the W3C, please so indicate by changing the status to REOPENED (and, of course, explaining in a comment why you are unhappy and what changes might satisfy your requirements).

If we don't hear from you in the next two weeks, we will assume that you are satisfied with the resolution of the issue. Thank you very much for commenting on the spec.

Comment 5 Peter.Geraghty 2009-09-07 10:17:49 UTC

(In reply to comment #4)

If I understand the situation correctly, this has been closed because inheritable attributes are in fact only considered in relation to CTA, and in this context the issues I raised are not material.

If that is the case, there should at least be a correction to appendix G.1 which says....

Attribute declarations can now be marked {inheritable} (see Inherited Attributes (§3.3.5.6)) and the values of inherited attributes are accessible in the XDM data model instance constructed for checking assertions (see Assertions (§3.13)) and for conditional type assignment (see Type Alternatives (§3.12)).
Among other consequences, this allows conditional type assignment and assertions to be sensitive to the inherited value of the xml:lang attribute

I think it is fair to say that the above statement is commonly believed to be the case even though it does not tally with the normative sections of the document. For example, the recent powerpoint circulated by Roger Costello on xmlschema-dev@w3.org explains inheritable attributes with examples which are in fact spurious since they don't relate to CTA, but no one replied to point this out.

I would like to ask again whether inheritable attributes considered in CTA are of practical benefit in any of the areas that 1.1 is trying address. I.e., re there important known situations, whether Atom or otherwise, where inheritable attributes would be useful in CTA? I think that adding a feature which is likely to be commonly misunderstood and has no known practical benefit would be a bad idea.

If there are practical benefits and the feature is to remain as stated in the normative section then I accept that a change to appendix G.1 would be sufficient to close this issue.

Comment 6 David Ezell 2009-09-11 16:53:29 UTC

The WG discussed this reopened bug on the telcon today.

1) the WG agrees that the suggested changes to Appending G are in order, and will make those changes as suggested in comment #5.

2) the WG would like to reference bug 5003, introduced by i18n, which was the basis for the "inheritable attribute" feature.  This bug outlines the kinds of use cases addressed by the feature.

3) many in the WG hope that languages like Atom will find this feature useful, but saying more than that is not possible.

The WG thanks Peter Geraghty for his comments and we hope that you will find this resolution satisfactory.

We are marking the bug as "decided" reflecting the need for further drafting, to be closed by the commenter if he agrees with the resolution.

Comment 7 Peter.Geraghty 2009-09-14 11:03:27 UTC

I sympathise with the WG's struggle with the request from i18n as revealed in bug 5003 :(. Personally I think the closure of that request was actually the correct decision, as the solution which has emerged is unsatisfactory from many points of view.

If I could make a general point it would be a plea to consider automated information flows in the round when designing new features. For example, the use case of 5003 is:

<xs:element name="annotation" type="annotationType">
<xs:alternative test="@xml:lang='ja'" type="rubyType"/>
<xs:alternative test="@xml:lang='en" type="glossingType"/>
...</xs:element>

This schema implies that when an instance document is ceated the creator will include different content depending on what value of xml:lang is specified on an ancestral node. Behind the new feature request is an assumption that for the creator it is significantly easier to include the appropriate content without an explicit type identifier than it would be to include the appropriate content with an xsi:type. However, whilst this would save keyboard work for a human keying in the document, in practice the creator will be a computer program, and so the assumption that it is easier for the producer to omit the xsi:type is spurious. On the other hand, the new feature involves significant extra complexity for a receiving system.

My experience is that when standards in a particular area (e.g., schema) are complicated and do not integrate easily with other standards and tools there is an adverse effect on the ease of defining and implementing automated information flows, not a positive one, however sophisticated or powerful the features may be in isolation.

It is true that producers of information are not obliged to use any particular feature allowed by schema, but when the designer of the production of the information has several allowed possibilities s/he may, through ignorance or time pressures, choose options which cause difficulties for receiving systems.

Unlike a programming language, where the decision to use particular language facilities within a project affects a single community working on that project, XSDL is inherently involved with separate producers (who choose which language constructs to use) and consumers who are obliged to go with the producers' choice. These asymmetric roles affect the economics whereby the consumers bear costs relating to producers' decisions. They also affect overall cost and elapsed time to production if consumers request amendments and there are feedback iterations. Don't forget that information flows do not begin or end as xml documents - those documents have to be produced by something and they have to be processed by something.

In a nutshell, why allow multiple different ways of defining the same business information content? Consistency and standardisation are more important enablers of information flow than elegance or sophistication.

Here endeth the general plea.

I have not changed the status of the fault as the current status is REOPENED not DECIDED and I do not have an option for CLOSED - apologies for Bugzilla ignorance if I've misunderstood comments 4 or 6.

However I understand the WG's position and I am satisfied with the response given.

Comment 8 C. M. Sperberg-McQueen 2009-10-10 01:40:43 UTC

The WG today accepted a wording proposal to change the description of inherited attributes in the list of changes since 1.0, deleting the erroneous references to assertions.

We thank the originator of the issue for calling the problem to our attention; we regret we weren't able to make you happier about things.

Since the originator has already signaled his assent, I'm going to close this, after first marking it resolved.

Comment 9 David Ezell 2010-11-10 17:30:01 UTC

The WG reported this bug as FIXED on 2009-10-10.  We are closing this bug
as requiring no futher work.  If there are issues remaining, you can reopen
this bug and enter a comment to indicate the problem.  Thanks very much for the
feedback.