5003 – Applicability of <alternative> element to xml:lang

This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 5003 - Applicability of <alternative> element to xml:lang

Summary: Applicability of <alternative> element to xml:lang

Status:	CLOSED FIXED

Alias:	None

Product:	XML Schema
Classification:	Unclassified
Component:	Structures: XSD Part 1 (show other bugs)
Version:	1.1 only
Hardware:	PC Windows XP

Importance:	P1 minor
Target Milestone:	---
Assignee:	C. M. Sperberg-McQueen
QA Contact:	XML Schema comments list

URL:
Whiteboard:	XPath cluster
Keywords:	resolved

Depends on:
Blocks:	5907
	Show dependency tree / graph

Reported:	2007-09-04 14:46 UTC by Felix Sasaki
Modified:	2009-01-06 03:18 UTC (History)
CC List:	2 users (show)

See Also:

Attachments

Description Felix Sasaki 2007-09-04 14:46:18 UTC

(See also the mail at http://lists.w3.org/Archives/Member/member-i18n-core/2007Sep/0005.html .)

This comment is about the <alternative> element used for the assignment of conditional types, described at 
http://www.w3.org/TR/2007/WD-xmlschema11-1-20070830/#element-alternative

It would be useful to choose types relying on 
xml:lang, e.g.

<xs:element name="annotation" type="annotationType">
  <xs:alternative test="@xml:lang='ja'" type="rubyType"/>
  <xs:alternative test="@xml:lang='en" type="glossingType"/>
...</xs:element>

however, this only makes sense if the information of xml:lang is inherited 
for type-checking, that is: test="@xml:lang='ja'"  needs to be 
interpreted as test="ancestor-or-self::@xml:lang[last()]='ja'" . Could you make sure that this is possible?
Thank you,
Felix Sasaki (on behalf of the i18n core WG)

Comment 1 Noah Mendelsohn 2007-09-05 13:52:39 UTC

From the original bug report:

> however, this only makes sense if the information
> of xml:lang is inherited for type-checking, that
> is: test="@xml:lang='ja'" needs to be interpreted
> as test="ancestor-or-self::@xml:lang[last()]='ja'"
> . Could you make sure that this is possible?

I think this is asking schema to resolve a problem that should in fact be resolved at the Infoset level.  The request is basically that the values of xml:lang be inherited through the document tree, much as xml namespace prefix bindings are.  The Infoset does not currently provide for such inheritance;  xml:lang attribute values are attached only to elements on which they appear as children, not to descendents of those elements.

While it might be appealing from a user's point of view to allow navigation to ancestor elements in <alternative> XPaths (or in assertions for that matter), doing so would make element declarations (or in the case of assertions type definitions) context dependent in a manner that I think is undesirable.  For example, it would complicate the definition of what it means to start validating an arbitrary element information item against a corresponding element declaration:  would one need to navigate to that elements' ancestors to determine the right values for xml:lang?  Would it be necessary to generalize this mechanism to other attributes?

I think the right way to handle this, if the need is there, is to specify a transform on the Infoset that would take as input the obvious Infoset that results from parsing an XML document per today's specification, and produces an infoset that decorates descendent elements with xml:lang as appropriate.  This transform could be applied in places where the user wishes to have that semantic (or a different property could be invented, and we could then consider extending XPath to address that additional property).  I think this is a better solution architecturally then complicating schema;  presumably the reason that this feature is requested is that the instance itself is intended to have inheritance semantics that the Infoset is failing to reflect.

Noah

Comment 2 C. M. Sperberg-McQueen 2007-09-05 16:53:02 UTC

Comment #1 seeks to argue that handling the value of xml:lang as
specified in the XML spec, and working with it as the infoset is
currently specified, are both outside the scope of XSDL.  This seems
to me an astonishing bit of special pleading.

It also ignores the fact that the XML Schema Working Group is on
record as accepting as in-scope the task of describing xml:lang and other
attributes which inherit their values.  We were unable to agree on
a mechanism for such inheritance, but I think the record will show
that there was no question in the Working Group about its being out
of scope for XSDL.

The alternative solution proposed in comment #1 is to make a special 
case of xml:lang.  I think XSDL already has too many special cases,
and requires too many ad hoc special-case processes from users of other
specs in the XML area; the proposal to add another to the list should
be rejected firmly and decisively. 

(I speak here only on my own behalf.)

Comment 3 Michael Kay 2007-09-07 15:38:21 UTC

Given that XSDL already augments the Infoset with values for defaulted attributes, it seems entirely reasonable to me to extend the defaulting mechanism to allow the defaulted value to be inherited from an ancestor element rather than defined as a constant in the schema. (However, this reopens the question of whether defaulted attribute values are visible to the XPath expressions used in CTA.)

Comment 4 Noah Mendelsohn 2007-09-07 18:13:07 UTC

Mike Kay writes:

> Given that XSDL already augments the Infoset with
> values for defaulted attributes, it seems entirely
> reasonable to me to extend the defaulting
> mechanism to allow the defaulted value to be
> inherited from an ancestor element rather than
> defined as a constant in the schema. (However,
> this reopens the question of whether defaulted
> attribute values are visible to the XPath
> expressions used in CTA.)

Very interesting approach.  I remain opposed to having the XPath's "look" outside the tree of the element being validated, but I am not necessarily opposed to saying that the Infoset transform about which I speculated could be included, perhaps as an option (another point of incompatibility?  Ugh. Anyway...) in XSD itself.  So then the XSD model would be:

* Augment infoset for defaults and inherited attributes (not sure how we'd specify which ones inherit, vs. special-casing just xml:lang, which seems very ugly).  

* State that validation uses the inherited attribute values, either for all purposes, or specifically for XPath data model construction.

More complexity, but if it makes users happy I might live with it.  One thing I like about this is that it makes streaming a bit less of a transformative exercise.  You just inherit the xml:lang values in the Infoset as you go.  If XPath has an ancestor axis, you need to notice it, and to the sort of things Fabio's team has proposed to turn it into a forward processing model.

So, I'm not strongly in favor, but not strongly opposed.

Comment 5 Felix Sasaki 2007-09-13 13:40:39 UTC

(In reply to comment #4)
> Mike Kay writes:
> 
> > Given that XSDL already augments the Infoset with
> > values for defaulted attributes, it seems entirely
> > reasonable to me to extend the defaulting
> > mechanism to allow the defaulted value to be
> > inherited from an ancestor element rather than
> > defined as a constant in the schema. (However,
> > this reopens the question of whether defaulted
> > attribute values are visible to the XPath
> > expressions used in CTA.)
> 
> Very interesting approach.  I remain opposed to having the XPath's "look"
> outside the tree of the element being validated,

I said in the initial issue description
[the information of xml:lang .... needs to be 
interpreted as test="ancestor-or-self::@xml:lang[last()]='ja'" .]
For our use case, it does not matter if XPath is really applied or if we achieve the necessary interpretation by infoset augmentation.
So I think what you describe below would respond to our use case.

 but I am not necessarily
> opposed to saying that the Infoset transform about which I speculated could be
> included, perhaps as an option (another point of incompatibility?  Ugh.
> Anyway...) in XSD itself.  So then the XSD model would be:
> 
> * Augment infoset for defaults and inherited attributes (not sure how we'd
> specify which ones inherit, vs. special-casing just xml:lang, which seems very
> ugly).  
> 
> * State that validation uses the inherited attribute values, either for all
> purposes, or specifically for XPath data model construction.
> 
> More complexity, but if it makes users happy I might live with it.  One thing I
> like about this is that it makes streaming a bit less of a transformative
> exercise.  You just inherit the xml:lang values in the Infoset as you go.  If
> XPath has an ancestor axis, you need to notice it, and to the sort of things
> Fabio's team has proposed to turn it into a forward processing model.
> 
> So, I'm not strongly in favor, but not strongly opposed.
>

Comment 6 Noah Mendelsohn 2008-01-14 16:37:40 UTC

On the call last week, Henry asked me to send an email clarifying my thoughts on this issue. In researching the question, I realize that much of my position has been stated in the comments above [1,2]. Since Henry seems to want another try at netting things out, I'll try that here:

There are at least three fundamental questions, I think:

Q1. If xml:lang semantics are to be inherited for CTA, is this best viewed as an infoset augmentation in the spirit of attribute defaulting, or is it better to achieve this by allowing type assignments to depend on the context in which the element appears, I.e. to allow <xs:alternative> XPaths to look up the tree?

Q2. To what extent is the necessary support best provided in the XSD specification as opposed to in, say, a new version of the Infoset Recommendation?

Q3. If the preferred answer for CTA is to relax tree trimming rules, what would we do for assertions? It seems that those might also want to be sensitive to the inherited semantics of xml:lang.

I'll briefly comment on these in order.

Q1 Comments:

xml:lang inheritance is not used just for validation. One can easily imagine lots of use cases, in databinding for example, in which an application consuming the PSVI would welcome the availability of the inherited xml:lang value, even if that value played no role in determining validity or type assignment. XQuery and XSLT also seem to be use cases in which inheriting the value might be useful. Those are key reasons that I prefer to think about this in terms of infoset augmentation. I also welcome the fact that augmentation reduces the need to reopen the discussion of a hard-won compromise, which was to include CTA, but with the "attributes-only" tree triming.

Q2 Comments (including sketch of a design proposal):

Michael took my comment #1 to imply that I was against changes to XSD to support this requirement. Not so, though the changes I think I'd prefer would not be to the tree trimming rules.

As an aside, I think the case can be made that xml:lang inheritance, and perhaps other attribute inheritance, would architecturally have been better handled in XML itself. Perhaps some alternate syntax like <e attr!="..."/> might have been used. Though a bit ugly syntactically, this would have made the inheritance work consistently regardless of whether a validation was being performed at all, regardless of what schema language was used, etc. Presumably, the Infoset and XQuery data models would have reflected such inheritance.

Nonetheless, I think XSD is a reasonable second-best or alternative for signalling the inheritance of attributes, and I'm willing to invest a bit of effort in the needed changes.

Design proposal sketch:

I certainly haven't considered this carefully, but what if we allowed on either an attribute declaration or an attribute use (not sure which), something like:

This could be a signal to augment the PSVI of descendent elements with the inherited value. Exactly which PSVI properties to set would be TBD, but it would probably be similar to attribute defaulting.

We would state that both assertion and CTA alternative XPaths would honor these inherited defaults. Perhaps we'd extend identity constraints to do the same. We would also have to state what the rules are in the case that the element info item at which validation starts has a [parent]. Our precedent with ID/IDREF is never to look outside the root of the validation for anything. If something like an editor is doing incremental revalidation of, say, some particular element within a larger document, we'd have to decide whether that processor MAY, MUST, SHOULD, or MUST NOT inherit from any xml:langs that might be above the validation root.

Q3. As noted above, my first choice is not to reopen tree trimming at all. Getting the current design was a tough compromise for all concerned. Nonetheless, some ne members of the group favor exploring that option. Regarding that, my position is:

I would "lie down in the road" against:

* Letting assertion XPaths look at ancestors or siblings of the element on which the assertion occurs. I don't want the validation of a complex type to be context dependent.

* Letting CTA XPaths look at descendents, other than attribute children, of the element on which the type assignment is being done. I'm worried about deferring the assignment of a type until content that may be well beyond the start tag is seen.

I would with some serious concerns about complexity consider:

* Letting CTA alternative XPaths look at ancestors.

That last bit is what we'd need to allow the test="ancestor-or-self::@xml:lang[last()]='ja'" that's been discussed here.

Summary:

Independent of my concerns about tree trimming, doing this as an infoset augmentation seems right to me. I have some concern about scope creep, but this seems like a good requirement against CTA, and if we can meet it without serious delay to our schedule I'd be willing to support that. I ask the group to consider proposals like the one I've made above.

If others in the group near consensus that loosening tree trimming is a better approach to his particular requirement, I'd with serious concerns consider allowing ancestor references for CTA. Unlike the infoset approach, this would not provide similar function for assertions, and I remain very much against allowing ancestor references on assertions, as that would make complex types context-dependent. That said, I hope the group will not use this issue as an opening to reconsider our compromise on tree trimming.

I note that in comment #5 [3] the original commentator states:

"For our use case, it does not matter if XPath is really applied or if we achieve the necessary interpretation by infoset augmentation. So I think what you describe below would respond to our use case."

So, as I understand it, either approach would be acceptable.

Noah

[1] http://www.w3.org/Bugs/Public/show_bug.cgi?id=5003#c1
[2] http://www.w3.org/Bugs/Public/show_bug.cgi?id=5003#c4
[3] http://www.w3.org/Bugs/Public/show_bug.cgi?id=5003#c5

Comment 7 Michael Kay 2008-01-14 17:03:35 UTC

I'm even more strongly opposed than Noah to making validation of an element depend in any way on its ancestors. I think that would have pretty devastating implications for XSLT/XQuery, which I believe rely heavily on the fact that if an element has been validated, then copying it (changing its ancestry but not its content) cannot change the validity (or assigned type) of that element.

There are some similar difficulties with inheritance-based defaulting of attribute values, but I suspect that these can be solved. However, like Noah, I fear that this is a substantial piece of work that we are not well-resourced to tackle.

Comment 8 C. M. Sperberg-McQueen 2008-01-15 03:19:21 UTC

Michael Kay writes:

    XSLT/XQuery ... rely heavily on the fact that if an element has 
    been validated, then copying it (changing its ancestry but not
    its content) cannot change the validity (or assigned type) of 
    that element.

Thank you; that's valuable information.  

Unfortunately, I'm not sure that either part of the proposition is a 
fact for XSDL 1.0 or 1.1.

If we define three element declarations with the same name and
different fixed values, or different types, in different contexts,
then creating and validating such an element in one context and then
moving it to another will certainly affect its schema-validity,
and the governing type it will have in any PSVI.  

An example, using three elements named 'pi', one at the top
level with the fixed value 3.141592, one in the context of an element
named 'indiana-almost' with the fixed value 4 (to commemorate the
occasion on which the Senate of Indiana passed a bill specifying that
the area of a circle equals the area of a square whose side is 1/4 
the circumference of the circle, which works out algebraically to saying
that pi = 4), and one in the context of an element named 'II_Chronicles'
with the fixed value of 3 (to remember II Chron 4:2-5, which effectively 
gives a value of 3 for pi), may be found, with a document instance, at

  http://www.w3.org/XML/2008/xsdl-exx/b5003.xsd
  http://www.w3.org/XML/2008/xsdl-exx/b5003.xml

An argument against allowing assertions or CTA to look upwards can 
perhaps be constructed nevertheless:  if QT relies upon the invariant 
that an element E valid against type T (independent of whether it's
valid against a particular *element declaration* or not) is valid against
T without regard for its ancestry, then changing that invariant could
be dangerous, and should be discussed extensively with QT.

But if QT relies on the proposition that elements of the same name are 
bound to the same element declaration, and to the same type, regardless
of context, then it relies on a proposition not actually guaranteed by
XSDL.  (And we would have a massive failure of inter-working-group 
coordination.  But I hope that Michael simply misspoke slightly, and QT
doesn't rely on this broader and less reliable proposition.)

Comment 9 Felix Sasaki 2008-01-15 04:43:12 UTC

(In reply to comment #6)
[snip]

> 
> I note that in comment #5 [3] the original commentator states:
> 
> "For our use case, it does not matter if XPath is really applied or if we
> achieve the necessary interpretation by infoset augmentation.  So I think what
> you describe below would respond to our use case."
> 
> So, as I understand it, either approach would be acceptable.

this is still the case.

Felix

> 
> Noah
> 
> [1] http://www.w3.org/Bugs/Public/show_bug.cgi?id=5003#c1
> [2] http://www.w3.org/Bugs/Public/show_bug.cgi?id=5003#c4
> [3] http://www.w3.org/Bugs/Public/show_bug.cgi?id=5003#c5
>

Comment 10 Michael Kay 2008-01-15 12:11:22 UTC

OK, I should have been more precise.

What QT depends on is that once an element has been validated and a type has been assigned, that type remains an accurate description of the content of the element when the ancestry of the element changes or when other changes are made to the containing document provided they do not affect the element's content. In general there is no requirement that a revalidation of the element after such changes would assign the same type again, but there is a requirement that revalidation against the same type would "succeed": that is, the type annotation of a node is always trustworthy.

Comment 11 Fabio Vitali 2008-01-17 18:17:12 UTC

Hello, 

My reading of Noah's comment [6] is that we have three possible ways to solve the xml:lang issue proposed by Felix [1]: 

* We could augment the Infoset to include room for xml:lang
* We could augment the Infoset to include room for inherited attribute values
* We could relax the tree trimming rules adopted for CTA to allow for upward sneaking. 

Please allow me to give you my .5 cents on the issue (i.e., half a cent, which, although being euro cents, so somewhat higher than .7 US cents, is still way lower than 5 cents even US). 

I first will note my strong opposition to any augmentation of the Infoset to deal with specific instance-level attributes, even when they belong to privileged namespaces such as xml. It is just wrong both to grant special status to some namespaces and to cut out exceptions for this or that special case. I think that the important lesson to remember whenever these situations are known is that they exemplify issues that may be found in any vocabulary, and that they are more prominent and urgent, but not intrinsically special and different, when they happen to important vocabularies. Extending the infoset for xml:lang would provide a solution for this specific case, but not for the myriads of other identical situations of lesser vocabularies. 

On the second bullet, inherited attribute value, I am expressing my deep concerns about introducing a new untested concept out of the blue just to provide support for the issue at hand. I see at least three problems with this: - first of all, let me point out similarities and differences with precedents, such as the #CURRENT specification in SGML attributes, which would require a missing attribute to assume the value of an attribute with the same name appearing somewhere before (rather that upward) to the current element; several reasons, mostly having to do with visibility and necessity of such feature, have made it disappear in XML, and I don't recall damp eyes about this: let us think about this carefully. 
- Secondly, it introduces an unjustified differentiation between elements and attributes, which were carefully drafted out of the previous versions of the XSDL language: it would now become a relevant deciding factor in the perennial element/attribute discussion to have some special properties applicable to attributes and not to elements: let us think about this carefully. 
- Finally, I don't think we are ready to face the impact of inherited attributes values to the rest of the language. A few examples: do we allow inheritance on attributes of type ID? On attributes on which unicity constraints exist? On local attributes? On nillable attributes? Is inheritance shared only among elements that explicitly define such attribute, or would it be available on elements that do not define it? On elements that block it? An intermediate element that does not allow the inherited attribute would or would not block inheritance? Who wins between an inherited value and a default value? If I have a global and a local definition of an attribute with the same name and the same type, would they participate to the inheritance if they nested? Would two local attributes inherit from each other? I could go on.  Let us think about this carefully. 
In short, we don't have time to think about it carefully, and it seems to me that the only real justification for it is the desire to not mess again with the tree trimming compromise. A weak reason, seems to me. 

Finally: relax tree trimming rules. As Noah notices, 

> some ne members of the group favor exploring that option. 
> Regarding that, my position is:
> 
> I would "lie down in the road" against:
> 
> * Letting assertion XPaths look at ancestors or siblings of the element on
> which the assertion occurs.  I don't want the validation of a complex type to
> be context dependent.
> 
> * Letting CTA XPaths look at descendents, other than attribute children, of the
> element on which the type assignment is being done.  I'm worried about
> deferring the assignment of a type until content that may be well beyond the
> start tag is seen.
> 
> I would with some serious concerns about complexity consider:
> 
> * Letting CTA alternative XPaths look at ancestors.  
> 
> That last bit is what we'd need to allow the
> test="ancestor-or-self::@xml:lang[last()]='ja'" that's been discussed here.

It will not come as a surprise to any of you that I am among those who would favor exploring that option. let me summarize: 
* Does not create a special case for a specific attribute or a specific namespace. 
* Does not create a spurious distinction between elements and attributes. 
* Does not create ambiguous interaction patterns with other features of the language, including local/global definitions, defaults, uniqueness, nillability, etc. 
* Does not require new syntax
* Does not require new text in the draft
* Does not provide a better narrative than "we weren't able to think of a more elegant solution so we decided to work around the specific use case"
* Does provide a reasonable solution for this and many similar cases
* Does show a way forward towards the removal of artificial constraints in XSDL constructs
* Does provide a reasonable narrative in terms both of reuse of existing constructs and of general applicability to a class of use cases. 

In a separate message my own take at the idea of context-free validation. 

Ciao

Fabio

[1] http://www.w3.org/Bugs/Public/show_bug.cgi?id=5003#c1
[6] http://www.w3.org/Bugs/Public/show_bug.cgi?id=5003#c6

Comment 12 Michael Kay 2008-01-17 18:29:34 UTC

>It will not come as a surprise to any of you that I am among those who would favor exploring that option [relax the tree trimming rules]. let me summarize: 
* Does not ....

But what it does do is to destroy the property that you can validate an element, label it with a type annotation, and then copy the element to a new tree knowing that the type annotation will still be sound. That property is extremely important to QT.

I do sympathize with the argument that attribute inheritance will introduce a lot of new complexity. After all, that's the way namespaces work, and we know how much they have to answer for in terms of adding complexity.

Comment 13 Fabio Vitali 2008-01-18 00:21:56 UTC

Dear Michael,

> But what it does do is to destroy the property that you can validate an
> element, label it with a type annotation, and then copy the element to a new
> tree knowing that the type annotation will still be sound. That property is
> extremely important to QT.

I have been trying to come to grip to this and other comments along the same line for a while now, and I haven't been able to understand their point. Context-free validation can be considered as a property or a goal in the same manner that (I'll steal the example from MSM in [1]) setting PI = 3 can be considered a property or a goal: it would surely speed computations, but the results will not be as precise as they should be, and we will need a lot of post-process adjustments to fit observed data with the computed theory. 

A schema is a contract between a data producer and a data consumer about the range of acceptable data set that the former promises to deliver and the latter promises to accept. Acceptability is NOT an abstract quality of documents and it is in general NOT constrainable to context-free content models. Constraining validity to content models only (i.e. to the subtree of each element) is a simplification that separates validity from acceptability.
  
The acceptability of some elements in some document types does OFTEN depend on circumstances that exist outside of the subtree they head. That is unfortunate but is true, as the xml:lang example can testify, as well as the HTML form/input example mentioned by MSM in [2], the <alternative> without test attribute in XSDL 1.1, or for that matters even ID/IDREF pairs. I can provide as many examples as you care to receive: once you know what to look for, you start finding literally thousands of such situations in basically any XML language.   

Requiring validation to be context-free does not change these facts neither does it provide an alternative and brilliant workaround to them: it merely redefines "validity" to mean something less than "acceptability". It leaves downstream applications with the hard choice between implementing post-validation code to verify the inexpressible constraints and blindly accepting possibly wrong documents. 

To draw from our current use case, if you have a fragment with no xml:lang attribute set that is down a tree that has an xml:lang set above, and move this fragment to another tree that has no xml:lang attribute anywhere, then you can cover your eyes and pretend that the fragment maintains type annotation and validity, but in fact it does not. Or at least, it maintains them if you tweak the definition of validity, but it still becomes "unacceptable" in a wider sense, since the actual language of the fragment is not reflected in any explicit attribute of the new tree. This has nothing to do with types, XML Schema, assertions and acceptable XPath syntax. It is just a wrong (or, shall we say, "simplified") assumption about what happens when you move data from one position (where it was acceptable) to another (where it is not). 

We can decide that we may live with such simplification, and stick to it despite the proposed use cases, but let's not tell ourselves it is a design goal or a theoretical property chosen as a feature: it is really just a sad admission of implementation difficulties. Reasons for choosing this or other simplifications need to be grounded on engineering concerns, and not on abstract principles that have no actual correspondence on reality. 

Ciao

Fabio

[1] http://www.w3.org/Bugs/Public/show_bug.cgi?id=5003#c1
[2] http://www.w3.org/Bugs/Public/show_bug.cgi?id=5297

Comment 14 Michael Kay 2008-01-18 09:11:53 UTC

I'm fine having a rule that says "if the language is DE then the currency must be EUR". But what is the object to which this rule relates? I don't think it's a rule about the validity of the currency, I think it's a rule about the validity of some object that contains both a language and a currency.

Now we could have done this differently, of course. The principle outlined above is not an absolute. But it happens to be an assumption that's built pretty deeply into QT's adoption of a type system based on XML Schema. When in XSLT I do:

<invoice xml:lang="en">
  <xsl:copy-of select="$payment" validation="preserve"/>
</invoice>

I'm relying on the fact that if the payment was valid in one context, then it is going to remain valid in a different context.

So I don't think there's anything philosophically absurd about the idea that the validity of an object should be context-sensitive, I'm just saying that changing the rules of the game in this way is going to be unacceptable to two of the important stakeholders.

Comment 15 Fabio Vitali 2008-01-19 09:13:50 UTC

Michael, 

> So I don't think there's anything philosophically absurd about the idea that
> the validity of an object should be context-sensitive, I'm just saying that
> changing the rules of the game in this way is going to be unacceptable to two
> of the important stakeholders.

In fact, that was my point: context-independence of type is more a political issue than a technical one, and it is definitely not a design issue.  

Ciao

Fabio

Comment 16 Noah Mendelsohn 2008-01-21 19:53:17 UTC

Michael Kay writes:

> I'm fine having a rule that says "if the language is DE then the currency must
> be EUR". But what is the object to which this rule relates? I don't think it's
> a rule about the validity of the currency, I think it's a rule about the
> validity of some object that contains both a language and a currency.
> 
> Now we could have done this differently, of course. The principle outlined
> above is not an absolute. But it happens to be an assumption that's built
> pretty deeply into QT's adoption of a type system based on XML Schema. When in
> XSLT I do:
> 
> <invoice xml:lang="en">
>   <xsl:copy-of select="$payment" validation="preserve"/>
> </invoice>
> 
> I'm relying on the fact that if the payment was valid in one context, then it
> is going to remain valid in a different context.
> 
> So I don't think there's anything philosophically absurd about the idea that
> the validity of an object should be context-sensitive, I'm just saying that
> changing the rules of the game in this way is going to be unacceptable to two
> of the important stakeholders.

Yes, exactly.

Fabio Vitali writes:

> In fact, that was my point: context-independence of type is more a political
> issue than a technical one, and it is definitely not a design issue.  

I really don't think it's fair to say that.  

I can't speak for others who were involved or for the working group as a whole, but I think it's fair to say that I was among those who played a significant role in suggesting [1,2] the "Tag/Type" distinction, and what's now known as Complex Types.  I can tell you that the ability to use them in the manner that XQuery later chose to do was intentional on my part in proposing them.  I was not otherwise involved in the design of XQuery, but as Michael Kay says, they seem to have used context-independent types in exactly the manner that I (and I presume others, though perhaps not all others in the working group) expected when the design for types was proposed.

So, as Michael says, while there are other technically plausible ways that both XML Schema and/or XQuery could have been designed, I certainly don't think that makes the decisions political.  As far as I'm concerned, technical trade-offs were made in the design of Schema.  We bit off the conceptual and design complexity of separating complex types from element declarations, and we built a lot of machinery to make them first class abstractions and sufficiently context-insensitive to use in the manner that Michael describes.  The XQuery working group decided to use the abstraction in what I consider the manner intended: I.e. to ensure that one has assignment compatibility between two "element"s that have the same complex types, and thus to be able to use schema types for functions, etc.

Surely there are trade0offs embodied in those choices.  I don't think it's fair to imply that those who made the trade-offs did so for primarily political reasons.  The costs and benefits are primarily technical, IMO.

Furthermore, as Michael also says:  changing this now would be far more disruptive than changing it before XQuery had come to depend on it.  I strongly favor keeping our types context-independent, to the degree that they currently are.

Noah

Comment 17 Noah Mendelsohn 2008-01-21 19:58:23 UTC

Hmm. For some reason, the references I made in comment #16 didn't get included.  They were supposed to be (quoting the pertinent part of that posting):

...

> I can't speak for others who were involved
> or for the workgroup as a whole, but I think
> it's fair to say that I was among those who
> played a significant role in suggesting [1,2]
> the "Tag/Type" distinction, and what's now known
> as Complex Types in particular.

...

Noah


[1] http://lists.w3.org/Archives/Member/w3c-xml-schema-wg/1999Mar/0029.parts/tagtype.html
[2] http://www.w3.org/XML/Group/1999/03/10-schema-notes.html#ID4.4.2

Comment 18 Michael Kay 2008-01-22 15:13:39 UTC

A technical requirement is one that you meet because your users need it; a political requirement is one that you meet because your paymasters think they want it. This is a technical requirement.

Comment 19 Henry S. Thompson 2008-01-23 11:40:35 UTC

(In reply to comment #13)
> Dear Michael,
>
>> But what it does do is to destroy the property that you can
>> validate an element, label it with a type annotation, and then copy
>> the element to a new tree knowing that the type annotation will
>> still be sound. That property is extremely important to QT.
>
> I have been trying to come to grip to this and other comments along
> the same line for a while now, and I haven't been able to understand
> their point.  Context-free validation can be considered as a
> property or a goal . . .

Not 'considered as a property', it _is_ a property of validity as
defined in XSDL 1.0 as published and XSDL 1.1 as currently spec'd.  As
Mike Kay points out, it's a property QT depend on.  So we don't change
it without careful thought and good reason.

> . . .

> The acceptability of some elements in some document types does OFTEN
> depend on circumstances that exist outside of the subtree they
> head. That is unfortunate but is true, as the xml:lang example can
> testify, as well as the HTML form/input example mentioned by MSM in
> [2], the <alternative> without test attribute in XSDL 1.1, or for
> that matters even ID/IDREF pairs.

XSDL 1.0 provides two design patterns for addressing these cases:
local declarations and the notion of choosing where best to 'stand'.
By the latter I mean our approach to Identity Constraints and the
legacy ID/IDREF types: these are, correctly in my view, described as
determining the validity of the appropriate scoping subtree, not the
immediate locus of one or the other of the items which duplicate or
fail to match.

Local declarations, which enable the tag-type distinction, are a very
powerful way of encapsulating and managing context dependence which
has worked well, as far as I can tell.  Not only in the sForS, but in
major production schemas (judging from my random sample in 2004), this
mechanism gets a lot of use.

So my inclination in looking for ways to address the examples which
currently cannot be handled by XSDL is to look to our existing
architecture.

Lets consider the three cases Fabio mentions.

 1) That last <alternative>: Feels right to me to address this from
    the "where best to stand" perspective -- just as the
    spec. currently states this as a constraint on xs:element, I would
    choose to enforce this in the sForS at that point, with an
    assertion along the lines of

  <xs:assert test="not(xs:alternative[not(@test) and
                                      position() < last()])"/>

  or

  <xs:assert test="every $a in xs:alternative satisfies $a[@test or last()]"/>


 2) HTML form/input: In principle I think again that this is a
    question of where best to stand.  On balance I would rather say
    "x:form defines a scope within which x:input may occur" than
    "x:input must only occur within x:form".  To achieve this within
    XSDL, we would need to allow type definitions to be dynamically
    scoped.  I think that would be a great feature, but not for 1.1
    :-(.  In its absence, I'll say "Either do the hard graft to use
    the local element declaration approach outlined in [1], or accept
    that this is a constraint which XSDL can't express without more
    complexity than you're prepared to accept."

 3) xml:lang: I stated elsewhere [2] that I prefer the defaulting
    approach to this rather than the inheritance approach, but however
    we do it, just implementing inheritance won't allow us to solve
    the problem put to us, i.e. to choose a type for an element based
    on its xml:lang value.  The reason for this is that whichever
    approach we take, we will need to know the type of the element in
    question _before_ we know whether to use the inherited value for
    its xml:lang or not, since that type might itself _specify_ a
    value for that attribute.  But we want to appeal to the value of
    xml:lang to _determine_ the type.  Result: impasse.

    Note that this impasse doesn't go away if we loosen the
    restrictions of looking up the tree when choosing alternatives.

    My preferred solution, which I nearly thought of during our
    initial discussion of tree trimming, would be something along the
    lines of saying that _if_ an element declaration includes an
    explicit value for {type definition}, that

     a) all alternatives must be derived from it and
     b) its attribute declarations are used to type attribute values
        in the XDM which is used for alternative selection.

    I think this idea has independent value, and I hope the group will
    consider it.  It contributes to a solution of the xml:lang
    problem, by allowing the relevant declaration for xml:lang to be
    moved out of the alternatives and into the {type definition}, thus
    breaking the circularity identified above.

[1] http://www.w3.org/2000/04/26-csrules.html
[2] http://lists.w3.org/Archives/Member/w3c-xml-schema-ig/2008Jan/0024.html

Comment 20 David Ezell 2008-01-24 18:13:17 UTC

We discussed at the f2f, and there are still deep divisions within the WG about how to proceed.  The problem as stated in the original issue suggests various approaches, each with detractors and some with proponents.

Comment 21 C. M. Sperberg-McQueen 2008-02-04 16:17:19 UTC

In an effort to make better use of Bugzilla, we are going to use the
'severity' field to classify issues by perceived difficulty.  This 
bug is getting severity=minor to reflect the existing whiteboard note
'easy'.

Comment 22 Michael Kay 2008-02-04 19:37:15 UTC

I wonder if I can put forward another suggestion: parameterized validation. Currently we say that the validity and type of an element depend only on (a) the content of the subtree rooted at that element, and (b) the input conditions for the validation, such as an optional element declaration and type. One way forward might be to allow additional parameters to the validation, and to allow these parameters to be referenced as XPath variables in expressions such as the alternative and assertion expressions. Such parameters could be externally defined by the agent invoking the validation episode, or they could be bound during the validation of some containing tree. So $language might be such a parameter, and it might be set either by the user invoking the validation with $language=de, or it might be set by the appearance of xml:lang="de" on some containing element.

There would seem to be many potential use cases for this facility. For example, I often cite the case of a document in a workflow where the criteria for passing on to the next stage become increasingly strict as the workflow proceeds. One could represent this by a validation parameter "stage", and assess whether the document is "valid given stage=3".

This sounds rather beyond the scope of 1.1, but it feels to me like an interesting way forward.

Comment 23 Noah Mendelsohn 2008-02-04 21:11:07 UTC

Michael Kay writes:

> I wonder if I can put forward another suggestion:
> parameterized validation.  Currently we say that
> the validity and type of an element depend only on
> (a) the content of the subtree rooted at that
> element, and (b) the input conditions for the
> validation, such as an optional element
> declaration and type. One way forward might be to
> allow additional parameters to the validation, and
> to allow these parameters to be referenced as
> XPath variables in expressions such as the
> alternative and assertion expressions.

> [...]

> This sounds rather beyond the scope of 1.1, but it
> feels to me like an interesting way forward.

Yes, interesting, but definitely beyond 1.1 IMO.  

Beyond that, I have to say that my intuition is still to be quite conservative about things like this.   One of my conclusions about XML Schema in a presentation I gave years ago [1] was: "There's no such thing as a simple feature."  Like so many other things in our language that seem simple in isolation (e.g. locally scoped elements), I'm concerned that things like this add conceptual as well as implementation complexity.  In the case of these parameterized types, I think there's a sense in which we'd now be saying:  a type now validates not just the subtree to which it applies, but the combination of the subtree and zero or more parameters.  If I'm building a databinding system that creates rather than consumes XML, what's my data model?  Does it include the parameters?  If not, what data do I collect?  Presumably, if I want the resulting XML to be valid, I need to be able to not just build up the XML itself, but also the corresponding parameters that will be necessary for it's validation.  One way or the other, there's significant conceptual and practical complexity there, at least in some scenarios.

I certainly agree that there are cool things you could do with a system like this, and so I agree with what I take to be the essence of your proposal:  I.e. that when and if we start planning the next non-bugfix release of Schema, this is an interesting idea to consider.  I'm just saying that I can also see reasons why it might not work out from a cost/benefit point of view.

Noah

[1] http://www.intertwingly.net/slides/2002/devcon/SchemaSecrets.ppt

Comment 24 Pete Cordell 2008-02-19 12:01:00 UTC

(In reply to comment #22)
> ...it might be set either by the user invoking the validation with
> $language=de, or it might be set by the appearance of xml:lang="de" on some
> containing element.

I sense this is being put on the back burner, so I'm just dumping this here as a place to store the idea (just in case what's suggested isn't implicit to you guys from what has been mentioned).  Merging Noah's comment #6 proposal and this from Michael Kay might lend itself to a syntax along the lines of:

<xs:attribute ref='xml:lang' validationParameterName='$language'/>

This would instruct the processor to put the value of xml:lang into some kind of dictionary (possibly analogous to the ID/IDREF dictionary) such that descendants could access the value by using $language in assertions. 

Naturally this does suffer all the problems of descendent validation being context dependent.

Comment 25 C. M. Sperberg-McQueen 2008-06-13 16:58:10 UTC

The XML Schema WG discussed this issue at its telcon of 13 June 2008,
and noted that, as in the past, the WG has no consensus on how to
resolve this issue.  As a group, therefore, we agreed to close the
issue with the notation LATER, in the hopes of being able to return to the
topic at some future date (but as far as can be predicted this will not
be in time for XSD 1.1).  The W3C representative dissented, on the grounds
that the correct way to resolve the issue is to do as suggested and
eliminate the restriction on upward pointing references in conditional
type assignement tests, or failing that to find some other way to support
xml:lang and other constructs that use the same idiom.

Felix, as the originator of the issue and as the representative of the
i18n Core WG, we ask that you report this result back to the I18n Core
WG, consider this disposition of the issue, and let us know by closing 
the issue or by reopening it whether you are satisfied with the XML 
Schema WG's efforts to resolve it and with the ultimate disposition 
of the issue.  If we do not hear from you before the end of August, we
will assume that the I18n Core WG is willing to accept this disposition
and does not wish to reopen it or to appeal it to the Director.

Thank you.

Comment 26 Felix Sasaki 2008-06-26 12:42:42 UTC

Dear XML Schema Working Group,

this is a response on behalf of the i18n Core Working Group.

We discussed your resolution
http://www.w3.org/2008/06/18-core-minutes.html#item05
We are concerned that there is no agreeable way to implement our
requirement of taking inheritance of xml:lang into account for
conditional type assignment. Due to the importance of xml:lang we would
at least like to have a health warning explaining the problem and an
example how to address it, that is by spelling out xml:lang attributes
at elements which need conditional type assignment. If you agree on this
proposal we are very willing to draft text for this proposal.

Thank you,

Felix

Comment 27 David Ezell 2008-08-04 14:37:00 UTC

ACTION 2008-08-01: MSM to add note on 5003 noting request by W3C staff that we continue discussion

Comment 28 Felix Sasaki 2008-08-11 12:35:41 UTC

Hello,

this is an additional comment from the i18n core Working Group.
See comment 5 at http://www.w3.org/International/reviews/0807-xsd11/
The I18N Core Working Group has discussed this topic again and decided to ask you again for considering a technical solution for the issue. Looking into the discussion of http://www.w3.org/Bugs/Public/show_bug.cgi?id=5003 , we are aware of the problems which arise, but think that xml:lang is important enough to achieve a solution.
Thank you,
Felix

Comment 29 C. M. Sperberg-McQueen 2008-10-29 18:23:39 UTC

I have been instructed to record that the considered opinion of the
responsible W3C staff, at an Architecture domain meeting last summer,
is that this issue needs to be reopened and resolved in a way that allows
conditional type selection to use xml:lang.  I took an action some time
ago to record this fact, but have not yet done so; I do so now.

Comment 30 Sandy Gao 2009-01-05 20:44:29 UTC

During its 2008-12-12 telecon, the schema WG adopted a proposal to address this issue.

The proposal can be found at (member-only):
http://www.w3.org/XML/Group/2004/06/xmlschema-1/structures.b5003c.html

Changes include:
1. Certain attributes can be declared "inheritable"
2. When an inheritable attribute is available (either specified or defaulted) on an ancestor element, then it's included in the [inherited attributes] PSVI property of descendant elements.
3. When evaluating type alternative XPaths, [inherited attributes] are accessible as if they were also specified on the current element.
4. Restriction rules are also updated so that the {inheritable} property of an attribute in a restriction type has the same value as the same-named attribute in the base type, if any.

For example, to support the xml:lang scenario raised in this bug, the xml:lang attribute can be declared "inheritable":

<xs:attribute name="lang" type="xs:language" inheritable="true">
...
</xs:attribute>

Then the schema snippet in the original bug report will work.

Note the linked proposal shows "heritable" as the name of the properties. The WG agreed to use "inheritable" instead.

With these change, the WG believes that the issue raised in this bug report
is fully addressed. I'm marking this RESOLVED accordingly.

Felix and Michael, as the persons who opened and reopened this issue, if you would indicate your concurrence with or dissent from the WG's disposition of the comment by closing or reopening the issue, we'll be grateful. If we don't hear from you in the next two weeks, we'll assume that silence implies consent.

Comment 31 Felix Sasaki 2009-01-06 03:18:55 UTC

(In reply to comment #30)
> During its 2008-12-12 telecon, the schema WG adopted a proposal to address this
> issue.
> 
> The proposal can be found at (member-only):
>   http://www.w3.org/XML/Group/2004/06/xmlschema-1/structures.b5003c.html
> 
> Changes include:
> 1. Certain attributes can be declared "inheritable"
> 2. When an inheritable attribute is available (either specified or defaulted)
> on an ancestor element, then it's included in the [inherited attributes] PSVI
> property of descendant elements.
> 3. When evaluating type alternative XPaths, [inherited attributes] are
> accessible as if they were also specified on the current element.
> 4. Restriction rules are also updated so that the {inheritable} property of an
> attribute in a restriction type has the same value as the same-named attribute
> in the base type, if any.
> 
> For example, to support the xml:lang scenario raised in this bug, the xml:lang
> attribute can be declared "inheritable":
> 
> <xs:attribute name="lang" type="xs:language" inheritable="true">
>   ...
> </xs:attribute>
> 
> Then the schema snippet in the original bug report will work.
> 
> Note the linked proposal shows "heritable" as the name of the properties. The
> WG agreed to use "inheritable" instead.
> 
> With these change, the WG believes that the issue raised in this bug report
> is fully addressed. I'm marking this RESOLVED accordingly.
> 
> Felix and Michael, as the persons who opened and reopened this issue, if you
> would indicate your concurrence with or dissent from the WG's disposition of
> the comment by closing or reopening the issue, we'll be grateful. If we don't
> hear from you in the next two weeks, we'll assume that silence implies consent.
> 

Hello Sandy,

I am very happy about this resolution and will convey it to the i18n core Working Group. Thank you very much!

Felix