This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.
In our design of assertions and conditional type assignment, we have for some time taken the position that the XPath expressions used should not point up or left (or right) from the element on which the assertions are checked (or the type of which is being calculated). This was not a unanimous decision, if I recall correctly, but one to which some WG members assented reluctantly. It seemed clear, when we reached agreement on this point, that the prohibition was inconvenient, but as far as we could tell it did not actually affect the expressive power of assertions. One example used at the time (see http://lists.w3.org/Archives/Member/w3c-xml-schema-ig/2006Mar/0041.html for an appearance of this example in WG discussion) was that of the HTML 'input' and 'form' elements: 'form' is legal only within an 'input' element. A natural way to express the constraint is to write an assertion of the form ancestor::html:form or count(ancestor::html:form) > 0 in the declaration of the type assigned to the input element. We came to the conclusion that upward-pointing assertions were not essential to this scenario, however, when we realized that one could place the assertion on the HTML element, and formulate it as count(.//input) = count(.//form//input) Recently I have become aware of a fallacy in our reasoning, which leads me to conclude that we reasoned our way to a false conclusion and that we need to reconsider the decisions based in part on that conclusion. HTML is defined as a modular vocabulary, and the forms module is intended to be usable in arbitrary XML vocabularies, not only in (X)HTML itself. Placing the assertion on the HTML element does not guarantee that it will be enforced wherever 'form' and 'input' are used. There is no element on which the assertion can be placed so that it reliably enforces the constraint. It appears, then, that we were wrong to believe that the current form of assertions in XSDL 1.1 could be used to handle the HTML form/input use case. Under the circumstances, I no longer believe it plausible to forbid assertions to point upward in the tree, and believe that we should drop the restriction (and with it, our mechanisms for 'tree trimming' prior to XPath evaluation).
I'm against allowing upward pointing XPaths. Let's try and remember the 80/20 principle. I don't think we have to scourer the world looking for every conceivable use-case and then accommodate it.
I think it's a good principle that an element is valid or invalid based only on its content, and not on its context in some larger document. XSLT and XQuery rely on this principle, allowing you to assemble a document bottom-up, knowing that if a component is valid in itself, then it will still be valid when inserted into a larger tree. We do have one exception to this, the document-level constraints on ID/IDREF, but that's manageable. In the use case cited, if we argued that an input element can never exist without a containing form, then it would be impossible for XSLT to construct the form. There are definitely contexts where it's quite reasonable for an input element to exist without the containing form, for example in the results of a query performing introspection on the form. This truly is a constraint on valid HTML elements, not a constraint on input elements. There is one usability issue with constraints like count(.//input) = count(.//form//input) - it's very hard to produce an error message that identifies the offending input element. If users can be persuaded to write such constraints in the form every $i in .//input satisfies ($i intersect .//form//input) then it might be possible to do better. At present our restricted XPath subset discourages this.
I agree with both Pete and Mike: we don't have to handle every use case, and there are good architectural reasons for types (and thus elements) to have context-independent validation requirements. Also: I assume I'm correct that the original bug report took a shortcut in asking for XPaths that could point up, left or right. Presumably what's wanted is that, along with relaxation of the tree trimming rules that would make such added XPath capability useful? I remain opposed, but it's certainly useful to have the proposal be clear in any case. Did I correctly infer what was intended? Thank you. Thanks. NOah
Comments #1, #2, and #3 all argue that we should not make any change here, first because there is no need to support every conceivable use case and second because there are good reasons for validation to be context independent. These arguments are not wholly satisfying. The first presents a generally accepted principle as if it were an argument: we cannot support all conceivable use cases. This is not in dispute, in part because some of us are well aware that it is not possible to do so even in theory -- but it does not address the salient question, which is: should we seek to support *this* use case? This particular example is not a discovery made after long search: the context awareness of HTML form and input elements has been a topic of conversation, in the context of schema languages and their expressive power, for about as long as HTML has had DTDs. The last I heard, HTML was not an obscure language of no particular interest for the world at large; the ability to specify features built into widely used markup languages is, I think, something the designer of any mechanism for specifying markup vocabularies should be thinking hard about, if the mechanism being designed is intended to be of general use. The case for taking this use case seriously is simple: HTML is the flagship markup language of W3C, and the feature makes it possible to define a well known part of HTML precisely and concisely. Our earlier decision not to support it was based on a false premise. The case against considering this use case seriously appears to be: the W3C's schema language doesn't really have any need to support the features needed to define W3C's markup languages, or any other obscure and little-used languages. These arguments are most charitably described as laughable; perhaps there are others. The second argument offered is that there are good reasons for validation to be context-free. If those reasons exist, deciding on this issue will involve weighing them carefully against the reasons for making it possible to write an XSDL schema that expresses an important constraint in (X)HTML. Such a careful weighing of one point against another will be easier if the reasons for context-free validation are identified. Comment #3 does not identify any arguments, architectural or otherwise; it only says they exist. Michael Kay does identify a specific reason: bottom-up construction. I think he is correct that the locality of validity may make it easier to guarantee the validity of larger constructs made from smaller ones, and I expect that it will be useful to distinguish different kinds of validity which depend on different parts of the document, just as we do now in distinguishing local validity from 'deep' validity. It's not entirely clear to me how to weigh against each other the two situations (A) Validity is simple to calculate but does not include some simple mechanically checkable constraints imposed by the definition of the vocabulary. So "validity" is less useful as a concept than it might be. (B) "Validity" included more of the constraints imposed by the definition of the vocabulary, so it's a more useful concept than in (A). But it's more complex to calculate. But it does seem clear to me that dismissing the requirements of HTML out of hand, as if they were a corner case, is not the right way to weigh them. And assertions that there are good architectural reasons for something do not carry, in a careful Working Group, the same weight as the architectural reasons themselves.
Comment #3 asks for clarification of the issue, but I cannot provide the desired clarification because I do not understand the question. The Working Group has entertained several different formulations of the spec intended to achieve the goal of ensuring that neither assertions nor conditional type assignment depend on nodes outside the subtree rooted in the item to which the assertions or conditional type assignment are attached by declaration. In some of those formulations, specific XPath axes have been made illegal in the XPath expressions. In others, the input tree has been truncated so that any expression which attempts to refer to nodes outside the subtree will evaluate to the empty set. Other formulations are possible which involve neither syntactic restrictions on XPath expressions nor tree surgery. The details of the spec prose are not of interest here; what is at issue is the goal the WG has been trying to achieve, which the description of the issue argues should be revisited. If the goal is dropped, then I think the implications for the spec prose are obvious enough.
> In some of those formulations, specific XPath axes > have been made illegal in the XPath expressions. > In others, the input tree has been truncated so > that any expression which attempts to refer to > nodes outside the subtree will evaluate to the > empty set. Other formulations are possible which > involve neither syntactic restrictions on XPath > expressions nor tree surgery. The details of the > spec prose are not of interest here; what is at > issue is the goal the WG has been trying to > achieve, which the description of the issue argues > should be revisited. If the goal is dropped, then > I think the implications for the spec prose are > obvious enough. OK. That's what I wanted to be sure I understood. Thank you. Noah
HST calls the attention of the WG to his proposal for resolving the form/input issue, in comment 19 on issue 5003: http://www.w3.org/Bugs/Public/show_bug.cgi?id=5003#c19 WG asserts that it's too late for that. Propose that we close this issue Approved. Dissent from W3C. Dissent from NACS.