This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.
This comment appears in http://lists.w3.org/Archives/Public/www-xml-schema-comments/2009JanMar/. This bugzilla entry is so that the WG can track whether or not it has been addressed. It may be a duplicate. Apologies if it is. http://lists.w3.org/Archives/Public/www-xml-schema-comments/2009JanMar/0189.html Michael Kay responded: http://lists.w3.org/Archives/Public/www-xml-schema-comments/2009JanMar/0190.html
The XML Schema WG discussed this issue on today's telcon. The originator of the comment has been informed of the result in email archived at http://lists.w3.org/Archives/Public/www-xml-schema-comments/2009AprJun/0111.html which reads: Thank you for the comment. The XML Schema WG discussed this issue on its telcon this morning, and concurred with Michael Kay's analysis of the situation, which is that this is not actually an error in the spec but just one instance of a general fact about the descriptions of our lexical spaces: satisfying the descriptions is typically a necessary but not a sufficient condition of type validity; the examples of byte, short, int, etc. are perhaps the most obvious cases of this. Please let us know if you agree with this resolution of your issue, by adding a comment to the issue record at http://www.w3.org/Bugs/Public/show_bug.cgi?id=6707 and changing the Status of the issue to Closed. Or, if you do not agree with this resolution, please add a comment explaining why. If you wish to appeal the WG's decision to the Director, then also change the Status of the record to Reopened. If you wish to record your dissent, but do not wish to appeal the decision to the Director, then change the Status of the record to Closed. (It's slightly more convenient for the WG if you respond in Bugzilla, but if necessary you can reply to this email, instead.) If we do not hear from you in the next ten days or so, we will assume you agree with the WG decision.
I agree that the lexical space definition is not sufficient for validation. As an example, the earlier email from Michael Kay mentioned the inability of the lexical space definition to enforce range constraints on integers. But it is a well-known limitation that reasonable regular expressions cannot encode range constraints, which is why I did not ask for this. Yet, because of the close association of "lexical" and "regular expression" commonly used in computing, it is reasonable to expect that some editing effort would be expended to correct the descriptions of the lexical spaces of data types when there are simple and obvious regular expressions that do a better job of characterizing the lexical space. I agree it is an editorial fix, i.e. that the correctness of a schema validator is not ultimately affected here, but it's not clear why this feedback would receive less attention than a grammatical error, a missing word or a misspelled word. It seems to be more of technical error than any of those other editorial examples, which would surely get addressed. I am not really sure how to change the status of this issue so that I may find out the response of the working group to this additional comment.
>it is reasonable to expect that some editing effort would be expended to correct the descriptions of the lexical spaces of data types when there are simple and obvious regular expressions that do a better job of characterizing the lexical space. (1) The descriptions are not incorrect, they do not need to be corrected. Restricting the base type both by using minInclusive/maxInclusive and by using a pattern would be unnecessary redundancy. We don't need to say the same thing in two different ways, and doing so is always dangerous because of the risk of inconsistency. (2) Providing a precise pattern for some subtypes of integer and not for others would be inconsistent, and would make readers ask the reason for the inconsistency. It is always possible to define a numeric range by means of a regular expression, but in general the regular expressions that result are extremely unwieldy (which is why XSD provides minInclusive and maxInclusive as a more convenient way of doing it). There is no logic to treating this one as a special case just because the regular expression in this one case is moderately readable. Michael Kay
To the question of how to make sure you are informed of WG actions on this issue: you're now in the CC list, so you should be getting Bugzilla mail whenever the issue is updated. (I'll mention now that you can adjust your user options to exclude some or all of Bugzilla's notices.) On the issue of positiveInteger and its lexical space, it occurs to me that there are three things (at least!) that may be attracting your disapprobation and which the WG might fix (or not). We may have been overhasty in assuming we understood your comment in the first place. Since the bug report was raised on your behalf by David Ezell, you may well be right that you don't have the permissions needed to change its status. I believe that your message in comment 2 amounts to providing pushback and that if you had had the buttons in front of you you would have reopened the issue, so I am reopening it on your behalf. (If I've misunderstood let me know.) (1) If you mean that the prose description of the lexical space in 3.4.25.1 is unnecessarily loose and sloppy, I think I agree. The current text of the section is: positiveInteger has a lexical representation consisting of an optional positive sign ('+') followed by a non-empty finite-length sequence of decimal digits (#x30-#x39). For example: 1, 12678967543233, +100000. I don't think it's terribly misleading, but I also don't think it would hurt very much to make it a little tighter; we could insert "at least one of which should be a digit other than '0'" at the end of the first sentence. Would that help? If (as feared in comment 3) readers ask why the description of the lexical space in descriptions of short and byte and so on are not similarly precise and exact, I think the only answer we can give is "there is a point at which additional precision in the prose costs more in syntactic obscurity than it provides in semantic clarity. It's a judgement call where the line lies; our judgement is reflected in our text; ymmv". (2) If it is the value of the 'pattern' facet given in 3.4.25.3 that is the source of dissatisfaction, it's a bit harder. That value is inherited from 'integer' and for the reasons given by Michael Kay I wouldn't like (and I don't expect the WG to wish) to change it. But since you didn't actually mention the possible leading minus sign mentioned in the pattern facet, I'm guessing it's not the pattern facet that is at issue here. (3) If it is the definition of positiveInteger in appendix C.2 that attracted your attention, then (a) I'm reluctant to change it, again for the same reasons, but (b) congratulations, you may be the first documented reader of that mass of gray undigestible text outside (or possibly inside) the WG in the last several years. If all you mean is (1), then I apologize on behalf of the WG for having been a bit dense and slow on the uptake, and I endorse the editorial change suggested above.
Proposal: In 3.4 (Other Built-in Datatypes), after the paragraph beginning "This section gives", add a new paragraph: "Each of the atomic datatypes described ibelow has a ·lexical mapping· which is a subset the ·lexical mapping· of its ·base type·, determined by the defining facets. The descriptions below of the resulting value space and ·lexical space· are only approximations, and, if taken alone, may well not exactly describe the real value space and/or lexical space of the derived datatype, which are in fact the domain and range of the ·lexical mapping· so restricted." In 3.4.25.1, at the end of the first sentence add ", one of which must be non-zero".
The WG decided to make this change: In 3.4.25.1, at the end of the first sentence add ", at least one of which must be a digit other than '0'" We believe this change resolves this issue.
The changes agreed upon at yesterday's call have been made in the status-quo version of the spec at http://www.w3.org/XML/Group/2004/06/xmlschema-2/datatypes.html (member-only link) I have unilaterally applied the same wording change to the corresponding sentence in the description of negativeInteger (which must similarly be non-zero). Accordingly, I'm marking this issue resolved. It would be helpful if John Boyer, as the originator of the issue, could close it to indicate your assent, or reopen it to indicate dissent. If we don't hear otherwise from you in the next two weeks, we'll assume you are happy.
(In reply to comment #7) > The changes agreed upon at yesterday's call have been made in the status-quo > version of the spec at > http://www.w3.org/XML/Group/2004/06/xmlschema-2/datatypes.html > (member-only link) > I have unilaterally applied the same wording change to the corresponding > sentence in the description of negativeInteger (which must similarly be > non-zero). > Accordingly, I'm marking this issue resolved. It would be helpful if John > Boyer, as the originator of the issue, could close it to indicate your assent, > or reopen it to indicate dissent. If we don't hear otherwise from you in the > next two weeks, we'll assume you are happy. The wording itself looks good to me. I've noticed, though, that somehow this issue was tagged as a Schema 1.1 only issue, whereas my original post [1] stated an erratum for Schema 1.0, which would of course also need to be applied to 1.1. This means that fixing only 1.1 does not fix [1]. [1] http://lists.w3.org/Archives/Public/www-xml-schema-comments/2009JanMar/0189.html It's not clear how to change the status of the issue; I'll try a few more things after committing this comment to see if the way to change the status presents itself.
The WG reported this bug as FIXED on 2009-05-11. We are closing this bug as requiring no futher work. If there are issues remaining, you can reopen this bug and enter a comment to indicate the problem. Thanks very much for the feedback.