This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.
Hello, this is a comment on behalf of the i18n core Working Group. We suggest that you have a non-static reference to the Unicode database, and that you make the Note starting with "[Unicode Database] is subject to future revision" to normative text. Locations: References http://www.w3.org/TR/2008/WD-xmlschema11-2-20080620/#UnicodeDB and G1 http://www.w3.org/TR/2008/WD-xmlschema11-2-20080620/#charcter-classes Thank you, Felix
I would be more inclined to agree with this if the Unicode Database offered better guarantees of backwards compatibility. But recent releases have, for example, renamed one of the character groups from "Greek" to "Greek and Coptic", and an implementation that followed that blindly would cause any schema using the regular expression \P{IsGreek} to become invalid overnight. Similarly there are characters that have changed category, which also changes the semantics of regular expressions, causing a message that is validated by the sender to be rejected by the recipient even though both use the same schema. I think we need to offer XML Schema users better stability than this.
Hello Michael, we discussed your comment http://www.w3.org/Bugs/Public/show_bug.cgi?id=5948#c1 at http://www.w3.org/2008/09/17-core-minutes#item05 We are now contacting the Unicode Technical Committee asking for stability of block names. See the thread at http://lists.w3.org/Archives/Member/member-i18n-core/2008Sep/0013.html The short summary is that there is a trade off between having a stable reference to the Unicode database and allowing for more characters in "yet to come" versions of Unicode. We think a good compromise would be to make it implementation-defined which version of the database is used. This proposal is also based on Bug http://www.w3.org/Bugs/Public/show_bug.cgi?id=5818 Mark Davis pointed out at http://lists.w3.org/Archives/Member/member-i18n-core/2008Sep/0025.html that the property Alias file http://unicode.org/Public/UNIDATA/PropertyValueAliases.txt provides information about previous block names, so you might want to take that into account as well. Regards, Felix.
WG agreed at the Face to Face: <MSM> We think the text is mostly ok, but some tweaks are desirable: <MSM> - replace references to 'current version' with references to 'the version cited in the References' (or to the actual number of that version -- we won't change it) <MSM> - change references to 'future' versions to 'later' versions <MSM> - add prose to the reference re-stating the conformance rules (ceiling/floor)
AND: <MSM> - Make those two Notes normative text!
(In reply to comment #3) > WG agreed at the Face to Face: > > <MSM> We think the text is mostly ok, but some tweaks are desirable: > <MSM> - replace references to 'current version' with references to 'the version > cited in the References' (or to the actual number of that version -- we won't > change it) > <MSM> - change references to 'future' versions to 'later' versions > <MSM> - add prose to the reference re-stating the conformance rules > (ceiling/floor) > Hello David, all, it is a bit hard to see what the actual change will be, could you give a link to the updated draft later? Also, FYI and maybe of importante for this issues: the i18n core WG has talked to the Unicode consortium about various stability policies, including character class names, which resulted in an update of these policies. See http://lists.w3.org/Archives/Member/member-i18n-core/2008Oct/0038.html Felix
I don't see anything in that Unicode stability policy about names of blocks such as "Greek", which is my biggest concern here. Did I miss something? Also, a policy for the future doesn't solve the problem for the past - I do think we need to say something explicit to make sure that a schema using <pattern value="p{IsGreek}*"/> continues to work.
(In reply to comment #6) > I don't see anything in that Unicode stability policy about names of blocks > such as "Greek", which is my biggest concern here. Did I miss something? Also, > a policy for the future doesn't solve the problem for the past - I do think we > need to say something explicit to make sure that a schema using <pattern > value="p{IsGreek}*"/> continues to work. > From a mail exchange with Mark Davis on the topic: [Soon] "there will be a publicly available stability provision for all of the property aliases and property value aliases on * http://unicode.org/Public/UNIDATA/PropertyValueAliases.txt * http://unicode.org/Public/UNIDATA/PropertyAliases.txt with the exception of Contributory properties listed on <http://www.unicode.org/Public/UNIDATA/UCD.html#Properties>. This is not completely final yet, since the exact wording has to be formulated by the editorial committee, and it actually requires approval by the officers, but I don't anticipate any problems. So that will include block names. Note that that the set of characters having a given property or property value may change (subject to the stability policies). What the above means is that the identifiers will always remain valid, so \p{script=Greek} or equivalent syntax will remain valid. That should address your concerns. Mark"
A wording proposal intended to resolve bug 5948 and bug 5950 is at http://www.w3.org/XML/Group/2004/06/xmlschema-2/datatypes.b5948.html This should make it easier to see what changes, exactly, are proposed.
In the wording proposal, the Note after the table of block names refers to PropertyAliases.txt and PropertyValueAliases.txt. The block names are actually defined in Blocks.txt
The wording proposal mentioned in comment #8 was approved by the XML Schema WG at its telcon of 19 December 2008, with minor amendments. We discussed the point raised in comment #9; some WG members took comment #7 to mean that the two files in question are in fact relevant to (changes in) block assignment, or will be. In the end, recalling that those two files are already mentioned in the first note in the change proposal, not very far above this location in the text, the WG decided just to delete that sentence. The changes have been integrated into the status quo document at the usual location. The WG decided not to close the issue, however. Since XSD 1.0 refers to version 3.1 of the Unicode database, the current draft of XSD 1.1 to 4.1, and the Unicode Consortium's web site now carries version 5.1, we discussed briefly which version to require of XSD 1.1 processors: 3.1 (for compatibility with XSD 1.0)?, 4.1 (for compatibility with earlier drafts of 1.1)?, or 5.1 (to be current)? We decided to require 5.1, and instructed the editors to check the block and property information and update the reference. I'm marking this needsDrafting, accordingly.
On 16 January 2009, the XML Schema WG adopted the proposal at http://www.w3.org/XML/Group/2004/06/xmlschema-2/datatypes.b5948b.html with amendments proposed by Michael Kay in http://lists.w3.org/Archives/Member/w3c-xml-schema-ig/2009Jan/0011.html (both of these are member-only links). Felix, if you could convey this decision to the i18n WG and let us know whether it resolves the issue to your and their satisfaction, we'd be grateful. Close the issue if you're satisfied, reopen it if not. If we don't hear from you in the next two weeks, we will assume you and the i18n core WG are satisfied with the resolution of the issue.
*** Bug 10008 has been marked as a duplicate of this bug. ***
The WG reported this bug as FIXED on 2010-06-24. We are closing this bug as requiring no futher work. If there are issues remaining, you can reopen this bug and enter a comment to indicate the problem. Thanks very much for the feedback.