I18n WG comments on XQuery 1.0 and XPath 2.0 Functions and Operators

Version reviewed

http://www.w3.org/TR/2005/WD-xpath-functions-20050404/

Main reviewer

Felix Sasaki fsasaki@w3.org

Notes

These are so far personal comments, NOT on behalf of the I18N WG.

Comments

ID Location Comment Additional information Accepted
1 Sec. 10.1 on Date/time datatype values You write "For comparing or subtracting xs:dateTime values, this local value is translated or normalized to UTC (timezone Z)." This should be read as "Before comparing... this local value should (must?) be normalized to UTC." Please also change "This value is referred to as the local or localized value" to "This value is referred to as the local value", because "localized" as a specific notion in localization. Based on Martin's review, comment [12]. See Bugzilla -
2 Sec. 1.1, and in general Please refer to IRI instead of URI, e.g. "Namespace URI" -> "namespace IRI", "The URIs of the namespaces" -> "The IRIs of the namespaces". We don't ask you to change your data type, because we know that you are relying on xs:anyURI, but please add a reference to IRI in a note. Based on Martin's review, comment [17]. See Bugzilla -
3 Sec. 3 on the error function The indication of the language of the description may be useful. Having additional information on the language may be important for rendering. We also need a way to provide messages in different languages. Based on Martin's review, comment [21]. See Bugzilla -
4 Sec. 6.4.4 on rounding in addition to the current rounding, please add a facility for user defined roundings. See http://www.xencraft.com/resources/multi-currency.html#rounding) for the need of this facility. Based on Martin's review, comment [30]. See Bugzilla -
5 Sec. 7.2 Please make sure that there are tests for these functions that include surrogate pairs/codepoints above #xFFFF. Based on Martin's review, comment [34]. See Bugzilla -
6 Sec. 7.3 on collations and the rules for ordering of collations The current rules, which allow only one collation to be specified, raise an error if the collation is not supported, and use anyURIs to identify collations without any mechanism for giving anyURIs to well-known collations, are bound to lead to interoperability problems. Collations should not be the major source of interoperability problems. With the current design, even vendors who want to be interoperable have no chance of doing so. It will often be the case that e.g. a user wants just 'a French collation'. How can this be indicated? The static context defines a single collation. But it may often be desirable to have two 'default collations', in two different senses:
  • different collations for matching (less precision) and sorting (best precision possible)
  • different collations for internal operations and user-oriented operations
Based on Martin's review, comment [42] and [43]. See Bugzilla -
7 General: References to the character model specifications You make several references to the character model specifications. Please be aware that there are several specifications which have been created out of one, e.g. Fundamentals, Normalization and Resource Identifiers. Be careful to point to the right one, e.g. when you are talking about normalization. See Bugzilla -
8 Sec. 7.4.1. on fn:concat and other operations like string-join Is there a concat operation that includes normalization splicing at the contact point? This would be very helpful, and ideally should be the default, because this may be the most efficient way to maintain a certain normalization. The same applies to string-join and potentially other operations. Based on Martin's review, comment [57]. See Bugzilla -
9 Functions for case mappings, e.g. for upper case You might think about other mappings, e.g. katakana<->hiragana. Based on Martin's review, comment [57] (accidentially the number is the same as above, but it is a different comment.). -
10 Sec. 7.4.6 on fn:normalize-unicode There should be a reference to Unicode Standard Annex #15 for the various normalization forms. Based on Martin's review, comment [60]. See Bugzilla -
11 Sec. 7.4.10 on fn:escape-uri "The semantics of this function are aligned with the URI escaping semantics in [XML Linking Language (XLink) Version 1.0]." Please refer to the IRI specification instead. Based on Martin's review, comment [67]. See Bugzilla -
12 Sec. 7.6.1.1. on Flags for regular expressions "Regular expression matching is defined on the basis of Unicode code-points; it takes no account of collations." It seems somewhat inadequate that fn:contains and others use collations, but regular expressions don't. But we are not sure which way the right solutions is. The section 7.6.1.1 Flags on flags contains an i flag which makes use of case mapping tables, so this is an example where you are using additional information, i.e. beyond code-points. Based on Martin's review, comment [73]. See Bugzilla -
13 Sec. 7.6.3 An error is raised ("Invalid replacement string") if the value of $replacement contains a "$" character that is not immediately followed by a digit 1-9 and not immediately preceded by a "/". "/" must be "\". Based on Martin's review, comment [74]. See Bugzilla -
14 Sec. 7.6.3 We have been told that there is a provision to replace character strings with markup, but this is not discussed here. We want to make sure this is available, because it is relevant to get people away from using the private use area of Unicode. Based on Martin's review, comment [75]. See Bugzilla -
15Sec. 7.1, note at the endYour description of "code point" and "characters" is a little bit unclear. You might rewrite it, taking into account the section 4.1. of the character model specification, see sec. 4.1See Bugzilla-
16Sec. 7.6.1In sec. 7.6.1 (on Regular Expression Syntax) you should refer to the Unicode Technical Standard #18 "Unicode Regular Expressions", which describes guidelines for how to adapt regular expression engines to use Unicode.See Bugzilla-

Version: $Id: xq-func-op-review.html,v 1.9 2005/05/25 06:36:37 fsasaki Exp $