This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 26549 - [f+o 3.1] Non-ascii character in spec rendered incorrectly
Summary: [f+o 3.1] Non-ascii character in spec rendered incorrectly
Status: CLOSED FIXED
Alias: None
Product: XPath / XQuery / XSLT
Classification: Unclassified
Component: Functions and Operators 3.1 (show other bugs)
Version: Working drafts
Hardware: PC All
: P2 minor
Target Milestone: ---
Assignee: Michael Kay
QA Contact: Mailing list for public feedback on specs from XSL and XML Query WGs
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-08-10 08:33 UTC by Michael Kay
Modified: 2014-11-17 09:48 UTC (History)
1 user (show)

See Also:


Attachments

Description Michael Kay 2014-08-10 08:33:51 UTC
In section 5.5 of the 3.1 specification at http://www.w3.org/TR/xpath-functions-31, the character "a umlaut" in "Jäger" is incorrectly rendered as "square root, section sign" in my browser. It is correctly rendered in the 3.0 version of the spec.
Comment 1 Michael Kay 2014-08-10 09:27:23 UTC
Another corruption occurs in 9.8.4.1 (format-date) where the width modifier syntax appears as

   ","  min-width ("-" max-width)?

and another in 9.8.5 where the German for 31st is given as

einunddreißigste

However, pi and theta appear correctly in the specs of math functions, and the Islamic and Thai dates in 9.8.5 also appear correct, as do the Arabic-Indic digits and the Italian ordinal indicators in 4.6.1.
Comment 2 Michael Kay 2014-08-10 19:48:39 UTC
Looking at the CVS log, it seems the corruptions go back to the first version of the 3.1 xpath-functions.xml, which was apparently copied incorrectly from the 3.0 version of the document. (But it's hard to be sure, because my CVS client has a compare utility that is itself not showing these characters correctly). The corruptions moreover appear to be in some sense cumulative, in that different CVS commits show different variations of the character.

Probably the best solution is to replace all non-ASCII characters in the source by character references or entity references, to reduce the risk of further corruption if someone uses a non-UTF-8 editor to edit the file.
Comment 3 Michael Kay 2014-08-10 22:27:54 UTC
The non-ASCII characters in the xpath-functions.xml file are in the following sections:

namespace-prefixes - mdash, OK.
defining-decimal-format - per mille sign - seems OK
uca-collations - wrong (strength) A=a=√Ç=√¢, should be A=a=Â=â
substring.functions - Jaeger should contain "a umlaut" (several times)
date-time-duration-conformance - mdash, codepoint 8212 (twice), in deleted text.
date-picture-string -    ","  min-width ("-" max-width)? should use NBSP 
formatting-timezones - uses NBSP twice, written as entity ref, OK.
date-time-examples - German example einunddreißigste Dezember is wrong, but the Hebrew, Arabic and Thai examples look OK.
casting-to-float - wrong, √ó should be <=
casting-to-double - wrong, √ó should be <=
ISO10967 - mdash, OK.
ISO15924 - mdash, OK.
ISO15924_register - mdash, OK.
Comment 4 Michael Kay 2014-09-16 19:46:18 UTC
Correction, the symbol in casting-to-float and -double should be a multiplication sign.
Comment 5 Michael Kay 2014-09-16 19:48:37 UTC
These problems have been fixed by use of entities defined in the DTD.