This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 7202 - [XSLT 2.0] Traditional Hebrew Numbering
Summary: [XSLT 2.0] Traditional Hebrew Numbering
Status: RESOLVED WONTFIX
Alias: None
Product: XPath / XQuery / XSLT
Classification: Unclassified
Component: XSLT 2.0 (show other bugs)
Version: Recommendation
Hardware: PC Windows NT
: P2 normal
Target Milestone: ---
Assignee: Michael Kay
QA Contact: Mailing list for public feedback on specs from XSL and XML Query WGs
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2009-08-03 13:31 UTC by Michael Kay
Modified: 2012-02-16 16:08 UTC (History)
3 users (show)

See Also:


Attachments

Description Michael Kay 2009-08-03 13:31:00 UTC
See message from Efraim Feinstein on the saxon-help list:

http://markmail.org/message/ogwvahwep44p2weh

Extracts:

(A) it's unclear to me how numbers divisible by 1000 should be differentiated from the numbers 1-9. One way would be to remove the geresh mark from 1-9.

(B) numbers-as-words functionality ... would be difficult in Hebrew -- the language is gendered and there are at least two forms of correctly specifying a number in words (Biblical and Modern styles). There's also the variable of whether the result should be pointed (with vowels) or unpointed (without vowels).

(C) The XSLT 2.0 specification is unclear about the representation of numbers in the traditional Hebrew system. There are 2 examples. The one given as a numerical sequence in section 12.3:א, ב, ג, ד, ה, ו, ז, ח, ט, י, יא, יב, יג, יד, טו, טז, יז, יח, יט, כ has no markings. The one given for a date (sec 16.5.3): כ״ו טבת תשס״ג shows both numbers with the gershayim symbol (״). 
A third possibility is to mark all letters that represent numbers with an overdot. Presumably, these should be an option...
Comment 1 Efraim Feinstein 2009-08-27 15:16:24 UTC
This feature request really comes down to a request for an additional standard way for a user to provide the XSLT processor with nonstandard language-specific options.

The specific issues for "traditional Hebrew" are:

Sometimes numbers are printed with additional marks to indicate that they are numbers, sometimes they aren't.  The specification uses both conventions, once in the example for dates, once in the example for numbering.  The types of additional marks also change.  In modern texts, numbers are sometimes marked with a geresh (1=א׳) following the number, and sometimes with a gershayim (21=כ״א); In archaic texts, overdots are sometimes used to indicate that the value is numeric and not a word (21=כׄאׄ).

When the number is represented as words, it could be masculine or feminine, in both ordinal and cardinal forms.  There's currently no way to specify masculine or feminine for cardinal forms.

Also, there are two conventions for how to specify a number in words.  The modern convention (the equivalent of representing 1234 as "one thousand two hundred thirty four") and an archaic convention ("four and thirty and two hundred and one thousand").

It's probably beyond the scope of XSLT to define all possible options for all possible languages, but, in the interests of interoperability between processors, ideally, the set of options for each language could be expanded in some way involving community-consensus, perhaps by a list of non-normative best practices for each language.

A model for how to do it might be in the @ordinal attribute of xsl:number (XSLT 2.0, sec 12.3):
"For inflected languages that vary the ending of the word, the preferred approach is to indicate the required ending, preceded by a hyphen: for example in German, appropriate values are -e, -er, -es, -en."

Similarly, a new @options attribute to indicate language-specific options could be added to the spec.
Comment 2 Anders Berglund 2010-12-01 22:26:35 UTC
The XSL Working Group approved this response at its November 18 2010 Teleconference.

Your comment highlights a number of the difficulties in supporting
the wide variety of conventions used over time in many languages.

Some of these cannot even be represented in a "linear sequence" of
Unicode characters - such as a "titlo" that is "stretched" to cover
all the characters in an old slavic number.

For most of the variations that you list the intention was coverage
by (just before the description of "grouping-separator")

  Note:
  Implementations may use extension attributes on xsl:number
  to provide additional control over the way in which numbers
  are formatted.

Do you think this should be augmented and clarified to get the point
across? One could - if you would volunteer to organze a "community"
design effort - add a specific example designed to fully cover a
particular language. I feel that a SINGLE attribute will be too
limiting in the general case so a "standard" @options is probably
not the way to go. The namespace of the extension attributes should
probably clearly identify the language and "authority" of the
definition.

Could you provide examples of where the cardinal form varies based on
e.g. gender? Preferably in more than one language.
Comment 3 Sharon Adler 2012-02-16 16:08:43 UTC
Since we have not heard back from the commenter for over a year we are closing this bug. If you feel this is in error please supply requested examples and reopen.  Thank you!