Re: [i18n review comment] BP3 should recommend locale-neutral representation #187

Hi Addison,

Thanks for your response, and it does make sense. I think what I am 
still missing is whether there is guidance we can point to as to how to 
represent the "locale-neutral" data so that it can most easily be made 
locale specific by existing tools. You mention "pre-made standards for 
the basic data types". Is there a recommended list we could reference?

Thanks for your help!
-Annette

On 8/4/16 12:31 PM, Phillips, Addison wrote:
> Hi Annette,
>
> Thanks for the note. This is a personal reply not on behalf of the WG.
>
> Locale neutral formats are quite common on the Web and the Internet in general. One familiar format referenced by your document, for example, is XML Schema. While the representations of numbers, dates, and the like in XML Schema would be "more appropriate" for some languages/locales than others if given as plain text, what distinguishes them is that they are all machine readable and intended to be read by machines for later processing. The display of values is a separate, local, concern for the data's consumer. This necessarily means choosing specific separators (such as decimal separators) over other, more localized values. Save for "free text" (natural language) data, most data formats are locale neutral and these include things like JSON-LD, XML Schema, CSV, and so forth.
>
> Not every possible data structure or data value is, of course, covered fully. For example, in my day job (I work at Amazon), we have many different common measurement units defined internally. To transmit these in a locale-neutral manner, we need to construct our own data schemas and identifiers. There are profoundly many ways to measure shoes, dresses, auto parts, hats, drone propellers, and so forth. But it would be a nightmare to have to deal with localized presentation formats on top of that.
>
> But there are pre-made standards for the basic data types and these are what are needed to build almost any data structure necessary for global interchange of data.
>
> Does that make sense?
>
> Addison
>
> Addison Phillips
> Principal SDE, I18N Architect (Amazon)
> Chair (W3C I18N WG)
>
> Internationalization is not a feature.
> It is an architecture.
>
>
>
>
>> -----Original Message-----
>> From: Annette Greiner [mailto:amgreiner@lbl.gov]
>> Sent: Thursday, August 04, 2016 12:04 PM
>> To: ishida@w3.org; public-dwbp-comments@w3.org
>> Cc: www International <www-international@w3.org>
>> Subject: Re: [i18n review comment] BP3 should recommend locale-neutral
>> representation #187
>>
>> Hello on behalf of the DWBP WG,
>>
>> We're interested in pursuing this concept in our best practice document, but
>> we would like some clarification of the practice of locale neutrality. You
>> mention the variation across locales in decimal symbol, grouping symbol,
>> number of grouping digits, digit shapes, etc., and you give an example of a
>> locale-neutral data structure for monetary values.
>> But this structure alone does not appear to address differences in decimal
>> symbol, grouping symbol, number of grouping digits, or digit shapes. It does
>> provide a mechanism to separately specify the units, and the example uses
>> an ISO-4217 currency code, both of which we agree are good ideas. Is there a
>> broad standard (beyond just monetary) for addressing the other
>> symbol/representation issues you raised that we can address briefly in our
>> best practice? Do you consider SI units consistent with a locale-neutral
>> approach? Is there a locale-neutral standard for representing decimal
>> numbers (perhaps using a period and no grouping, as in your example)?
>>
>> -Annette
>>
>>
>> On 7/22/16 5:32 AM, ishida@w3.org wrote:
>>> [raised by aphillips]
>>>
>>> https://www.w3.org/TR/dwbp/#LocaleParametersMetadata
>>>
>>> Best practice #3 introduces itself as:
>>>
>>>> Providing locale parameters helps humans and computer applications
>>> to work accurately with things like dates, currencies and numbers that
>>> may look similar but have different meanings in different locales.
>>>
>>> But the actual best practice is to use **locale-neutral**
>>> representations that are interpreted/displayed to end-users in a
>>> locale-appropriate manner. For example, instead of storing the string
>>> "€2000.00", exchanging a data structure like the following is strongly
>>> preferred:
>>>
>>> ```
>>> "price" {
>>>     "value": 2000.00,
>>>     "currency": "EUR"
>>> }
>>> ```
>>>
>>> The date examples given are all in xsd:date format, which is an
>>> excellent example of using a locale-neutral format.
>>>
>>> Many things are dependent on locale: decimal symbol, grouping symbol,
>>> number of grouping digits, digit shapes, etc. It's because there can
>>> be wide variation (sometimes open to misinterpretation) that sending a
>>> locale neutral format is preferred for data values. Note also btw that
>>> the position of the currency symbol is dependent on the locale. In
>>> France it would be normal to write 2000.00 € rather than €2000.00.
>>> Same even when talking about USD when using $, ie. 2000.00 $.
>>>
>>>
>> --
>> Annette Greiner
>> NERSC Data and Analytics Services
>> Lawrence Berkeley National Laboratory
>>

-- 
Annette Greiner
NERSC Data and Analytics Services
Lawrence Berkeley National Laboratory

Received on Thursday, 4 August 2016 21:27:28 UTC