Re: [i18n review comment] BP3 should recommend locale-neutral representation #187

I took an action on today's call to try and address this in BP3. You can 
see the results at
http://philarcher1.github.io/dwbp/bp.html#LocaleParametersMetadata

This uses some of Addison's text directly and highlights the value of 
the xsd datatypes - but retains enough of the original BP for it to be 
an amendment rather than a whole new one - I hope.

This addresses most of the resolution taken today [1] but I have not 
moved the BP to the formats section. I leave that to the editors who may 
want to make further changes - or argue for it to be left where it is, 
or add references from the formats section or, or, or...

I've created the Pull Request https://github.com/w3c/dwbp/pull/447

Phil.

[1] https://www.w3.org/2016/08/19-dwbp-minutes#resolution02

On 15/08/2016 17:28, Bernadette Farias Lóscio wrote:
> Dear Ishida,
>
> This comment [1] is still under discussion [4] and we'd like to ask your
> opinion about two of our proposals:
>
> 1. to include locale-neutral representation ideas as part of BP3 [2], or
> 2. to include a paragraph at the introduction of Section 8.8 Data Formats
> [3] to discuss the relevance of having local-neutral representations.
>
> We also discussed the proposal of having a new BP and we agreed that we
> won't have a lot of time for a broader review of the new BP and to collect
> feedback from the community.
>
> Thanks a lot!
> DWBP editors
>
> [1] https://lists.w3.org/Archives/Public/public-dwbp-comments/
> 2016Jul/0028.html
> [2]http://agreiner.github.io/dwbp/bp.html#LocaleParametersMetadata
> [3] https://www.w3.org/TR/dwbp/#dataFormats
> [4] https://lists.w3.org/Archives/Public/public-dwbp-wg/2016Aug/0009.html
>
>
> 2016-08-04 23:26 GMT+02:00 Annette Greiner <amgreiner@lbl.gov>:
>
>> Hi Addison,
>>
>> Thanks for your response, and it does make sense. I think what I am still
>> missing is whether there is guidance we can point to as to how to represent
>> the "locale-neutral" data so that it can most easily be made locale
>> specific by existing tools. You mention "pre-made standards for the basic
>> data types". Is there a recommended list we could reference?
>>
>> Thanks for your help!
>> -Annette
>>
>>
>> On 8/4/16 12:31 PM, Phillips, Addison wrote:
>>
>>> Hi Annette,
>>>
>>> Thanks for the note. This is a personal reply not on behalf of the WG.
>>>
>>> Locale neutral formats are quite common on the Web and the Internet in
>>> general. One familiar format referenced by your document, for example, is
>>> XML Schema. While the representations of numbers, dates, and the like in
>>> XML Schema would be "more appropriate" for some languages/locales than
>>> others if given as plain text, what distinguishes them is that they are all
>>> machine readable and intended to be read by machines for later processing.
>>> The display of values is a separate, local, concern for the data's
>>> consumer. This necessarily means choosing specific separators (such as
>>> decimal separators) over other, more localized values. Save for "free text"
>>> (natural language) data, most data formats are locale neutral and these
>>> include things like JSON-LD, XML Schema, CSV, and so forth.
>>>
>>> Not every possible data structure or data value is, of course, covered
>>> fully. For example, in my day job (I work at Amazon), we have many
>>> different common measurement units defined internally. To transmit these in
>>> a locale-neutral manner, we need to construct our own data schemas and
>>> identifiers. There are profoundly many ways to measure shoes, dresses, auto
>>> parts, hats, drone propellers, and so forth. But it would be a nightmare to
>>> have to deal with localized presentation formats on top of that.
>>>
>>> But there are pre-made standards for the basic data types and these are
>>> what are needed to build almost any data structure necessary for global
>>> interchange of data.
>>>
>>> Does that make sense?
>>>
>>> Addison
>>>
>>> Addison Phillips
>>> Principal SDE, I18N Architect (Amazon)
>>> Chair (W3C I18N WG)
>>>
>>> Internationalization is not a feature.
>>> It is an architecture.
>>>
>>>
>>>
>>>
>>> -----Original Message-----
>>>> From: Annette Greiner [mailto:amgreiner@lbl.gov]
>>>> Sent: Thursday, August 04, 2016 12:04 PM
>>>> To: ishida@w3.org; public-dwbp-comments@w3.org
>>>> Cc: www International <www-international@w3.org>
>>>> Subject: Re: [i18n review comment] BP3 should recommend locale-neutral
>>>> representation #187
>>>>
>>>> Hello on behalf of the DWBP WG,
>>>>
>>>> We're interested in pursuing this concept in our best practice document,
>>>> but
>>>> we would like some clarification of the practice of locale neutrality.
>>>> You
>>>> mention the variation across locales in decimal symbol, grouping symbol,
>>>> number of grouping digits, digit shapes, etc., and you give an example
>>>> of a
>>>> locale-neutral data structure for monetary values.
>>>> But this structure alone does not appear to address differences in
>>>> decimal
>>>> symbol, grouping symbol, number of grouping digits, or digit shapes. It
>>>> does
>>>> provide a mechanism to separately specify the units, and the example uses
>>>> an ISO-4217 currency code, both of which we agree are good ideas. Is
>>>> there a
>>>> broad standard (beyond just monetary) for addressing the other
>>>> symbol/representation issues you raised that we can address briefly in
>>>> our
>>>> best practice? Do you consider SI units consistent with a locale-neutral
>>>> approach? Is there a locale-neutral standard for representing decimal
>>>> numbers (perhaps using a period and no grouping, as in your example)?
>>>>
>>>> -Annette
>>>>
>>>>
>>>> On 7/22/16 5:32 AM, ishida@w3.org wrote:
>>>>
>>>>> [raised by aphillips]
>>>>>
>>>>> https://www.w3.org/TR/dwbp/#LocaleParametersMetadata
>>>>>
>>>>> Best practice #3 introduces itself as:
>>>>>
>>>>> Providing locale parameters helps humans and computer applications
>>>>>>
>>>>> to work accurately with things like dates, currencies and numbers that
>>>>> may look similar but have different meanings in different locales.
>>>>>
>>>>> But the actual best practice is to use **locale-neutral**
>>>>> representations that are interpreted/displayed to end-users in a
>>>>> locale-appropriate manner. For example, instead of storing the string
>>>>> "€2000.00", exchanging a data structure like the following is strongly
>>>>> preferred:
>>>>>
>>>>> ```
>>>>> "price" {
>>>>>     "value": 2000.00,
>>>>>     "currency": "EUR"
>>>>> }
>>>>> ```
>>>>>
>>>>> The date examples given are all in xsd:date format, which is an
>>>>> excellent example of using a locale-neutral format.
>>>>>
>>>>> Many things are dependent on locale: decimal symbol, grouping symbol,
>>>>> number of grouping digits, digit shapes, etc. It's because there can
>>>>> be wide variation (sometimes open to misinterpretation) that sending a
>>>>> locale neutral format is preferred for data values. Note also btw that
>>>>> the position of the currency symbol is dependent on the locale. In
>>>>> France it would be normal to write 2000.00 € rather than €2000.00.
>>>>> Same even when talking about USD when using $, ie. 2000.00 $.
>>>>>
>>>>>
>>>>> --
>>>> Annette Greiner
>>>> NERSC Data and Analytics Services
>>>> Lawrence Berkeley National Laboratory
>>>>
>>>>
>> --
>> Annette Greiner
>> NERSC Data and Analytics Services
>> Lawrence Berkeley National Laboratory
>>
>>
>>
>
>

-- 


Phil Archer
W3C Data Activity Lead
http://www.w3.org/2013/data/

http://philarcher.org
+44 (0)7887 767755
@philarcher1

Received on Friday, 19 August 2016 15:34:59 UTC