Brief   Full   Jump  

Small
Medium
Large

Teal
High contrast
Bluish
Black

Sans-serif
Serif
Monospaced
Close
d
?
Styles

[i18n review comment] BP3 should recommend locale-neutral representation #187

22 messages.

[i18n review comment] BP3 should recommend locale-neutral representation #187
ishida@w3.org   Fri, 22 Jul 2016 13:32:52 +0100

public-dwbp-comments > July 2016 > 0000.html

Received on Friday, 22 July 2016 12:33:03 UTC

Show in list: by dateby threadby subjectby author

Link to this message in this page.

Sent to: public-dwbp-comments@w3.org
Copied to: www-international@w3.org.

[raised by aphillips] https://www.w3.org/TR/dwbp/#LocaleParametersMetadata Best practice #3 introduces itself as: > Providing locale parameters helps humans and computer applications to work accurately with things like dates, currencies and numbers that may look similar but have different meanings in different locales. But the actual best practice is to use **locale-neutral** representations that are interpreted/displayed to end-users in a locale-appropriate manner. For example, instead of storing the string "€2000.00", exchanging a data structure like the following is strongly preferred: ``` "price" { "value": 2000.00, "currency": "EUR" } ``` The date examples given are all in xsd:date format, which is an excellent example of using a locale-neutral format. Many things are dependent on locale: decimal symbol, grouping symbol, number of grouping digits, digit shapes, etc. It's because there can be wide variation (sometimes open to misinterpretation) that sending a locale neutral format is preferred for data values. Note also btw that the position of the currency symbol is dependent on the locale. In France it would be normal to write 2000.00 € rather than €2000.00. Same even when talking about USD when using $, ie. 2000.00 $.
Re: [i18n review comment] BP3 should recommend locale-neutral representation #187
Annette Greiner   Thu, 4 Aug 2016 12:04:13 -0700

public-dwbp-comments > August 2016 > 0000.html

Received on Thursday, 4 August 2016 19:08:00 UTC

Show in list: by dateby threadby subjectby author

Link to this message in this page.

Sent to: ishida@w3.org, public-dwbp-comments@w3.org
Copied to: www-international@w3.org.

Hello on behalf of the DWBP WG, We're interested in pursuing this concept in our best practice document, but we would like some clarification of the practice of locale neutrality. You mention the variation across locales in decimal symbol, grouping symbol, number of grouping digits, digit shapes, etc., and you give an example of a locale-neutral data structure for monetary values. But this structure alone does not appear to address differences in decimal symbol, grouping symbol, number of grouping digits, or digit shapes. It does provide a mechanism to separately specify the units, and the example uses an ISO-4217 currency code, both of which we agree are good ideas. Is there a broad standard (beyond just monetary) for addressing the other symbol/representation issues you raised that we can address briefly in our best practice? Do you consider SI units consistent with a locale-neutral approach? Is there a locale-neutral standard for representing decimal numbers (perhaps using a period and no grouping, as in your example)? -Annette On 7/22/16 5:32 AM, ishida@w3.org wrote: > [raised by aphillips] > > https://www.w3.org/TR/dwbp/#LocaleParametersMetadata > > Best practice #3 introduces itself as: > > > Providing locale parameters helps humans and computer applications > to work accurately with things like dates, currencies and numbers that > may look similar but have different meanings in different locales. > > But the actual best practice is to use **locale-neutral** > representations that are interpreted/displayed to end-users in a > locale-appropriate manner. For example, instead of storing the string > "€2000.00", exchanging a data structure like the following is strongly > preferred: > > ``` > "price" { > "value": 2000.00, > "currency": "EUR" > } > ``` > > The date examples given are all in xsd:date format, which is an > excellent example of using a locale-neutral format. > > Many things are dependent on locale: decimal symbol, grouping symbol, > number of grouping digits, digit shapes, etc. It's because there can > be wide variation (sometimes open to misinterpretation) that sending a > locale neutral format is preferred for data values. Note also btw that > the position of the currency symbol is dependent on the locale. In > France it would be normal to write 2000.00 € rather than €2000.00. > Same even when talking about USD when using $, ie. 2000.00 $. > > -- Annette Greiner NERSC Data and Analytics Services Lawrence Berkeley National Laboratory
RE: [i18n review comment] BP3 should recommend locale-neutral representation #187
"Phillips, Addison"   Thu, 4 Aug 2016 19:31:47 +0000

public-dwbp-comments > August 2016 > 0000.html

Received on Thursday, 4 August 2016 19:32:16 UTC

Show in list: by dateby threadby subjectby author

Link to this message in this page.

Sent to: amgreiner@lbl.gov, ishida@w3.org, ishida@w3.org, public-dwbp-comments@w3.org, public-dwbp-comments@w3.org
Copied to: www-international@w3.org.

Hi Annette, Thanks for the note. This is a personal reply not on behalf of the WG. Locale neutral formats are quite common on the Web and the Internet in general. One familiar format referenced by your document, for example, is XML Schema. While the representations of numbers, dates, and the like in XML Schema would be "more appropriate" for some languages/locales than others if given as plain text, what distinguishes them is that they are all machine readable and intended to be read by machines for later processing. The display of values is a separate, local, concern for the data's consumer. This necessarily means choosing specific separators (such as decimal separators) over other, more localized values. Save for "free text" (natural language) data, most data formats are locale neutral and these include things like JSON-LD, XML Schema, CSV, and so forth. Not every possible data structure or data value is, of course, covered fully. For example, in my day job (I work at Amazon), we have many different common measurement units defined internally. To transmit these in a locale-neutral manner, we need to construct our own data schemas and identifiers. There are profoundly many ways to measure shoes, dresses, auto parts, hats, drone propellers, and so forth. But it would be a nightmare to have to deal with localized presentation formats on top of that. But there are pre-made standards for the basic data types and these are what are needed to build almost any data structure necessary for global interchange of data. Does that make sense? Addison Addison Phillips Principal SDE, I18N Architect (Amazon) Chair (W3C I18N WG) Internationalization is not a feature. It is an architecture. > -----Original Message----- > From: Annette Greiner [mailto:amgreiner@lbl.gov] > Sent: Thursday, August 04, 2016 12:04 PM > To: ishida@w3.org; public-dwbp-comments@w3.org > Cc: www International <www-international@w3.org> > Subject: Re: [i18n review comment] BP3 should recommend locale-neutral > representation #187 > > Hello on behalf of the DWBP WG, > > We're interested in pursuing this concept in our best practice document, but > we would like some clarification of the practice of locale neutrality. You > mention the variation across locales in decimal symbol, grouping symbol, > number of grouping digits, digit shapes, etc., and you give an example of a > locale-neutral data structure for monetary values. > But this structure alone does not appear to address differences in decimal > symbol, grouping symbol, number of grouping digits, or digit shapes. It does > provide a mechanism to separately specify the units, and the example uses > an ISO-4217 currency code, both of which we agree are good ideas. Is there a > broad standard (beyond just monetary) for addressing the other > symbol/representation issues you raised that we can address briefly in our > best practice? Do you consider SI units consistent with a locale-neutral > approach? Is there a locale-neutral standard for representing decimal > numbers (perhaps using a period and no grouping, as in your example)? > > -Annette > > > On 7/22/16 5:32 AM, ishida@w3.org wrote: > > [raised by aphillips] > > > > https://www.w3.org/TR/dwbp/#LocaleParametersMetadata > > > > Best practice #3 introduces itself as: > > > > > Providing locale parameters helps humans and computer applications > > to work accurately with things like dates, currencies and numbers that > > may look similar but have different meanings in different locales. > > > > But the actual best practice is to use **locale-neutral** > > representations that are interpreted/displayed to end-users in a > > locale-appropriate manner. For example, instead of storing the string > > "€2000.00", exchanging a data structure like the following is strongly > > preferred: > > > > ``` > > "price" { > > "value": 2000.00, > > "currency": "EUR" > > } > > ``` > > > > The date examples given are all in xsd:date format, which is an > > excellent example of using a locale-neutral format. > > > > Many things are dependent on locale: decimal symbol, grouping symbol, > > number of grouping digits, digit shapes, etc. It's because there can > > be wide variation (sometimes open to misinterpretation) that sending a > > locale neutral format is preferred for data values. Note also btw that > > the position of the currency symbol is dependent on the locale. In > > France it would be normal to write 2000.00 € rather than €2000.00. > > Same even when talking about USD when using $, ie. 2000.00 $. > > > > > > -- > Annette Greiner > NERSC Data and Analytics Services > Lawrence Berkeley National Laboratory >
Re: [i18n review comment] BP3 should recommend locale-neutral representation #187
Annette Greiner   Thu, 4 Aug 2016 14:26:20 -0700

public-dwbp-comments > August 2016 > 0000.html

Received on Thursday, 4 August 2016 21:27:28 UTC

Show in list: by dateby threadby subjectby author

Link to this message in this page.

Sent to: addison@lab126.com, ishida@w3.org, ishida@w3.org, public-dwbp-comments@w3.org, public-dwbp-comments@w3.org
Copied to: www-international@w3.org.

Hi Addison, Thanks for your response, and it does make sense. I think what I am still missing is whether there is guidance we can point to as to how to represent the "locale-neutral" data so that it can most easily be made locale specific by existing tools. You mention "pre-made standards for the basic data types". Is there a recommended list we could reference? Thanks for your help! -Annette On 8/4/16 12:31 PM, Phillips, Addison wrote: > Hi Annette, > > Thanks for the note. This is a personal reply not on behalf of the WG. > > Locale neutral formats are quite common on the Web and the Internet in general. One familiar format referenced by your document, for example, is XML Schema. While the representations of numbers, dates, and the like in XML Schema would be "more appropriate" for some languages/locales than others if given as plain text, what distinguishes them is that they are all machine readable and intended to be read by machines for later processing. The display of values is a separate, local, concern for the data's consumer. This necessarily means choosing specific separators (such as decimal separators) over other, more localized values. Save for "free text" (natural language) data, most data formats are locale neutral and these include things like JSON-LD, XML Schema, CSV, and so forth. > > Not every possible data structure or data value is, of course, covered fully. For example, in my day job (I work at Amazon), we have many different common measurement units defined internally. To transmit these in a locale-neutral manner, we need to construct our own data schemas and identifiers. There are profoundly many ways to measure shoes, dresses, auto parts, hats, drone propellers, and so forth. But it would be a nightmare to have to deal with localized presentation formats on top of that. > > But there are pre-made standards for the basic data types and these are what are needed to build almost any data structure necessary for global interchange of data. > > Does that make sense? > > Addison > > Addison Phillips > Principal SDE, I18N Architect (Amazon) > Chair (W3C I18N WG) > > Internationalization is not a feature. > It is an architecture. > > > > >> -----Original Message----- >> From: Annette Greiner [mailto:amgreiner@lbl.gov] >> Sent: Thursday, August 04, 2016 12:04 PM >> To: ishida@w3.org; public-dwbp-comments@w3.org >> Cc: www International <www-international@w3.org> >> Subject: Re: [i18n review comment] BP3 should recommend locale-neutral >> representation #187 >> >> Hello on behalf of the DWBP WG, >> >> We're interested in pursuing this concept in our best practice document, but >> we would like some clarification of the practice of locale neutrality. You >> mention the variation across locales in decimal symbol, grouping symbol, >> number of grouping digits, digit shapes, etc., and you give an example of a >> locale-neutral data structure for monetary values. >> But this structure alone does not appear to address differences in decimal >> symbol, grouping symbol, number of grouping digits, or digit shapes. It does >> provide a mechanism to separately specify the units, and the example uses >> an ISO-4217 currency code, both of which we agree are good ideas. Is there a >> broad standard (beyond just monetary) for addressing the other >> symbol/representation issues you raised that we can address briefly in our >> best practice? Do you consider SI units consistent with a locale-neutral >> approach? Is there a locale-neutral standard for representing decimal >> numbers (perhaps using a period and no grouping, as in your example)? >> >> -Annette >> >> >> On 7/22/16 5:32 AM, ishida@w3.org wrote: >>> [raised by aphillips] >>> >>> https://www.w3.org/TR/dwbp/#LocaleParametersMetadata >>> >>> Best practice #3 introduces itself as: >>> >>>> Providing locale parameters helps humans and computer applications >>> to work accurately with things like dates, currencies and numbers that >>> may look similar but have different meanings in different locales. >>> >>> But the actual best practice is to use **locale-neutral** >>> representations that are interpreted/displayed to end-users in a >>> locale-appropriate manner. For example, instead of storing the string >>> "€2000.00", exchanging a data structure like the following is strongly >>> preferred: >>> >>> ``` >>> "price" { >>> "value": 2000.00, >>> "currency": "EUR" >>> } >>> ``` >>> >>> The date examples given are all in xsd:date format, which is an >>> excellent example of using a locale-neutral format. >>> >>> Many things are dependent on locale: decimal symbol, grouping symbol, >>> number of grouping digits, digit shapes, etc. It's because there can >>> be wide variation (sometimes open to misinterpretation) that sending a >>> locale neutral format is preferred for data values. Note also btw that >>> the position of the currency symbol is dependent on the locale. In >>> France it would be normal to write 2000.00 € rather than €2000.00. >>> Same even when talking about USD when using $, ie. 2000.00 $. >>> >>> >> -- >> Annette Greiner >> NERSC Data and Analytics Services >> Lawrence Berkeley National Laboratory >> -- Annette Greiner NERSC Data and Analytics Services Lawrence Berkeley National Laboratory
Re: [i18n review comment] BP3 should recommend locale-neutral representation #187
Bernadette Farias Lóscio   Mon, 15 Aug 2016 18:28:37 +0200

public-dwbp-comments > August 2016 > 0000.html

Received on Monday, 15 August 2016 16:29:29 UTC

Show in list: by dateby threadby subjectby author

Link to this message in this page.

Sent to: amgreiner@lbl.gov
Copied to: addison@lab126.com, ishida@w3.org, ishida@w3.org, public-dwbp-comments@w3.org, public-dwbp-comments@w3.org, www-international@w3.org.

Dear Ishida, This comment [1] is still under discussion [4] and we'd like to ask your opinion about two of our proposals: 1. to include locale-neutral representation ideas as part of BP3 [2], or 2. to include a paragraph at the introduction of Section 8.8 Data Formats [3] to discuss the relevance of having local-neutral representations. We also discussed the proposal of having a new BP and we agreed that we won't have a lot of time for a broader review of the new BP and to collect feedback from the community. Thanks a lot! DWBP editors [1] https://lists.w3.org/Archives/Public/public-dwbp-comments/ 2016Jul/0028.html [2]http://agreiner.github.io/dwbp/bp.html#LocaleParametersMetadata [3] https://www.w3.org/TR/dwbp/#dataFormats [4] https://lists.w3.org/Archives/Public/public-dwbp-wg/2016Aug/0009.html 2016-08-04 23:26 GMT+02:00 Annette Greiner <amgreiner@lbl.gov>: > Hi Addison, > > Thanks for your response, and it does make sense. I think what I am still > missing is whether there is guidance we can point to as to how to represent > the "locale-neutral" data so that it can most easily be made locale > specific by existing tools. You mention "pre-made standards for the basic > data types". Is there a recommended list we could reference? > > Thanks for your help! > -Annette > > > On 8/4/16 12:31 PM, Phillips, Addison wrote: > >> Hi Annette, >> >> Thanks for the note. This is a personal reply not on behalf of the WG. >> >> Locale neutral formats are quite common on the Web and the Internet in >> general. One familiar format referenced by your document, for example, is >> XML Schema. While the representations of numbers, dates, and the like in >> XML Schema would be "more appropriate" for some languages/locales than >> others if given as plain text, what distinguishes them is that they are all >> machine readable and intended to be read by machines for later processing. >> The display of values is a separate, local, concern for the data's >> consumer. This necessarily means choosing specific separators (such as >> decimal separators) over other, more localized values. Save for "free text" >> (natural language) data, most data formats are locale neutral and these >> include things like JSON-LD, XML Schema, CSV, and so forth. >> >> Not every possible data structure or data value is, of course, covered >> fully. For example, in my day job (I work at Amazon), we have many >> different common measurement units defined internally. To transmit these in >> a locale-neutral manner, we need to construct our own data schemas and >> identifiers. There are profoundly many ways to measure shoes, dresses, auto >> parts, hats, drone propellers, and so forth. But it would be a nightmare to >> have to deal with localized presentation formats on top of that. >> >> But there are pre-made standards for the basic data types and these are >> what are needed to build almost any data structure necessary for global >> interchange of data. >> >> Does that make sense? >> >> Addison >> >> Addison Phillips >> Principal SDE, I18N Architect (Amazon) >> Chair (W3C I18N WG) >> >> Internationalization is not a feature. >> It is an architecture. >> >> >> >> >> -----Original Message----- >>> From: Annette Greiner [mailto:amgreiner@lbl.gov] >>> Sent: Thursday, August 04, 2016 12:04 PM >>> To: ishida@w3.org; public-dwbp-comments@w3.org >>> Cc: www International <www-international@w3.org> >>> Subject: Re: [i18n review comment] BP3 should recommend locale-neutral >>> representation #187 >>> >>> Hello on behalf of the DWBP WG, >>> >>> We're interested in pursuing this concept in our best practice document, >>> but >>> we would like some clarification of the practice of locale neutrality. >>> You >>> mention the variation across locales in decimal symbol, grouping symbol, >>> number of grouping digits, digit shapes, etc., and you give an example >>> of a >>> locale-neutral data structure for monetary values. >>> But this structure alone does not appear to address differences in >>> decimal >>> symbol, grouping symbol, number of grouping digits, or digit shapes. It >>> does >>> provide a mechanism to separately specify the units, and the example uses >>> an ISO-4217 currency code, both of which we agree are good ideas. Is >>> there a >>> broad standard (beyond just monetary) for addressing the other >>> symbol/representation issues you raised that we can address briefly in >>> our >>> best practice? Do you consider SI units consistent with a locale-neutral >>> approach? Is there a locale-neutral standard for representing decimal >>> numbers (perhaps using a period and no grouping, as in your example)? >>> >>> -Annette >>> >>> >>> On 7/22/16 5:32 AM, ishida@w3.org wrote: >>> >>>> [raised by aphillips] >>>> >>>> https://www.w3.org/TR/dwbp/#LocaleParametersMetadata >>>> >>>> Best practice #3 introduces itself as: >>>> >>>> Providing locale parameters helps humans and computer applications >>>>> >>>> to work accurately with things like dates, currencies and numbers that >>>> may look similar but have different meanings in different locales. >>>> >>>> But the actual best practice is to use **locale-neutral** >>>> representations that are interpreted/displayed to end-users in a >>>> locale-appropriate manner. For example, instead of storing the string >>>> "€2000.00", exchanging a data structure like the following is strongly >>>> preferred: >>>> >>>> ``` >>>> "price" { >>>> "value": 2000.00, >>>> "currency": "EUR" >>>> } >>>> ``` >>>> >>>> The date examples given are all in xsd:date format, which is an >>>> excellent example of using a locale-neutral format. >>>> >>>> Many things are dependent on locale: decimal symbol, grouping symbol, >>>> number of grouping digits, digit shapes, etc. It's because there can >>>> be wide variation (sometimes open to misinterpretation) that sending a >>>> locale neutral format is preferred for data values. Note also btw that >>>> the position of the currency symbol is dependent on the locale. In >>>> France it would be normal to write 2000.00 € rather than €2000.00. >>>> Same even when talking about USD when using $, ie. 2000.00 $. >>>> >>>> >>>> -- >>> Annette Greiner >>> NERSC Data and Analytics Services >>> Lawrence Berkeley National Laboratory >>> >>> > -- > Annette Greiner > NERSC Data and Analytics Services > Lawrence Berkeley National Laboratory > > > -- Bernadette Farias Lóscio Centro de Informática Universidade Federal de Pernambuco - UFPE, Brazil ----------------------------------------------------------------------------
Re: [i18n review comment] BP3 should recommend locale-neutral representation #187
Phil Archer   Fri, 19 Aug 2016 16:37:26 +0100

public-dwbp-comments > August 2016 > 0000.html

Received on Friday, 19 August 2016 15:34:59 UTC

Show in list: by dateby threadby subjectby author

Link to this message in this page.

Sent to: bfl@cin.ufpe.br, amgreiner@lbl.gov
Copied to: addison@lab126.com, ishida@w3.org, ishida@w3.org, public-dwbp-comments@w3.org, public-dwbp-comments@w3.org, www-international@w3.org.

I took an action on today's call to try and address this in BP3. You can see the results at http://philarcher1.github.io/dwbp/bp.html#LocaleParametersMetadata This uses some of Addison's text directly and highlights the value of the xsd datatypes - but retains enough of the original BP for it to be an amendment rather than a whole new one - I hope. This addresses most of the resolution taken today [1] but I have not moved the BP to the formats section. I leave that to the editors who may want to make further changes - or argue for it to be left where it is, or add references from the formats section or, or, or... I've created the Pull Request https://github.com/w3c/dwbp/pull/447 Phil. [1] https://www.w3.org/2016/08/19-dwbp-minutes#resolution02 On 15/08/2016 17:28, Bernadette Farias Lóscio wrote: > Dear Ishida, > > This comment [1] is still under discussion [4] and we'd like to ask your > opinion about two of our proposals: > > 1. to include locale-neutral representation ideas as part of BP3 [2], or > 2. to include a paragraph at the introduction of Section 8.8 Data Formats > [3] to discuss the relevance of having local-neutral representations. > > We also discussed the proposal of having a new BP and we agreed that we > won't have a lot of time for a broader review of the new BP and to collect > feedback from the community. > > Thanks a lot! > DWBP editors > > [1] https://lists.w3.org/Archives/Public/public-dwbp-comments/ > 2016Jul/0028.html > [2]http://agreiner.github.io/dwbp/bp.html#LocaleParametersMetadata > [3] https://www.w3.org/TR/dwbp/#dataFormats > [4] https://lists.w3.org/Archives/Public/public-dwbp-wg/2016Aug/0009.html > > > 2016-08-04 23:26 GMT+02:00 Annette Greiner <amgreiner@lbl.gov>: > >> Hi Addison, >> >> Thanks for your response, and it does make sense. I think what I am still >> missing is whether there is guidance we can point to as to how to represent >> the "locale-neutral" data so that it can most easily be made locale >> specific by existing tools. You mention "pre-made standards for the basic >> data types". Is there a recommended list we could reference? >> >> Thanks for your help! >> -Annette >> >> >> On 8/4/16 12:31 PM, Phillips, Addison wrote: >> >>> Hi Annette, >>> >>> Thanks for the note. This is a personal reply not on behalf of the WG. >>> >>> Locale neutral formats are quite common on the Web and the Internet in >>> general. One familiar format referenced by your document, for example, is >>> XML Schema. While the representations of numbers, dates, and the like in >>> XML Schema would be "more appropriate" for some languages/locales than >>> others if given as plain text, what distinguishes them is that they are all >>> machine readable and intended to be read by machines for later processing. >>> The display of values is a separate, local, concern for the data's >>> consumer. This necessarily means choosing specific separators (such as >>> decimal separators) over other, more localized values. Save for "free text" >>> (natural language) data, most data formats are locale neutral and these >>> include things like JSON-LD, XML Schema, CSV, and so forth. >>> >>> Not every possible data structure or data value is, of course, covered >>> fully. For example, in my day job (I work at Amazon), we have many >>> different common measurement units defined internally. To transmit these in >>> a locale-neutral manner, we need to construct our own data schemas and >>> identifiers. There are profoundly many ways to measure shoes, dresses, auto >>> parts, hats, drone propellers, and so forth. But it would be a nightmare to >>> have to deal with localized presentation formats on top of that. >>> >>> But there are pre-made standards for the basic data types and these are >>> what are needed to build almost any data structure necessary for global >>> interchange of data. >>> >>> Does that make sense? >>> >>> Addison >>> >>> Addison Phillips >>> Principal SDE, I18N Architect (Amazon) >>> Chair (W3C I18N WG) >>> >>> Internationalization is not a feature. >>> It is an architecture. >>> >>> >>> >>> >>> -----Original Message----- >>>> From: Annette Greiner [mailto:amgreiner@lbl.gov] >>>> Sent: Thursday, August 04, 2016 12:04 PM >>>> To: ishida@w3.org; public-dwbp-comments@w3.org >>>> Cc: www International <www-international@w3.org> >>>> Subject: Re: [i18n review comment] BP3 should recommend locale-neutral >>>> representation #187 >>>> >>>> Hello on behalf of the DWBP WG, >>>> >>>> We're interested in pursuing this concept in our best practice document, >>>> but >>>> we would like some clarification of the practice of locale neutrality. >>>> You >>>> mention the variation across locales in decimal symbol, grouping symbol, >>>> number of grouping digits, digit shapes, etc., and you give an example >>>> of a >>>> locale-neutral data structure for monetary values. >>>> But this structure alone does not appear to address differences in >>>> decimal >>>> symbol, grouping symbol, number of grouping digits, or digit shapes. It >>>> does >>>> provide a mechanism to separately specify the units, and the example uses >>>> an ISO-4217 currency code, both of which we agree are good ideas. Is >>>> there a >>>> broad standard (beyond just monetary) for addressing the other >>>> symbol/representation issues you raised that we can address briefly in >>>> our >>>> best practice? Do you consider SI units consistent with a locale-neutral >>>> approach? Is there a locale-neutral standard for representing decimal >>>> numbers (perhaps using a period and no grouping, as in your example)? >>>> >>>> -Annette >>>> >>>> >>>> On 7/22/16 5:32 AM, ishida@w3.org wrote: >>>> >>>>> [raised by aphillips] >>>>> >>>>> https://www.w3.org/TR/dwbp/#LocaleParametersMetadata >>>>> >>>>> Best practice #3 introduces itself as: >>>>> >>>>> Providing locale parameters helps humans and computer applications >>>>>> >>>>> to work accurately with things like dates, currencies and numbers that >>>>> may look similar but have different meanings in different locales. >>>>> >>>>> But the actual best practice is to use **locale-neutral** >>>>> representations that are interpreted/displayed to end-users in a >>>>> locale-appropriate manner. For example, instead of storing the string >>>>> "€2000.00", exchanging a data structure like the following is strongly >>>>> preferred: >>>>> >>>>> ``` >>>>> "price" { >>>>> "value": 2000.00, >>>>> "currency": "EUR" >>>>> } >>>>> ``` >>>>> >>>>> The date examples given are all in xsd:date format, which is an >>>>> excellent example of using a locale-neutral format. >>>>> >>>>> Many things are dependent on locale: decimal symbol, grouping symbol, >>>>> number of grouping digits, digit shapes, etc. It's because there can >>>>> be wide variation (sometimes open to misinterpretation) that sending a >>>>> locale neutral format is preferred for data values. Note also btw that >>>>> the position of the currency symbol is dependent on the locale. In >>>>> France it would be normal to write 2000.00 € rather than €2000.00. >>>>> Same even when talking about USD when using $, ie. 2000.00 $. >>>>> >>>>> >>>>> -- >>>> Annette Greiner >>>> NERSC Data and Analytics Services >>>> Lawrence Berkeley National Laboratory >>>> >>>> >> -- >> Annette Greiner >> NERSC Data and Analytics Services >> Lawrence Berkeley National Laboratory >> >> >> > > -- Phil Archer W3C Data Activity Lead http://www.w3.org/2013/data/ http://philarcher.org +44 (0)7887 767755 @philarcher1
RE: [i18n review comment] BP3 should recommend locale-neutral representation #187
"Phillips, Addison"   Fri, 19 Aug 2016 16:39:36 +0000

public-dwbp-comments > August 2016 > 0000.html

Received on Friday, 19 August 2016 16:40:08 UTC

Show in list: by dateby threadby subjectby author

Link to this message in this page.

Sent to: phila@w3.org, bfl@cin.ufpe.br, amgreiner@lbl.gov
Copied to: ishida@w3.org, ishida@w3.org, public-dwbp-comments@w3.org, public-dwbp-comments@w3.org, www-international@w3.org.

Hi Phil, Thanks for starting on this. I think the pull request is a good start. I have some comments on it. My main concern is that this BP is really backwards. It recommends to "locale parameter metadata" and then says that the simplest way to do this is to use locale-neutral formats. The recommendation should be more like "use locale-neutral formats or provide locale/language information where that's not possible". The pull request captures the use of locale-neutral, but doesn't really explain about when to provide locale and language information. I would change this: -- <p class="practicedesc">Provide metadata about locale parameters (date, time, and number formats, language).</p> -- To say: -- <p class="practicedesc">Use locale-neutral data structures and values, or, where that is not possible, provide metadata about the locale used by data values.</p> -- I would change: -- <p>The simplest method is to use local-neutral representations of the actual data, and then add metadata to provide relevant locale information. For example, rather than storing "€2000.00" as a string, it's strongly preferred to exchange a data structure such as:</p> -- To say: -- <p>Most common data representations are locale neutral. For example, XML Schema types such as xsd:integer and xsd: date are intended for locale-neutral data interchange. Using locale-neutral representations allows the data values to be processed accurately without complex parsing or misinterpretation and also allows the data to be presented in the format most comfortable for the consumer of the data. For example, rather than storing "€2000,00" as a string, it's strongly preferred to exchange a data structure such as:</p> -- Also, note the misspelling of "locale-neutral" in the pull request. I would then go on to add some text about when locale parameters are needed. Something like: -- Some datasets contain values that are not or cannot be rendered into a locale-neutral format. This is particularly true of any natural language text values. For each data field that can contain locale affected or natural language text, there should be an associated language tag used to indicate the language and locale of the data. This locale information can be used in parsing the data or to ensure proper presentation and processing of the value by the consumer. -- (Sorry for not generating a pull request of my own) Addison > -----Original Message----- > From: Phil Archer [mailto:phila@w3.org] > Sent: Friday, August 19, 2016 8:37 AM > To: Bernadette Farias Lóscio <bfl@cin.ufpe.br>; Annette Greiner > <amgreiner@lbl.gov> > Cc: Phillips, Addison <addison@lab126.com>; ishida@w3.org; public-dwbp- > comments@w3.org; www International <www-international@w3.org> > Subject: Re: [i18n review comment] BP3 should recommend locale-neutral > representation #187 > > I took an action on today's call to try and address this in BP3. You can see the > results at > http://philarcher1.github.io/dwbp/bp.html#LocaleParametersMetadata > > This uses some of Addison's text directly and highlights the value of the xsd > datatypes - but retains enough of the original BP for it to be an amendment > rather than a whole new one - I hope. > > This addresses most of the resolution taken today [1] but I have not moved > the BP to the formats section. I leave that to the editors who may want to > make further changes - or argue for it to be left where it is, or add references > from the formats section or, or, or... > > I've created the Pull Request https://github.com/w3c/dwbp/pull/447 > > Phil. > > [1] https://www.w3.org/2016/08/19-dwbp-minutes#resolution02 > > On 15/08/2016 17:28, Bernadette Farias Lóscio wrote: > > Dear Ishida, > > > > This comment [1] is still under discussion [4] and we'd like to ask > > your opinion about two of our proposals: > > > > 1. to include locale-neutral representation ideas as part of BP3 [2], > > or 2. to include a paragraph at the introduction of Section 8.8 Data > > Formats [3] to discuss the relevance of having local-neutral representations. > > > > We also discussed the proposal of having a new BP and we agreed that > > we won't have a lot of time for a broader review of the new BP and to > > collect feedback from the community. > > > > Thanks a lot! > > DWBP editors > > > > [1] https://lists.w3.org/Archives/Public/public-dwbp-comments/ > > 2016Jul/0028.html > > [2]http://agreiner.github.io/dwbp/bp.html#LocaleParametersMetadata > > [3] https://www.w3.org/TR/dwbp/#dataFormats > > [4] > > https://lists.w3.org/Archives/Public/public-dwbp-wg/2016Aug/0009.html > > > > > > 2016-08-04 23:26 GMT+02:00 Annette Greiner <amgreiner@lbl.gov>: > > > >> Hi Addison, > >> > >> Thanks for your response, and it does make sense. I think what I am > >> still missing is whether there is guidance we can point to as to how > >> to represent the "locale-neutral" data so that it can most easily be > >> made locale specific by existing tools. You mention "pre-made > >> standards for the basic data types". Is there a recommended list we could > reference? > >> > >> Thanks for your help! > >> -Annette > >> > >> > >> On 8/4/16 12:31 PM, Phillips, Addison wrote: > >> > >>> Hi Annette, > >>> > >>> Thanks for the note. This is a personal reply not on behalf of the WG. > >>> > >>> Locale neutral formats are quite common on the Web and the Internet > >>> in general. One familiar format referenced by your document, for > >>> example, is XML Schema. While the representations of numbers, dates, > >>> and the like in XML Schema would be "more appropriate" for some > >>> languages/locales than others if given as plain text, what > >>> distinguishes them is that they are all machine readable and intended to > be read by machines for later processing. > >>> The display of values is a separate, local, concern for the data's > >>> consumer. This necessarily means choosing specific separators (such > >>> as decimal separators) over other, more localized values. Save for "free > text" > >>> (natural language) data, most data formats are locale neutral and > >>> these include things like JSON-LD, XML Schema, CSV, and so forth. > >>> > >>> Not every possible data structure or data value is, of course, > >>> covered fully. For example, in my day job (I work at Amazon), we > >>> have many different common measurement units defined internally. To > >>> transmit these in a locale-neutral manner, we need to construct our > >>> own data schemas and identifiers. There are profoundly many ways to > >>> measure shoes, dresses, auto parts, hats, drone propellers, and so > >>> forth. But it would be a nightmare to have to deal with localized > presentation formats on top of that. > >>> > >>> But there are pre-made standards for the basic data types and these > >>> are what are needed to build almost any data structure necessary for > >>> global interchange of data. > >>> > >>> Does that make sense? > >>> > >>> Addison > >>> > >>> Addison Phillips > >>> Principal SDE, I18N Architect (Amazon) Chair (W3C I18N WG) > >>> > >>> Internationalization is not a feature. > >>> It is an architecture. > >>> > >>> > >>> > >>> > >>> -----Original Message----- > >>>> From: Annette Greiner [mailto:amgreiner@lbl.gov] > >>>> Sent: Thursday, August 04, 2016 12:04 PM > >>>> To: ishida@w3.org; public-dwbp-comments@w3.org > >>>> Cc: www International <www-international@w3.org> > >>>> Subject: Re: [i18n review comment] BP3 should recommend > >>>> locale-neutral representation #187 > >>>> > >>>> Hello on behalf of the DWBP WG, > >>>> > >>>> We're interested in pursuing this concept in our best practice > >>>> document, but we would like some clarification of the practice of > >>>> locale neutrality. > >>>> You > >>>> mention the variation across locales in decimal symbol, grouping > >>>> symbol, number of grouping digits, digit shapes, etc., and you give > >>>> an example of a locale-neutral data structure for monetary values. > >>>> But this structure alone does not appear to address differences in > >>>> decimal symbol, grouping symbol, number of grouping digits, or > >>>> digit shapes. It does provide a mechanism to separately specify the > >>>> units, and the example uses an ISO-4217 currency code, both of > >>>> which we agree are good ideas. Is there a broad standard (beyond > >>>> just monetary) for addressing the other symbol/representation > >>>> issues you raised that we can address briefly in our best practice? > >>>> Do you consider SI units consistent with a locale-neutral approach? > >>>> Is there a locale-neutral standard for representing decimal numbers > >>>> (perhaps using a period and no grouping, as in your example)? > >>>> > >>>> -Annette > >>>> > >>>> > >>>> On 7/22/16 5:32 AM, ishida@w3.org wrote: > >>>> > >>>>> [raised by aphillips] > >>>>> > >>>>> https://www.w3.org/TR/dwbp/#LocaleParametersMetadata > >>>>> > >>>>> Best practice #3 introduces itself as: > >>>>> > >>>>> Providing locale parameters helps humans and computer applications > >>>>>> > >>>>> to work accurately with things like dates, currencies and numbers > >>>>> that may look similar but have different meanings in different locales. > >>>>> > >>>>> But the actual best practice is to use **locale-neutral** > >>>>> representations that are interpreted/displayed to end-users in a > >>>>> locale-appropriate manner. For example, instead of storing the > >>>>> string "€2000.00", exchanging a data structure like the following > >>>>> is strongly > >>>>> preferred: > >>>>> > >>>>> ``` > >>>>> "price" { > >>>>> "value": 2000.00, > >>>>> "currency": "EUR" > >>>>> } > >>>>> ``` > >>>>> > >>>>> The date examples given are all in xsd:date format, which is an > >>>>> excellent example of using a locale-neutral format. > >>>>> > >>>>> Many things are dependent on locale: decimal symbol, grouping > >>>>> symbol, number of grouping digits, digit shapes, etc. It's because > >>>>> there can be wide variation (sometimes open to misinterpretation) > >>>>> that sending a locale neutral format is preferred for data values. > >>>>> Note also btw that the position of the currency symbol is > >>>>> dependent on the locale. In France it would be normal to write > 2000.00 € rather than €2000.00. > >>>>> Same even when talking about USD when using $, ie. 2000.00 $. > >>>>> > >>>>> > >>>>> -- > >>>> Annette Greiner > >>>> NERSC Data and Analytics Services > >>>> Lawrence Berkeley National Laboratory > >>>> > >>>> > >> -- > >> Annette Greiner > >> NERSC Data and Analytics Services > >> Lawrence Berkeley National Laboratory > >> > >> > >> > > > > > > -- > > > Phil Archer > W3C Data Activity Lead > http://www.w3.org/2013/data/ > > http://philarcher.org > +44 (0)7887 767755 > @philarcher1
Re: [i18n review comment] BP3 should recommend locale-neutral representation #187
Annette Greiner   Fri, 19 Aug 2016 13:23:34 -0700

public-dwbp-comments > August 2016 > 0000.html

Received on Friday, 19 August 2016 20:24:36 UTC

Show in list: by dateby threadby subjectby author

Link to this message in this page.

Sent to: phila@w3.org, bfl@cin.ufpe.br
Copied to: addison@lab126.com, ishida@w3.org, ishida@w3.org, public-dwbp-comments@w3.org, public-dwbp-comments@w3.org, www-international@w3.org.

Thanks, Phil, for giving this a try. I think in light of Addison's comments, we will need to make a more substantial change. We had discussed in today's call changing the sense of the BP to primarily suggest using locale-neutral representations and to offer metadata only as a fallback if that wasn't workable. The version at http://agreiner.github.io/dwbp/bp.html#LocaleParametersMetadata goes a little further in that direction, but even that doesn't go far enough. I think we need to write a new BP, "use locale-neutral data representations" and only mention the metadata approach in the implementation section as a fallback. There are usable pieces of text in the three versions of BP3 floating around, though I think this calls for a little new text as well, to get the angle right. -Annette On 8/19/16 8:37 AM, Phil Archer wrote: > I took an action on today's call to try and address this in BP3. You > can see the results at > http://philarcher1.github.io/dwbp/bp.html#LocaleParametersMetadata > > This uses some of Addison's text directly and highlights the value of > the xsd datatypes - but retains enough of the original BP for it to be > an amendment rather than a whole new one - I hope. > > This addresses most of the resolution taken today [1] but I have not > moved the BP to the formats section. I leave that to the editors who > may want to make further changes - or argue for it to be left where it > is, or add references from the formats section or, or, or... > > I've created the Pull Request https://github.com/w3c/dwbp/pull/447 > > Phil. > > [1] https://www.w3.org/2016/08/19-dwbp-minutes#resolution02 > > On 15/08/2016 17:28, Bernadette Farias Lóscio wrote: >> Dear Ishida, >> >> This comment [1] is still under discussion [4] and we'd like to ask your >> opinion about two of our proposals: >> >> 1. to include locale-neutral representation ideas as part of BP3 [2], or >> 2. to include a paragraph at the introduction of Section 8.8 Data >> Formats >> [3] to discuss the relevance of having local-neutral representations. >> >> We also discussed the proposal of having a new BP and we agreed that we >> won't have a lot of time for a broader review of the new BP and to >> collect >> feedback from the community. >> >> Thanks a lot! >> DWBP editors >> >> [1] https://lists.w3.org/Archives/Public/public-dwbp-comments/ >> 2016Jul/0028.html >> [2]http://agreiner.github.io/dwbp/bp.html#LocaleParametersMetadata >> [3] https://www.w3.org/TR/dwbp/#dataFormats >> [4] >> https://lists.w3.org/Archives/Public/public-dwbp-wg/2016Aug/0009.html >> >> >> 2016-08-04 23:26 GMT+02:00 Annette Greiner <amgreiner@lbl.gov>: >> >>> Hi Addison, >>> >>> Thanks for your response, and it does make sense. I think what I am >>> still >>> missing is whether there is guidance we can point to as to how to >>> represent >>> the "locale-neutral" data so that it can most easily be made locale >>> specific by existing tools. You mention "pre-made standards for the >>> basic >>> data types". Is there a recommended list we could reference? >>> >>> Thanks for your help! >>> -Annette >>> >>> >>> On 8/4/16 12:31 PM, Phillips, Addison wrote: >>> >>>> Hi Annette, >>>> >>>> Thanks for the note. This is a personal reply not on behalf of the WG. >>>> >>>> Locale neutral formats are quite common on the Web and the Internet in >>>> general. One familiar format referenced by your document, for >>>> example, is >>>> XML Schema. While the representations of numbers, dates, and the >>>> like in >>>> XML Schema would be "more appropriate" for some languages/locales than >>>> others if given as plain text, what distinguishes them is that they >>>> are all >>>> machine readable and intended to be read by machines for later >>>> processing. >>>> The display of values is a separate, local, concern for the data's >>>> consumer. This necessarily means choosing specific separators (such as >>>> decimal separators) over other, more localized values. Save for >>>> "free text" >>>> (natural language) data, most data formats are locale neutral and >>>> these >>>> include things like JSON-LD, XML Schema, CSV, and so forth. >>>> >>>> Not every possible data structure or data value is, of course, covered >>>> fully. For example, in my day job (I work at Amazon), we have many >>>> different common measurement units defined internally. To transmit >>>> these in >>>> a locale-neutral manner, we need to construct our own data schemas and >>>> identifiers. There are profoundly many ways to measure shoes, >>>> dresses, auto >>>> parts, hats, drone propellers, and so forth. But it would be a >>>> nightmare to >>>> have to deal with localized presentation formats on top of that. >>>> >>>> But there are pre-made standards for the basic data types and these >>>> are >>>> what are needed to build almost any data structure necessary for >>>> global >>>> interchange of data. >>>> >>>> Does that make sense? >>>> >>>> Addison >>>> >>>> Addison Phillips >>>> Principal SDE, I18N Architect (Amazon) >>>> Chair (W3C I18N WG) >>>> >>>> Internationalization is not a feature. >>>> It is an architecture. >>>> >>>> >>>> >>>> >>>> -----Original Message----- >>>>> From: Annette Greiner [mailto:amgreiner@lbl.gov] >>>>> Sent: Thursday, August 04, 2016 12:04 PM >>>>> To: ishida@w3.org; public-dwbp-comments@w3.org >>>>> Cc: www International <www-international@w3.org> >>>>> Subject: Re: [i18n review comment] BP3 should recommend >>>>> locale-neutral >>>>> representation #187 >>>>> >>>>> Hello on behalf of the DWBP WG, >>>>> >>>>> We're interested in pursuing this concept in our best practice >>>>> document, >>>>> but >>>>> we would like some clarification of the practice of locale >>>>> neutrality. >>>>> You >>>>> mention the variation across locales in decimal symbol, grouping >>>>> symbol, >>>>> number of grouping digits, digit shapes, etc., and you give an >>>>> example >>>>> of a >>>>> locale-neutral data structure for monetary values. >>>>> But this structure alone does not appear to address differences in >>>>> decimal >>>>> symbol, grouping symbol, number of grouping digits, or digit >>>>> shapes. It >>>>> does >>>>> provide a mechanism to separately specify the units, and the >>>>> example uses >>>>> an ISO-4217 currency code, both of which we agree are good ideas. Is >>>>> there a >>>>> broad standard (beyond just monetary) for addressing the other >>>>> symbol/representation issues you raised that we can address >>>>> briefly in >>>>> our >>>>> best practice? Do you consider SI units consistent with a >>>>> locale-neutral >>>>> approach? Is there a locale-neutral standard for representing decimal >>>>> numbers (perhaps using a period and no grouping, as in your example)? >>>>> >>>>> -Annette >>>>> >>>>> >>>>> On 7/22/16 5:32 AM, ishida@w3.org wrote: >>>>> >>>>>> [raised by aphillips] >>>>>> >>>>>> https://www.w3.org/TR/dwbp/#LocaleParametersMetadata >>>>>> >>>>>> Best practice #3 introduces itself as: >>>>>> >>>>>> Providing locale parameters helps humans and computer applications >>>>>>> >>>>>> to work accurately with things like dates, currencies and numbers >>>>>> that >>>>>> may look similar but have different meanings in different locales. >>>>>> >>>>>> But the actual best practice is to use **locale-neutral** >>>>>> representations that are interpreted/displayed to end-users in a >>>>>> locale-appropriate manner. For example, instead of storing the >>>>>> string >>>>>> "€2000.00", exchanging a data structure like the following is >>>>>> strongly >>>>>> preferred: >>>>>> >>>>>> ``` >>>>>> "price" { >>>>>> "value": 2000.00, >>>>>> "currency": "EUR" >>>>>> } >>>>>> ``` >>>>>> >>>>>> The date examples given are all in xsd:date format, which is an >>>>>> excellent example of using a locale-neutral format. >>>>>> >>>>>> Many things are dependent on locale: decimal symbol, grouping >>>>>> symbol, >>>>>> number of grouping digits, digit shapes, etc. It's because there can >>>>>> be wide variation (sometimes open to misinterpretation) that >>>>>> sending a >>>>>> locale neutral format is preferred for data values. Note also btw >>>>>> that >>>>>> the position of the currency symbol is dependent on the locale. In >>>>>> France it would be normal to write 2000.00 € rather than €2000.00. >>>>>> Same even when talking about USD when using $, ie. 2000.00 $. >>>>>> >>>>>> >>>>>> -- >>>>> Annette Greiner >>>>> NERSC Data and Analytics Services >>>>> Lawrence Berkeley National Laboratory >>>>> >>>>> >>> -- >>> Annette Greiner >>> NERSC Data and Analytics Services >>> Lawrence Berkeley National Laboratory >>> >>> >>> >> >> > -- Annette Greiner NERSC Data and Analytics Services Lawrence Berkeley National Laboratory
Re: [i18n review comment] BP3 should recommend locale-neutral representation #187
Deirdre Lee   Mon, 22 Aug 2016 08:52:52 +0100

public-dwbp-comments > August 2016 > 0000.html

Received on Monday, 22 August 2016 07:53:28 UTC

Show in list: by dateby threadby subjectby author

Link to this message in this page.

Sent to: addison@lab126.com, phila@w3.org, bfl@cin.ufpe.br, amgreiner@lbl.gov
Copied to: ishida@w3.org, ishida@w3.org, public-dwbp-comments@w3.org, public-dwbp-comments@w3.org, www-international@w3.org.

HI, Thank you for your comments Addison. I think they make sense and should be straight-forward to incorporate. The title of the BP should probably also be updated to something like 'Provide locale-neutral data' Phil and DWBP editors, in Friday's meeting we also agreed to move BP3 to the Data Formats section from the Metadata section, which would make it BP14, right? Kind regards, Deirdre On 19/08/2016 17:39, Phillips, Addison wrote: > Hi Phil, > > Thanks for starting on this. I think the pull request is a good start. I have some comments on it. > > My main concern is that this BP is really backwards. It recommends to "locale parameter metadata" and then says that the simplest way to do this is to use locale-neutral formats. The recommendation should be more like "use locale-neutral formats or provide locale/language information where that's not possible". The pull request captures the use of locale-neutral, but doesn't really explain about when to provide locale and language information. > > I would change this: > > -- > <p class="practicedesc">Provide metadata about locale parameters (date, time, and number formats, language).</p> > -- > > To say: > > -- > <p class="practicedesc">Use locale-neutral data structures and values, or, where that is not possible, provide metadata about the locale used by data values.</p> > -- > > I would change: > > -- > <p>The simplest method is to use local-neutral representations of the actual data, and then add metadata to provide relevant locale information. For example, rather than storing "€2000.00" as a string, it's strongly preferred to exchange a data structure such as:</p> > -- > > To say: > > -- > <p>Most common data representations are locale neutral. For example, XML Schema types such as xsd:integer and xsd: date are intended for locale-neutral data interchange. Using locale-neutral representations allows the data values to be processed accurately without complex parsing or misinterpretation and also allows the data to be presented in the format most comfortable for the consumer of the data. For example, rather than storing "€2000,00" as a string, it's strongly preferred to exchange a data structure such as:</p> > -- > > Also, note the misspelling of "locale-neutral" in the pull request. > > I would then go on to add some text about when locale parameters are needed. Something like: > > -- > Some datasets contain values that are not or cannot be rendered into a locale-neutral format. This is particularly true of any natural language text values. For each data field that can contain locale affected or natural language text, there should be an associated language tag used to indicate the language and locale of the data. This locale information can be used in parsing the data or to ensure proper presentation and processing of the value by the consumer. > -- > > (Sorry for not generating a pull request of my own) > > Addison > >> -----Original Message----- >> From: Phil Archer [mailto:phila@w3.org] >> Sent: Friday, August 19, 2016 8:37 AM >> To: Bernadette Farias Lóscio <bfl@cin.ufpe.br>; Annette Greiner >> <amgreiner@lbl.gov> >> Cc: Phillips, Addison <addison@lab126.com>; ishida@w3.org; public-dwbp- >> comments@w3.org; www International <www-international@w3.org> >> Subject: Re: [i18n review comment] BP3 should recommend locale-neutral >> representation #187 >> >> I took an action on today's call to try and address this in BP3. You can see the >> results at >> http://philarcher1.github.io/dwbp/bp.html#LocaleParametersMetadata >> >> This uses some of Addison's text directly and highlights the value of the xsd >> datatypes - but retains enough of the original BP for it to be an amendment >> rather than a whole new one - I hope. >> >> This addresses most of the resolution taken today [1] but I have not moved >> the BP to the formats section. I leave that to the editors who may want to >> make further changes - or argue for it to be left where it is, or add references >> from the formats section or, or, or... >> >> I've created the Pull Request https://github.com/w3c/dwbp/pull/447 >> >> Phil. >> >> [1] https://www.w3.org/2016/08/19-dwbp-minutes#resolution02 >> >> On 15/08/2016 17:28, Bernadette Farias Lóscio wrote: >>> Dear Ishida, >>> >>> This comment [1] is still under discussion [4] and we'd like to ask >>> your opinion about two of our proposals: >>> >>> 1. to include locale-neutral representation ideas as part of BP3 [2], >>> or 2. to include a paragraph at the introduction of Section 8.8 Data >>> Formats [3] to discuss the relevance of having local-neutral representations. >>> >>> We also discussed the proposal of having a new BP and we agreed that >>> we won't have a lot of time for a broader review of the new BP and to >>> collect feedback from the community. >>> >>> Thanks a lot! >>> DWBP editors >>> >>> [1] https://lists.w3.org/Archives/Public/public-dwbp-comments/ >>> 2016Jul/0028.html >>> [2]http://agreiner.github.io/dwbp/bp.html#LocaleParametersMetadata >>> [3] https://www.w3.org/TR/dwbp/#dataFormats >>> [4] >>> https://lists.w3.org/Archives/Public/public-dwbp-wg/2016Aug/0009.html >>> >>> >>> 2016-08-04 23:26 GMT+02:00 Annette Greiner <amgreiner@lbl.gov>: >>> >>>> Hi Addison, >>>> >>>> Thanks for your response, and it does make sense. I think what I am >>>> still missing is whether there is guidance we can point to as to how >>>> to represent the "locale-neutral" data so that it can most easily be >>>> made locale specific by existing tools. You mention "pre-made >>>> standards for the basic data types". Is there a recommended list we could >> reference? >>>> Thanks for your help! >>>> -Annette >>>> >>>> >>>> On 8/4/16 12:31 PM, Phillips, Addison wrote: >>>> >>>>> Hi Annette, >>>>> >>>>> Thanks for the note. This is a personal reply not on behalf of the WG. >>>>> >>>>> Locale neutral formats are quite common on the Web and the Internet >>>>> in general. One familiar format referenced by your document, for >>>>> example, is XML Schema. While the representations of numbers, dates, >>>>> and the like in XML Schema would be "more appropriate" for some >>>>> languages/locales than others if given as plain text, what >>>>> distinguishes them is that they are all machine readable and intended to >> be read by machines for later processing. >>>>> The display of values is a separate, local, concern for the data's >>>>> consumer. This necessarily means choosing specific separators (such >>>>> as decimal separators) over other, more localized values. Save for "free >> text" >>>>> (natural language) data, most data formats are locale neutral and >>>>> these include things like JSON-LD, XML Schema, CSV, and so forth. >>>>> >>>>> Not every possible data structure or data value is, of course, >>>>> covered fully. For example, in my day job (I work at Amazon), we >>>>> have many different common measurement units defined internally. To >>>>> transmit these in a locale-neutral manner, we need to construct our >>>>> own data schemas and identifiers. There are profoundly many ways to >>>>> measure shoes, dresses, auto parts, hats, drone propellers, and so >>>>> forth. But it would be a nightmare to have to deal with localized >> presentation formats on top of that. >>>>> But there are pre-made standards for the basic data types and these >>>>> are what are needed to build almost any data structure necessary for >>>>> global interchange of data. >>>>> >>>>> Does that make sense? >>>>> >>>>> Addison >>>>> >>>>> Addison Phillips >>>>> Principal SDE, I18N Architect (Amazon) Chair (W3C I18N WG) >>>>> >>>>> Internationalization is not a feature. >>>>> It is an architecture. >>>>> >>>>> >>>>> >>>>> >>>>> -----Original Message----- >>>>>> From: Annette Greiner [mailto:amgreiner@lbl.gov] >>>>>> Sent: Thursday, August 04, 2016 12:04 PM >>>>>> To: ishida@w3.org; public-dwbp-comments@w3.org >>>>>> Cc: www International <www-international@w3.org> >>>>>> Subject: Re: [i18n review comment] BP3 should recommend >>>>>> locale-neutral representation #187 >>>>>> >>>>>> Hello on behalf of the DWBP WG, >>>>>> >>>>>> We're interested in pursuing this concept in our best practice >>>>>> document, but we would like some clarification of the practice of >>>>>> locale neutrality. >>>>>> You >>>>>> mention the variation across locales in decimal symbol, grouping >>>>>> symbol, number of grouping digits, digit shapes, etc., and you give >>>>>> an example of a locale-neutral data structure for monetary values. >>>>>> But this structure alone does not appear to address differences in >>>>>> decimal symbol, grouping symbol, number of grouping digits, or >>>>>> digit shapes. It does provide a mechanism to separately specify the >>>>>> units, and the example uses an ISO-4217 currency code, both of >>>>>> which we agree are good ideas. Is there a broad standard (beyond >>>>>> just monetary) for addressing the other symbol/representation >>>>>> issues you raised that we can address briefly in our best practice? >>>>>> Do you consider SI units consistent with a locale-neutral approach? >>>>>> Is there a locale-neutral standard for representing decimal numbers >>>>>> (perhaps using a period and no grouping, as in your example)? >>>>>> >>>>>> -Annette >>>>>> >>>>>> >>>>>> On 7/22/16 5:32 AM, ishida@w3.org wrote: >>>>>> >>>>>>> [raised by aphillips] >>>>>>> >>>>>>> https://www.w3.org/TR/dwbp/#LocaleParametersMetadata >>>>>>> >>>>>>> Best practice #3 introduces itself as: >>>>>>> >>>>>>> Providing locale parameters helps humans and computer applications >>>>>>> to work accurately with things like dates, currencies and numbers >>>>>>> that may look similar but have different meanings in different locales. >>>>>>> >>>>>>> But the actual best practice is to use **locale-neutral** >>>>>>> representations that are interpreted/displayed to end-users in a >>>>>>> locale-appropriate manner. For example, instead of storing the >>>>>>> string "€2000.00", exchanging a data structure like the following >>>>>>> is strongly >>>>>>> preferred: >>>>>>> >>>>>>> ``` >>>>>>> "price" { >>>>>>> "value": 2000.00, >>>>>>> "currency": "EUR" >>>>>>> } >>>>>>> ``` >>>>>>> >>>>>>> The date examples given are all in xsd:date format, which is an >>>>>>> excellent example of using a locale-neutral format. >>>>>>> >>>>>>> Many things are dependent on locale: decimal symbol, grouping >>>>>>> symbol, number of grouping digits, digit shapes, etc. It's because >>>>>>> there can be wide variation (sometimes open to misinterpretation) >>>>>>> that sending a locale neutral format is preferred for data values. >>>>>>> Note also btw that the position of the currency symbol is >>>>>>> dependent on the locale. In France it would be normal to write >> 2000.00 € rather than €2000.00. >>>>>>> Same even when talking about USD when using $, ie. 2000.00 $. >>>>>>> >>>>>>> >>>>>>> -- >>>>>> Annette Greiner >>>>>> NERSC Data and Analytics Services >>>>>> Lawrence Berkeley National Laboratory >>>>>> >>>>>> >>>> -- >>>> Annette Greiner >>>> NERSC Data and Analytics Services >>>> Lawrence Berkeley National Laboratory >>>> >>>> >>>> >>> >> -- >> >> >> Phil Archer >> W3C Data Activity Lead >> http://www.w3.org/2013/data/ >> >> http://philarcher.org >> +44 (0)7887 767755 >> @philarcher1 -- ------------------------------------ Deirdre Lee, CEO & Founder Derilinx - Linked & Open Data Solutions Web: www.derilinx.com Email: deirdre@derilinx.com Address: 11/12 Baggot Court, Dublin 2, D02 F891 Tel: +353 (0)1 254 4316 Mob: +353 (0)87 417 2318 Linkedin: ie.linkedin.com/in/leedeirdre/ Twitter: @deirdrelee
Re: [i18n review comment] BP3 should recommend locale-neutral representation #187
Phil Archer   Mon, 22 Aug 2016 10:33:36 +0100

public-dwbp-comments > August 2016 > 0000.html

Received on Monday, 22 August 2016 09:31:06 UTC

Show in list: by dateby threadby subjectby author

Link to this message in this page.

Sent to: deirdre@derilinx.com, addison@lab126.com, bfl@cin.ufpe.br, amgreiner@lbl.gov
Copied to: ishida@w3.org, ishida@w3.org, public-dwbp-comments@w3.org, public-dwbp-comments@w3.org, www-international@w3.org.

Dear all, I have taken further steps on this. The result can be seen at http://philarcher1.github.io/dwbp/bp.html#LocaleParametersMetadata 1. Addision's text used more or less verbatim; 1a. taken account of Annette's suggestion; 1b. replaced inline links to BCP47 and CLDR with references 2. title of the BP changed to Use locale-neutral data representations 3. moved to Data Formats section as resolved in WG meeting on Friday; 4. added R-FormatMachineRead to list of evidence and thereby updated the UCR cross matching; 5. updated the Challenges SVG diagram; 6. updated my Pull request. NB, I *retained* the old ID for the BP so that any links to #LocaleParametersMetadata will still work. I know there are some of these, for example, in the Share-PSI project. HTH Phil. On 22/08/2016 08:52, Deirdre Lee wrote: > HI, > > Thank you for your comments Addison. I think they make sense and should > be straight-forward to incorporate. > > The title of the BP should probably also be updated to something like > 'Provide locale-neutral data' > > Phil and DWBP editors, in Friday's meeting we also agreed to move BP3 to > the Data Formats section from the Metadata section, which would make it > BP14, right? > > Kind regards, > > Deirdre > > > > On 19/08/2016 17:39, Phillips, Addison wrote: >> Hi Phil, >> >> Thanks for starting on this. I think the pull request is a good start. >> I have some comments on it. >> >> My main concern is that this BP is really backwards. It recommends to >> "locale parameter metadata" and then says that the simplest way to do >> this is to use locale-neutral formats. The recommendation should be >> more like "use locale-neutral formats or provide locale/language >> information where that's not possible". The pull request captures the >> use of locale-neutral, but doesn't really explain about when to >> provide locale and language information. >> >> I would change this: >> >> -- >> <p class="practicedesc">Provide metadata about locale parameters >> (date, time, and number formats, language).</p> >> -- >> >> To say: >> >> -- >> <p class="practicedesc">Use locale-neutral data structures and values, >> or, where that is not possible, provide metadata about the locale used >> by data values.</p> >> -- >> >> I would change: >> >> -- >> <p>The simplest method is to use local-neutral representations of the >> actual data, and then add metadata to provide relevant locale >> information. For example, rather than storing "€2000.00" as a string, >> it's strongly preferred to exchange a data structure such as:</p> >> -- >> >> To say: >> >> -- >> <p>Most common data representations are locale neutral. For example, >> XML Schema types such as xsd:integer and xsd: date are intended for >> locale-neutral data interchange. Using locale-neutral representations >> allows the data values to be processed accurately without complex >> parsing or misinterpretation and also allows the data to be presented >> in the format most comfortable for the consumer of the data. For >> example, rather than storing "€2000,00" as a string, it's strongly >> preferred to exchange a data structure such as:</p> >> -- >> >> Also, note the misspelling of "locale-neutral" in the pull request. >> >> I would then go on to add some text about when locale parameters are >> needed. Something like: >> >> -- >> Some datasets contain values that are not or cannot be rendered into a >> locale-neutral format. This is particularly true of any natural >> language text values. For each data field that can contain locale >> affected or natural language text, there should be an associated >> language tag used to indicate the language and locale of the data. >> This locale information can be used in parsing the data or to ensure >> proper presentation and processing of the value by the consumer. >> -- >> >> (Sorry for not generating a pull request of my own) >> >> Addison >> >>> -----Original Message----- >>> From: Phil Archer [mailto:phila@w3.org] >>> Sent: Friday, August 19, 2016 8:37 AM >>> To: Bernadette Farias Lóscio <bfl@cin.ufpe.br>; Annette Greiner >>> <amgreiner@lbl.gov> >>> Cc: Phillips, Addison <addison@lab126.com>; ishida@w3.org; public-dwbp- >>> comments@w3.org; www International <www-international@w3.org> >>> Subject: Re: [i18n review comment] BP3 should recommend locale-neutral >>> representation #187 >>> >>> I took an action on today's call to try and address this in BP3. You >>> can see the >>> results at >>> http://philarcher1.github.io/dwbp/bp.html#LocaleParametersMetadata >>> >>> This uses some of Addison's text directly and highlights the value of >>> the xsd >>> datatypes - but retains enough of the original BP for it to be an >>> amendment >>> rather than a whole new one - I hope. >>> >>> This addresses most of the resolution taken today [1] but I have not >>> moved >>> the BP to the formats section. I leave that to the editors who may >>> want to >>> make further changes - or argue for it to be left where it is, or add >>> references >>> from the formats section or, or, or... >>> >>> I've created the Pull Request https://github.com/w3c/dwbp/pull/447 >>> >>> Phil. >>> >>> [1] https://www.w3.org/2016/08/19-dwbp-minutes#resolution02 >>> >>> On 15/08/2016 17:28, Bernadette Farias Lóscio wrote: >>>> Dear Ishida, >>>> >>>> This comment [1] is still under discussion [4] and we'd like to ask >>>> your opinion about two of our proposals: >>>> >>>> 1. to include locale-neutral representation ideas as part of BP3 [2], >>>> or 2. to include a paragraph at the introduction of Section 8.8 Data >>>> Formats [3] to discuss the relevance of having local-neutral >>>> representations. >>>> >>>> We also discussed the proposal of having a new BP and we agreed that >>>> we won't have a lot of time for a broader review of the new BP and to >>>> collect feedback from the community. >>>> >>>> Thanks a lot! >>>> DWBP editors >>>> >>>> [1] https://lists.w3.org/Archives/Public/public-dwbp-comments/ >>>> 2016Jul/0028.html >>>> [2]http://agreiner.github.io/dwbp/bp.html#LocaleParametersMetadata >>>> [3] https://www.w3.org/TR/dwbp/#dataFormats >>>> [4] >>>> https://lists.w3.org/Archives/Public/public-dwbp-wg/2016Aug/0009.html >>>> >>>> >>>> 2016-08-04 23:26 GMT+02:00 Annette Greiner <amgreiner@lbl.gov>: >>>> >>>>> Hi Addison, >>>>> >>>>> Thanks for your response, and it does make sense. I think what I am >>>>> still missing is whether there is guidance we can point to as to how >>>>> to represent the "locale-neutral" data so that it can most easily be >>>>> made locale specific by existing tools. You mention "pre-made >>>>> standards for the basic data types". Is there a recommended list we >>>>> could >>> reference? >>>>> Thanks for your help! >>>>> -Annette >>>>> >>>>> >>>>> On 8/4/16 12:31 PM, Phillips, Addison wrote: >>>>> >>>>>> Hi Annette, >>>>>> >>>>>> Thanks for the note. This is a personal reply not on behalf of the >>>>>> WG. >>>>>> >>>>>> Locale neutral formats are quite common on the Web and the Internet >>>>>> in general. One familiar format referenced by your document, for >>>>>> example, is XML Schema. While the representations of numbers, dates, >>>>>> and the like in XML Schema would be "more appropriate" for some >>>>>> languages/locales than others if given as plain text, what >>>>>> distinguishes them is that they are all machine readable and >>>>>> intended to >>> be read by machines for later processing. >>>>>> The display of values is a separate, local, concern for the data's >>>>>> consumer. This necessarily means choosing specific separators (such >>>>>> as decimal separators) over other, more localized values. Save for >>>>>> "free >>> text" >>>>>> (natural language) data, most data formats are locale neutral and >>>>>> these include things like JSON-LD, XML Schema, CSV, and so forth. >>>>>> >>>>>> Not every possible data structure or data value is, of course, >>>>>> covered fully. For example, in my day job (I work at Amazon), we >>>>>> have many different common measurement units defined internally. To >>>>>> transmit these in a locale-neutral manner, we need to construct our >>>>>> own data schemas and identifiers. There are profoundly many ways to >>>>>> measure shoes, dresses, auto parts, hats, drone propellers, and so >>>>>> forth. But it would be a nightmare to have to deal with localized >>> presentation formats on top of that. >>>>>> But there are pre-made standards for the basic data types and these >>>>>> are what are needed to build almost any data structure necessary for >>>>>> global interchange of data. >>>>>> >>>>>> Does that make sense? >>>>>> >>>>>> Addison >>>>>> >>>>>> Addison Phillips >>>>>> Principal SDE, I18N Architect (Amazon) Chair (W3C I18N WG) >>>>>> >>>>>> Internationalization is not a feature. >>>>>> It is an architecture. >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -----Original Message----- >>>>>>> From: Annette Greiner [mailto:amgreiner@lbl.gov] >>>>>>> Sent: Thursday, August 04, 2016 12:04 PM >>>>>>> To: ishida@w3.org; public-dwbp-comments@w3.org >>>>>>> Cc: www International <www-international@w3.org> >>>>>>> Subject: Re: [i18n review comment] BP3 should recommend >>>>>>> locale-neutral representation #187 >>>>>>> >>>>>>> Hello on behalf of the DWBP WG, >>>>>>> >>>>>>> We're interested in pursuing this concept in our best practice >>>>>>> document, but we would like some clarification of the practice of >>>>>>> locale neutrality. >>>>>>> You >>>>>>> mention the variation across locales in decimal symbol, grouping >>>>>>> symbol, number of grouping digits, digit shapes, etc., and you give >>>>>>> an example of a locale-neutral data structure for monetary values. >>>>>>> But this structure alone does not appear to address differences in >>>>>>> decimal symbol, grouping symbol, number of grouping digits, or >>>>>>> digit shapes. It does provide a mechanism to separately specify the >>>>>>> units, and the example uses an ISO-4217 currency code, both of >>>>>>> which we agree are good ideas. Is there a broad standard (beyond >>>>>>> just monetary) for addressing the other symbol/representation >>>>>>> issues you raised that we can address briefly in our best practice? >>>>>>> Do you consider SI units consistent with a locale-neutral approach? >>>>>>> Is there a locale-neutral standard for representing decimal numbers >>>>>>> (perhaps using a period and no grouping, as in your example)? >>>>>>> >>>>>>> -Annette >>>>>>> >>>>>>> >>>>>>> On 7/22/16 5:32 AM, ishida@w3.org wrote: >>>>>>> >>>>>>>> [raised by aphillips] >>>>>>>> >>>>>>>> https://www.w3.org/TR/dwbp/#LocaleParametersMetadata >>>>>>>> >>>>>>>> Best practice #3 introduces itself as: >>>>>>>> >>>>>>>> Providing locale parameters helps humans and computer applications >>>>>>>> to work accurately with things like dates, currencies and numbers >>>>>>>> that may look similar but have different meanings in different >>>>>>>> locales. >>>>>>>> >>>>>>>> But the actual best practice is to use **locale-neutral** >>>>>>>> representations that are interpreted/displayed to end-users in a >>>>>>>> locale-appropriate manner. For example, instead of storing the >>>>>>>> string "€2000.00", exchanging a data structure like the following >>>>>>>> is strongly >>>>>>>> preferred: >>>>>>>> >>>>>>>> ``` >>>>>>>> "price" { >>>>>>>> "value": 2000.00, >>>>>>>> "currency": "EUR" >>>>>>>> } >>>>>>>> ``` >>>>>>>> >>>>>>>> The date examples given are all in xsd:date format, which is an >>>>>>>> excellent example of using a locale-neutral format. >>>>>>>> >>>>>>>> Many things are dependent on locale: decimal symbol, grouping >>>>>>>> symbol, number of grouping digits, digit shapes, etc. It's because >>>>>>>> there can be wide variation (sometimes open to misinterpretation) >>>>>>>> that sending a locale neutral format is preferred for data values. >>>>>>>> Note also btw that the position of the currency symbol is >>>>>>>> dependent on the locale. In France it would be normal to write >>> 2000.00 € rather than €2000.00. >>>>>>>> Same even when talking about USD when using $, ie. 2000.00 $. >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>> Annette Greiner >>>>>>> NERSC Data and Analytics Services >>>>>>> Lawrence Berkeley National Laboratory >>>>>>> >>>>>>> >>>>> -- >>>>> Annette Greiner >>>>> NERSC Data and Analytics Services >>>>> Lawrence Berkeley National Laboratory >>>>> >>>>> >>>>> >>>> >>> -- >>> >>> >>> Phil Archer >>> W3C Data Activity Lead >>> http://www.w3.org/2013/data/ >>> >>> http://philarcher.org >>> +44 (0)7887 767755 >>> @philarcher1 > -- Phil Archer W3C Data Activity Lead http://www.w3.org/2013/data/ http://philarcher.org +44 (0)7887 767755 @philarcher1
Re: [i18n review comment] BP3 should recommend locale-neutral representation #187
Deirdre Lee   Mon, 22 Aug 2016 13:48:04 +0100

public-dwbp-comments > August 2016 > 0000.html

Received on Monday, 22 August 2016 12:48:41 UTC

Show in list: by dateby threadby subjectby author

Link to this message in this page.

Sent to: phila@w3.org, addison@lab126.com, bfl@cin.ufpe.br, amgreiner@lbl.gov
Copied to: ishida@w3.org, ishida@w3.org, public-dwbp-comments@w3.org, public-dwbp-comments@w3.org, www-international@w3.org.

Looks good, thanks Phil. On 22/08/2016 10:33, Phil Archer wrote: > Dear all, > > I have taken further steps on this. The result can be seen at > http://philarcher1.github.io/dwbp/bp.html#LocaleParametersMetadata > > 1. Addision's text used more or less verbatim; > 1a. taken account of Annette's suggestion; > 1b. replaced inline links to BCP47 and CLDR with references > 2. title of the BP changed to Use locale-neutral data representations > 3. moved to Data Formats section as resolved in WG meeting on Friday; > 4. added R-FormatMachineRead to list of evidence and thereby updated > the UCR cross matching; > 5. updated the Challenges SVG diagram; > 6. updated my Pull request. > > NB, I *retained* the old ID for the BP so that any links to > #LocaleParametersMetadata will still work. I know there are some of > these, for example, in the Share-PSI project. > > HTH > > Phil. > > > > On 22/08/2016 08:52, Deirdre Lee wrote: >> HI, >> >> Thank you for your comments Addison. I think they make sense and should >> be straight-forward to incorporate. >> >> The title of the BP should probably also be updated to something like >> 'Provide locale-neutral data' >> >> Phil and DWBP editors, in Friday's meeting we also agreed to move BP3 to >> the Data Formats section from the Metadata section, which would make it >> BP14, right? >> >> Kind regards, >> >> Deirdre >> >> >> >> On 19/08/2016 17:39, Phillips, Addison wrote: >>> Hi Phil, >>> >>> Thanks for starting on this. I think the pull request is a good start. >>> I have some comments on it. >>> >>> My main concern is that this BP is really backwards. It recommends to >>> "locale parameter metadata" and then says that the simplest way to do >>> this is to use locale-neutral formats. The recommendation should be >>> more like "use locale-neutral formats or provide locale/language >>> information where that's not possible". The pull request captures the >>> use of locale-neutral, but doesn't really explain about when to >>> provide locale and language information. >>> >>> I would change this: >>> >>> -- >>> <p class="practicedesc">Provide metadata about locale parameters >>> (date, time, and number formats, language).</p> >>> -- >>> >>> To say: >>> >>> -- >>> <p class="practicedesc">Use locale-neutral data structures and values, >>> or, where that is not possible, provide metadata about the locale used >>> by data values.</p> >>> -- >>> >>> I would change: >>> >>> -- >>> <p>The simplest method is to use local-neutral representations of the >>> actual data, and then add metadata to provide relevant locale >>> information. For example, rather than storing "€2000.00" as a string, >>> it's strongly preferred to exchange a data structure such as:</p> >>> -- >>> >>> To say: >>> >>> -- >>> <p>Most common data representations are locale neutral. For example, >>> XML Schema types such as xsd:integer and xsd: date are intended for >>> locale-neutral data interchange. Using locale-neutral representations >>> allows the data values to be processed accurately without complex >>> parsing or misinterpretation and also allows the data to be presented >>> in the format most comfortable for the consumer of the data. For >>> example, rather than storing "€2000,00" as a string, it's strongly >>> preferred to exchange a data structure such as:</p> >>> -- >>> >>> Also, note the misspelling of "locale-neutral" in the pull request. >>> >>> I would then go on to add some text about when locale parameters are >>> needed. Something like: >>> >>> -- >>> Some datasets contain values that are not or cannot be rendered into a >>> locale-neutral format. This is particularly true of any natural >>> language text values. For each data field that can contain locale >>> affected or natural language text, there should be an associated >>> language tag used to indicate the language and locale of the data. >>> This locale information can be used in parsing the data or to ensure >>> proper presentation and processing of the value by the consumer. >>> -- >>> >>> (Sorry for not generating a pull request of my own) >>> >>> Addison >>> >>>> -----Original Message----- >>>> From: Phil Archer [mailto:phila@w3.org] >>>> Sent: Friday, August 19, 2016 8:37 AM >>>> To: Bernadette Farias Lóscio <bfl@cin.ufpe.br>; Annette Greiner >>>> <amgreiner@lbl.gov> >>>> Cc: Phillips, Addison <addison@lab126.com>; ishida@w3.org; >>>> public-dwbp- >>>> comments@w3.org; www International <www-international@w3.org> >>>> Subject: Re: [i18n review comment] BP3 should recommend locale-neutral >>>> representation #187 >>>> >>>> I took an action on today's call to try and address this in BP3. You >>>> can see the >>>> results at >>>> http://philarcher1.github.io/dwbp/bp.html#LocaleParametersMetadata >>>> >>>> This uses some of Addison's text directly and highlights the value of >>>> the xsd >>>> datatypes - but retains enough of the original BP for it to be an >>>> amendment >>>> rather than a whole new one - I hope. >>>> >>>> This addresses most of the resolution taken today [1] but I have not >>>> moved >>>> the BP to the formats section. I leave that to the editors who may >>>> want to >>>> make further changes - or argue for it to be left where it is, or add >>>> references >>>> from the formats section or, or, or... >>>> >>>> I've created the Pull Request https://github.com/w3c/dwbp/pull/447 >>>> >>>> Phil. >>>> >>>> [1] https://www.w3.org/2016/08/19-dwbp-minutes#resolution02 >>>> >>>> On 15/08/2016 17:28, Bernadette Farias Lóscio wrote: >>>>> Dear Ishida, >>>>> >>>>> This comment [1] is still under discussion [4] and we'd like to ask >>>>> your opinion about two of our proposals: >>>>> >>>>> 1. to include locale-neutral representation ideas as part of BP3 [2], >>>>> or 2. to include a paragraph at the introduction of Section 8.8 Data >>>>> Formats [3] to discuss the relevance of having local-neutral >>>>> representations. >>>>> >>>>> We also discussed the proposal of having a new BP and we agreed that >>>>> we won't have a lot of time for a broader review of the new BP and to >>>>> collect feedback from the community. >>>>> >>>>> Thanks a lot! >>>>> DWBP editors >>>>> >>>>> [1] https://lists.w3.org/Archives/Public/public-dwbp-comments/ >>>>> 2016Jul/0028.html >>>>> [2]http://agreiner.github.io/dwbp/bp.html#LocaleParametersMetadata >>>>> [3] https://www.w3.org/TR/dwbp/#dataFormats >>>>> [4] >>>>> https://lists.w3.org/Archives/Public/public-dwbp-wg/2016Aug/0009.html >>>>> >>>>> >>>>> 2016-08-04 23:26 GMT+02:00 Annette Greiner <amgreiner@lbl.gov>: >>>>> >>>>>> Hi Addison, >>>>>> >>>>>> Thanks for your response, and it does make sense. I think what I am >>>>>> still missing is whether there is guidance we can point to as to how >>>>>> to represent the "locale-neutral" data so that it can most easily be >>>>>> made locale specific by existing tools. You mention "pre-made >>>>>> standards for the basic data types". Is there a recommended list we >>>>>> could >>>> reference? >>>>>> Thanks for your help! >>>>>> -Annette >>>>>> >>>>>> >>>>>> On 8/4/16 12:31 PM, Phillips, Addison wrote: >>>>>> >>>>>>> Hi Annette, >>>>>>> >>>>>>> Thanks for the note. This is a personal reply not on behalf of the >>>>>>> WG. >>>>>>> >>>>>>> Locale neutral formats are quite common on the Web and the Internet >>>>>>> in general. One familiar format referenced by your document, for >>>>>>> example, is XML Schema. While the representations of numbers, >>>>>>> dates, >>>>>>> and the like in XML Schema would be "more appropriate" for some >>>>>>> languages/locales than others if given as plain text, what >>>>>>> distinguishes them is that they are all machine readable and >>>>>>> intended to >>>> be read by machines for later processing. >>>>>>> The display of values is a separate, local, concern for the data's >>>>>>> consumer. This necessarily means choosing specific separators (such >>>>>>> as decimal separators) over other, more localized values. Save for >>>>>>> "free >>>> text" >>>>>>> (natural language) data, most data formats are locale neutral and >>>>>>> these include things like JSON-LD, XML Schema, CSV, and so forth. >>>>>>> >>>>>>> Not every possible data structure or data value is, of course, >>>>>>> covered fully. For example, in my day job (I work at Amazon), we >>>>>>> have many different common measurement units defined internally. To >>>>>>> transmit these in a locale-neutral manner, we need to construct our >>>>>>> own data schemas and identifiers. There are profoundly many ways to >>>>>>> measure shoes, dresses, auto parts, hats, drone propellers, and so >>>>>>> forth. But it would be a nightmare to have to deal with localized >>>> presentation formats on top of that. >>>>>>> But there are pre-made standards for the basic data types and these >>>>>>> are what are needed to build almost any data structure necessary >>>>>>> for >>>>>>> global interchange of data. >>>>>>> >>>>>>> Does that make sense? >>>>>>> >>>>>>> Addison >>>>>>> >>>>>>> Addison Phillips >>>>>>> Principal SDE, I18N Architect (Amazon) Chair (W3C I18N WG) >>>>>>> >>>>>>> Internationalization is not a feature. >>>>>>> It is an architecture. >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> -----Original Message----- >>>>>>>> From: Annette Greiner [mailto:amgreiner@lbl.gov] >>>>>>>> Sent: Thursday, August 04, 2016 12:04 PM >>>>>>>> To: ishida@w3.org; public-dwbp-comments@w3.org >>>>>>>> Cc: www International <www-international@w3.org> >>>>>>>> Subject: Re: [i18n review comment] BP3 should recommend >>>>>>>> locale-neutral representation #187 >>>>>>>> >>>>>>>> Hello on behalf of the DWBP WG, >>>>>>>> >>>>>>>> We're interested in pursuing this concept in our best practice >>>>>>>> document, but we would like some clarification of the practice of >>>>>>>> locale neutrality. >>>>>>>> You >>>>>>>> mention the variation across locales in decimal symbol, grouping >>>>>>>> symbol, number of grouping digits, digit shapes, etc., and you >>>>>>>> give >>>>>>>> an example of a locale-neutral data structure for monetary values. >>>>>>>> But this structure alone does not appear to address differences in >>>>>>>> decimal symbol, grouping symbol, number of grouping digits, or >>>>>>>> digit shapes. It does provide a mechanism to separately specify >>>>>>>> the >>>>>>>> units, and the example uses an ISO-4217 currency code, both of >>>>>>>> which we agree are good ideas. Is there a broad standard (beyond >>>>>>>> just monetary) for addressing the other symbol/representation >>>>>>>> issues you raised that we can address briefly in our best >>>>>>>> practice? >>>>>>>> Do you consider SI units consistent with a locale-neutral >>>>>>>> approach? >>>>>>>> Is there a locale-neutral standard for representing decimal >>>>>>>> numbers >>>>>>>> (perhaps using a period and no grouping, as in your example)? >>>>>>>> >>>>>>>> -Annette >>>>>>>> >>>>>>>> >>>>>>>> On 7/22/16 5:32 AM, ishida@w3.org wrote: >>>>>>>> >>>>>>>>> [raised by aphillips] >>>>>>>>> >>>>>>>>> https://www.w3.org/TR/dwbp/#LocaleParametersMetadata >>>>>>>>> >>>>>>>>> Best practice #3 introduces itself as: >>>>>>>>> >>>>>>>>> Providing locale parameters helps humans and computer >>>>>>>>> applications >>>>>>>>> to work accurately with things like dates, currencies and numbers >>>>>>>>> that may look similar but have different meanings in different >>>>>>>>> locales. >>>>>>>>> >>>>>>>>> But the actual best practice is to use **locale-neutral** >>>>>>>>> representations that are interpreted/displayed to end-users in a >>>>>>>>> locale-appropriate manner. For example, instead of storing the >>>>>>>>> string "€2000.00", exchanging a data structure like the following >>>>>>>>> is strongly >>>>>>>>> preferred: >>>>>>>>> >>>>>>>>> ``` >>>>>>>>> "price" { >>>>>>>>> "value": 2000.00, >>>>>>>>> "currency": "EUR" >>>>>>>>> } >>>>>>>>> ``` >>>>>>>>> >>>>>>>>> The date examples given are all in xsd:date format, which is an >>>>>>>>> excellent example of using a locale-neutral format. >>>>>>>>> >>>>>>>>> Many things are dependent on locale: decimal symbol, grouping >>>>>>>>> symbol, number of grouping digits, digit shapes, etc. It's >>>>>>>>> because >>>>>>>>> there can be wide variation (sometimes open to misinterpretation) >>>>>>>>> that sending a locale neutral format is preferred for data >>>>>>>>> values. >>>>>>>>> Note also btw that the position of the currency symbol is >>>>>>>>> dependent on the locale. In France it would be normal to write >>>> 2000.00 € rather than €2000.00. >>>>>>>>> Same even when talking about USD when using $, ie. 2000.00 $. >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>> Annette Greiner >>>>>>>> NERSC Data and Analytics Services >>>>>>>> Lawrence Berkeley National Laboratory >>>>>>>> >>>>>>>> >>>>>> -- >>>>>> Annette Greiner >>>>>> NERSC Data and Analytics Services >>>>>> Lawrence Berkeley National Laboratory >>>>>> >>>>>> >>>>>> >>>>> >>>> -- >>>> >>>> >>>> Phil Archer >>>> W3C Data Activity Lead >>>> http://www.w3.org/2013/data/ >>>> >>>> http://philarcher.org >>>> +44 (0)7887 767755 >>>> @philarcher1 >> > -- ------------------------------------ Deirdre Lee, CEO & Founder Derilinx - Linked & Open Data Solutions Web: www.derilinx.com Email: deirdre@derilinx.com Address: 11/12 Baggot Court, Dublin 2, D02 F891 Tel: +353 (0)1 254 4316 Mob: +353 (0)87 417 2318 Linkedin: ie.linkedin.com/in/leedeirdre/ Twitter: @deirdrelee
RE: [i18n review comment] BP3 should recommend locale-neutral representation #187
"Phillips, Addison"   Mon, 22 Aug 2016 17:36:43 +0000

public-dwbp-comments > August 2016 > 0000.html

Received on Monday, 22 August 2016 17:37:14 UTC

Show in list: by dateby threadby subjectby author

Link to this message in this page.

Sent to: phila@w3.org, deirdre@derilinx.com, bfl@cin.ufpe.br, amgreiner@lbl.gov
Copied to: ishida@w3.org, ishida@w3.org, public-dwbp-comments@w3.org, public-dwbp-comments@w3.org, www-international@w3.org.

Hi Phil, This looks good. A few comments. 1. Rather than providing your own definition for 'locale', you might make use of the one we provide in LTLI [1]. 2. The "why" is still missing something. I would suggest adding a new first paragraph explaining locale-neutral first. Something like: -- Data values that are machine-readable and not specific to any particular language or culture are more durable and less open to misinterpretation than values that use one of the many different cultural representations. By using a locale-neutral format, systems avoid the need to establish specific interchange rules that vary according to the language or location of the user. When the data is already in a locale-specific format, providing locale parameters... <rest of existing text> -- Hope that helps, Addison [1] https://www.w3.org/TR/ltli/#locale > -----Original Message----- > From: Phil Archer [mailto:phila@w3.org] > Sent: Monday, August 22, 2016 2:34 AM > To: Deirdre Lee <deirdre@derilinx.com>; Phillips, Addison > <addison@lab126.com>; Bernadette Farias Lóscio <bfl@cin.ufpe.br>; > Annette Greiner <amgreiner@lbl.gov> > Cc: ishida@w3.org; public-dwbp-comments@w3.org; www International > <www-international@w3.org> > Subject: Re: [i18n review comment] BP3 should recommend locale-neutral > representation #187 > > Dear all, > > I have taken further steps on this. The result can be seen at > http://philarcher1.github.io/dwbp/bp.html#LocaleParametersMetadata > > 1. Addision's text used more or less verbatim; 1a. taken account of Annette's > suggestion; 1b. replaced inline links to BCP47 and CLDR with references 2. > title of the BP changed to Use locale-neutral data representations 3. moved > to Data Formats section as resolved in WG meeting on Friday; 4. added R- > FormatMachineRead to list of evidence and thereby updated the UCR cross > matching; 5. updated the Challenges SVG diagram; 6. updated my Pull > request. > > NB, I *retained* the old ID for the BP so that any links to > #LocaleParametersMetadata will still work. I know there are some of these, > for example, in the Share-PSI project. > > HTH > > Phil. > > > > On 22/08/2016 08:52, Deirdre Lee wrote: > > HI, > > > > Thank you for your comments Addison. I think they make sense and > > should be straight-forward to incorporate. > > > > The title of the BP should probably also be updated to something like > > 'Provide locale-neutral data' > > > > Phil and DWBP editors, in Friday's meeting we also agreed to move BP3 > > to the Data Formats section from the Metadata section, which would > > make it BP14, right? > > > > Kind regards, > > > > Deirdre > > > > > > > > On 19/08/2016 17:39, Phillips, Addison wrote: > >> Hi Phil, > >> > >> Thanks for starting on this. I think the pull request is a good start. > >> I have some comments on it. > >> > >> My main concern is that this BP is really backwards. It recommends to > >> "locale parameter metadata" and then says that the simplest way to do > >> this is to use locale-neutral formats. The recommendation should be > >> more like "use locale-neutral formats or provide locale/language > >> information where that's not possible". The pull request captures the > >> use of locale-neutral, but doesn't really explain about when to > >> provide locale and language information. > >> > >> I would change this: > >> > >> -- > >> <p class="practicedesc">Provide metadata about locale parameters > >> (date, time, and number formats, language).</p> > >> -- > >> > >> To say: > >> > >> -- > >> <p class="practicedesc">Use locale-neutral data structures and > >> values, or, where that is not possible, provide metadata about the > >> locale used by data values.</p> > >> -- > >> > >> I would change: > >> > >> -- > >> <p>The simplest method is to use local-neutral representations of the > >> actual data, and then add metadata to provide relevant locale > >> information. For example, rather than storing "€2000.00" as a string, > >> it's strongly preferred to exchange a data structure such as:</p> > >> -- > >> > >> To say: > >> > >> -- > >> <p>Most common data representations are locale neutral. For example, > >> XML Schema types such as xsd:integer and xsd: date are intended for > >> locale-neutral data interchange. Using locale-neutral representations > >> allows the data values to be processed accurately without complex > >> parsing or misinterpretation and also allows the data to be presented > >> in the format most comfortable for the consumer of the data. For > >> example, rather than storing "€2000,00" as a string, it's strongly > >> preferred to exchange a data structure such as:</p> > >> -- > >> > >> Also, note the misspelling of "locale-neutral" in the pull request. > >> > >> I would then go on to add some text about when locale parameters are > >> needed. Something like: > >> > >> -- > >> Some datasets contain values that are not or cannot be rendered into > >> a locale-neutral format. This is particularly true of any natural > >> language text values. For each data field that can contain locale > >> affected or natural language text, there should be an associated > >> language tag used to indicate the language and locale of the data. > >> This locale information can be used in parsing the data or to ensure > >> proper presentation and processing of the value by the consumer. > >> -- > >> > >> (Sorry for not generating a pull request of my own) > >> > >> Addison > >> > >>> -----Original Message----- > >>> From: Phil Archer [mailto:phila@w3.org] > >>> Sent: Friday, August 19, 2016 8:37 AM > >>> To: Bernadette Farias Lóscio <bfl@cin.ufpe.br>; Annette Greiner > >>> <amgreiner@lbl.gov> > >>> Cc: Phillips, Addison <addison@lab126.com>; ishida@w3.org; > >>> public-dwbp- comments@w3.org; www International > >>> <www-international@w3.org> > >>> Subject: Re: [i18n review comment] BP3 should recommend > >>> locale-neutral representation #187 > >>> > >>> I took an action on today's call to try and address this in BP3. You > >>> can see the results at > >>> http://philarcher1.github.io/dwbp/bp.html#LocaleParametersMetadata > >>> > >>> This uses some of Addison's text directly and highlights the value > >>> of the xsd datatypes - but retains enough of the original BP for it > >>> to be an amendment rather than a whole new one - I hope. > >>> > >>> This addresses most of the resolution taken today [1] but I have not > >>> moved the BP to the formats section. I leave that to the editors who > >>> may want to make further changes - or argue for it to be left where > >>> it is, or add references from the formats section or, or, or... > >>> > >>> I've created the Pull Request https://github.com/w3c/dwbp/pull/447 > >>> > >>> Phil. > >>> > >>> [1] https://www.w3.org/2016/08/19-dwbp-minutes#resolution02 > >>> > >>> On 15/08/2016 17:28, Bernadette Farias Lóscio wrote: > >>>> Dear Ishida, > >>>> > >>>> This comment [1] is still under discussion [4] and we'd like to ask > >>>> your opinion about two of our proposals: > >>>> > >>>> 1. to include locale-neutral representation ideas as part of BP3 > >>>> [2], or 2. to include a paragraph at the introduction of Section > >>>> 8.8 Data Formats [3] to discuss the relevance of having > >>>> local-neutral representations. > >>>> > >>>> We also discussed the proposal of having a new BP and we agreed > >>>> that we won't have a lot of time for a broader review of the new BP > >>>> and to collect feedback from the community. > >>>> > >>>> Thanks a lot! > >>>> DWBP editors > >>>> > >>>> [1] https://lists.w3.org/Archives/Public/public-dwbp-comments/ > >>>> 2016Jul/0028.html > >>>> > [2]http://agreiner.github.io/dwbp/bp.html#LocaleParametersMetadata > >>>> [3] https://www.w3.org/TR/dwbp/#dataFormats > >>>> [4] > >>>> https://lists.w3.org/Archives/Public/public-dwbp-wg/2016Aug/0009.ht > >>>> ml > >>>> > >>>> > >>>> 2016-08-04 23:26 GMT+02:00 Annette Greiner <amgreiner@lbl.gov>: > >>>> > >>>>> Hi Addison, > >>>>> > >>>>> Thanks for your response, and it does make sense. I think what I > >>>>> am still missing is whether there is guidance we can point to as > >>>>> to how to represent the "locale-neutral" data so that it can most > >>>>> easily be made locale specific by existing tools. You mention > >>>>> "pre-made standards for the basic data types". Is there a > >>>>> recommended list we could > >>> reference? > >>>>> Thanks for your help! > >>>>> -Annette > >>>>> > >>>>> > >>>>> On 8/4/16 12:31 PM, Phillips, Addison wrote: > >>>>> > >>>>>> Hi Annette, > >>>>>> > >>>>>> Thanks for the note. This is a personal reply not on behalf of > >>>>>> the WG. > >>>>>> > >>>>>> Locale neutral formats are quite common on the Web and the > >>>>>> Internet in general. One familiar format referenced by your > >>>>>> document, for example, is XML Schema. While the representations > >>>>>> of numbers, dates, and the like in XML Schema would be "more > >>>>>> appropriate" for some languages/locales than others if given as > >>>>>> plain text, what distinguishes them is that they are all machine > >>>>>> readable and intended to > >>> be read by machines for later processing. > >>>>>> The display of values is a separate, local, concern for the > >>>>>> data's consumer. This necessarily means choosing specific > >>>>>> separators (such as decimal separators) over other, more > >>>>>> localized values. Save for "free > >>> text" > >>>>>> (natural language) data, most data formats are locale neutral and > >>>>>> these include things like JSON-LD, XML Schema, CSV, and so forth. > >>>>>> > >>>>>> Not every possible data structure or data value is, of course, > >>>>>> covered fully. For example, in my day job (I work at Amazon), we > >>>>>> have many different common measurement units defined internally. > >>>>>> To transmit these in a locale-neutral manner, we need to > >>>>>> construct our own data schemas and identifiers. There are > >>>>>> profoundly many ways to measure shoes, dresses, auto parts, hats, > >>>>>> drone propellers, and so forth. But it would be a nightmare to > >>>>>> have to deal with localized > >>> presentation formats on top of that. > >>>>>> But there are pre-made standards for the basic data types and > >>>>>> these are what are needed to build almost any data structure > >>>>>> necessary for global interchange of data. > >>>>>> > >>>>>> Does that make sense? > >>>>>> > >>>>>> Addison > >>>>>> > >>>>>> Addison Phillips > >>>>>> Principal SDE, I18N Architect (Amazon) Chair (W3C I18N WG) > >>>>>> > >>>>>> Internationalization is not a feature. > >>>>>> It is an architecture. > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> -----Original Message----- > >>>>>>> From: Annette Greiner [mailto:amgreiner@lbl.gov] > >>>>>>> Sent: Thursday, August 04, 2016 12:04 PM > >>>>>>> To: ishida@w3.org; public-dwbp-comments@w3.org > >>>>>>> Cc: www International <www-international@w3.org> > >>>>>>> Subject: Re: [i18n review comment] BP3 should recommend > >>>>>>> locale-neutral representation #187 > >>>>>>> > >>>>>>> Hello on behalf of the DWBP WG, > >>>>>>> > >>>>>>> We're interested in pursuing this concept in our best practice > >>>>>>> document, but we would like some clarification of the practice > >>>>>>> of locale neutrality. > >>>>>>> You > >>>>>>> mention the variation across locales in decimal symbol, grouping > >>>>>>> symbol, number of grouping digits, digit shapes, etc., and you > >>>>>>> give an example of a locale-neutral data structure for monetary > values. > >>>>>>> But this structure alone does not appear to address differences > >>>>>>> in decimal symbol, grouping symbol, number of grouping digits, > >>>>>>> or digit shapes. It does provide a mechanism to separately > >>>>>>> specify the units, and the example uses an ISO-4217 currency > >>>>>>> code, both of which we agree are good ideas. Is there a broad > >>>>>>> standard (beyond just monetary) for addressing the other > >>>>>>> symbol/representation issues you raised that we can address > briefly in our best practice? > >>>>>>> Do you consider SI units consistent with a locale-neutral approach? > >>>>>>> Is there a locale-neutral standard for representing decimal > >>>>>>> numbers (perhaps using a period and no grouping, as in your > example)? > >>>>>>> > >>>>>>> -Annette > >>>>>>> > >>>>>>> > >>>>>>> On 7/22/16 5:32 AM, ishida@w3.org wrote: > >>>>>>> > >>>>>>>> [raised by aphillips] > >>>>>>>> > >>>>>>>> https://www.w3.org/TR/dwbp/#LocaleParametersMetadata > >>>>>>>> > >>>>>>>> Best practice #3 introduces itself as: > >>>>>>>> > >>>>>>>> Providing locale parameters helps humans and computer > >>>>>>>> applications to work accurately with things like dates, > >>>>>>>> currencies and numbers that may look similar but have different > >>>>>>>> meanings in different locales. > >>>>>>>> > >>>>>>>> But the actual best practice is to use **locale-neutral** > >>>>>>>> representations that are interpreted/displayed to end-users in > >>>>>>>> a locale-appropriate manner. For example, instead of storing > >>>>>>>> the string "€2000.00", exchanging a data structure like the > >>>>>>>> following is strongly > >>>>>>>> preferred: > >>>>>>>> > >>>>>>>> ``` > >>>>>>>> "price" { > >>>>>>>> "value": 2000.00, > >>>>>>>> "currency": "EUR" > >>>>>>>> } > >>>>>>>> ``` > >>>>>>>> > >>>>>>>> The date examples given are all in xsd:date format, which is an > >>>>>>>> excellent example of using a locale-neutral format. > >>>>>>>> > >>>>>>>> Many things are dependent on locale: decimal symbol, grouping > >>>>>>>> symbol, number of grouping digits, digit shapes, etc. It's > >>>>>>>> because there can be wide variation (sometimes open to > >>>>>>>> misinterpretation) that sending a locale neutral format is > preferred for data values. > >>>>>>>> Note also btw that the position of the currency symbol is > >>>>>>>> dependent on the locale. In France it would be normal to write > >>> 2000.00 € rather than €2000.00. > >>>>>>>> Same even when talking about USD when using $, ie. 2000.00 $. > >>>>>>>> > >>>>>>>> > >>>>>>>> -- > >>>>>>> Annette Greiner > >>>>>>> NERSC Data and Analytics Services Lawrence Berkeley National > >>>>>>> Laboratory > >>>>>>> > >>>>>>> > >>>>> -- > >>>>> Annette Greiner > >>>>> NERSC Data and Analytics Services > >>>>> Lawrence Berkeley National Laboratory > >>>>> > >>>>> > >>>>> > >>>> > >>> -- > >>> > >>> > >>> Phil Archer > >>> W3C Data Activity Lead > >>> http://www.w3.org/2013/data/ > >>> > >>> http://philarcher.org > >>> +44 (0)7887 767755 > >>> @philarcher1 > > > > -- > > > Phil Archer > W3C Data Activity Lead > http://www.w3.org/2013/data/ > > http://philarcher.org > +44 (0)7887 767755 > @philarcher1
Re: [i18n review comment] BP3 should recommend locale-neutral representation #187
Phil Archer   Tue, 23 Aug 2016 11:28:38 +0100

public-dwbp-comments > August 2016 > 0000.html

Received on Tuesday, 23 August 2016 10:26:06 UTC

Show in list: by dateby threadby subjectby author

Link to this message in this page.

Sent to: addison@lab126.com, deirdre@derilinx.com, bfl@cin.ufpe.br, amgreiner@lbl.gov
Copied to: ishida@w3.org, ishida@w3.org, public-dwbp-comments@w3.org, public-dwbp-comments@w3.org, www-international@w3.org.

Thanks again Addison, Pls see below. On 22/08/2016 18:36, Phillips, Addison wrote: > Hi Phil, > > This looks good. A few comments. > > 1. Rather than providing your own definition for 'locale', you might make use of the one we provide in LTLI [1]. Done http://w3c.github.io/dwbp/bp.html#locale_parameter > > 2. The "why" is still missing something. I would suggest adding a new first paragraph explaining locale-neutral first. Something like: > > -- > Data values that are machine-readable and not specific to any particular language or culture are more durable and less open to misinterpretation than values that use one of the many different cultural representations. By using a locale-neutral format, systems avoid the need to establish specific interchange rules that vary according to the language or location of the user. > > When the data is already in a locale-specific format, providing locale parameters... <rest of existing text> Done, exactly as you suggest http://w3c.github.io/dwbp/bp.html#LocaleParametersMetadata With luck... the doc gets a green light from you? Thanks again Phil. > -- > > Hope that helps, > > Addison > > [1] https://www.w3.org/TR/ltli/#locale > >> -----Original Message----- >> From: Phil Archer [mailto:phila@w3.org] >> Sent: Monday, August 22, 2016 2:34 AM >> To: Deirdre Lee <deirdre@derilinx.com>; Phillips, Addison >> <addison@lab126.com>; Bernadette Farias Lóscio <bfl@cin.ufpe.br>; >> Annette Greiner <amgreiner@lbl.gov> >> Cc: ishida@w3.org; public-dwbp-comments@w3.org; www International >> <www-international@w3.org> >> Subject: Re: [i18n review comment] BP3 should recommend locale-neutral >> representation #187 >> >> Dear all, >> >> I have taken further steps on this. The result can be seen at >> http://philarcher1.github.io/dwbp/bp.html#LocaleParametersMetadata >> >> 1. Addision's text used more or less verbatim; 1a. taken account of Annette's >> suggestion; 1b. replaced inline links to BCP47 and CLDR with references 2. >> title of the BP changed to Use locale-neutral data representations 3. moved >> to Data Formats section as resolved in WG meeting on Friday; 4. added R- >> FormatMachineRead to list of evidence and thereby updated the UCR cross >> matching; 5. updated the Challenges SVG diagram; 6. updated my Pull >> request. >> >> NB, I *retained* the old ID for the BP so that any links to >> #LocaleParametersMetadata will still work. I know there are some of these, >> for example, in the Share-PSI project. >> >> HTH >> >> Phil. >> >> >> >> On 22/08/2016 08:52, Deirdre Lee wrote: >>> HI, >>> >>> Thank you for your comments Addison. I think they make sense and >>> should be straight-forward to incorporate. >>> >>> The title of the BP should probably also be updated to something like >>> 'Provide locale-neutral data' >>> >>> Phil and DWBP editors, in Friday's meeting we also agreed to move BP3 >>> to the Data Formats section from the Metadata section, which would >>> make it BP14, right? >>> >>> Kind regards, >>> >>> Deirdre >>> >>> >>> >>> On 19/08/2016 17:39, Phillips, Addison wrote: >>>> Hi Phil, >>>> >>>> Thanks for starting on this. I think the pull request is a good start. >>>> I have some comments on it. >>>> >>>> My main concern is that this BP is really backwards. It recommends to >>>> "locale parameter metadata" and then says that the simplest way to do >>>> this is to use locale-neutral formats. The recommendation should be >>>> more like "use locale-neutral formats or provide locale/language >>>> information where that's not possible". The pull request captures the >>>> use of locale-neutral, but doesn't really explain about when to >>>> provide locale and language information. >>>> >>>> I would change this: >>>> >>>> -- >>>> <p class="practicedesc">Provide metadata about locale parameters >>>> (date, time, and number formats, language).</p> >>>> -- >>>> >>>> To say: >>>> >>>> -- >>>> <p class="practicedesc">Use locale-neutral data structures and >>>> values, or, where that is not possible, provide metadata about the >>>> locale used by data values.</p> >>>> -- >>>> >>>> I would change: >>>> >>>> -- >>>> <p>The simplest method is to use local-neutral representations of the >>>> actual data, and then add metadata to provide relevant locale >>>> information. For example, rather than storing "€2000.00" as a string, >>>> it's strongly preferred to exchange a data structure such as:</p> >>>> -- >>>> >>>> To say: >>>> >>>> -- >>>> <p>Most common data representations are locale neutral. For example, >>>> XML Schema types such as xsd:integer and xsd: date are intended for >>>> locale-neutral data interchange. Using locale-neutral representations >>>> allows the data values to be processed accurately without complex >>>> parsing or misinterpretation and also allows the data to be presented >>>> in the format most comfortable for the consumer of the data. For >>>> example, rather than storing "€2000,00" as a string, it's strongly >>>> preferred to exchange a data structure such as:</p> >>>> -- >>>> >>>> Also, note the misspelling of "locale-neutral" in the pull request. >>>> >>>> I would then go on to add some text about when locale parameters are >>>> needed. Something like: >>>> >>>> -- >>>> Some datasets contain values that are not or cannot be rendered into >>>> a locale-neutral format. This is particularly true of any natural >>>> language text values. For each data field that can contain locale >>>> affected or natural language text, there should be an associated >>>> language tag used to indicate the language and locale of the data. >>>> This locale information can be used in parsing the data or to ensure >>>> proper presentation and processing of the value by the consumer. >>>> -- >>>> >>>> (Sorry for not generating a pull request of my own) >>>> >>>> Addison >>>> >>>>> -----Original Message----- >>>>> From: Phil Archer [mailto:phila@w3.org] >>>>> Sent: Friday, August 19, 2016 8:37 AM >>>>> To: Bernadette Farias Lóscio <bfl@cin.ufpe.br>; Annette Greiner >>>>> <amgreiner@lbl.gov> >>>>> Cc: Phillips, Addison <addison@lab126.com>; ishida@w3.org; >>>>> public-dwbp- comments@w3.org; www International >>>>> <www-international@w3.org> >>>>> Subject: Re: [i18n review comment] BP3 should recommend >>>>> locale-neutral representation #187 >>>>> >>>>> I took an action on today's call to try and address this in BP3. You >>>>> can see the results at >>>>> http://philarcher1.github.io/dwbp/bp.html#LocaleParametersMetadata >>>>> >>>>> This uses some of Addison's text directly and highlights the value >>>>> of the xsd datatypes - but retains enough of the original BP for it >>>>> to be an amendment rather than a whole new one - I hope. >>>>> >>>>> This addresses most of the resolution taken today [1] but I have not >>>>> moved the BP to the formats section. I leave that to the editors who >>>>> may want to make further changes - or argue for it to be left where >>>>> it is, or add references from the formats section or, or, or... >>>>> >>>>> I've created the Pull Request https://github.com/w3c/dwbp/pull/447 >>>>> >>>>> Phil. >>>>> >>>>> [1] https://www.w3.org/2016/08/19-dwbp-minutes#resolution02 >>>>> >>>>> On 15/08/2016 17:28, Bernadette Farias Lóscio wrote: >>>>>> Dear Ishida, >>>>>> >>>>>> This comment [1] is still under discussion [4] and we'd like to ask >>>>>> your opinion about two of our proposals: >>>>>> >>>>>> 1. to include locale-neutral representation ideas as part of BP3 >>>>>> [2], or 2. to include a paragraph at the introduction of Section >>>>>> 8.8 Data Formats [3] to discuss the relevance of having >>>>>> local-neutral representations. >>>>>> >>>>>> We also discussed the proposal of having a new BP and we agreed >>>>>> that we won't have a lot of time for a broader review of the new BP >>>>>> and to collect feedback from the community. >>>>>> >>>>>> Thanks a lot! >>>>>> DWBP editors >>>>>> >>>>>> [1] https://lists.w3.org/Archives/Public/public-dwbp-comments/ >>>>>> 2016Jul/0028.html >>>>>> >> [2]http://agreiner.github.io/dwbp/bp.html#LocaleParametersMetadata >>>>>> [3] https://www.w3.org/TR/dwbp/#dataFormats >>>>>> [4] >>>>>> https://lists.w3.org/Archives/Public/public-dwbp-wg/2016Aug/0009.ht >>>>>> ml >>>>>> >>>>>> >>>>>> 2016-08-04 23:26 GMT+02:00 Annette Greiner <amgreiner@lbl.gov>: >>>>>> >>>>>>> Hi Addison, >>>>>>> >>>>>>> Thanks for your response, and it does make sense. I think what I >>>>>>> am still missing is whether there is guidance we can point to as >>>>>>> to how to represent the "locale-neutral" data so that it can most >>>>>>> easily be made locale specific by existing tools. You mention >>>>>>> "pre-made standards for the basic data types". Is there a >>>>>>> recommended list we could >>>>> reference? >>>>>>> Thanks for your help! >>>>>>> -Annette >>>>>>> >>>>>>> >>>>>>> On 8/4/16 12:31 PM, Phillips, Addison wrote: >>>>>>> >>>>>>>> Hi Annette, >>>>>>>> >>>>>>>> Thanks for the note. This is a personal reply not on behalf of >>>>>>>> the WG. >>>>>>>> >>>>>>>> Locale neutral formats are quite common on the Web and the >>>>>>>> Internet in general. One familiar format referenced by your >>>>>>>> document, for example, is XML Schema. While the representations >>>>>>>> of numbers, dates, and the like in XML Schema would be "more >>>>>>>> appropriate" for some languages/locales than others if given as >>>>>>>> plain text, what distinguishes them is that they are all machine >>>>>>>> readable and intended to >>>>> be read by machines for later processing. >>>>>>>> The display of values is a separate, local, concern for the >>>>>>>> data's consumer. This necessarily means choosing specific >>>>>>>> separators (such as decimal separators) over other, more >>>>>>>> localized values. Save for "free >>>>> text" >>>>>>>> (natural language) data, most data formats are locale neutral and >>>>>>>> these include things like JSON-LD, XML Schema, CSV, and so forth. >>>>>>>> >>>>>>>> Not every possible data structure or data value is, of course, >>>>>>>> covered fully. For example, in my day job (I work at Amazon), we >>>>>>>> have many different common measurement units defined internally. >>>>>>>> To transmit these in a locale-neutral manner, we need to >>>>>>>> construct our own data schemas and identifiers. There are >>>>>>>> profoundly many ways to measure shoes, dresses, auto parts, hats, >>>>>>>> drone propellers, and so forth. But it would be a nightmare to >>>>>>>> have to deal with localized >>>>> presentation formats on top of that. >>>>>>>> But there are pre-made standards for the basic data types and >>>>>>>> these are what are needed to build almost any data structure >>>>>>>> necessary for global interchange of data. >>>>>>>> >>>>>>>> Does that make sense? >>>>>>>> >>>>>>>> Addison >>>>>>>> >>>>>>>> Addison Phillips >>>>>>>> Principal SDE, I18N Architect (Amazon) Chair (W3C I18N WG) >>>>>>>> >>>>>>>> Internationalization is not a feature. >>>>>>>> It is an architecture. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -----Original Message----- >>>>>>>>> From: Annette Greiner [mailto:amgreiner@lbl.gov] >>>>>>>>> Sent: Thursday, August 04, 2016 12:04 PM >>>>>>>>> To: ishida@w3.org; public-dwbp-comments@w3.org >>>>>>>>> Cc: www International <www-international@w3.org> >>>>>>>>> Subject: Re: [i18n review comment] BP3 should recommend >>>>>>>>> locale-neutral representation #187 >>>>>>>>> >>>>>>>>> Hello on behalf of the DWBP WG, >>>>>>>>> >>>>>>>>> We're interested in pursuing this concept in our best practice >>>>>>>>> document, but we would like some clarification of the practice >>>>>>>>> of locale neutrality. >>>>>>>>> You >>>>>>>>> mention the variation across locales in decimal symbol, grouping >>>>>>>>> symbol, number of grouping digits, digit shapes, etc., and you >>>>>>>>> give an example of a locale-neutral data structure for monetary >> values. >>>>>>>>> But this structure alone does not appear to address differences >>>>>>>>> in decimal symbol, grouping symbol, number of grouping digits, >>>>>>>>> or digit shapes. It does provide a mechanism to separately >>>>>>>>> specify the units, and the example uses an ISO-4217 currency >>>>>>>>> code, both of which we agree are good ideas. Is there a broad >>>>>>>>> standard (beyond just monetary) for addressing the other >>>>>>>>> symbol/representation issues you raised that we can address >> briefly in our best practice? >>>>>>>>> Do you consider SI units consistent with a locale-neutral approach? >>>>>>>>> Is there a locale-neutral standard for representing decimal >>>>>>>>> numbers (perhaps using a period and no grouping, as in your >> example)? >>>>>>>>> >>>>>>>>> -Annette >>>>>>>>> >>>>>>>>> >>>>>>>>> On 7/22/16 5:32 AM, ishida@w3.org wrote: >>>>>>>>> >>>>>>>>>> [raised by aphillips] >>>>>>>>>> >>>>>>>>>> https://www.w3.org/TR/dwbp/#LocaleParametersMetadata >>>>>>>>>> >>>>>>>>>> Best practice #3 introduces itself as: >>>>>>>>>> >>>>>>>>>> Providing locale parameters helps humans and computer >>>>>>>>>> applications to work accurately with things like dates, >>>>>>>>>> currencies and numbers that may look similar but have different >>>>>>>>>> meanings in different locales. >>>>>>>>>> >>>>>>>>>> But the actual best practice is to use **locale-neutral** >>>>>>>>>> representations that are interpreted/displayed to end-users in >>>>>>>>>> a locale-appropriate manner. For example, instead of storing >>>>>>>>>> the string "€2000.00", exchanging a data structure like the >>>>>>>>>> following is strongly >>>>>>>>>> preferred: >>>>>>>>>> >>>>>>>>>> ``` >>>>>>>>>> "price" { >>>>>>>>>> "value": 2000.00, >>>>>>>>>> "currency": "EUR" >>>>>>>>>> } >>>>>>>>>> ``` >>>>>>>>>> >>>>>>>>>> The date examples given are all in xsd:date format, which is an >>>>>>>>>> excellent example of using a locale-neutral format. >>>>>>>>>> >>>>>>>>>> Many things are dependent on locale: decimal symbol, grouping >>>>>>>>>> symbol, number of grouping digits, digit shapes, etc. It's >>>>>>>>>> because there can be wide variation (sometimes open to >>>>>>>>>> misinterpretation) that sending a locale neutral format is >> preferred for data values. >>>>>>>>>> Note also btw that the position of the currency symbol is >>>>>>>>>> dependent on the locale. In France it would be normal to write >>>>> 2000.00 € rather than €2000.00. >>>>>>>>>> Same even when talking about USD when using $, ie. 2000.00 $. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>> Annette Greiner >>>>>>>>> NERSC Data and Analytics Services Lawrence Berkeley National >>>>>>>>> Laboratory >>>>>>>>> >>>>>>>>> >>>>>>> -- >>>>>>> Annette Greiner >>>>>>> NERSC Data and Analytics Services >>>>>>> Lawrence Berkeley National Laboratory >>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>> -- >>>>> >>>>> >>>>> Phil Archer >>>>> W3C Data Activity Lead >>>>> http://www.w3.org/2013/data/ >>>>> >>>>> http://philarcher.org >>>>> +44 (0)7887 767755 >>>>> @philarcher1 >>> >> >> -- >> >> >> Phil Archer >> W3C Data Activity Lead >> http://www.w3.org/2013/data/ >> >> http://philarcher.org >> +44 (0)7887 767755 >> @philarcher1 -- Phil Archer W3C Data Activity Lead http://www.w3.org/2013/data/ http://philarcher.org +44 (0)7887 767755 @philarcher1
RE: [i18n review comment] BP3 should recommend locale-neutral representation #187
"Phillips, Addison"   Tue, 23 Aug 2016 14:11:50 +0000

public-dwbp-comments > August 2016 > 0000.html

Received on Tuesday, 23 August 2016 14:12:38 UTC

Show in list: by dateby threadby subjectby author

Link to this message in this page.

Sent to: phila@w3.org, deirdre@derilinx.com, bfl@cin.ufpe.br, amgreiner@lbl.gov
Copied to: ishida@w3.org, ishida@w3.org, public-dwbp-comments@w3.org, public-dwbp-comments@w3.org, www-international@w3.org.

Hi Phil, Thanks. This looks good to me. Addison > -----Original Message----- > From: Phil Archer [mailto:phila@w3.org] > Sent: Tuesday, August 23, 2016 3:29 AM > To: Phillips, Addison <addison@lab126.com>; Deirdre Lee > <deirdre@derilinx.com>; Bernadette Farias Lóscio <bfl@cin.ufpe.br>; > Annette Greiner <amgreiner@lbl.gov> > Cc: ishida@w3.org; public-dwbp-comments@w3.org; www International > <www-international@w3.org> > Subject: Re: [i18n review comment] BP3 should recommend locale-neutral > representation #187 > > Thanks again Addison, > > Pls see below. > > On 22/08/2016 18:36, Phillips, Addison wrote: > > Hi Phil, > > > > This looks good. A few comments. > > > > 1. Rather than providing your own definition for 'locale', you might make > use of the one we provide in LTLI [1]. > > Done > http://w3c.github.io/dwbp/bp.html#locale_parameter > > > > > 2. The "why" is still missing something. I would suggest adding a new first > paragraph explaining locale-neutral first. Something like: > > > > -- > > Data values that are machine-readable and not specific to any particular > language or culture are more durable and less open to misinterpretation than > values that use one of the many different cultural representations. By using a > locale-neutral format, systems avoid the need to establish specific > interchange rules that vary according to the language or location of the user. > > > > When the data is already in a locale-specific format, providing locale > > parameters... <rest of existing text> > > > Done, exactly as you suggest > http://w3c.github.io/dwbp/bp.html#LocaleParametersMetadata > > With luck... the doc gets a green light from you? > > Thanks again > > Phil. > > > -- > > > > Hope that helps, > > > > Addison > > > > [1] https://www.w3.org/TR/ltli/#locale > > > >> -----Original Message----- > >> From: Phil Archer [mailto:phila@w3.org] > >> Sent: Monday, August 22, 2016 2:34 AM > >> To: Deirdre Lee <deirdre@derilinx.com>; Phillips, Addison > >> <addison@lab126.com>; Bernadette Farias Lóscio <bfl@cin.ufpe.br>; > >> Annette Greiner <amgreiner@lbl.gov> > >> Cc: ishida@w3.org; public-dwbp-comments@w3.org; www International > >> <www-international@w3.org> > >> Subject: Re: [i18n review comment] BP3 should recommend > >> locale-neutral representation #187 > >> > >> Dear all, > >> > >> I have taken further steps on this. The result can be seen at > >> http://philarcher1.github.io/dwbp/bp.html#LocaleParametersMetadata > >> > >> 1. Addision's text used more or less verbatim; 1a. taken account of > >> Annette's suggestion; 1b. replaced inline links to BCP47 and CLDR with > references 2. > >> title of the BP changed to Use locale-neutral data representations 3. > >> moved to Data Formats section as resolved in WG meeting on Friday; 4. > >> added R- FormatMachineRead to list of evidence and thereby updated > >> the UCR cross matching; 5. updated the Challenges SVG diagram; 6. > >> updated my Pull request. > >> > >> NB, I *retained* the old ID for the BP so that any links to > >> #LocaleParametersMetadata will still work. I know there are some of > >> these, for example, in the Share-PSI project. > >> > >> HTH > >> > >> Phil. > >> > >> > >> > >> On 22/08/2016 08:52, Deirdre Lee wrote: > >>> HI, > >>> > >>> Thank you for your comments Addison. I think they make sense and > >>> should be straight-forward to incorporate. > >>> > >>> The title of the BP should probably also be updated to something > >>> like 'Provide locale-neutral data' > >>> > >>> Phil and DWBP editors, in Friday's meeting we also agreed to move > >>> BP3 to the Data Formats section from the Metadata section, which > >>> would make it BP14, right? > >>> > >>> Kind regards, > >>> > >>> Deirdre > >>> > >>> > >>> > >>> On 19/08/2016 17:39, Phillips, Addison wrote: > >>>> Hi Phil, > >>>> > >>>> Thanks for starting on this. I think the pull request is a good start. > >>>> I have some comments on it. > >>>> > >>>> My main concern is that this BP is really backwards. It recommends > >>>> to "locale parameter metadata" and then says that the simplest way > >>>> to do this is to use locale-neutral formats. The recommendation > >>>> should be more like "use locale-neutral formats or provide > >>>> locale/language information where that's not possible". The pull > >>>> request captures the use of locale-neutral, but doesn't really > >>>> explain about when to provide locale and language information. > >>>> > >>>> I would change this: > >>>> > >>>> -- > >>>> <p class="practicedesc">Provide metadata about locale parameters > >>>> (date, time, and number formats, language).</p> > >>>> -- > >>>> > >>>> To say: > >>>> > >>>> -- > >>>> <p class="practicedesc">Use locale-neutral data structures and > >>>> values, or, where that is not possible, provide metadata about the > >>>> locale used by data values.</p> > >>>> -- > >>>> > >>>> I would change: > >>>> > >>>> -- > >>>> <p>The simplest method is to use local-neutral representations of > >>>> the actual data, and then add metadata to provide relevant locale > >>>> information. For example, rather than storing "€2000.00" as a > >>>> string, it's strongly preferred to exchange a data structure such > >>>> as:</p> > >>>> -- > >>>> > >>>> To say: > >>>> > >>>> -- > >>>> <p>Most common data representations are locale neutral. For > >>>> example, XML Schema types such as xsd:integer and xsd: date are > >>>> intended for locale-neutral data interchange. Using locale-neutral > >>>> representations allows the data values to be processed accurately > >>>> without complex parsing or misinterpretation and also allows the > >>>> data to be presented in the format most comfortable for the > >>>> consumer of the data. For example, rather than storing "€2000,00" > >>>> as a string, it's strongly preferred to exchange a data structure > >>>> such as:</p> > >>>> -- > >>>> > >>>> Also, note the misspelling of "locale-neutral" in the pull request. > >>>> > >>>> I would then go on to add some text about when locale parameters > >>>> are needed. Something like: > >>>> > >>>> -- > >>>> Some datasets contain values that are not or cannot be rendered > >>>> into a locale-neutral format. This is particularly true of any > >>>> natural language text values. For each data field that can contain > >>>> locale affected or natural language text, there should be an > >>>> associated language tag used to indicate the language and locale of the > data. > >>>> This locale information can be used in parsing the data or to > >>>> ensure proper presentation and processing of the value by the > consumer. > >>>> -- > >>>> > >>>> (Sorry for not generating a pull request of my own) > >>>> > >>>> Addison > >>>> > >>>>> -----Original Message----- > >>>>> From: Phil Archer [mailto:phila@w3.org] > >>>>> Sent: Friday, August 19, 2016 8:37 AM > >>>>> To: Bernadette Farias Lóscio <bfl@cin.ufpe.br>; Annette Greiner > >>>>> <amgreiner@lbl.gov> > >>>>> Cc: Phillips, Addison <addison@lab126.com>; ishida@w3.org; > >>>>> public-dwbp- comments@w3.org; www International > >>>>> <www-international@w3.org> > >>>>> Subject: Re: [i18n review comment] BP3 should recommend > >>>>> locale-neutral representation #187 > >>>>> > >>>>> I took an action on today's call to try and address this in BP3. > >>>>> You can see the results at > >>>>> > http://philarcher1.github.io/dwbp/bp.html#LocaleParametersMetadata > >>>>> > >>>>> This uses some of Addison's text directly and highlights the value > >>>>> of the xsd datatypes - but retains enough of the original BP for > >>>>> it to be an amendment rather than a whole new one - I hope. > >>>>> > >>>>> This addresses most of the resolution taken today [1] but I have > >>>>> not moved the BP to the formats section. I leave that to the > >>>>> editors who may want to make further changes - or argue for it to > >>>>> be left where it is, or add references from the formats section or, or, > or... > >>>>> > >>>>> I've created the Pull Request https://github.com/w3c/dwbp/pull/447 > >>>>> > >>>>> Phil. > >>>>> > >>>>> [1] https://www.w3.org/2016/08/19-dwbp-minutes#resolution02 > >>>>> > >>>>> On 15/08/2016 17:28, Bernadette Farias Lóscio wrote: > >>>>>> Dear Ishida, > >>>>>> > >>>>>> This comment [1] is still under discussion [4] and we'd like to > >>>>>> ask your opinion about two of our proposals: > >>>>>> > >>>>>> 1. to include locale-neutral representation ideas as part of BP3 > >>>>>> [2], or 2. to include a paragraph at the introduction of Section > >>>>>> 8.8 Data Formats [3] to discuss the relevance of having > >>>>>> local-neutral representations. > >>>>>> > >>>>>> We also discussed the proposal of having a new BP and we agreed > >>>>>> that we won't have a lot of time for a broader review of the new > >>>>>> BP and to collect feedback from the community. > >>>>>> > >>>>>> Thanks a lot! > >>>>>> DWBP editors > >>>>>> > >>>>>> [1] https://lists.w3.org/Archives/Public/public-dwbp-comments/ > >>>>>> 2016Jul/0028.html > >>>>>> > >> [2]http://agreiner.github.io/dwbp/bp.html#LocaleParametersMetadata > >>>>>> [3] https://www.w3.org/TR/dwbp/#dataFormats > >>>>>> [4] > >>>>>> https://lists.w3.org/Archives/Public/public-dwbp-wg/2016Aug/0009. > >>>>>> ht > >>>>>> ml > >>>>>> > >>>>>> > >>>>>> 2016-08-04 23:26 GMT+02:00 Annette Greiner <amgreiner@lbl.gov>: > >>>>>> > >>>>>>> Hi Addison, > >>>>>>> > >>>>>>> Thanks for your response, and it does make sense. I think what I > >>>>>>> am still missing is whether there is guidance we can point to as > >>>>>>> to how to represent the "locale-neutral" data so that it can > >>>>>>> most easily be made locale specific by existing tools. You > >>>>>>> mention "pre-made standards for the basic data types". Is there > >>>>>>> a recommended list we could > >>>>> reference? > >>>>>>> Thanks for your help! > >>>>>>> -Annette > >>>>>>> > >>>>>>> > >>>>>>> On 8/4/16 12:31 PM, Phillips, Addison wrote: > >>>>>>> > >>>>>>>> Hi Annette, > >>>>>>>> > >>>>>>>> Thanks for the note. This is a personal reply not on behalf of > >>>>>>>> the WG. > >>>>>>>> > >>>>>>>> Locale neutral formats are quite common on the Web and the > >>>>>>>> Internet in general. One familiar format referenced by your > >>>>>>>> document, for example, is XML Schema. While the > representations > >>>>>>>> of numbers, dates, and the like in XML Schema would be "more > >>>>>>>> appropriate" for some languages/locales than others if given as > >>>>>>>> plain text, what distinguishes them is that they are all > >>>>>>>> machine readable and intended to > >>>>> be read by machines for later processing. > >>>>>>>> The display of values is a separate, local, concern for the > >>>>>>>> data's consumer. This necessarily means choosing specific > >>>>>>>> separators (such as decimal separators) over other, more > >>>>>>>> localized values. Save for "free > >>>>> text" > >>>>>>>> (natural language) data, most data formats are locale neutral > >>>>>>>> and these include things like JSON-LD, XML Schema, CSV, and so > forth. > >>>>>>>> > >>>>>>>> Not every possible data structure or data value is, of course, > >>>>>>>> covered fully. For example, in my day job (I work at Amazon), > >>>>>>>> we have many different common measurement units defined > internally. > >>>>>>>> To transmit these in a locale-neutral manner, we need to > >>>>>>>> construct our own data schemas and identifiers. There are > >>>>>>>> profoundly many ways to measure shoes, dresses, auto parts, > >>>>>>>> hats, drone propellers, and so forth. But it would be a > >>>>>>>> nightmare to have to deal with localized > >>>>> presentation formats on top of that. > >>>>>>>> But there are pre-made standards for the basic data types and > >>>>>>>> these are what are needed to build almost any data structure > >>>>>>>> necessary for global interchange of data. > >>>>>>>> > >>>>>>>> Does that make sense? > >>>>>>>> > >>>>>>>> Addison > >>>>>>>> > >>>>>>>> Addison Phillips > >>>>>>>> Principal SDE, I18N Architect (Amazon) Chair (W3C I18N WG) > >>>>>>>> > >>>>>>>> Internationalization is not a feature. > >>>>>>>> It is an architecture. > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> -----Original Message----- > >>>>>>>>> From: Annette Greiner [mailto:amgreiner@lbl.gov] > >>>>>>>>> Sent: Thursday, August 04, 2016 12:04 PM > >>>>>>>>> To: ishida@w3.org; public-dwbp-comments@w3.org > >>>>>>>>> Cc: www International <www-international@w3.org> > >>>>>>>>> Subject: Re: [i18n review comment] BP3 should recommend > >>>>>>>>> locale-neutral representation #187 > >>>>>>>>> > >>>>>>>>> Hello on behalf of the DWBP WG, > >>>>>>>>> > >>>>>>>>> We're interested in pursuing this concept in our best practice > >>>>>>>>> document, but we would like some clarification of the practice > >>>>>>>>> of locale neutrality. > >>>>>>>>> You > >>>>>>>>> mention the variation across locales in decimal symbol, > >>>>>>>>> grouping symbol, number of grouping digits, digit shapes, > >>>>>>>>> etc., and you give an example of a locale-neutral data > >>>>>>>>> structure for monetary > >> values. > >>>>>>>>> But this structure alone does not appear to address > >>>>>>>>> differences in decimal symbol, grouping symbol, number of > >>>>>>>>> grouping digits, or digit shapes. It does provide a mechanism > >>>>>>>>> to separately specify the units, and the example uses an > >>>>>>>>> ISO-4217 currency code, both of which we agree are good ideas. > >>>>>>>>> Is there a broad standard (beyond just monetary) for > >>>>>>>>> addressing the other symbol/representation issues you raised > >>>>>>>>> that we can address > >> briefly in our best practice? > >>>>>>>>> Do you consider SI units consistent with a locale-neutral > approach? > >>>>>>>>> Is there a locale-neutral standard for representing decimal > >>>>>>>>> numbers (perhaps using a period and no grouping, as in your > >> example)? > >>>>>>>>> > >>>>>>>>> -Annette > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> On 7/22/16 5:32 AM, ishida@w3.org wrote: > >>>>>>>>> > >>>>>>>>>> [raised by aphillips] > >>>>>>>>>> > >>>>>>>>>> https://www.w3.org/TR/dwbp/#LocaleParametersMetadata > >>>>>>>>>> > >>>>>>>>>> Best practice #3 introduces itself as: > >>>>>>>>>> > >>>>>>>>>> Providing locale parameters helps humans and computer > >>>>>>>>>> applications to work accurately with things like dates, > >>>>>>>>>> currencies and numbers that may look similar but have > >>>>>>>>>> different meanings in different locales. > >>>>>>>>>> > >>>>>>>>>> But the actual best practice is to use **locale-neutral** > >>>>>>>>>> representations that are interpreted/displayed to end-users > >>>>>>>>>> in a locale-appropriate manner. For example, instead of > >>>>>>>>>> storing the string "€2000.00", exchanging a data structure > >>>>>>>>>> like the following is strongly > >>>>>>>>>> preferred: > >>>>>>>>>> > >>>>>>>>>> ``` > >>>>>>>>>> "price" { > >>>>>>>>>> "value": 2000.00, > >>>>>>>>>> "currency": "EUR" > >>>>>>>>>> } > >>>>>>>>>> ``` > >>>>>>>>>> > >>>>>>>>>> The date examples given are all in xsd:date format, which is > >>>>>>>>>> an excellent example of using a locale-neutral format. > >>>>>>>>>> > >>>>>>>>>> Many things are dependent on locale: decimal symbol, > grouping > >>>>>>>>>> symbol, number of grouping digits, digit shapes, etc. It's > >>>>>>>>>> because there can be wide variation (sometimes open to > >>>>>>>>>> misinterpretation) that sending a locale neutral format is > >> preferred for data values. > >>>>>>>>>> Note also btw that the position of the currency symbol is > >>>>>>>>>> dependent on the locale. In France it would be normal to > >>>>>>>>>> write > >>>>> 2000.00 € rather than €2000.00. > >>>>>>>>>> Same even when talking about USD when using $, ie. 2000.00 $. > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> -- > >>>>>>>>> Annette Greiner > >>>>>>>>> NERSC Data and Analytics Services Lawrence Berkeley National > >>>>>>>>> Laboratory > >>>>>>>>> > >>>>>>>>> > >>>>>>> -- > >>>>>>> Annette Greiner > >>>>>>> NERSC Data and Analytics Services Lawrence Berkeley National > >>>>>>> Laboratory > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>> > >>>>> -- > >>>>> > >>>>> > >>>>> Phil Archer > >>>>> W3C Data Activity Lead > >>>>> http://www.w3.org/2013/data/ > >>>>> > >>>>> http://philarcher.org > >>>>> +44 (0)7887 767755 > >>>>> @philarcher1 > >>> > >> > >> -- > >> > >> > >> Phil Archer > >> W3C Data Activity Lead > >> http://www.w3.org/2013/data/ > >> > >> http://philarcher.org > >> +44 (0)7887 767755 > >> @philarcher1 > > -- > > > Phil Archer > W3C Data Activity Lead > http://www.w3.org/2013/data/ > > http://philarcher.org > +44 (0)7887 767755 > @philarcher1
Re: [i18n review comment] BP3 should recommend locale-neutral representation #187
Annette Greiner   Tue, 23 Aug 2016 10:30:03 -0700

public-dwbp-comments > August 2016 > 0000.html

Received on Tuesday, 23 August 2016 17:31:27 UTC

Show in list: by dateby threadby subjectby author

Link to this message in this page.

Sent to: addison@lab126.com, phila@w3.org, deirdre@derilinx.com, bfl@cin.ufpe.br
Copied to: ishida@w3.org, ishida@w3.org, public-dwbp-comments@w3.org, public-dwbp-comments@w3.org, www-international@w3.org.

Hi folks, Sorry I haven't been able to jump in before now. Since this has been changing a bunch, let me say that this comment is on the version at http://w3c.github.io/dwbp/bp.html#dataFormats as of 9:37am PDT August 23. The "Why" still devotes more text to the metadata approach than to the locale-neutral approach, though a little reshuffling would fix that. Here's a suggested rewrite: "Data values that are machine-readable and not specific to any particular language or culture are more durable and less open to misinterpretation than values that use one of the many different cultural representations. Things like dates, currencies and numbers may look similar but have different meanings in different locales. For example, the 'date' 4/7 can be read as 7th of April or the 4th of July depending on where the data was created. Similarly, €2,000 is either two thousand Euros or an over-precise representation of two Euros. By using a locale-neutral format, systems avoid the need to establish specific interchange rules that vary according to the language or location of the user. When the data is already in a locale-specific format, making the locale and language explicit by providing locale <http://w3c.github.io/dwbp/bp.html#locale_parameter> parameters allows users to determine how readily they can work with the data and may enable automated translation services." I also don't believe this is true: "Most common data representations are locale neutral." I would say most common data serialization formats are locale neutral, but it seems to me quite common to see them used in locale-specific ways. Finally, the example marked prominently as Example 13 looks like the primary suggestion for implementing the BP, which it isn't anymore. I think the 2000 Euro example should be at least as prominently marked. -Annette On 8/23/16 7:11 AM, Phillips, Addison wrote: > Hi Phil, > > Thanks. This looks good to me. > > Addison > >> -----Original Message----- >> From: Phil Archer [mailto:phila@w3.org] >> Sent: Tuesday, August 23, 2016 3:29 AM >> To: Phillips, Addison <addison@lab126.com>; Deirdre Lee >> <deirdre@derilinx.com>; Bernadette Farias Lóscio <bfl@cin.ufpe.br>; >> Annette Greiner <amgreiner@lbl.gov> >> Cc: ishida@w3.org; public-dwbp-comments@w3.org; www International >> <www-international@w3.org> >> Subject: Re: [i18n review comment] BP3 should recommend locale-neutral >> representation #187 >> >> Thanks again Addison, >> >> Pls see below. >> >> On 22/08/2016 18:36, Phillips, Addison wrote: >>> Hi Phil, >>> >>> This looks good. A few comments. >>> >>> 1. Rather than providing your own definition for 'locale', you might make >> use of the one we provide in LTLI [1]. >> >> Done >> http://w3c.github.io/dwbp/bp.html#locale_parameter >> >>> 2. The "why" is still missing something. I would suggest adding a new first >> paragraph explaining locale-neutral first. Something like: >>> -- >>> Data values that are machine-readable and not specific to any particular >> language or culture are more durable and less open to misinterpretation than >> values that use one of the many different cultural representations. By using a >> locale-neutral format, systems avoid the need to establish specific >> interchange rules that vary according to the language or location of the user. >>> When the data is already in a locale-specific format, providing locale >>> parameters... <rest of existing text> >> >> Done, exactly as you suggest >> http://w3c.github.io/dwbp/bp.html#LocaleParametersMetadata >> >> With luck... the doc gets a green light from you? >> >> Thanks again >> >> Phil. >> >>> -- >>> >>> Hope that helps, >>> >>> Addison >>> >>> [1] https://www.w3.org/TR/ltli/#locale >>> >>>> -----Original Message----- >>>> From: Phil Archer [mailto:phila@w3.org] >>>> Sent: Monday, August 22, 2016 2:34 AM >>>> To: Deirdre Lee <deirdre@derilinx.com>; Phillips, Addison >>>> <addison@lab126.com>; Bernadette Farias Lóscio <bfl@cin.ufpe.br>; >>>> Annette Greiner <amgreiner@lbl.gov> >>>> Cc: ishida@w3.org; public-dwbp-comments@w3.org; www International >>>> <www-international@w3.org> >>>> Subject: Re: [i18n review comment] BP3 should recommend >>>> locale-neutral representation #187 >>>> >>>> Dear all, >>>> >>>> I have taken further steps on this. The result can be seen at >>>> http://philarcher1.github.io/dwbp/bp.html#LocaleParametersMetadata >>>> >>>> 1. Addision's text used more or less verbatim; 1a. taken account of >>>> Annette's suggestion; 1b. replaced inline links to BCP47 and CLDR with >> references 2. >>>> title of the BP changed to Use locale-neutral data representations 3. >>>> moved to Data Formats section as resolved in WG meeting on Friday; 4. >>>> added R- FormatMachineRead to list of evidence and thereby updated >>>> the UCR cross matching; 5. updated the Challenges SVG diagram; 6. >>>> updated my Pull request. >>>> >>>> NB, I *retained* the old ID for the BP so that any links to >>>> #LocaleParametersMetadata will still work. I know there are some of >>>> these, for example, in the Share-PSI project. >>>> >>>> HTH >>>> >>>> Phil. >>>> >>>> >>>> >>>> On 22/08/2016 08:52, Deirdre Lee wrote: >>>>> HI, >>>>> >>>>> Thank you for your comments Addison. I think they make sense and >>>>> should be straight-forward to incorporate. >>>>> >>>>> The title of the BP should probably also be updated to something >>>>> like 'Provide locale-neutral data' >>>>> >>>>> Phil and DWBP editors, in Friday's meeting we also agreed to move >>>>> BP3 to the Data Formats section from the Metadata section, which >>>>> would make it BP14, right? >>>>> >>>>> Kind regards, >>>>> >>>>> Deirdre >>>>> >>>>> >>>>> >>>>> On 19/08/2016 17:39, Phillips, Addison wrote: >>>>>> Hi Phil, >>>>>> >>>>>> Thanks for starting on this. I think the pull request is a good start. >>>>>> I have some comments on it. >>>>>> >>>>>> My main concern is that this BP is really backwards. It recommends >>>>>> to "locale parameter metadata" and then says that the simplest way >>>>>> to do this is to use locale-neutral formats. The recommendation >>>>>> should be more like "use locale-neutral formats or provide >>>>>> locale/language information where that's not possible". The pull >>>>>> request captures the use of locale-neutral, but doesn't really >>>>>> explain about when to provide locale and language information. >>>>>> >>>>>> I would change this: >>>>>> >>>>>> -- >>>>>> <p class="practicedesc">Provide metadata about locale parameters >>>>>> (date, time, and number formats, language).</p> >>>>>> -- >>>>>> >>>>>> To say: >>>>>> >>>>>> -- >>>>>> <p class="practicedesc">Use locale-neutral data structures and >>>>>> values, or, where that is not possible, provide metadata about the >>>>>> locale used by data values.</p> >>>>>> -- >>>>>> >>>>>> I would change: >>>>>> >>>>>> -- >>>>>> <p>The simplest method is to use local-neutral representations of >>>>>> the actual data, and then add metadata to provide relevant locale >>>>>> information. For example, rather than storing "€2000.00" as a >>>>>> string, it's strongly preferred to exchange a data structure such >>>>>> as:</p> >>>>>> -- >>>>>> >>>>>> To say: >>>>>> >>>>>> -- >>>>>> <p>Most common data representations are locale neutral. For >>>>>> example, XML Schema types such as xsd:integer and xsd: date are >>>>>> intended for locale-neutral data interchange. Using locale-neutral >>>>>> representations allows the data values to be processed accurately >>>>>> without complex parsing or misinterpretation and also allows the >>>>>> data to be presented in the format most comfortable for the >>>>>> consumer of the data. For example, rather than storing "€2000,00" >>>>>> as a string, it's strongly preferred to exchange a data structure >>>>>> such as:</p> >>>>>> -- >>>>>> >>>>>> Also, note the misspelling of "locale-neutral" in the pull request. >>>>>> >>>>>> I would then go on to add some text about when locale parameters >>>>>> are needed. Something like: >>>>>> >>>>>> -- >>>>>> Some datasets contain values that are not or cannot be rendered >>>>>> into a locale-neutral format. This is particularly true of any >>>>>> natural language text values. For each data field that can contain >>>>>> locale affected or natural language text, there should be an >>>>>> associated language tag used to indicate the language and locale of the >> data. >>>>>> This locale information can be used in parsing the data or to >>>>>> ensure proper presentation and processing of the value by the >> consumer. >>>>>> -- >>>>>> >>>>>> (Sorry for not generating a pull request of my own) >>>>>> >>>>>> Addison >>>>>> >>>>>>> -----Original Message----- >>>>>>> From: Phil Archer [mailto:phila@w3.org] >>>>>>> Sent: Friday, August 19, 2016 8:37 AM >>>>>>> To: Bernadette Farias Lóscio <bfl@cin.ufpe.br>; Annette Greiner >>>>>>> <amgreiner@lbl.gov> >>>>>>> Cc: Phillips, Addison <addison@lab126.com>; ishida@w3.org; >>>>>>> public-dwbp- comments@w3.org; www International >>>>>>> <www-international@w3.org> >>>>>>> Subject: Re: [i18n review comment] BP3 should recommend >>>>>>> locale-neutral representation #187 >>>>>>> >>>>>>> I took an action on today's call to try and address this in BP3. >>>>>>> You can see the results at >>>>>>> >> http://philarcher1.github.io/dwbp/bp.html#LocaleParametersMetadata >>>>>>> This uses some of Addison's text directly and highlights the value >>>>>>> of the xsd datatypes - but retains enough of the original BP for >>>>>>> it to be an amendment rather than a whole new one - I hope. >>>>>>> >>>>>>> This addresses most of the resolution taken today [1] but I have >>>>>>> not moved the BP to the formats section. I leave that to the >>>>>>> editors who may want to make further changes - or argue for it to >>>>>>> be left where it is, or add references from the formats section or, or, >> or... >>>>>>> I've created the Pull Request https://github.com/w3c/dwbp/pull/447 >>>>>>> >>>>>>> Phil. >>>>>>> >>>>>>> [1] https://www.w3.org/2016/08/19-dwbp-minutes#resolution02 >>>>>>> >>>>>>> On 15/08/2016 17:28, Bernadette Farias Lóscio wrote: >>>>>>>> Dear Ishida, >>>>>>>> >>>>>>>> This comment [1] is still under discussion [4] and we'd like to >>>>>>>> ask your opinion about two of our proposals: >>>>>>>> >>>>>>>> 1. to include locale-neutral representation ideas as part of BP3 >>>>>>>> [2], or 2. to include a paragraph at the introduction of Section >>>>>>>> 8.8 Data Formats [3] to discuss the relevance of having >>>>>>>> local-neutral representations. >>>>>>>> >>>>>>>> We also discussed the proposal of having a new BP and we agreed >>>>>>>> that we won't have a lot of time for a broader review of the new >>>>>>>> BP and to collect feedback from the community. >>>>>>>> >>>>>>>> Thanks a lot! >>>>>>>> DWBP editors >>>>>>>> >>>>>>>> [1] https://lists.w3.org/Archives/Public/public-dwbp-comments/ >>>>>>>> 2016Jul/0028.html >>>>>>>> >>>> [2]http://agreiner.github.io/dwbp/bp.html#LocaleParametersMetadata >>>>>>>> [3] https://www.w3.org/TR/dwbp/#dataFormats >>>>>>>> [4] >>>>>>>> https://lists.w3.org/Archives/Public/public-dwbp-wg/2016Aug/0009. >>>>>>>> ht >>>>>>>> ml >>>>>>>> >>>>>>>> >>>>>>>> 2016-08-04 23:26 GMT+02:00 Annette Greiner <amgreiner@lbl.gov>: >>>>>>>> >>>>>>>>> Hi Addison, >>>>>>>>> >>>>>>>>> Thanks for your response, and it does make sense. I think what I >>>>>>>>> am still missing is whether there is guidance we can point to as >>>>>>>>> to how to represent the "locale-neutral" data so that it can >>>>>>>>> most easily be made locale specific by existing tools. You >>>>>>>>> mention "pre-made standards for the basic data types". Is there >>>>>>>>> a recommended list we could >>>>>>> reference? >>>>>>>>> Thanks for your help! >>>>>>>>> -Annette >>>>>>>>> >>>>>>>>> >>>>>>>>> On 8/4/16 12:31 PM, Phillips, Addison wrote: >>>>>>>>> >>>>>>>>>> Hi Annette, >>>>>>>>>> >>>>>>>>>> Thanks for the note. This is a personal reply not on behalf of >>>>>>>>>> the WG. >>>>>>>>>> >>>>>>>>>> Locale neutral formats are quite common on the Web and the >>>>>>>>>> Internet in general. One familiar format referenced by your >>>>>>>>>> document, for example, is XML Schema. While the >> representations >>>>>>>>>> of numbers, dates, and the like in XML Schema would be "more >>>>>>>>>> appropriate" for some languages/locales than others if given as >>>>>>>>>> plain text, what distinguishes them is that they are all >>>>>>>>>> machine readable and intended to >>>>>>> be read by machines for later processing. >>>>>>>>>> The display of values is a separate, local, concern for the >>>>>>>>>> data's consumer. This necessarily means choosing specific >>>>>>>>>> separators (such as decimal separators) over other, more >>>>>>>>>> localized values. Save for "free >>>>>>> text" >>>>>>>>>> (natural language) data, most data formats are locale neutral >>>>>>>>>> and these include things like JSON-LD, XML Schema, CSV, and so >> forth. >>>>>>>>>> Not every possible data structure or data value is, of course, >>>>>>>>>> covered fully. For example, in my day job (I work at Amazon), >>>>>>>>>> we have many different common measurement units defined >> internally. >>>>>>>>>> To transmit these in a locale-neutral manner, we need to >>>>>>>>>> construct our own data schemas and identifiers. There are >>>>>>>>>> profoundly many ways to measure shoes, dresses, auto parts, >>>>>>>>>> hats, drone propellers, and so forth. But it would be a >>>>>>>>>> nightmare to have to deal with localized >>>>>>> presentation formats on top of that. >>>>>>>>>> But there are pre-made standards for the basic data types and >>>>>>>>>> these are what are needed to build almost any data structure >>>>>>>>>> necessary for global interchange of data. >>>>>>>>>> >>>>>>>>>> Does that make sense? >>>>>>>>>> >>>>>>>>>> Addison >>>>>>>>>> >>>>>>>>>> Addison Phillips >>>>>>>>>> Principal SDE, I18N Architect (Amazon) Chair (W3C I18N WG) >>>>>>>>>> >>>>>>>>>> Internationalization is not a feature. >>>>>>>>>> It is an architecture. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -----Original Message----- >>>>>>>>>>> From: Annette Greiner [mailto:amgreiner@lbl.gov] >>>>>>>>>>> Sent: Thursday, August 04, 2016 12:04 PM >>>>>>>>>>> To: ishida@w3.org; public-dwbp-comments@w3.org >>>>>>>>>>> Cc: www International <www-international@w3.org> >>>>>>>>>>> Subject: Re: [i18n review comment] BP3 should recommend >>>>>>>>>>> locale-neutral representation #187 >>>>>>>>>>> >>>>>>>>>>> Hello on behalf of the DWBP WG, >>>>>>>>>>> >>>>>>>>>>> We're interested in pursuing this concept in our best practice >>>>>>>>>>> document, but we would like some clarification of the practice >>>>>>>>>>> of locale neutrality. >>>>>>>>>>> You >>>>>>>>>>> mention the variation across locales in decimal symbol, >>>>>>>>>>> grouping symbol, number of grouping digits, digit shapes, >>>>>>>>>>> etc., and you give an example of a locale-neutral data >>>>>>>>>>> structure for monetary >>>> values. >>>>>>>>>>> But this structure alone does not appear to address >>>>>>>>>>> differences in decimal symbol, grouping symbol, number of >>>>>>>>>>> grouping digits, or digit shapes. It does provide a mechanism >>>>>>>>>>> to separately specify the units, and the example uses an >>>>>>>>>>> ISO-4217 currency code, both of which we agree are good ideas. >>>>>>>>>>> Is there a broad standard (beyond just monetary) for >>>>>>>>>>> addressing the other symbol/representation issues you raised >>>>>>>>>>> that we can address >>>> briefly in our best practice? >>>>>>>>>>> Do you consider SI units consistent with a locale-neutral >> approach? >>>>>>>>>>> Is there a locale-neutral standard for representing decimal >>>>>>>>>>> numbers (perhaps using a period and no grouping, as in your >>>> example)? >>>>>>>>>>> -Annette >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On 7/22/16 5:32 AM, ishida@w3.org wrote: >>>>>>>>>>> >>>>>>>>>>>> [raised by aphillips] >>>>>>>>>>>> >>>>>>>>>>>> https://www.w3.org/TR/dwbp/#LocaleParametersMetadata >>>>>>>>>>>> >>>>>>>>>>>> Best practice #3 introduces itself as: >>>>>>>>>>>> >>>>>>>>>>>> Providing locale parameters helps humans and computer >>>>>>>>>>>> applications to work accurately with things like dates, >>>>>>>>>>>> currencies and numbers that may look similar but have >>>>>>>>>>>> different meanings in different locales. >>>>>>>>>>>> >>>>>>>>>>>> But the actual best practice is to use **locale-neutral** >>>>>>>>>>>> representations that are interpreted/displayed to end-users >>>>>>>>>>>> in a locale-appropriate manner. For example, instead of >>>>>>>>>>>> storing the string "€2000.00", exchanging a data structure >>>>>>>>>>>> like the following is strongly >>>>>>>>>>>> preferred: >>>>>>>>>>>> >>>>>>>>>>>> ``` >>>>>>>>>>>> "price" { >>>>>>>>>>>> "value": 2000.00, >>>>>>>>>>>> "currency": "EUR" >>>>>>>>>>>> } >>>>>>>>>>>> ``` >>>>>>>>>>>> >>>>>>>>>>>> The date examples given are all in xsd:date format, which is >>>>>>>>>>>> an excellent example of using a locale-neutral format. >>>>>>>>>>>> >>>>>>>>>>>> Many things are dependent on locale: decimal symbol, >> grouping >>>>>>>>>>>> symbol, number of grouping digits, digit shapes, etc. It's >>>>>>>>>>>> because there can be wide variation (sometimes open to >>>>>>>>>>>> misinterpretation) that sending a locale neutral format is >>>> preferred for data values. >>>>>>>>>>>> Note also btw that the position of the currency symbol is >>>>>>>>>>>> dependent on the locale. In France it would be normal to >>>>>>>>>>>> write >>>>>>> 2000.00 € rather than €2000.00. >>>>>>>>>>>> Same even when talking about USD when using $, ie. 2000.00 $. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> -- >>>>>>>>>>> Annette Greiner >>>>>>>>>>> NERSC Data and Analytics Services Lawrence Berkeley National >>>>>>>>>>> Laboratory >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>> -- >>>>>>>>> Annette Greiner >>>>>>>>> NERSC Data and Analytics Services Lawrence Berkeley National >>>>>>>>> Laboratory >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>> -- >>>>>>> >>>>>>> >>>>>>> Phil Archer >>>>>>> W3C Data Activity Lead >>>>>>> http://www.w3.org/2013/data/ >>>>>>> >>>>>>> http://philarcher.org >>>>>>> +44 (0)7887 767755 >>>>>>> @philarcher1 >>>> -- >>>> >>>> >>>> Phil Archer >>>> W3C Data Activity Lead >>>> http://www.w3.org/2013/data/ >>>> >>>> http://philarcher.org >>>> +44 (0)7887 767755 >>>> @philarcher1 >> -- >> >> >> Phil Archer >> W3C Data Activity Lead >> http://www.w3.org/2013/data/ >> >> http://philarcher.org >> +44 (0)7887 767755 >> @philarcher1 -- Annette Greiner NERSC Data and Analytics Services Lawrence Berkeley National Laboratory
Re: [i18n review comment] BP3 should recommend locale-neutral representation #187
Phil Archer   Wed, 24 Aug 2016 11:44:09 +0100

public-dwbp-comments > August 2016 > 0000.html

Received on Wednesday, 24 August 2016 10:41:35 UTC

Show in list: by dateby threadby subjectby author

Link to this message in this page.

Sent to: amgreiner@lbl.gov, addison@lab126.com, deirdre@derilinx.com, bfl@cin.ufpe.br
Copied to: ishida@w3.org, ishida@w3.org, public-dwbp-comments@w3.org, public-dwbp-comments@w3.org, www-international@w3.org.

Thanks Annette, As time is tight - I want to put the CR doc in place - I've gone ahead and responded to this as indicated inline below: On 23/08/2016 18:30, Annette Greiner wrote: > Hi folks, > > Sorry I haven't been able to jump in before now. Since this has been > changing a bunch, let me say that this comment is on the version at > http://w3c.github.io/dwbp/bp.html#dataFormats as of 9:37am PDT August 23. > > The "Why" still devotes more text to the metadata approach than to the > locale-neutral approach, though a little reshuffling would fix that. > Here's a suggested rewrite: > > "Data values that are machine-readable and not specific to any > particular language or culture are more durable and less open to > misinterpretation than values that use one of the many different > cultural representations. Things like dates, currencies and numbers may > look similar but have different meanings in different locales. For > example, the 'date' 4/7 can be read as 7th of April or the 4th of July > depending on where the data was created. Similarly, €2,000 is either two > thousand Euros or an over-precise representation of two Euros. By using > a locale-neutral format, systems avoid the need to establish specific > interchange rules that vary according to the language or location of the > user. When the data is already in a locale-specific format, making the > locale and language explicit by providing locale > <http://w3c.github.io/dwbp/bp.html#locale_parameter> parameters allows > users to determine how readily they can work with the data and may > enable automated translation services." No problem AFAICT - text changed to this. I very much doubt Addsion will object. > > I also don't believe this is true: "Most common data representations are > locale neutral." I would say most common data serialization formats are > locale neutral, but it seems to me quite common to see them used in > locale-specific ways. OK, text changed, Pull request made and merged. > > Finally, the example marked prominently as Example 13 looks like the > primary suggestion for implementing the BP, which it isn't anymore. I > think the 2000 Euro example should be at least as prominently marked. I sympathise but I'm going to have to leave that to the editors. It can be done by simply adding class="example" to the <pre> element. But, doing that then means that the example numbers will be out of step with the BP numbers from that that point on, which I *think* editors have been anxious to avoid? Berna, Newton, Carol - can you look at this today? Cheers Phil > > -Annette > > > On 8/23/16 7:11 AM, Phillips, Addison wrote: >> Hi Phil, >> >> Thanks. This looks good to me. >> >> Addison >> >>> -----Original Message----- >>> From: Phil Archer [mailto:phila@w3.org] >>> Sent: Tuesday, August 23, 2016 3:29 AM >>> To: Phillips, Addison <addison@lab126.com>; Deirdre Lee >>> <deirdre@derilinx.com>; Bernadette Farias Lóscio <bfl@cin.ufpe.br>; >>> Annette Greiner <amgreiner@lbl.gov> >>> Cc: ishida@w3.org; public-dwbp-comments@w3.org; www International >>> <www-international@w3.org> >>> Subject: Re: [i18n review comment] BP3 should recommend locale-neutral >>> representation #187 >>> >>> Thanks again Addison, >>> >>> Pls see below. >>> >>> On 22/08/2016 18:36, Phillips, Addison wrote: >>>> Hi Phil, >>>> >>>> This looks good. A few comments. >>>> >>>> 1. Rather than providing your own definition for 'locale', you might >>>> make >>> use of the one we provide in LTLI [1]. >>> >>> Done >>> http://w3c.github.io/dwbp/bp.html#locale_parameter >>> >>>> 2. The "why" is still missing something. I would suggest adding a >>>> new first >>> paragraph explaining locale-neutral first. Something like: >>>> -- >>>> Data values that are machine-readable and not specific to any >>>> particular >>> language or culture are more durable and less open to >>> misinterpretation than >>> values that use one of the many different cultural representations. >>> By using a >>> locale-neutral format, systems avoid the need to establish specific >>> interchange rules that vary according to the language or location of >>> the user. >>>> When the data is already in a locale-specific format, providing locale >>>> parameters... <rest of existing text> >>> >>> Done, exactly as you suggest >>> http://w3c.github.io/dwbp/bp.html#LocaleParametersMetadata >>> >>> With luck... the doc gets a green light from you? >>> >>> Thanks again >>> >>> Phil. >>> >>>> -- >>>> >>>> Hope that helps, >>>> >>>> Addison >>>> >>>> [1] https://www.w3.org/TR/ltli/#locale >>>> >>>>> -----Original Message----- >>>>> From: Phil Archer [mailto:phila@w3.org] >>>>> Sent: Monday, August 22, 2016 2:34 AM >>>>> To: Deirdre Lee <deirdre@derilinx.com>; Phillips, Addison >>>>> <addison@lab126.com>; Bernadette Farias Lóscio <bfl@cin.ufpe.br>; >>>>> Annette Greiner <amgreiner@lbl.gov> >>>>> Cc: ishida@w3.org; public-dwbp-comments@w3.org; www International >>>>> <www-international@w3.org> >>>>> Subject: Re: [i18n review comment] BP3 should recommend >>>>> locale-neutral representation #187 >>>>> >>>>> Dear all, >>>>> >>>>> I have taken further steps on this. The result can be seen at >>>>> http://philarcher1.github.io/dwbp/bp.html#LocaleParametersMetadata >>>>> >>>>> 1. Addision's text used more or less verbatim; 1a. taken account of >>>>> Annette's suggestion; 1b. replaced inline links to BCP47 and CLDR with >>> references 2. >>>>> title of the BP changed to Use locale-neutral data representations 3. >>>>> moved to Data Formats section as resolved in WG meeting on Friday; 4. >>>>> added R- FormatMachineRead to list of evidence and thereby updated >>>>> the UCR cross matching; 5. updated the Challenges SVG diagram; 6. >>>>> updated my Pull request. >>>>> >>>>> NB, I *retained* the old ID for the BP so that any links to >>>>> #LocaleParametersMetadata will still work. I know there are some of >>>>> these, for example, in the Share-PSI project. >>>>> >>>>> HTH >>>>> >>>>> Phil. >>>>> >>>>> >>>>> >>>>> On 22/08/2016 08:52, Deirdre Lee wrote: >>>>>> HI, >>>>>> >>>>>> Thank you for your comments Addison. I think they make sense and >>>>>> should be straight-forward to incorporate. >>>>>> >>>>>> The title of the BP should probably also be updated to something >>>>>> like 'Provide locale-neutral data' >>>>>> >>>>>> Phil and DWBP editors, in Friday's meeting we also agreed to move >>>>>> BP3 to the Data Formats section from the Metadata section, which >>>>>> would make it BP14, right? >>>>>> >>>>>> Kind regards, >>>>>> >>>>>> Deirdre >>>>>> >>>>>> >>>>>> >>>>>> On 19/08/2016 17:39, Phillips, Addison wrote: >>>>>>> Hi Phil, >>>>>>> >>>>>>> Thanks for starting on this. I think the pull request is a good >>>>>>> start. >>>>>>> I have some comments on it. >>>>>>> >>>>>>> My main concern is that this BP is really backwards. It recommends >>>>>>> to "locale parameter metadata" and then says that the simplest way >>>>>>> to do this is to use locale-neutral formats. The recommendation >>>>>>> should be more like "use locale-neutral formats or provide >>>>>>> locale/language information where that's not possible". The pull >>>>>>> request captures the use of locale-neutral, but doesn't really >>>>>>> explain about when to provide locale and language information. >>>>>>> >>>>>>> I would change this: >>>>>>> >>>>>>> -- >>>>>>> <p class="practicedesc">Provide metadata about locale parameters >>>>>>> (date, time, and number formats, language).</p> >>>>>>> -- >>>>>>> >>>>>>> To say: >>>>>>> >>>>>>> -- >>>>>>> <p class="practicedesc">Use locale-neutral data structures and >>>>>>> values, or, where that is not possible, provide metadata about the >>>>>>> locale used by data values.</p> >>>>>>> -- >>>>>>> >>>>>>> I would change: >>>>>>> >>>>>>> -- >>>>>>> <p>The simplest method is to use local-neutral representations of >>>>>>> the actual data, and then add metadata to provide relevant locale >>>>>>> information. For example, rather than storing "€2000.00" as a >>>>>>> string, it's strongly preferred to exchange a data structure such >>>>>>> as:</p> >>>>>>> -- >>>>>>> >>>>>>> To say: >>>>>>> >>>>>>> -- >>>>>>> <p>Most common data representations are locale neutral. For >>>>>>> example, XML Schema types such as xsd:integer and xsd: date are >>>>>>> intended for locale-neutral data interchange. Using locale-neutral >>>>>>> representations allows the data values to be processed accurately >>>>>>> without complex parsing or misinterpretation and also allows the >>>>>>> data to be presented in the format most comfortable for the >>>>>>> consumer of the data. For example, rather than storing "€2000,00" >>>>>>> as a string, it's strongly preferred to exchange a data structure >>>>>>> such as:</p> >>>>>>> -- >>>>>>> >>>>>>> Also, note the misspelling of "locale-neutral" in the pull request. >>>>>>> >>>>>>> I would then go on to add some text about when locale parameters >>>>>>> are needed. Something like: >>>>>>> >>>>>>> -- >>>>>>> Some datasets contain values that are not or cannot be rendered >>>>>>> into a locale-neutral format. This is particularly true of any >>>>>>> natural language text values. For each data field that can contain >>>>>>> locale affected or natural language text, there should be an >>>>>>> associated language tag used to indicate the language and locale >>>>>>> of the >>> data. >>>>>>> This locale information can be used in parsing the data or to >>>>>>> ensure proper presentation and processing of the value by the >>> consumer. >>>>>>> -- >>>>>>> >>>>>>> (Sorry for not generating a pull request of my own) >>>>>>> >>>>>>> Addison >>>>>>> >>>>>>>> -----Original Message----- >>>>>>>> From: Phil Archer [mailto:phila@w3.org] >>>>>>>> Sent: Friday, August 19, 2016 8:37 AM >>>>>>>> To: Bernadette Farias Lóscio <bfl@cin.ufpe.br>; Annette Greiner >>>>>>>> <amgreiner@lbl.gov> >>>>>>>> Cc: Phillips, Addison <addison@lab126.com>; ishida@w3.org; >>>>>>>> public-dwbp- comments@w3.org; www International >>>>>>>> <www-international@w3.org> >>>>>>>> Subject: Re: [i18n review comment] BP3 should recommend >>>>>>>> locale-neutral representation #187 >>>>>>>> >>>>>>>> I took an action on today's call to try and address this in BP3. >>>>>>>> You can see the results at >>>>>>>> >>> http://philarcher1.github.io/dwbp/bp.html#LocaleParametersMetadata >>>>>>>> This uses some of Addison's text directly and highlights the value >>>>>>>> of the xsd datatypes - but retains enough of the original BP for >>>>>>>> it to be an amendment rather than a whole new one - I hope. >>>>>>>> >>>>>>>> This addresses most of the resolution taken today [1] but I have >>>>>>>> not moved the BP to the formats section. I leave that to the >>>>>>>> editors who may want to make further changes - or argue for it to >>>>>>>> be left where it is, or add references from the formats section >>>>>>>> or, or, >>> or... >>>>>>>> I've created the Pull Request https://github.com/w3c/dwbp/pull/447 >>>>>>>> >>>>>>>> Phil. >>>>>>>> >>>>>>>> [1] https://www.w3.org/2016/08/19-dwbp-minutes#resolution02 >>>>>>>> >>>>>>>> On 15/08/2016 17:28, Bernadette Farias Lóscio wrote: >>>>>>>>> Dear Ishida, >>>>>>>>> >>>>>>>>> This comment [1] is still under discussion [4] and we'd like to >>>>>>>>> ask your opinion about two of our proposals: >>>>>>>>> >>>>>>>>> 1. to include locale-neutral representation ideas as part of BP3 >>>>>>>>> [2], or 2. to include a paragraph at the introduction of Section >>>>>>>>> 8.8 Data Formats [3] to discuss the relevance of having >>>>>>>>> local-neutral representations. >>>>>>>>> >>>>>>>>> We also discussed the proposal of having a new BP and we agreed >>>>>>>>> that we won't have a lot of time for a broader review of the new >>>>>>>>> BP and to collect feedback from the community. >>>>>>>>> >>>>>>>>> Thanks a lot! >>>>>>>>> DWBP editors >>>>>>>>> >>>>>>>>> [1] https://lists.w3.org/Archives/Public/public-dwbp-comments/ >>>>>>>>> 2016Jul/0028.html >>>>>>>>> >>>>> [2]http://agreiner.github.io/dwbp/bp.html#LocaleParametersMetadata >>>>>>>>> [3] https://www.w3.org/TR/dwbp/#dataFormats >>>>>>>>> [4] >>>>>>>>> https://lists.w3.org/Archives/Public/public-dwbp-wg/2016Aug/0009. >>>>>>>>> ht >>>>>>>>> ml >>>>>>>>> >>>>>>>>> >>>>>>>>> 2016-08-04 23:26 GMT+02:00 Annette Greiner <amgreiner@lbl.gov>: >>>>>>>>> >>>>>>>>>> Hi Addison, >>>>>>>>>> >>>>>>>>>> Thanks for your response, and it does make sense. I think what I >>>>>>>>>> am still missing is whether there is guidance we can point to as >>>>>>>>>> to how to represent the "locale-neutral" data so that it can >>>>>>>>>> most easily be made locale specific by existing tools. You >>>>>>>>>> mention "pre-made standards for the basic data types". Is there >>>>>>>>>> a recommended list we could >>>>>>>> reference? >>>>>>>>>> Thanks for your help! >>>>>>>>>> -Annette >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On 8/4/16 12:31 PM, Phillips, Addison wrote: >>>>>>>>>> >>>>>>>>>>> Hi Annette, >>>>>>>>>>> >>>>>>>>>>> Thanks for the note. This is a personal reply not on behalf of >>>>>>>>>>> the WG. >>>>>>>>>>> >>>>>>>>>>> Locale neutral formats are quite common on the Web and the >>>>>>>>>>> Internet in general. One familiar format referenced by your >>>>>>>>>>> document, for example, is XML Schema. While the >>> representations >>>>>>>>>>> of numbers, dates, and the like in XML Schema would be "more >>>>>>>>>>> appropriate" for some languages/locales than others if given as >>>>>>>>>>> plain text, what distinguishes them is that they are all >>>>>>>>>>> machine readable and intended to >>>>>>>> be read by machines for later processing. >>>>>>>>>>> The display of values is a separate, local, concern for the >>>>>>>>>>> data's consumer. This necessarily means choosing specific >>>>>>>>>>> separators (such as decimal separators) over other, more >>>>>>>>>>> localized values. Save for "free >>>>>>>> text" >>>>>>>>>>> (natural language) data, most data formats are locale neutral >>>>>>>>>>> and these include things like JSON-LD, XML Schema, CSV, and so >>> forth. >>>>>>>>>>> Not every possible data structure or data value is, of course, >>>>>>>>>>> covered fully. For example, in my day job (I work at Amazon), >>>>>>>>>>> we have many different common measurement units defined >>> internally. >>>>>>>>>>> To transmit these in a locale-neutral manner, we need to >>>>>>>>>>> construct our own data schemas and identifiers. There are >>>>>>>>>>> profoundly many ways to measure shoes, dresses, auto parts, >>>>>>>>>>> hats, drone propellers, and so forth. But it would be a >>>>>>>>>>> nightmare to have to deal with localized >>>>>>>> presentation formats on top of that. >>>>>>>>>>> But there are pre-made standards for the basic data types and >>>>>>>>>>> these are what are needed to build almost any data structure >>>>>>>>>>> necessary for global interchange of data. >>>>>>>>>>> >>>>>>>>>>> Does that make sense? >>>>>>>>>>> >>>>>>>>>>> Addison >>>>>>>>>>> >>>>>>>>>>> Addison Phillips >>>>>>>>>>> Principal SDE, I18N Architect (Amazon) Chair (W3C I18N WG) >>>>>>>>>>> >>>>>>>>>>> Internationalization is not a feature. >>>>>>>>>>> It is an architecture. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> -----Original Message----- >>>>>>>>>>>> From: Annette Greiner [mailto:amgreiner@lbl.gov] >>>>>>>>>>>> Sent: Thursday, August 04, 2016 12:04 PM >>>>>>>>>>>> To: ishida@w3.org; public-dwbp-comments@w3.org >>>>>>>>>>>> Cc: www International <www-international@w3.org> >>>>>>>>>>>> Subject: Re: [i18n review comment] BP3 should recommend >>>>>>>>>>>> locale-neutral representation #187 >>>>>>>>>>>> >>>>>>>>>>>> Hello on behalf of the DWBP WG, >>>>>>>>>>>> >>>>>>>>>>>> We're interested in pursuing this concept in our best practice >>>>>>>>>>>> document, but we would like some clarification of the practice >>>>>>>>>>>> of locale neutrality. >>>>>>>>>>>> You >>>>>>>>>>>> mention the variation across locales in decimal symbol, >>>>>>>>>>>> grouping symbol, number of grouping digits, digit shapes, >>>>>>>>>>>> etc., and you give an example of a locale-neutral data >>>>>>>>>>>> structure for monetary >>>>> values. >>>>>>>>>>>> But this structure alone does not appear to address >>>>>>>>>>>> differences in decimal symbol, grouping symbol, number of >>>>>>>>>>>> grouping digits, or digit shapes. It does provide a mechanism >>>>>>>>>>>> to separately specify the units, and the example uses an >>>>>>>>>>>> ISO-4217 currency code, both of which we agree are good ideas. >>>>>>>>>>>> Is there a broad standard (beyond just monetary) for >>>>>>>>>>>> addressing the other symbol/representation issues you raised >>>>>>>>>>>> that we can address >>>>> briefly in our best practice? >>>>>>>>>>>> Do you consider SI units consistent with a locale-neutral >>> approach? >>>>>>>>>>>> Is there a locale-neutral standard for representing decimal >>>>>>>>>>>> numbers (perhaps using a period and no grouping, as in your >>>>> example)? >>>>>>>>>>>> -Annette >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On 7/22/16 5:32 AM, ishida@w3.org wrote: >>>>>>>>>>>> >>>>>>>>>>>>> [raised by aphillips] >>>>>>>>>>>>> >>>>>>>>>>>>> https://www.w3.org/TR/dwbp/#LocaleParametersMetadata >>>>>>>>>>>>> >>>>>>>>>>>>> Best practice #3 introduces itself as: >>>>>>>>>>>>> >>>>>>>>>>>>> Providing locale parameters helps humans and computer >>>>>>>>>>>>> applications to work accurately with things like dates, >>>>>>>>>>>>> currencies and numbers that may look similar but have >>>>>>>>>>>>> different meanings in different locales. >>>>>>>>>>>>> >>>>>>>>>>>>> But the actual best practice is to use **locale-neutral** >>>>>>>>>>>>> representations that are interpreted/displayed to end-users >>>>>>>>>>>>> in a locale-appropriate manner. For example, instead of >>>>>>>>>>>>> storing the string "€2000.00", exchanging a data structure >>>>>>>>>>>>> like the following is strongly >>>>>>>>>>>>> preferred: >>>>>>>>>>>>> >>>>>>>>>>>>> ``` >>>>>>>>>>>>> "price" { >>>>>>>>>>>>> "value": 2000.00, >>>>>>>>>>>>> "currency": "EUR" >>>>>>>>>>>>> } >>>>>>>>>>>>> ``` >>>>>>>>>>>>> >>>>>>>>>>>>> The date examples given are all in xsd:date format, which is >>>>>>>>>>>>> an excellent example of using a locale-neutral format. >>>>>>>>>>>>> >>>>>>>>>>>>> Many things are dependent on locale: decimal symbol, >>> grouping >>>>>>>>>>>>> symbol, number of grouping digits, digit shapes, etc. It's >>>>>>>>>>>>> because there can be wide variation (sometimes open to >>>>>>>>>>>>> misinterpretation) that sending a locale neutral format is >>>>> preferred for data values. >>>>>>>>>>>>> Note also btw that the position of the currency symbol is >>>>>>>>>>>>> dependent on the locale. In France it would be normal to >>>>>>>>>>>>> write >>>>>>>> 2000.00 € rather than €2000.00. >>>>>>>>>>>>> Same even when talking about USD when using $, ie. 2000.00 $. >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> -- >>>>>>>>>>>> Annette Greiner >>>>>>>>>>>> NERSC Data and Analytics Services Lawrence Berkeley National >>>>>>>>>>>> Laboratory >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> Annette Greiner >>>>>>>>>> NERSC Data and Analytics Services Lawrence Berkeley National >>>>>>>>>> Laboratory >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>> -- >>>>>>>> >>>>>>>> >>>>>>>> Phil Archer >>>>>>>> W3C Data Activity Lead >>>>>>>> http://www.w3.org/2013/data/ >>>>>>>> >>>>>>>> http://philarcher.org >>>>>>>> +44 (0)7887 767755 >>>>>>>> @philarcher1 >>>>> -- >>>>> >>>>> >>>>> Phil Archer >>>>> W3C Data Activity Lead >>>>> http://www.w3.org/2013/data/ >>>>> >>>>> http://philarcher.org >>>>> +44 (0)7887 767755 >>>>> @philarcher1 >>> -- >>> >>> >>> Phil Archer >>> W3C Data Activity Lead >>> http://www.w3.org/2013/data/ >>> >>> http://philarcher.org >>> +44 (0)7887 767755 >>> @philarcher1 > -- Phil Archer W3C Data Activity Lead http://www.w3.org/2013/data/ http://philarcher.org +44 (0)7887 767755 @philarcher1
Re: [i18n review comment] BP3 should recommend locale-neutral representation #187
Bernadette Farias Lóscio   Wed, 24 Aug 2016 08:56:41 -0300

public-dwbp-comments > August 2016 > 0000.html

Received on Wednesday, 24 August 2016 11:57:37 UTC

Show in list: by dateby threadby subjectby author

Link to this message in this page.

Sent to: phila@w3.org
Copied to: amgreiner@lbl.gov, addison@lab126.com, deirdre@derilinx.com, ishida@w3.org, ishida@w3.org, public-dwbp-comments@w3.org, public-dwbp-comments@w3.org, www-international@w3.org.

Hi all, @Phil, thanks a lot for making the updates! @Annette and Addison thanks for the comments and suggestions! I agree with the changes made until now, but I'd like to answer the following comment: Finally, the example marked prominently as Example 13 looks like the >> primary suggestion for implementing the BP, which it isn't anymore. I >> think the 2000 Euro example should be at least as prominently marked. >> > > I sympathise but I'm going to have to leave that to the editors. It can be > done by simply adding class="example" to the <pre> element. But, doing that > then means that the example numbers will be out of step with the BP numbers > from that that point on, which I *think* editors have been anxious to avoid? > > I don't think its a good idea to change the numbers of the examples. One solution could be to make some changes on the example 13. Example 13 shows both the use of locale-neutral representation and locale-parameters metadata. We have the tag "xsd:date" in 'dct:issued "2015-05-05"^^xsd:date', but we also have 'dct:conformsTo < http://www.iso.org/iso/home/standards/iso8601.htm>' to indicate the standard adopted as date format. :stops-2015-05-05 a dcat:Dataset ; dct:title "Bus stops of MyCity" ; dcat:keyword "transport","mobility","bus" ; dct:issued "2015-05-05"^^xsd:date ; dcat:contactPoint <http://data.mycity.example.com/transport/contact> ; dct:temporal <http://reference.data.gov.uk/id/year/2015> ; dct:spatial <http://www.geonames.org/3399415> ; dct:publisher :transport-agency-mycity ; dct:accrualPeriodicity < http://purl.org/linked-data/sdmx/2009/code#freq-A> ; dcat:theme :mobility ; dcat:distribution :stops-2015-05-05.csv ; dct:language <http://id.loc.gov/vocabulary/iso639-1/en> , <http://id.loc.gov/vocabulary/iso639-1/pt> ; dct:conformsTo <http://www.iso.org/iso/home/standards/iso8601.htm> ; . Should we have both ou just xsd:date? If I understood correct, I think we should keep just xsd:date. In this case, we can also change the example description to mention that we are using a locale-neutral representation for date and a locale-parameter metadata (dct:language) to specify the languages in which dataset is published. See the suggestion below: The example below shows the use of xsd:date providing a local-neutral representation for the issue date of the bus stops dataset (stops-2015-05-05). Considering that the data from the bus stops dataset is already in a locale-specific format, then the property dct:language is used to declare the languages the dataset is published in. If the dataset is available in multiple languages, use multiple values for this property. :stops-2015-05-05 a dcat:Dataset ; dct:title "Bus stops of MyCity" ; dcat:keyword "transport","mobility","bus" ; dct:issued "2015-05-05"^^xsd:date ; dcat:contactPoint <http://data.mycity.example.com/transport/contact> ; dct:temporal <http://reference.data.gov.uk/id/year/2015> ; dct:spatial <http://www.geonames.org/3399415> ; dct:publisher :transport-agency-mycity ; dct:accrualPeriodicity < http://purl.org/linked-data/sdmx/2009/code#freq-A> ; dcat:theme :mobility ; dcat:distribution :stops-2015-05-05.csv ; dct:language <http://id.loc.gov/vocabulary/iso639-1/en> , <http://id.loc.gov/vocabulary/iso639-1/pt> ; Please, let me know what do you think about this. Thanks! Berna > > > >> -Annette >> >> >> On 8/23/16 7:11 AM, Phillips, Addison wrote: >> >>> Hi Phil, >>> >>> Thanks. This looks good to me. >>> >>> Addison >>> >>> -----Original Message----- >>>> From: Phil Archer [mailto:phila@w3.org] >>>> Sent: Tuesday, August 23, 2016 3:29 AM >>>> To: Phillips, Addison <addison@lab126.com>; Deirdre Lee >>>> <deirdre@derilinx.com>; Bernadette Farias Lóscio <bfl@cin.ufpe.br>; >>>> Annette Greiner <amgreiner@lbl.gov> >>>> Cc: ishida@w3.org; public-dwbp-comments@w3.org; www International >>>> <www-international@w3.org> >>>> Subject: Re: [i18n review comment] BP3 should recommend locale-neutral >>>> representation #187 >>>> >>>> Thanks again Addison, >>>> >>>> Pls see below. >>>> >>>> On 22/08/2016 18:36, Phillips, Addison wrote: >>>> >>>>> Hi Phil, >>>>> >>>>> This looks good. A few comments. >>>>> >>>>> 1. Rather than providing your own definition for 'locale', you might >>>>> make >>>>> >>>> use of the one we provide in LTLI [1]. >>>> >>>> Done >>>> http://w3c.github.io/dwbp/bp.html#locale_parameter >>>> >>>> 2. The "why" is still missing something. I would suggest adding a >>>>> new first >>>>> >>>> paragraph explaining locale-neutral first. Something like: >>>> >>>>> -- >>>>> Data values that are machine-readable and not specific to any >>>>> particular >>>>> >>>> language or culture are more durable and less open to >>>> misinterpretation than >>>> values that use one of the many different cultural representations. >>>> By using a >>>> locale-neutral format, systems avoid the need to establish specific >>>> interchange rules that vary according to the language or location of >>>> the user. >>>> >>>>> When the data is already in a locale-specific format, providing locale >>>>> parameters... <rest of existing text> >>>>> >>>> >>>> Done, exactly as you suggest >>>> http://w3c.github.io/dwbp/bp.html#LocaleParametersMetadata >>>> >>>> With luck... the doc gets a green light from you? >>>> >>>> Thanks again >>>> >>>> Phil. >>>> >>>> -- >>>>> >>>>> Hope that helps, >>>>> >>>>> Addison >>>>> >>>>> [1] https://www.w3.org/TR/ltli/#locale >>>>> >>>>> -----Original Message----- >>>>>> From: Phil Archer [mailto:phila@w3.org] >>>>>> Sent: Monday, August 22, 2016 2:34 AM >>>>>> To: Deirdre Lee <deirdre@derilinx.com>; Phillips, Addison >>>>>> <addison@lab126.com>; Bernadette Farias Lóscio <bfl@cin.ufpe.br>; >>>>>> Annette Greiner <amgreiner@lbl.gov> >>>>>> Cc: ishida@w3.org; public-dwbp-comments@w3.org; www International >>>>>> <www-international@w3.org> >>>>>> Subject: Re: [i18n review comment] BP3 should recommend >>>>>> locale-neutral representation #187 >>>>>> >>>>>> Dear all, >>>>>> >>>>>> I have taken further steps on this. The result can be seen at >>>>>> http://philarcher1.github.io/dwbp/bp.html#LocaleParametersMetadata >>>>>> >>>>>> 1. Addision's text used more or less verbatim; 1a. taken account of >>>>>> Annette's suggestion; 1b. replaced inline links to BCP47 and CLDR with >>>>>> >>>>> references 2. >>>> >>>>> title of the BP changed to Use locale-neutral data representations 3. >>>>>> moved to Data Formats section as resolved in WG meeting on Friday; 4. >>>>>> added R- FormatMachineRead to list of evidence and thereby updated >>>>>> the UCR cross matching; 5. updated the Challenges SVG diagram; 6. >>>>>> updated my Pull request. >>>>>> >>>>>> NB, I *retained* the old ID for the BP so that any links to >>>>>> #LocaleParametersMetadata will still work. I know there are some of >>>>>> these, for example, in the Share-PSI project. >>>>>> >>>>>> HTH >>>>>> >>>>>> Phil. >>>>>> >>>>>> >>>>>> >>>>>> On 22/08/2016 08:52, Deirdre Lee wrote: >>>>>> >>>>>>> HI, >>>>>>> >>>>>>> Thank you for your comments Addison. I think they make sense and >>>>>>> should be straight-forward to incorporate. >>>>>>> >>>>>>> The title of the BP should probably also be updated to something >>>>>>> like 'Provide locale-neutral data' >>>>>>> >>>>>>> Phil and DWBP editors, in Friday's meeting we also agreed to move >>>>>>> BP3 to the Data Formats section from the Metadata section, which >>>>>>> would make it BP14, right? >>>>>>> >>>>>>> Kind regards, >>>>>>> >>>>>>> Deirdre >>>>>>> >>>>>>> >>>>>>> >>>>>>> On 19/08/2016 17:39, Phillips, Addison wrote: >>>>>>> >>>>>>>> Hi Phil, >>>>>>>> >>>>>>>> Thanks for starting on this. I think the pull request is a good >>>>>>>> start. >>>>>>>> I have some comments on it. >>>>>>>> >>>>>>>> My main concern is that this BP is really backwards. It recommends >>>>>>>> to "locale parameter metadata" and then says that the simplest way >>>>>>>> to do this is to use locale-neutral formats. The recommendation >>>>>>>> should be more like "use locale-neutral formats or provide >>>>>>>> locale/language information where that's not possible". The pull >>>>>>>> request captures the use of locale-neutral, but doesn't really >>>>>>>> explain about when to provide locale and language information. >>>>>>>> >>>>>>>> I would change this: >>>>>>>> >>>>>>>> -- >>>>>>>> <p class="practicedesc">Provide metadata about locale parameters >>>>>>>> (date, time, and number formats, language).</p> >>>>>>>> -- >>>>>>>> >>>>>>>> To say: >>>>>>>> >>>>>>>> -- >>>>>>>> <p class="practicedesc">Use locale-neutral data structures and >>>>>>>> values, or, where that is not possible, provide metadata about the >>>>>>>> locale used by data values.</p> >>>>>>>> -- >>>>>>>> >>>>>>>> I would change: >>>>>>>> >>>>>>>> -- >>>>>>>> <p>The simplest method is to use local-neutral representations of >>>>>>>> the actual data, and then add metadata to provide relevant locale >>>>>>>> information. For example, rather than storing "€2000.00" as a >>>>>>>> string, it's strongly preferred to exchange a data structure such >>>>>>>> as:</p> >>>>>>>> -- >>>>>>>> >>>>>>>> To say: >>>>>>>> >>>>>>>> -- >>>>>>>> <p>Most common data representations are locale neutral. For >>>>>>>> example, XML Schema types such as xsd:integer and xsd: date are >>>>>>>> intended for locale-neutral data interchange. Using locale-neutral >>>>>>>> representations allows the data values to be processed accurately >>>>>>>> without complex parsing or misinterpretation and also allows the >>>>>>>> data to be presented in the format most comfortable for the >>>>>>>> consumer of the data. For example, rather than storing "€2000,00" >>>>>>>> as a string, it's strongly preferred to exchange a data structure >>>>>>>> such as:</p> >>>>>>>> -- >>>>>>>> >>>>>>>> Also, note the misspelling of "locale-neutral" in the pull request. >>>>>>>> >>>>>>>> I would then go on to add some text about when locale parameters >>>>>>>> are needed. Something like: >>>>>>>> >>>>>>>> -- >>>>>>>> Some datasets contain values that are not or cannot be rendered >>>>>>>> into a locale-neutral format. This is particularly true of any >>>>>>>> natural language text values. For each data field that can contain >>>>>>>> locale affected or natural language text, there should be an >>>>>>>> associated language tag used to indicate the language and locale >>>>>>>> of the >>>>>>>> >>>>>>> data. >>>> >>>>> This locale information can be used in parsing the data or to >>>>>>>> ensure proper presentation and processing of the value by the >>>>>>>> >>>>>>> consumer. >>>> >>>>> -- >>>>>>>> >>>>>>>> (Sorry for not generating a pull request of my own) >>>>>>>> >>>>>>>> Addison >>>>>>>> >>>>>>>> -----Original Message----- >>>>>>>>> From: Phil Archer [mailto:phila@w3.org] >>>>>>>>> Sent: Friday, August 19, 2016 8:37 AM >>>>>>>>> To: Bernadette Farias Lóscio <bfl@cin.ufpe.br>; Annette Greiner >>>>>>>>> <amgreiner@lbl.gov> >>>>>>>>> Cc: Phillips, Addison <addison@lab126.com>; ishida@w3.org; >>>>>>>>> public-dwbp- comments@w3.org; www International >>>>>>>>> <www-international@w3.org> >>>>>>>>> Subject: Re: [i18n review comment] BP3 should recommend >>>>>>>>> locale-neutral representation #187 >>>>>>>>> >>>>>>>>> I took an action on today's call to try and address this in BP3. >>>>>>>>> You can see the results at >>>>>>>>> >>>>>>>>> http://philarcher1.github.io/dwbp/bp.html#LocaleParametersMetadata >>>> >>>>> This uses some of Addison's text directly and highlights the value >>>>>>>>> of the xsd datatypes - but retains enough of the original BP for >>>>>>>>> it to be an amendment rather than a whole new one - I hope. >>>>>>>>> >>>>>>>>> This addresses most of the resolution taken today [1] but I have >>>>>>>>> not moved the BP to the formats section. I leave that to the >>>>>>>>> editors who may want to make further changes - or argue for it to >>>>>>>>> be left where it is, or add references from the formats section >>>>>>>>> or, or, >>>>>>>>> >>>>>>>> or... >>>> >>>>> I've created the Pull Request https://github.com/w3c/dwbp/pull/447 >>>>>>>>> >>>>>>>>> Phil. >>>>>>>>> >>>>>>>>> [1] https://www.w3.org/2016/08/19-dwbp-minutes#resolution02 >>>>>>>>> >>>>>>>>> On 15/08/2016 17:28, Bernadette Farias Lóscio wrote: >>>>>>>>> >>>>>>>>>> Dear Ishida, >>>>>>>>>> >>>>>>>>>> This comment [1] is still under discussion [4] and we'd like to >>>>>>>>>> ask your opinion about two of our proposals: >>>>>>>>>> >>>>>>>>>> 1. to include locale-neutral representation ideas as part of BP3 >>>>>>>>>> [2], or 2. to include a paragraph at the introduction of Section >>>>>>>>>> 8.8 Data Formats [3] to discuss the relevance of having >>>>>>>>>> local-neutral representations. >>>>>>>>>> >>>>>>>>>> We also discussed the proposal of having a new BP and we agreed >>>>>>>>>> that we won't have a lot of time for a broader review of the new >>>>>>>>>> BP and to collect feedback from the community. >>>>>>>>>> >>>>>>>>>> Thanks a lot! >>>>>>>>>> DWBP editors >>>>>>>>>> >>>>>>>>>> [1] https://lists.w3.org/Archives/Public/public-dwbp-comments/ >>>>>>>>>> 2016Jul/0028.html >>>>>>>>>> >>>>>>>>>> [2]http://agreiner.github.io/dwbp/bp.html#LocaleParametersMe >>>>>> tadata >>>>>> >>>>>>> [3] https://www.w3.org/TR/dwbp/#dataFormats >>>>>>>>>> [4] >>>>>>>>>> https://lists.w3.org/Archives/Public/public-dwbp-wg/2016Aug/0009. >>>>>>>>>> ht >>>>>>>>>> ml >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> 2016-08-04 23:26 GMT+02:00 Annette Greiner <amgreiner@lbl.gov>: >>>>>>>>>> >>>>>>>>>> Hi Addison, >>>>>>>>>>> >>>>>>>>>>> Thanks for your response, and it does make sense. I think what I >>>>>>>>>>> am still missing is whether there is guidance we can point to as >>>>>>>>>>> to how to represent the "locale-neutral" data so that it can >>>>>>>>>>> most easily be made locale specific by existing tools. You >>>>>>>>>>> mention "pre-made standards for the basic data types". Is there >>>>>>>>>>> a recommended list we could >>>>>>>>>>> >>>>>>>>>> reference? >>>>>>>>> >>>>>>>>>> Thanks for your help! >>>>>>>>>>> -Annette >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On 8/4/16 12:31 PM, Phillips, Addison wrote: >>>>>>>>>>> >>>>>>>>>>> Hi Annette, >>>>>>>>>>>> >>>>>>>>>>>> Thanks for the note. This is a personal reply not on behalf of >>>>>>>>>>>> the WG. >>>>>>>>>>>> >>>>>>>>>>>> Locale neutral formats are quite common on the Web and the >>>>>>>>>>>> Internet in general. One familiar format referenced by your >>>>>>>>>>>> document, for example, is XML Schema. While the >>>>>>>>>>>> >>>>>>>>>>> representations >>>> >>>>> of numbers, dates, and the like in XML Schema would be "more >>>>>>>>>>>> appropriate" for some languages/locales than others if given as >>>>>>>>>>>> plain text, what distinguishes them is that they are all >>>>>>>>>>>> machine readable and intended to >>>>>>>>>>>> >>>>>>>>>>> be read by machines for later processing. >>>>>>>>> >>>>>>>>>> The display of values is a separate, local, concern for the >>>>>>>>>>>> data's consumer. This necessarily means choosing specific >>>>>>>>>>>> separators (such as decimal separators) over other, more >>>>>>>>>>>> localized values. Save for "free >>>>>>>>>>>> >>>>>>>>>>> text" >>>>>>>>> >>>>>>>>>> (natural language) data, most data formats are locale neutral >>>>>>>>>>>> and these include things like JSON-LD, XML Schema, CSV, and so >>>>>>>>>>>> >>>>>>>>>>> forth. >>>> >>>>> Not every possible data structure or data value is, of course, >>>>>>>>>>>> covered fully. For example, in my day job (I work at Amazon), >>>>>>>>>>>> we have many different common measurement units defined >>>>>>>>>>>> >>>>>>>>>>> internally. >>>> >>>>> To transmit these in a locale-neutral manner, we need to >>>>>>>>>>>> construct our own data schemas and identifiers. There are >>>>>>>>>>>> profoundly many ways to measure shoes, dresses, auto parts, >>>>>>>>>>>> hats, drone propellers, and so forth. But it would be a >>>>>>>>>>>> nightmare to have to deal with localized >>>>>>>>>>>> >>>>>>>>>>> presentation formats on top of that. >>>>>>>>> >>>>>>>>>> But there are pre-made standards for the basic data types and >>>>>>>>>>>> these are what are needed to build almost any data structure >>>>>>>>>>>> necessary for global interchange of data. >>>>>>>>>>>> >>>>>>>>>>>> Does that make sense? >>>>>>>>>>>> >>>>>>>>>>>> Addison >>>>>>>>>>>> >>>>>>>>>>>> Addison Phillips >>>>>>>>>>>> Principal SDE, I18N Architect (Amazon) Chair (W3C I18N WG) >>>>>>>>>>>> >>>>>>>>>>>> Internationalization is not a feature. >>>>>>>>>>>> It is an architecture. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> -----Original Message----- >>>>>>>>>>>> >>>>>>>>>>>>> From: Annette Greiner [mailto:amgreiner@lbl.gov] >>>>>>>>>>>>> Sent: Thursday, August 04, 2016 12:04 PM >>>>>>>>>>>>> To: ishida@w3.org; public-dwbp-comments@w3.org >>>>>>>>>>>>> Cc: www International <www-international@w3.org> >>>>>>>>>>>>> Subject: Re: [i18n review comment] BP3 should recommend >>>>>>>>>>>>> locale-neutral representation #187 >>>>>>>>>>>>> >>>>>>>>>>>>> Hello on behalf of the DWBP WG, >>>>>>>>>>>>> >>>>>>>>>>>>> We're interested in pursuing this concept in our best practice >>>>>>>>>>>>> document, but we would like some clarification of the practice >>>>>>>>>>>>> of locale neutrality. >>>>>>>>>>>>> You >>>>>>>>>>>>> mention the variation across locales in decimal symbol, >>>>>>>>>>>>> grouping symbol, number of grouping digits, digit shapes, >>>>>>>>>>>>> etc., and you give an example of a locale-neutral data >>>>>>>>>>>>> structure for monetary >>>>>>>>>>>>> >>>>>>>>>>>> values. >>>>>> >>>>>>> But this structure alone does not appear to address >>>>>>>>>>>>> differences in decimal symbol, grouping symbol, number of >>>>>>>>>>>>> grouping digits, or digit shapes. It does provide a mechanism >>>>>>>>>>>>> to separately specify the units, and the example uses an >>>>>>>>>>>>> ISO-4217 currency code, both of which we agree are good ideas. >>>>>>>>>>>>> Is there a broad standard (beyond just monetary) for >>>>>>>>>>>>> addressing the other symbol/representation issues you raised >>>>>>>>>>>>> that we can address >>>>>>>>>>>>> >>>>>>>>>>>> briefly in our best practice? >>>>>> >>>>>>> Do you consider SI units consistent with a locale-neutral >>>>>>>>>>>>> >>>>>>>>>>>> approach? >>>> >>>>> Is there a locale-neutral standard for representing decimal >>>>>>>>>>>>> numbers (perhaps using a period and no grouping, as in your >>>>>>>>>>>>> >>>>>>>>>>>> example)? >>>>>> >>>>>>> -Annette >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On 7/22/16 5:32 AM, ishida@w3.org wrote: >>>>>>>>>>>>> >>>>>>>>>>>>> [raised by aphillips] >>>>>>>>>>>>>> >>>>>>>>>>>>>> https://www.w3.org/TR/dwbp/#LocaleParametersMetadata >>>>>>>>>>>>>> >>>>>>>>>>>>>> Best practice #3 introduces itself as: >>>>>>>>>>>>>> >>>>>>>>>>>>>> Providing locale parameters helps humans and computer >>>>>>>>>>>>>> applications to work accurately with things like dates, >>>>>>>>>>>>>> currencies and numbers that may look similar but have >>>>>>>>>>>>>> different meanings in different locales. >>>>>>>>>>>>>> >>>>>>>>>>>>>> But the actual best practice is to use **locale-neutral** >>>>>>>>>>>>>> representations that are interpreted/displayed to end-users >>>>>>>>>>>>>> in a locale-appropriate manner. For example, instead of >>>>>>>>>>>>>> storing the string "€2000.00", exchanging a data structure >>>>>>>>>>>>>> like the following is strongly >>>>>>>>>>>>>> preferred: >>>>>>>>>>>>>> >>>>>>>>>>>>>> ``` >>>>>>>>>>>>>> "price" { >>>>>>>>>>>>>> "value": 2000.00, >>>>>>>>>>>>>> "currency": "EUR" >>>>>>>>>>>>>> } >>>>>>>>>>>>>> ``` >>>>>>>>>>>>>> >>>>>>>>>>>>>> The date examples given are all in xsd:date format, which is >>>>>>>>>>>>>> an excellent example of using a locale-neutral format. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Many things are dependent on locale: decimal symbol, >>>>>>>>>>>>>> >>>>>>>>>>>>> grouping >>>> >>>>> symbol, number of grouping digits, digit shapes, etc. It's >>>>>>>>>>>>>> because there can be wide variation (sometimes open to >>>>>>>>>>>>>> misinterpretation) that sending a locale neutral format is >>>>>>>>>>>>>> >>>>>>>>>>>>> preferred for data values. >>>>>> >>>>>>> Note also btw that the position of the currency symbol is >>>>>>>>>>>>>> dependent on the locale. In France it would be normal to >>>>>>>>>>>>>> write >>>>>>>>>>>>>> >>>>>>>>>>>>> 2000.00 € rather than €2000.00. >>>>>>>>> >>>>>>>>>> Same even when talking about USD when using $, ie. 2000.00 $. >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> -- >>>>>>>>>>>>>> >>>>>>>>>>>>> Annette Greiner >>>>>>>>>>>>> NERSC Data and Analytics Services Lawrence Berkeley National >>>>>>>>>>>>> Laboratory >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> -- >>>>>>>>>>> Annette Greiner >>>>>>>>>>> NERSC Data and Analytics Services Lawrence Berkeley National >>>>>>>>>>> Laboratory >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>> >>>>>>>>> >>>>>>>>> Phil Archer >>>>>>>>> W3C Data Activity Lead >>>>>>>>> http://www.w3.org/2013/data/ >>>>>>>>> >>>>>>>>> http://philarcher.org >>>>>>>>> +44 (0)7887 767755 >>>>>>>>> @philarcher1 >>>>>>>>> >>>>>>>> -- >>>>>> >>>>>> >>>>>> Phil Archer >>>>>> W3C Data Activity Lead >>>>>> http://www.w3.org/2013/data/ >>>>>> >>>>>> http://philarcher.org >>>>>> +44 (0)7887 767755 >>>>>> @philarcher1 >>>>>> >>>>> -- >>>> >>>> >>>> Phil Archer >>>> W3C Data Activity Lead >>>> http://www.w3.org/2013/data/ >>>> >>>> http://philarcher.org >>>> +44 (0)7887 767755 >>>> @philarcher1 >>>> >>> >> > -- > > > Phil Archer > W3C Data Activity Lead > http://www.w3.org/2013/data/ > > http://philarcher.org > +44 (0)7887 767755 > @philarcher1 > -- Bernadette Farias Lóscio Centro de Informática Universidade Federal de Pernambuco - UFPE, Brazil ----------------------------------------------------------------------------
Re: [i18n review comment] BP3 should recommend locale-neutral representation #187
Annette Greiner   Wed, 24 Aug 2016 10:32:04 -0700

public-dwbp-comments > August 2016 > 0000.html

Received on Wednesday, 24 August 2016 17:33:07 UTC

Show in list: by dateby threadby subjectby author

Link to this message in this page.

Sent to: bfl@cin.ufpe.br, phila@w3.org
Copied to: addison@lab126.com, deirdre@derilinx.com, ishida@w3.org, ishida@w3.org, public-dwbp-comments@w3.org, public-dwbp-comments@w3.org, www-international@w3.org.

I think it's a fine idea to put both the locale-neutral and the metadata approaches into example 13, but the metadata is not an example of a locale-neutral representation of the *data*. We need to show that in the data itself. -Annette On 8/24/16 4:56 AM, Bernadette Farias Lóscio wrote: > Hi all, > > @Phil, thanks a lot for making the updates! @Annette and Addison > thanks for the comments and suggestions! > > I agree with the changes made until now, but I'd like to answer the > following comment: > > Finally, the example marked prominently as Example 13 looks > like the > primary suggestion for implementing the BP, which it isn't > anymore. I > think the 2000 Euro example should be at least as prominently > marked. > > > I sympathise but I'm going to have to leave that to the editors. > It can be done by simply adding class="example" to the <pre> > element. But, doing that then means that the example numbers will > be out of step with the BP numbers from that that point on, which > I *think* editors have been anxious to avoid? > > > I don't think its a good idea to change the numbers of the examples. > One solution could be to make some changes on the example 13. > > Example 13 shows both the use of locale-neutral representation and > locale-parameters metadata. We have the tag "xsd:date" in 'dct:issued > "2015-05-05"^^xsd:date', but we also have 'dct:conformsTo > <http://www.iso.org/iso/home/standards/iso8601.htm>' to indicate the > standard adopted as date format. > > :stops-2015-05-05 > > a dcat:Dataset ; > dct:title "Bus stops of MyCity" ; > dcat:keyword "transport","mobility","bus" ; > dct:issued "2015-05-05"^^xsd:date ; > dcat:contactPoint > <http://data.mycity.example.com/transport/contact> ; > dct:temporal <http://reference.data.gov.uk/id/year/2015> ; > dct:spatial <http://www.geonames.org/3399415> ; > dct:publisher :transport-agency-mycity ; > dct:accrualPeriodicity > <http://purl.org/linked-data/sdmx/2009/code#freq-A> ; > dcat:theme :mobility ; > dcat:distribution :stops-2015-05-05.csv ; > dct:language <http://id.loc.gov/vocabulary/iso639-1/en> , > <http://id.loc.gov/vocabulary/iso639-1/pt> ; > dct:conformsTo > <http://www.iso.org/iso/home/standards/iso8601.htm> ; > . > > Should we have both ou just xsd:date? If I understood correct, I think > we should keep just xsd:date. In this case, we can also change the > example description to mention that we are using a locale-neutral > representation for date and a locale-parameter metadata (dct:language) > to specify the languages in which dataset is published. See the > suggestion below: > > The example below shows the use of xsd:date providing a local-neutral > representation for the issue date of the bus stops dataset > (|stops-2015-05-05|). Considering that the data from the bus stops > dataset is already in a locale-specific format, then the property > |dct:language| is used to declare the languages the dataset is > published in. If the dataset is available in multiple languages, use > multiple values for this property. > > :stops-2015-05-05 > > a dcat:Dataset ; > dct:title "Bus stops of MyCity" ; > dcat:keyword "transport","mobility","bus" ; > dct:issued "2015-05-05"^^xsd:date ; > dcat:contactPoint > <http://data.mycity.example.com/transport/contact> ; > dct:temporal <http://reference.data.gov.uk/id/year/2015> ; > dct:spatial <http://www.geonames.org/3399415> ; > dct:publisher :transport-agency-mycity ; > dct:accrualPeriodicity > <http://purl.org/linked-data/sdmx/2009/code#freq-A> ; > dcat:theme :mobility ; > dcat:distribution :stops-2015-05-05.csv ; > dct:language <http://id.loc.gov/vocabulary/iso639-1/en> , > <http://id.loc.gov/vocabulary/iso639-1/pt> ; > > > Please, let me know what do you think about this. > > Thanks! > > Berna > > > > > -Annette > > > On 8/23/16 7:11 AM, Phillips, Addison wrote: > > Hi Phil, > > Thanks. This looks good to me. > > Addison > > -----Original Message----- > From: Phil Archer [mailto:phila@w3.org > <mailto:phila@w3.org>] > Sent: Tuesday, August 23, 2016 3:29 AM > To: Phillips, Addison <addison@lab126.com > <mailto:addison@lab126.com>>; Deirdre Lee > <deirdre@derilinx.com <mailto:deirdre@derilinx.com>>; > Bernadette Farias Lóscio <bfl@cin.ufpe.br > <mailto:bfl@cin.ufpe.br>>; > Annette Greiner <amgreiner@lbl.gov > <mailto:amgreiner@lbl.gov>> > Cc: ishida@w3.org <mailto:ishida@w3.org>; > public-dwbp-comments@w3.org > <mailto:public-dwbp-comments@w3.org>; www International > <www-international@w3.org > <mailto:www-international@w3.org>> > Subject: Re: [i18n review comment] BP3 should > recommend locale-neutral > representation #187 > > Thanks again Addison, > > Pls see below. > > On 22/08/2016 18:36, Phillips, Addison wrote: > > Hi Phil, > > This looks good. A few comments. > > 1. Rather than providing your own definition for > 'locale', you might > make > > use of the one we provide in LTLI [1]. > > Done > http://w3c.github.io/dwbp/bp.html#locale_parameter > <http://w3c.github.io/dwbp/bp.html#locale_parameter> > > 2. The "why" is still missing something. I would > suggest adding a > new first > > paragraph explaining locale-neutral first. Something like: > > -- > Data values that are machine-readable and not > specific to any > particular > > language or culture are more durable and less open to > misinterpretation than > values that use one of the many different cultural > representations. > By using a > locale-neutral format, systems avoid the need to > establish specific > interchange rules that vary according to the language > or location of > the user. > > When the data is already in a locale-specific > format, providing locale > parameters... <rest of existing text> > > > Done, exactly as you suggest > http://w3c.github.io/dwbp/bp.html#LocaleParametersMetadata > <http://w3c.github.io/dwbp/bp.html#LocaleParametersMetadata> > > With luck... the doc gets a green light from you? > > Thanks again > > Phil. > > -- > > Hope that helps, > > Addison > > [1] https://www.w3.org/TR/ltli/#locale > <https://www.w3.org/TR/ltli/#locale> > > -----Original Message----- > From: Phil Archer [mailto:phila@w3.org > <mailto:phila@w3.org>] > Sent: Monday, August 22, 2016 2:34 AM > To: Deirdre Lee <deirdre@derilinx.com > <mailto:deirdre@derilinx.com>>; Phillips, Addison > <addison@lab126.com > <mailto:addison@lab126.com>>; Bernadette > Farias Lóscio <bfl@cin.ufpe.br > <mailto:bfl@cin.ufpe.br>>; > Annette Greiner <amgreiner@lbl.gov > <mailto:amgreiner@lbl.gov>> > Cc: ishida@w3.org <mailto:ishida@w3.org>; > public-dwbp-comments@w3.org > <mailto:public-dwbp-comments@w3.org>; www > International > <www-international@w3.org > <mailto:www-international@w3.org>> > Subject: Re: [i18n review comment] BP3 should > recommend > locale-neutral representation #187 > > Dear all, > > I have taken further steps on this. The result > can be seen at > http://philarcher1.github.io/dwbp/bp.html#LocaleParametersMetadata > <http://philarcher1.github.io/dwbp/bp.html#LocaleParametersMetadata> > > 1. Addision's text used more or less verbatim; > 1a. taken account of > Annette's suggestion; 1b. replaced inline > links to BCP47 and CLDR with > > references 2. > > title of the BP changed to Use locale-neutral > data representations 3. > moved to Data Formats section as resolved in > WG meeting on Friday; 4. > added R- FormatMachineRead to list of evidence > and thereby updated > the UCR cross matching; 5. updated the > Challenges SVG diagram; 6. > updated my Pull request. > > NB, I *retained* the old ID for the BP so that > any links to > #LocaleParametersMetadata will still work. I > know there are some of > these, for example, in the Share-PSI project. > > HTH > > Phil. > > > > On 22/08/2016 08:52, Deirdre Lee wrote: > > HI, > > Thank you for your comments Addison. I > think they make sense and > should be straight-forward to incorporate. > > The title of the BP should probably also > be updated to something > like 'Provide locale-neutral data' > > Phil and DWBP editors, in Friday's meeting > we also agreed to move > BP3 to the Data Formats section from the > Metadata section, which > would make it BP14, right? > > Kind regards, > > Deirdre > > > > On 19/08/2016 17:39, Phillips, Addison wrote: > > Hi Phil, > > Thanks for starting on this. I think > the pull request is a good > start. > I have some comments on it. > > My main concern is that this BP is > really backwards. It recommends > to "locale parameter metadata" and > then says that the simplest way > to do this is to use locale-neutral > formats. The recommendation > should be more like "use > locale-neutral formats or provide > locale/language information where > that's not possible". The pull > request captures the use of > locale-neutral, but doesn't really > explain about when to provide locale > and language information. > > I would change this: > > -- > <p class="practicedesc">Provide > metadata about locale parameters > (date, time, and number formats, > language).</p> > -- > > To say: > > -- > <p class="practicedesc">Use > locale-neutral data structures and > values, or, where that is not > possible, provide metadata about the > locale used by data values.</p> > -- > > I would change: > > -- > <p>The simplest method is to use > local-neutral representations of > the actual data, and then add metadata > to provide relevant locale > information. For example, rather than > storing "€2000.00" as a > string, it's strongly preferred to > exchange a data structure such > as:</p> > -- > > To say: > > -- > <p>Most common data representations > are locale neutral. For > example, XML Schema types such as > xsd:integer and xsd: date are > intended for locale-neutral data > interchange. Using locale-neutral > representations allows the data values > to be processed accurately > without complex parsing or > misinterpretation and also allows the > data to be presented in the format > most comfortable for the > consumer of the data. For example, > rather than storing "€2000,00" > as a string, it's strongly preferred > to exchange a data structure > such as:</p> > -- > > Also, note the misspelling of > "locale-neutral" in the pull request. > > I would then go on to add some text > about when locale parameters > are needed. Something like: > > -- > Some datasets contain values that are > not or cannot be rendered > into a locale-neutral format. This is > particularly true of any > natural language text values. For each > data field that can contain > locale affected or natural language > text, there should be an > associated language tag used to > indicate the language and locale > of the > > data. > > This locale information can be used in > parsing the data or to > ensure proper presentation and > processing of the value by the > > consumer. > > -- > > (Sorry for not generating a pull > request of my own) > > Addison > > -----Original Message----- > From: Phil Archer > [mailto:phila@w3.org > <mailto:phila@w3.org>] > Sent: Friday, August 19, 2016 8:37 AM > To: Bernadette Farias Lóscio > <bfl@cin.ufpe.br > <mailto:bfl@cin.ufpe.br>>; Annette > Greiner > <amgreiner@lbl.gov > <mailto:amgreiner@lbl.gov>> > Cc: Phillips, Addison > <addison@lab126.com > <mailto:addison@lab126.com>>; > ishida@w3.org <mailto:ishida@w3.org>; > public-dwbp- comments@w3.org > <mailto:comments@w3.org>; www > International > <www-international@w3.org > <mailto:www-international@w3.org>> > Subject: Re: [i18n review comment] > BP3 should recommend > locale-neutral representation #187 > > I took an action on today's call > to try and address this in BP3. > You can see the results at > > http://philarcher1.github.io/dwbp/bp.html#LocaleParametersMetadata > <http://philarcher1.github.io/dwbp/bp.html#LocaleParametersMetadata> > > This uses some of Addison's text > directly and highlights the value > of the xsd datatypes - but retains > enough of the original BP for > it to be an amendment rather than > a whole new one - I hope. > > This addresses most of the > resolution taken today [1] but I have > not moved the BP to the formats > section. I leave that to the > editors who may want to make > further changes - or argue for it to > be left where it is, or add > references from the formats section > or, or, > > or... > > I've created the Pull Request > https://github.com/w3c/dwbp/pull/447 > <https://github.com/w3c/dwbp/pull/447> > > Phil. > > [1] > https://www.w3.org/2016/08/19-dwbp-minutes#resolution02 > <https://www.w3.org/2016/08/19-dwbp-minutes#resolution02> > > On 15/08/2016 17:28, Bernadette > Farias Lóscio wrote: > > Dear Ishida, > > This comment [1] is still > under discussion [4] and we'd > like to > ask your opinion about two of > our proposals: > > 1. to include locale-neutral > representation ideas as part > of BP3 > [2], or 2. to include a > paragraph at the introduction > of Section > 8.8 Data Formats [3] to > discuss the relevance of having > local-neutral representations. > > We also discussed the proposal > of having a new BP and we agreed > that we won't have a lot of > time for a broader review of > the new > BP and to collect feedback > from the community. > > Thanks a lot! > DWBP editors > > [1] > https://lists.w3.org/Archives/Public/public-dwbp-comments/ > <https://lists.w3.org/Archives/Public/public-dwbp-comments/> > 2016Jul/0028.html > > [2]http://agreiner.github.io/dwbp/bp.html#LocaleParametersMetadata > <http://agreiner.github.io/dwbp/bp.html#LocaleParametersMetadata> > > [3] > https://www.w3.org/TR/dwbp/#dataFormats > <https://www.w3.org/TR/dwbp/#dataFormats> > [4] > https://lists.w3.org/Archives/Public/public-dwbp-wg/2016Aug/0009 > <https://lists.w3.org/Archives/Public/public-dwbp-wg/2016Aug/0009>. > ht > ml > > > 2016-08-04 23:26 GMT+02:00 > Annette Greiner > <amgreiner@lbl.gov > <mailto:amgreiner@lbl.gov>>: > > Hi Addison, > > Thanks for your response, > and it does make sense. I > think what I > am still missing is > whether there is guidance > we can point to as > to how to represent the > "locale-neutral" data so > that it can > most easily be made locale > specific by existing > tools. You > mention "pre-made > standards for the basic > data types". Is there > a recommended list we could > > reference? > > Thanks for your help! > -Annette > > > On 8/4/16 12:31 PM, > Phillips, Addison wrote: > > Hi Annette, > > Thanks for the note. > This is a personal > reply not on behalf of > the WG. > > Locale neutral formats > are quite common on > the Web and the > Internet in general. > One familiar format > referenced by your > document, for example, > is XML Schema. While the > > representations > > of numbers, dates, and > the like in XML Schema > would be "more > appropriate" for some > languages/locales than > others if given as > plain text, what > distinguishes them is > that they are all > machine readable and > intended to > > be read by machines for later > processing. > > The display of values > is a separate, local, > concern for the > data's consumer. This > necessarily means > choosing specific > separators (such as > decimal separators) > over other, more > localized values. Save > for "free > > text" > > (natural language) > data, most data > formats are locale neutral > and these include > things like JSON-LD, > XML Schema, CSV, and so > > forth. > > Not every possible > data structure or data > value is, of course, > covered fully. For > example, in my day job > (I work at Amazon), > we have many different > common measurement > units defined > > internally. > > To transmit these in a > locale-neutral manner, > we need to > construct our own data > schemas and > identifiers. There are > profoundly many ways > to measure shoes, > dresses, auto parts, > hats, drone > propellers, and so > forth. But it would be a > nightmare to have to > deal with localized > > presentation formats on top of that. > > But there are pre-made > standards for the > basic data types and > these are what are > needed to build almost > any data structure > necessary for global > interchange of data. > > Does that make sense? > > Addison > > Addison Phillips > Principal SDE, I18N > Architect (Amazon) > Chair (W3C I18N WG) > > Internationalization > is not a feature. > It is an architecture. > > > > > -----Original Message----- > > From: Annette > Greiner > [mailto:amgreiner@lbl.gov > <mailto:amgreiner@lbl.gov>] > Sent: Thursday, > August 04, 2016 > 12:04 PM > To: ishida@w3.org > <mailto:ishida@w3.org>; > public-dwbp-comments@w3.org > <mailto:public-dwbp-comments@w3.org> > Cc: www > International > <www-international@w3.org > <mailto:www-international@w3.org>> > Subject: Re: [i18n > review comment] > BP3 should recommend > locale-neutral > representation #187 > > Hello on behalf of > the DWBP WG, > > We're interested > in pursuing this > concept in our > best practice > document, but we > would like some > clarification of > the practice > of locale neutrality. > You > mention the > variation across > locales in decimal > symbol, > grouping symbol, > number of grouping > digits, digit shapes, > etc., and you give > an example of a > locale-neutral data > structure for monetary > > values. > > But this structure > alone does not > appear to address > differences in > decimal symbol, > grouping symbol, > number of > grouping digits, > or digit shapes. > It does provide a > mechanism > to separately > specify the units, > and the example > uses an > ISO-4217 currency > code, both of > which we agree are > good ideas. > Is there a broad > standard (beyond > just monetary) for > addressing the > other > symbol/representation > issues you raised > that we can address > > briefly in our best practice? > > Do you consider SI > units consistent > with a locale-neutral > > approach? > > Is there a > locale-neutral > standard for > representing decimal > numbers (perhaps > using a period and > no grouping, as in > your > > example)? > > -Annette > > > On 7/22/16 5:32 > AM, ishida@w3.org > <mailto:ishida@w3.org> > wrote: > > [raised by > aphillips] > > https://www.w3.org/TR/dwbp/#LocaleParametersMetadata > <https://www.w3.org/TR/dwbp/#LocaleParametersMetadata> > > Best practice > #3 introduces > itself as: > > Providing > locale > parameters > helps humans > and computer > applications > to work > accurately > with things > like dates, > currencies and > numbers that > may look > similar but have > different > meanings in > different locales. > > But the actual > best practice > is to use > **locale-neutral** > representations > that are > interpreted/displayed > to end-users > in a > locale-appropriate > manner. For > example, > instead of > storing the > string > "€2000.00", > exchanging a > data structure > like the > following is > strongly > preferred: > > ``` > "price" { > "value": > 2000.00, > > "currency": "EUR" > } > ``` > > The date > examples given > are all in > xsd:date > format, which is > an excellent > example of > using a > locale-neutral > format. > > Many things > are dependent > on locale: > decimal symbol, > > grouping > > symbol, number > of grouping > digits, digit > shapes, etc. It's > because there > can be wide > variation > (sometimes open to > misinterpretation) > that sending a > locale neutral > format is > > preferred for data values. > > Note also btw > that the > position of > the currency > symbol is > dependent on > the locale. In > France it > would be normal to > write > > 2000.00 € rather than €2000.00. > > Same even when > talking about > USD when using > $, ie. 2000.00 $. > > > -- > > Annette Greiner > NERSC Data and > Analytics Services > Lawrence Berkeley > National > Laboratory > > > -- > Annette Greiner > NERSC Data and Analytics > Services Lawrence Berkeley > National > Laboratory > > > > -- > > > Phil Archer > W3C Data Activity Lead > http://www.w3.org/2013/data/ > > http://philarcher.org > +44 (0)7887 767755 > <tel:%2B44%20%280%297887%20767755> > @philarcher1 > > -- > > > Phil Archer > W3C Data Activity Lead > http://www.w3.org/2013/data/ > > http://philarcher.org > +44 (0)7887 767755 > <tel:%2B44%20%280%297887%20767755> > @philarcher1 > > -- > > > Phil Archer > W3C Data Activity Lead > http://www.w3.org/2013/data/ > > http://philarcher.org > +44 (0)7887 767755 <tel:%2B44%20%280%297887%20767755> > @philarcher1 > > > > -- > > > Phil Archer > W3C Data Activity Lead > http://www.w3.org/2013/data/ > > http://philarcher.org > +44 (0)7887 767755 <tel:%2B44%20%280%297887%20767755> > @philarcher1 > > > > > -- > Bernadette Farias Lóscio > Centro de Informática > Universidade Federal de Pernambuco - UFPE, Brazil > ---------------------------------------------------------------------------- -- Annette Greiner NERSC Data and Analytics Services Lawrence Berkeley National Laboratory
Re: [i18n review comment] BP3 should recommend locale-neutral representation #187
Bernadette Farias Lóscio   Wed, 24 Aug 2016 15:38:00 -0300

public-dwbp-comments > August 2016 > 0000.html

Received on Wednesday, 24 August 2016 18:38:53 UTC

Show in list: by dateby threadby subjectby author

Link to this message in this page.

Sent to: amgreiner@lbl.gov
Copied to: phila@w3.org, addison@lab126.com, deirdre@derilinx.com, ishida@w3.org, ishida@w3.org, public-dwbp-comments@w3.org, public-dwbp-comments@w3.org, www-international@w3.org.

Hi Annette, thanks for your answer! Just a brief explanation: I understood that "2015-05-05" is also *data* and because of this I said that it is an example of locale-neutral representation. I have just two more questions: - I am still not sure if we should keep "dct:conformsTo". Should we keep it? - I understand that to have an example of locale-neutral representation we should present some instances from the dataset. But I don't see how to do this considering the dataset attributes and their corresponding data types [1]. Could you please give me an example? cheers, Bernadette [1] http://w3c.github.io/dwbp/dwbp-example.html#dataset-structural-metadata 2016-08-24 14:32 GMT-03:00 Annette Greiner <amgreiner@lbl.gov>: > I think it's a fine idea to put both the locale-neutral and the metadata > approaches into example 13, but the metadata is not an example of a > locale-neutral representation of the *data*. We need to show that in the > data itself. > > -Annette > > On 8/24/16 4:56 AM, Bernadette Farias Lóscio wrote: > > Hi all, > > @Phil, thanks a lot for making the updates! @Annette and Addison thanks > for the comments and suggestions! > > I agree with the changes made until now, but I'd like to answer the > following comment: > > Finally, the example marked prominently as Example 13 looks like the >>> primary suggestion for implementing the BP, which it isn't anymore. I >>> think the 2000 Euro example should be at least as prominently marked. >>> >> >> I sympathise but I'm going to have to leave that to the editors. It can >> be done by simply adding class="example" to the <pre> element. But, doing >> that then means that the example numbers will be out of step with the BP >> numbers from that that point on, which I *think* editors have been anxious >> to avoid? >> >> > I don't think its a good idea to change the numbers of the examples. One > solution could be to make some changes on the example 13. > > Example 13 shows both the use of locale-neutral representation and > locale-parameters metadata. We have the tag "xsd:date" in 'dct:issued > "2015-05-05"^^xsd:date', but we also have 'dct:conformsTo < > http://www.iso.org/iso/home/standards/iso8601.htm>' to indicate the > standard adopted as date format. > > :stops-2015-05-05 > > a dcat:Dataset ; > dct:title "Bus stops of MyCity" ; > dcat:keyword "transport","mobility","bus" ; > dct:issued "2015-05-05"^^xsd:date ; > dcat:contactPoint <http://data.mycity.example.com/transport/contact> > ; > dct:temporal <http://reference.data.gov.uk/id/year/2015> ; > dct:spatial <http://www.geonames.org/3399415> ; > dct:publisher :transport-agency-mycity ; > dct:accrualPeriodicity <http://purl.org/linked-data/ > sdmx/2009/code#freq-A> ; > dcat:theme :mobility ; > dcat:distribution :stops-2015-05-05.csv ; > dct:language <http://id.loc.gov/vocabulary/iso639-1/en> , > <http://id.loc.gov/vocabulary/iso639-1/pt> ; > dct:conformsTo <http://www.iso.org/iso/home/standards/iso8601.htm> > ; > . > > Should we have both ou just xsd:date? If I understood correct, I think we > should keep just xsd:date. In this case, we can also change the example > description to mention that we are using a locale-neutral representation > for date and a locale-parameter metadata (dct:language) to specify the > languages in which dataset is published. See the suggestion below: > > The example below shows the use of xsd:date providing a local-neutral > representation for the issue date of the bus stops dataset ( > stops-2015-05-05). Considering that the data from the bus stops dataset > is already in a locale-specific format, then the property dct:language > is used to declare the languages the dataset is published in. If the > dataset is available in multiple languages, use multiple values for this > property. > > :stops-2015-05-05 > > a dcat:Dataset ; > dct:title "Bus stops of MyCity" ; > dcat:keyword "transport","mobility","bus" ; > dct:issued "2015-05-05"^^xsd:date ; > dcat:contactPoint <http://data.mycity.example.com/transport/contact> > ; > dct:temporal <http://reference.data.gov.uk/id/year/2015> ; > dct:spatial <http://www.geonames.org/3399415> ; > dct:publisher :transport-agency-mycity ; > dct:accrualPeriodicity <http://purl.org/linked-data/ > sdmx/2009/code#freq-A> ; > dcat:theme :mobility ; > dcat:distribution :stops-2015-05-05.csv ; > dct:language <http://id.loc.gov/vocabulary/iso639-1/en> , > <http://id.loc.gov/vocabulary/iso639-1/pt> ; > > > Please, let me know what do you think about this. > > Thanks! > > Berna > > > >> >> >> >>> -Annette >>> >>> >>> On 8/23/16 7:11 AM, Phillips, Addison wrote: >>> >>>> Hi Phil, >>>> >>>> Thanks. This looks good to me. >>>> >>>> Addison >>>> >>>> -----Original Message----- >>>>> From: Phil Archer [mailto:phila@w3.org] >>>>> Sent: Tuesday, August 23, 2016 3:29 AM >>>>> To: Phillips, Addison <addison@lab126.com>; Deirdre Lee >>>>> <deirdre@derilinx.com>; Bernadette Farias Lóscio <bfl@cin.ufpe.br>; >>>>> Annette Greiner <amgreiner@lbl.gov> >>>>> Cc: ishida@w3.org; public-dwbp-comments@w3.org; www International >>>>> <www-international@w3.org> >>>>> Subject: Re: [i18n review comment] BP3 should recommend locale-neutral >>>>> representation #187 >>>>> >>>>> Thanks again Addison, >>>>> >>>>> Pls see below. >>>>> >>>>> On 22/08/2016 18:36, Phillips, Addison wrote: >>>>> >>>>>> Hi Phil, >>>>>> >>>>>> This looks good. A few comments. >>>>>> >>>>>> 1. Rather than providing your own definition for 'locale', you might >>>>>> make >>>>>> >>>>> use of the one we provide in LTLI [1]. >>>>> >>>>> Done >>>>> http://w3c.github.io/dwbp/bp.html#locale_parameter >>>>> >>>>> 2. The "why" is still missing something. I would suggest adding a >>>>>> new first >>>>>> >>>>> paragraph explaining locale-neutral first. Something like: >>>>> >>>>>> -- >>>>>> Data values that are machine-readable and not specific to any >>>>>> particular >>>>>> >>>>> language or culture are more durable and less open to >>>>> misinterpretation than >>>>> values that use one of the many different cultural representations. >>>>> By using a >>>>> locale-neutral format, systems avoid the need to establish specific >>>>> interchange rules that vary according to the language or location of >>>>> the user. >>>>> >>>>>> When the data is already in a locale-specific format, providing locale >>>>>> parameters... <rest of existing text> >>>>>> >>>>> >>>>> Done, exactly as you suggest >>>>> http://w3c.github.io/dwbp/bp.html#LocaleParametersMetadata >>>>> >>>>> With luck... the doc gets a green light from you? >>>>> >>>>> Thanks again >>>>> >>>>> Phil. >>>>> >>>>> -- >>>>>> >>>>>> Hope that helps, >>>>>> >>>>>> Addison >>>>>> >>>>>> [1] https://www.w3.org/TR/ltli/#locale >>>>>> >>>>>> -----Original Message----- >>>>>>> From: Phil Archer [mailto:phila@w3.org] >>>>>>> Sent: Monday, August 22, 2016 2:34 AM >>>>>>> To: Deirdre Lee <deirdre@derilinx.com>; Phillips, Addison >>>>>>> <addison@lab126.com>; Bernadette Farias Lóscio <bfl@cin.ufpe.br>; >>>>>>> Annette Greiner <amgreiner@lbl.gov> >>>>>>> Cc: ishida@w3.org; public-dwbp-comments@w3.org; www International >>>>>>> <www-international@w3.org> >>>>>>> Subject: Re: [i18n review comment] BP3 should recommend >>>>>>> locale-neutral representation #187 >>>>>>> >>>>>>> Dear all, >>>>>>> >>>>>>> I have taken further steps on this. The result can be seen at >>>>>>> http://philarcher1.github.io/dwbp/bp.html#LocaleParametersMetadata >>>>>>> >>>>>>> 1. Addision's text used more or less verbatim; 1a. taken account of >>>>>>> Annette's suggestion; 1b. replaced inline links to BCP47 and CLDR >>>>>>> with >>>>>>> >>>>>> references 2. >>>>> >>>>>> title of the BP changed to Use locale-neutral data representations 3. >>>>>>> moved to Data Formats section as resolved in WG meeting on Friday; 4. >>>>>>> added R- FormatMachineRead to list of evidence and thereby updated >>>>>>> the UCR cross matching; 5. updated the Challenges SVG diagram; 6. >>>>>>> updated my Pull request. >>>>>>> >>>>>>> NB, I *retained* the old ID for the BP so that any links to >>>>>>> #LocaleParametersMetadata will still work. I know there are some of >>>>>>> these, for example, in the Share-PSI project. >>>>>>> >>>>>>> HTH >>>>>>> >>>>>>> Phil. >>>>>>> >>>>>>> >>>>>>> >>>>>>> On 22/08/2016 08:52, Deirdre Lee wrote: >>>>>>> >>>>>>>> HI, >>>>>>>> >>>>>>>> Thank you for your comments Addison. I think they make sense and >>>>>>>> should be straight-forward to incorporate. >>>>>>>> >>>>>>>> The title of the BP should probably also be updated to something >>>>>>>> like 'Provide locale-neutral data' >>>>>>>> >>>>>>>> Phil and DWBP editors, in Friday's meeting we also agreed to move >>>>>>>> BP3 to the Data Formats section from the Metadata section, which >>>>>>>> would make it BP14, right? >>>>>>>> >>>>>>>> Kind regards, >>>>>>>> >>>>>>>> Deirdre >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On 19/08/2016 17:39, Phillips, Addison wrote: >>>>>>>> >>>>>>>>> Hi Phil, >>>>>>>>> >>>>>>>>> Thanks for starting on this. I think the pull request is a good >>>>>>>>> start. >>>>>>>>> I have some comments on it. >>>>>>>>> >>>>>>>>> My main concern is that this BP is really backwards. It recommends >>>>>>>>> to "locale parameter metadata" and then says that the simplest way >>>>>>>>> to do this is to use locale-neutral formats. The recommendation >>>>>>>>> should be more like "use locale-neutral formats or provide >>>>>>>>> locale/language information where that's not possible". The pull >>>>>>>>> request captures the use of locale-neutral, but doesn't really >>>>>>>>> explain about when to provide locale and language information. >>>>>>>>> >>>>>>>>> I would change this: >>>>>>>>> >>>>>>>>> -- >>>>>>>>> <p class="practicedesc">Provide metadata about locale parameters >>>>>>>>> (date, time, and number formats, language).</p> >>>>>>>>> -- >>>>>>>>> >>>>>>>>> To say: >>>>>>>>> >>>>>>>>> -- >>>>>>>>> <p class="practicedesc">Use locale-neutral data structures and >>>>>>>>> values, or, where that is not possible, provide metadata about the >>>>>>>>> locale used by data values.</p> >>>>>>>>> -- >>>>>>>>> >>>>>>>>> I would change: >>>>>>>>> >>>>>>>>> -- >>>>>>>>> <p>The simplest method is to use local-neutral representations of >>>>>>>>> the actual data, and then add metadata to provide relevant locale >>>>>>>>> information. For example, rather than storing "€2000.00" as a >>>>>>>>> string, it's strongly preferred to exchange a data structure such >>>>>>>>> as:</p> >>>>>>>>> -- >>>>>>>>> >>>>>>>>> To say: >>>>>>>>> >>>>>>>>> -- >>>>>>>>> <p>Most common data representations are locale neutral. For >>>>>>>>> example, XML Schema types such as xsd:integer and xsd: date are >>>>>>>>> intended for locale-neutral data interchange. Using locale-neutral >>>>>>>>> representations allows the data values to be processed accurately >>>>>>>>> without complex parsing or misinterpretation and also allows the >>>>>>>>> data to be presented in the format most comfortable for the >>>>>>>>> consumer of the data. For example, rather than storing "€2000,00" >>>>>>>>> as a string, it's strongly preferred to exchange a data structure >>>>>>>>> such as:</p> >>>>>>>>> -- >>>>>>>>> >>>>>>>>> Also, note the misspelling of "locale-neutral" in the pull request. >>>>>>>>> >>>>>>>>> I would then go on to add some text about when locale parameters >>>>>>>>> are needed. Something like: >>>>>>>>> >>>>>>>>> -- >>>>>>>>> Some datasets contain values that are not or cannot be rendered >>>>>>>>> into a locale-neutral format. This is particularly true of any >>>>>>>>> natural language text values. For each data field that can contain >>>>>>>>> locale affected or natural language text, there should be an >>>>>>>>> associated language tag used to indicate the language and locale >>>>>>>>> of the >>>>>>>>> >>>>>>>> data. >>>>> >>>>>> This locale information can be used in parsing the data or to >>>>>>>>> ensure proper presentation and processing of the value by the >>>>>>>>> >>>>>>>> consumer. >>>>> >>>>>> -- >>>>>>>>> >>>>>>>>> (Sorry for not generating a pull request of my own) >>>>>>>>> >>>>>>>>> Addison >>>>>>>>> >>>>>>>>> -----Original Message----- >>>>>>>>>> From: Phil Archer [mailto:phila@w3.org] >>>>>>>>>> Sent: Friday, August 19, 2016 8:37 AM >>>>>>>>>> To: Bernadette Farias Lóscio <bfl@cin.ufpe.br>; Annette Greiner >>>>>>>>>> <amgreiner@lbl.gov> >>>>>>>>>> Cc: Phillips, Addison <addison@lab126.com>; ishida@w3.org; >>>>>>>>>> public-dwbp- comments@w3.org; www International >>>>>>>>>> <www-international@w3.org> >>>>>>>>>> Subject: Re: [i18n review comment] BP3 should recommend >>>>>>>>>> locale-neutral representation #187 >>>>>>>>>> >>>>>>>>>> I took an action on today's call to try and address this in BP3. >>>>>>>>>> You can see the results at >>>>>>>>>> >>>>>>>>>> http://philarcher1.github.io/dwbp/bp.html#LocaleParametersMe >>>>> tadata >>>>> >>>>>> This uses some of Addison's text directly and highlights the value >>>>>>>>>> of the xsd datatypes - but retains enough of the original BP for >>>>>>>>>> it to be an amendment rather than a whole new one - I hope. >>>>>>>>>> >>>>>>>>>> This addresses most of the resolution taken today [1] but I have >>>>>>>>>> not moved the BP to the formats section. I leave that to the >>>>>>>>>> editors who may want to make further changes - or argue for it to >>>>>>>>>> be left where it is, or add references from the formats section >>>>>>>>>> or, or, >>>>>>>>>> >>>>>>>>> or... >>>>> >>>>>> I've created the Pull Request https://github.com/w3c/dwbp/pull/447 >>>>>>>>>> >>>>>>>>>> Phil. >>>>>>>>>> >>>>>>>>>> [1] https://www.w3.org/2016/08/19-dwbp-minutes#resolution02 >>>>>>>>>> >>>>>>>>>> On 15/08/2016 17:28, Bernadette Farias Lóscio wrote: >>>>>>>>>> >>>>>>>>>>> Dear Ishida, >>>>>>>>>>> >>>>>>>>>>> This comment [1] is still under discussion [4] and we'd like to >>>>>>>>>>> ask your opinion about two of our proposals: >>>>>>>>>>> >>>>>>>>>>> 1. to include locale-neutral representation ideas as part of BP3 >>>>>>>>>>> [2], or 2. to include a paragraph at the introduction of Section >>>>>>>>>>> 8.8 Data Formats [3] to discuss the relevance of having >>>>>>>>>>> local-neutral representations. >>>>>>>>>>> >>>>>>>>>>> We also discussed the proposal of having a new BP and we agreed >>>>>>>>>>> that we won't have a lot of time for a broader review of the new >>>>>>>>>>> BP and to collect feedback from the community. >>>>>>>>>>> >>>>>>>>>>> Thanks a lot! >>>>>>>>>>> DWBP editors >>>>>>>>>>> >>>>>>>>>>> [1] https://lists.w3.org/Archives/Public/public-dwbp-comments/ >>>>>>>>>>> 2016Jul/0028.html >>>>>>>>>>> >>>>>>>>>>> [2]http://agreiner.github.io/dwbp/bp.html#LocaleParametersMe >>>>>>> tadata >>>>>>> >>>>>>>> [3] https://www.w3.org/TR/dwbp/#dataFormats >>>>>>>>>>> [4] >>>>>>>>>>> https://lists.w3.org/Archives/Public/public-dwbp-wg/2016Aug/0009 >>>>>>>>>>> . >>>>>>>>>>> ht >>>>>>>>>>> ml >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> 2016-08-04 23:26 GMT+02:00 Annette Greiner <amgreiner@lbl.gov>: >>>>>>>>>>> >>>>>>>>>>> Hi Addison, >>>>>>>>>>>> >>>>>>>>>>>> Thanks for your response, and it does make sense. I think what I >>>>>>>>>>>> am still missing is whether there is guidance we can point to as >>>>>>>>>>>> to how to represent the "locale-neutral" data so that it can >>>>>>>>>>>> most easily be made locale specific by existing tools. You >>>>>>>>>>>> mention "pre-made standards for the basic data types". Is there >>>>>>>>>>>> a recommended list we could >>>>>>>>>>>> >>>>>>>>>>> reference? >>>>>>>>>> >>>>>>>>>>> Thanks for your help! >>>>>>>>>>>> -Annette >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On 8/4/16 12:31 PM, Phillips, Addison wrote: >>>>>>>>>>>> >>>>>>>>>>>> Hi Annette, >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks for the note. This is a personal reply not on behalf of >>>>>>>>>>>>> the WG. >>>>>>>>>>>>> >>>>>>>>>>>>> Locale neutral formats are quite common on the Web and the >>>>>>>>>>>>> Internet in general. One familiar format referenced by your >>>>>>>>>>>>> document, for example, is XML Schema. While the >>>>>>>>>>>>> >>>>>>>>>>>> representations >>>>> >>>>>> of numbers, dates, and the like in XML Schema would be "more >>>>>>>>>>>>> appropriate" for some languages/locales than others if given as >>>>>>>>>>>>> plain text, what distinguishes them is that they are all >>>>>>>>>>>>> machine readable and intended to >>>>>>>>>>>>> >>>>>>>>>>>> be read by machines for later processing. >>>>>>>>>> >>>>>>>>>>> The display of values is a separate, local, concern for the >>>>>>>>>>>>> data's consumer. This necessarily means choosing specific >>>>>>>>>>>>> separators (such as decimal separators) over other, more >>>>>>>>>>>>> localized values. Save for "free >>>>>>>>>>>>> >>>>>>>>>>>> text" >>>>>>>>>> >>>>>>>>>>> (natural language) data, most data formats are locale neutral >>>>>>>>>>>>> and these include things like JSON-LD, XML Schema, CSV, and so >>>>>>>>>>>>> >>>>>>>>>>>> forth. >>>>> >>>>>> Not every possible data structure or data value is, of course, >>>>>>>>>>>>> covered fully. For example, in my day job (I work at Amazon), >>>>>>>>>>>>> we have many different common measurement units defined >>>>>>>>>>>>> >>>>>>>>>>>> internally. >>>>> >>>>>> To transmit these in a locale-neutral manner, we need to >>>>>>>>>>>>> construct our own data schemas and identifiers. There are >>>>>>>>>>>>> profoundly many ways to measure shoes, dresses, auto parts, >>>>>>>>>>>>> hats, drone propellers, and so forth. But it would be a >>>>>>>>>>>>> nightmare to have to deal with localized >>>>>>>>>>>>> >>>>>>>>>>>> presentation formats on top of that. >>>>>>>>>> >>>>>>>>>>> But there are pre-made standards for the basic data types and >>>>>>>>>>>>> these are what are needed to build almost any data structure >>>>>>>>>>>>> necessary for global interchange of data. >>>>>>>>>>>>> >>>>>>>>>>>>> Does that make sense? >>>>>>>>>>>>> >>>>>>>>>>>>> Addison >>>>>>>>>>>>> >>>>>>>>>>>>> Addison Phillips >>>>>>>>>>>>> Principal SDE, I18N Architect (Amazon) Chair (W3C I18N WG) >>>>>>>>>>>>> >>>>>>>>>>>>> Internationalization is not a feature. >>>>>>>>>>>>> It is an architecture. >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> -----Original Message----- >>>>>>>>>>>>> >>>>>>>>>>>>>> From: Annette Greiner [mailto:amgreiner@lbl.gov] >>>>>>>>>>>>>> Sent: Thursday, August 04, 2016 12:04 PM >>>>>>>>>>>>>> To: ishida@w3.org; public-dwbp-comments@w3.org >>>>>>>>>>>>>> Cc: www International <www-international@w3.org> >>>>>>>>>>>>>> Subject: Re: [i18n review comment] BP3 should recommend >>>>>>>>>>>>>> locale-neutral representation #187 >>>>>>>>>>>>>> >>>>>>>>>>>>>> Hello on behalf of the DWBP WG, >>>>>>>>>>>>>> >>>>>>>>>>>>>> We're interested in pursuing this concept in our best practice >>>>>>>>>>>>>> document, but we would like some clarification of the practice >>>>>>>>>>>>>> of locale neutrality. >>>>>>>>>>>>>> You >>>>>>>>>>>>>> mention the variation across locales in decimal symbol, >>>>>>>>>>>>>> grouping symbol, number of grouping digits, digit shapes, >>>>>>>>>>>>>> etc., and you give an example of a locale-neutral data >>>>>>>>>>>>>> structure for monetary >>>>>>>>>>>>>> >>>>>>>>>>>>> values. >>>>>>> >>>>>>>> But this structure alone does not appear to address >>>>>>>>>>>>>> differences in decimal symbol, grouping symbol, number of >>>>>>>>>>>>>> grouping digits, or digit shapes. It does provide a mechanism >>>>>>>>>>>>>> to separately specify the units, and the example uses an >>>>>>>>>>>>>> ISO-4217 currency code, both of which we agree are good ideas. >>>>>>>>>>>>>> Is there a broad standard (beyond just monetary) for >>>>>>>>>>>>>> addressing the other symbol/representation issues you raised >>>>>>>>>>>>>> that we can address >>>>>>>>>>>>>> >>>>>>>>>>>>> briefly in our best practice? >>>>>>> >>>>>>>> Do you consider SI units consistent with a locale-neutral >>>>>>>>>>>>>> >>>>>>>>>>>>> approach? >>>>> >>>>>> Is there a locale-neutral standard for representing decimal >>>>>>>>>>>>>> numbers (perhaps using a period and no grouping, as in your >>>>>>>>>>>>>> >>>>>>>>>>>>> example)? >>>>>>> >>>>>>>> -Annette >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> On 7/22/16 5:32 AM, ishida@w3.org wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>> [raised by aphillips] >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> https://www.w3.org/TR/dwbp/#LocaleParametersMetadata >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Best practice #3 introduces itself as: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Providing locale parameters helps humans and computer >>>>>>>>>>>>>>> applications to work accurately with things like dates, >>>>>>>>>>>>>>> currencies and numbers that may look similar but have >>>>>>>>>>>>>>> different meanings in different locales. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> But the actual best practice is to use **locale-neutral** >>>>>>>>>>>>>>> representations that are interpreted/displayed to end-users >>>>>>>>>>>>>>> in a locale-appropriate manner. For example, instead of >>>>>>>>>>>>>>> storing the string "€2000.00", exchanging a data structure >>>>>>>>>>>>>>> like the following is strongly >>>>>>>>>>>>>>> preferred: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> ``` >>>>>>>>>>>>>>> "price" { >>>>>>>>>>>>>>> "value": 2000.00, >>>>>>>>>>>>>>> "currency": "EUR" >>>>>>>>>>>>>>> } >>>>>>>>>>>>>>> ``` >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> The date examples given are all in xsd:date format, which is >>>>>>>>>>>>>>> an excellent example of using a locale-neutral format. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Many things are dependent on locale: decimal symbol, >>>>>>>>>>>>>>> >>>>>>>>>>>>>> grouping >>>>> >>>>>> symbol, number of grouping digits, digit shapes, etc. It's >>>>>>>>>>>>>>> because there can be wide variation (sometimes open to >>>>>>>>>>>>>>> misinterpretation) that sending a locale neutral format is >>>>>>>>>>>>>>> >>>>>>>>>>>>>> preferred for data values. >>>>>>> >>>>>>>> Note also btw that the position of the currency symbol is >>>>>>>>>>>>>>> dependent on the locale. In France it would be normal to >>>>>>>>>>>>>>> write >>>>>>>>>>>>>>> >>>>>>>>>>>>>> 2000.00 € rather than €2000.00. >>>>>>>>>> >>>>>>>>>>> Same even when talking about USD when using $, ie. 2000.00 $. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>> >>>>>>>>>>>>>> Annette Greiner >>>>>>>>>>>>>> NERSC Data and Analytics Services Lawrence Berkeley National >>>>>>>>>>>>>> Laboratory >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> -- >>>>>>>>>>>> Annette Greiner >>>>>>>>>>>> NERSC Data and Analytics Services Lawrence Berkeley National >>>>>>>>>>>> Laboratory >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> -- >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Phil Archer >>>>>>>>>> W3C Data Activity Lead >>>>>>>>>> http://www.w3.org/2013/data/ >>>>>>>>>> >>>>>>>>>> http://philarcher.org >>>>>>>>>> +44 (0)7887 767755 >>>>>>>>>> @philarcher1 >>>>>>>>>> >>>>>>>>> -- >>>>>>> >>>>>>> >>>>>>> Phil Archer >>>>>>> W3C Data Activity Lead >>>>>>> http://www.w3.org/2013/data/ >>>>>>> >>>>>>> http://philarcher.org >>>>>>> +44 (0)7887 767755 <%2B44%20%280%297887%20767755> >>>>>>> @philarcher1 >>>>>>> >>>>>> -- >>>>> >>>>> >>>>> Phil Archer >>>>> W3C Data Activity Lead >>>>> http://www.w3.org/2013/data/ >>>>> >>>>> http://philarcher.org >>>>> +44 (0)7887 767755 <%2B44%20%280%297887%20767755> >>>>> @philarcher1 >>>>> >>>> >>> >> -- >> >> >> Phil Archer >> W3C Data Activity Lead >> http://www.w3.org/2013/data/ >> >> http://philarcher.org >> +44 (0)7887 767755 <%2B44%20%280%297887%20767755> >> @philarcher1 >> > > > > -- > Bernadette Farias Lóscio > Centro de Informática > Universidade Federal de Pernambuco - UFPE, Brazil > ------------------------------------------------------------ > ---------------- > > > -- > Annette Greiner > NERSC Data and Analytics Services > Lawrence Berkeley National Laboratory > > > -- Bernadette Farias Lóscio Centro de Informática Universidade Federal de Pernambuco - UFPE, Brazil ----------------------------------------------------------------------------
Re: [i18n review comment] BP3 should recommend locale-neutral representation #187
Annette Greiner   Wed, 24 Aug 2016 17:37:10 -0700

public-dwbp-comments > August 2016 > 0000.html

Received on Thursday, 25 August 2016 00:38:21 UTC

Show in list: by dateby threadby subjectby author

Link to this message in this page.

Sent to: bfl@cin.ufpe.br
Copied to: phila@w3.org, addison@lab126.com, deirdre@derilinx.com, ishida@w3.org, ishida@w3.org, public-dwbp-comments@w3.org, public-dwbp-comments@w3.org, www-international@w3.org.

Hi Bernadette, Of course you're right that "2015-05-05" can also be data, but in this example it is metadata so it doesn't really function as an example of what the BP is mostly about. We could use the same example as is already in there from Addison, maybe just call it a bus fare. I'll defer to others more familiar with vocabularies for the dct:conformsTo question. -Annette On 8/24/16 11:38 AM, Bernadette Farias Lóscio wrote: > Hi Annette, > > thanks for your answer! Just a brief explanation: I understood that > "2015-05-05" is also *data* and because of this I said that it is an > example of locale-neutral representation. > > I have just two more questions: > > - I am still not sure if we should keep "dct:conformsTo". Should we > keep it? > > - I understand that to have an example of locale-neutral > representation we should present some instances from the dataset. But > I don't see how to do this considering the dataset attributes and > their corresponding data types [1]. Could you please give me an example? > > cheers, > Bernadette > > [1] > http://w3c.github.io/dwbp/dwbp-example.html#dataset-structural-metadata > > > > > 2016-08-24 14:32 GMT-03:00 Annette Greiner <amgreiner@lbl.gov > <mailto:amgreiner@lbl.gov>>: > > I think it's a fine idea to put both the locale-neutral and the > metadata approaches into example 13, but the metadata is not an > example of a locale-neutral representation of the *data*. We need > to show that in the data itself. > > -Annette > > > On 8/24/16 4:56 AM, Bernadette Farias Lóscio wrote: >> Hi all, >> >> @Phil, thanks a lot for making the updates! @Annette and Addison >> thanks for the comments and suggestions! >> >> I agree with the changes made until now, but I'd like to answer >> the following comment: >> >> Finally, the example marked prominently as Example 13 >> looks like the >> primary suggestion for implementing the BP, which it >> isn't anymore. I >> think the 2000 Euro example should be at least as >> prominently marked. >> >> >> I sympathise but I'm going to have to leave that to the >> editors. It can be done by simply adding class="example" to >> the <pre> element. But, doing that then means that the >> example numbers will be out of step with the BP numbers from >> that that point on, which I *think* editors have been anxious >> to avoid? >> >> >> I don't think its a good idea to change the numbers of the >> examples. One solution could be to make some changes on the >> example 13. >> >> Example 13 shows both the use of locale-neutral representation >> and locale-parameters metadata. We have the tag "xsd:date" in >> 'dct:issued "2015-05-05"^^xsd:date', but we also have >> 'dct:conformsTo >> <http://www.iso.org/iso/home/standards/iso8601.htm >> <http://www.iso.org/iso/home/standards/iso8601.htm>>' to indicate >> the standard adopted as date format. >> >> :stops-2015-05-05 >> >> a dcat:Dataset ; >> dct:title "Bus stops of MyCity" ; >> dcat:keyword "transport","mobility","bus" ; >> dct:issued "2015-05-05"^^xsd:date ; >> dcat:contactPoint >> <http://data.mycity.example.com/transport/contact >> <http://data.mycity.example.com/transport/contact>> ; >> dct:temporal <http://reference.data.gov.uk/id/year/2015 >> <http://reference.data.gov.uk/id/year/2015>> ; >> dct:spatial <http://www.geonames.org/3399415 >> <http://www.geonames.org/3399415>> ; >> dct:publisher :transport-agency-mycity ; >> dct:accrualPeriodicity >> <http://purl.org/linked-data/sdmx/2009/code#freq-A >> <http://purl.org/linked-data/sdmx/2009/code#freq-A>> ; >> dcat:theme :mobility ; >> dcat:distribution :stops-2015-05-05.csv ; >> dct:language <http://id.loc.gov/vocabulary/iso639-1/en >> <http://id.loc.gov/vocabulary/iso639-1/en>> , >> <http://id.loc.gov/vocabulary/iso639-1/pt >> <http://id.loc.gov/vocabulary/iso639-1/pt>> ; >> dct:conformsTo >> <http://www.iso.org/iso/home/standards/iso8601.htm >> <http://www.iso.org/iso/home/standards/iso8601.htm>> ; >> . >> >> Should we have both ou just xsd:date? If I understood correct, I >> think we should keep just xsd:date. In this case, we can also >> change the example description to mention that we are using a >> locale-neutral representation for date and a locale-parameter >> metadata (dct:language) to specify the languages in which dataset >> is published. See the suggestion below: >> >> The example below shows the use of xsd:date providing a >> local-neutral representation for the issue date of the bus stops >> dataset (|stops-2015-05-05|). Considering that the data from the >> bus stops dataset is already in a locale-specific format, then >> the property |dct:language| is used to declare the languages the >> dataset is published in. If the dataset is available in multiple >> languages, use multiple values for this property. >> >> :stops-2015-05-05 >> >> a dcat:Dataset ; >> dct:title "Bus stops of MyCity" ; >> dcat:keyword "transport","mobility","bus" ; >> dct:issued "2015-05-05"^^xsd:date ; >> dcat:contactPoint >> <http://data.mycity.example.com/transport/contact >> <http://data.mycity.example.com/transport/contact>> ; >> dct:temporal <http://reference.data.gov.uk/id/year/2015 >> <http://reference.data.gov.uk/id/year/2015>> ; >> dct:spatial <http://www.geonames.org/3399415 >> <http://www.geonames.org/3399415>> ; >> dct:publisher :transport-agency-mycity ; >> dct:accrualPeriodicity >> <http://purl.org/linked-data/sdmx/2009/code#freq-A >> <http://purl.org/linked-data/sdmx/2009/code#freq-A>> ; >> dcat:theme :mobility ; >> dcat:distribution :stops-2015-05-05.csv ; >> dct:language <http://id.loc.gov/vocabulary/iso639-1/en >> <http://id.loc.gov/vocabulary/iso639-1/en>> , >> >> <http://id.loc.gov/vocabulary/iso639-1/pt >> <http://id.loc.gov/vocabulary/iso639-1/pt>> ; >> >> >> Please, let me know what do you think about this. >> >> Thanks! >> >> Berna >> >> >> >> >> -Annette >> >> >> On 8/23/16 7:11 AM, Phillips, Addison wrote: >> >> Hi Phil, >> >> Thanks. This looks good to me. >> >> Addison >> >> -----Original Message----- >> From: Phil Archer [mailto:phila@w3.org >> <mailto:phila@w3.org>] >> Sent: Tuesday, August 23, 2016 3:29 AM >> To: Phillips, Addison <addison@lab126.com >> <mailto:addison@lab126.com>>; Deirdre Lee >> <deirdre@derilinx.com >> <mailto:deirdre@derilinx.com>>; Bernadette Farias >> Lóscio <bfl@cin.ufpe.br <mailto:bfl@cin.ufpe.br>>; >> Annette Greiner <amgreiner@lbl.gov >> <mailto:amgreiner@lbl.gov>> >> Cc: ishida@w3.org <mailto:ishida@w3.org>; >> public-dwbp-comments@w3.org >> <mailto:public-dwbp-comments@w3.org>; www >> International >> <www-international@w3.org >> <mailto:www-international@w3.org>> >> Subject: Re: [i18n review comment] BP3 should >> recommend locale-neutral >> representation #187 >> >> Thanks again Addison, >> >> Pls see below. >> >> On 22/08/2016 18:36, Phillips, Addison wrote: >> >> Hi Phil, >> >> This looks good. A few comments. >> >> 1. Rather than providing your own definition >> for 'locale', you might >> make >> >> use of the one we provide in LTLI [1]. >> >> Done >> http://w3c.github.io/dwbp/bp.html#locale_parameter >> <http://w3c.github.io/dwbp/bp.html#locale_parameter> >> >> 2. The "why" is still missing something. I >> would suggest adding a >> new first >> >> paragraph explaining locale-neutral first. >> Something like: >> >> -- >> Data values that are machine-readable and not >> specific to any >> particular >> >> language or culture are more durable and less open to >> misinterpretation than >> values that use one of the many different >> cultural representations. >> By using a >> locale-neutral format, systems avoid the need to >> establish specific >> interchange rules that vary according to the >> language or location of >> the user. >> >> When the data is already in a locale-specific >> format, providing locale >> parameters... <rest of existing text> >> >> >> Done, exactly as you suggest >> http://w3c.github.io/dwbp/bp.html#LocaleParametersMetadata >> <http://w3c.github.io/dwbp/bp.html#LocaleParametersMetadata> >> >> With luck... the doc gets a green light from you? >> >> Thanks again >> >> Phil. >> >> -- >> >> Hope that helps, >> >> Addison >> >> [1] https://www.w3.org/TR/ltli/#locale >> <https://www.w3.org/TR/ltli/#locale> >> >> -----Original Message----- >> From: Phil Archer [mailto:phila@w3.org >> <mailto:phila@w3.org>] >> Sent: Monday, August 22, 2016 2:34 AM >> To: Deirdre Lee <deirdre@derilinx.com >> <mailto:deirdre@derilinx.com>>; Phillips, >> Addison >> <addison@lab126.com >> <mailto:addison@lab126.com>>; Bernadette >> Farias Lóscio <bfl@cin.ufpe.br >> <mailto:bfl@cin.ufpe.br>>; >> Annette Greiner <amgreiner@lbl.gov >> <mailto:amgreiner@lbl.gov>> >> Cc: ishida@w3.org <mailto:ishida@w3.org>; >> public-dwbp-comments@w3.org >> <mailto:public-dwbp-comments@w3.org>; www >> International >> <www-international@w3.org >> <mailto:www-international@w3.org>> >> Subject: Re: [i18n review comment] BP3 >> should recommend >> locale-neutral representation #187 >> >> Dear all, >> >> I have taken further steps on this. The >> result can be seen at >> http://philarcher1.github.io/dwbp/bp.html#LocaleParametersMetadata >> <http://philarcher1.github.io/dwbp/bp.html#LocaleParametersMetadata> >> >> 1. Addision's text used more or less >> verbatim; 1a. taken account of >> Annette's suggestion; 1b. replaced inline >> links to BCP47 and CLDR with >> >> references 2. >> >> title of the BP changed to Use >> locale-neutral data representations 3. >> moved to Data Formats section as resolved >> in WG meeting on Friday; 4. >> added R- FormatMachineRead to list of >> evidence and thereby updated >> the UCR cross matching; 5. updated the >> Challenges SVG diagram; 6. >> updated my Pull request. >> >> NB, I *retained* the old ID for the BP so >> that any links to >> #LocaleParametersMetadata will still >> work. I know there are some of >> these, for example, in the Share-PSI project. >> >> HTH >> >> Phil. >> >> >> >> On 22/08/2016 08:52, Deirdre Lee wrote: >> >> HI, >> >> Thank you for your comments Addison. >> I think they make sense and >> should be straight-forward to >> incorporate. >> >> The title of the BP should probably >> also be updated to something >> like 'Provide locale-neutral data' >> >> Phil and DWBP editors, in Friday's >> meeting we also agreed to move >> BP3 to the Data Formats section from >> the Metadata section, which >> would make it BP14, right? >> >> Kind regards, >> >> Deirdre >> >> >> >> On 19/08/2016 17:39, Phillips, >> Addison wrote: >> >> Hi Phil, >> >> Thanks for starting on this. I >> think the pull request is a good >> start. >> I have some comments on it. >> >> My main concern is that this BP >> is really backwards. It recommends >> to "locale parameter metadata" >> and then says that the simplest way >> to do this is to use >> locale-neutral formats. The >> recommendation >> should be more like "use >> locale-neutral formats or provide >> locale/language information where >> that's not possible". The pull >> request captures the use of >> locale-neutral, but doesn't really >> explain about when to provide >> locale and language information. >> >> I would change this: >> >> -- >> <p class="practicedesc">Provide >> metadata about locale parameters >> (date, time, and number formats, >> language).</p> >> -- >> >> To say: >> >> -- >> <p class="practicedesc">Use >> locale-neutral data structures and >> values, or, where that is not >> possible, provide metadata about the >> locale used by data values.</p> >> -- >> >> I would change: >> >> -- >> <p>The simplest method is to use >> local-neutral representations of >> the actual data, and then add >> metadata to provide relevant locale >> information. For example, rather >> than storing "€2000.00" as a >> string, it's strongly preferred >> to exchange a data structure such >> as:</p> >> -- >> >> To say: >> >> -- >> <p>Most common data >> representations are locale >> neutral. For >> example, XML Schema types such as >> xsd:integer and xsd: date are >> intended for locale-neutral data >> interchange. Using locale-neutral >> representations allows the data >> values to be processed accurately >> without complex parsing or >> misinterpretation and also allows the >> data to be presented in the >> format most comfortable for the >> consumer of the data. For >> example, rather than storing >> "€2000,00" >> as a string, it's strongly >> preferred to exchange a data >> structure >> such as:</p> >> -- >> >> Also, note the misspelling of >> "locale-neutral" in the pull request. >> >> I would then go on to add some >> text about when locale parameters >> are needed. Something like: >> >> -- >> Some datasets contain values that >> are not or cannot be rendered >> into a locale-neutral format. >> This is particularly true of any >> natural language text values. For >> each data field that can contain >> locale affected or natural >> language text, there should be an >> associated language tag used to >> indicate the language and locale >> of the >> >> data. >> >> This locale information can be >> used in parsing the data or to >> ensure proper presentation and >> processing of the value by the >> >> consumer. >> >> -- >> >> (Sorry for not generating a pull >> request of my own) >> >> Addison >> >> -----Original Message----- >> From: Phil Archer >> [mailto:phila@w3.org >> <mailto:phila@w3.org>] >> Sent: Friday, August 19, 2016 >> 8:37 AM >> To: Bernadette Farias Lóscio >> <bfl@cin.ufpe.br >> <mailto:bfl@cin.ufpe.br>>; >> Annette Greiner >> <amgreiner@lbl.gov >> <mailto:amgreiner@lbl.gov>> >> Cc: Phillips, Addison >> <addison@lab126.com >> <mailto:addison@lab126.com>>; >> ishida@w3.org >> <mailto:ishida@w3.org>; >> public-dwbp- comments@w3.org >> <mailto:comments@w3.org>; www >> International >> <www-international@w3.org >> <mailto:www-international@w3.org>> >> Subject: Re: [i18n review >> comment] BP3 should recommend >> locale-neutral representation >> #187 >> >> I took an action on today's >> call to try and address this >> in BP3. >> You can see the results at >> >> http://philarcher1.github.io/dwbp/bp.html#LocaleParametersMetadata >> <http://philarcher1.github.io/dwbp/bp.html#LocaleParametersMetadata> >> >> This uses some of Addison's >> text directly and highlights >> the value >> of the xsd datatypes - but >> retains enough of the >> original BP for >> it to be an amendment rather >> than a whole new one - I hope. >> >> This addresses most of the >> resolution taken today [1] >> but I have >> not moved the BP to the >> formats section. I leave that >> to the >> editors who may want to make >> further changes - or argue >> for it to >> be left where it is, or add >> references from the formats >> section >> or, or, >> >> or... >> >> I've created the Pull Request >> https://github.com/w3c/dwbp/pull/447 >> <https://github.com/w3c/dwbp/pull/447> >> >> Phil. >> >> [1] >> https://www.w3.org/2016/08/19-dwbp-minutes#resolution02 >> <https://www.w3.org/2016/08/19-dwbp-minutes#resolution02> >> >> On 15/08/2016 17:28, >> Bernadette Farias Lóscio wrote: >> >> Dear Ishida, >> >> This comment [1] is still >> under discussion [4] and >> we'd like to >> ask your opinion about >> two of our proposals: >> >> 1. to include >> locale-neutral >> representation ideas as >> part of BP3 >> [2], or 2. to include a >> paragraph at the >> introduction of Section >> 8.8 Data Formats [3] to >> discuss the relevance of >> having >> local-neutral >> representations. >> >> We also discussed the >> proposal of having a new >> BP and we agreed >> that we won't have a lot >> of time for a broader >> review of the new >> BP and to collect >> feedback from the community. >> >> Thanks a lot! >> DWBP editors >> >> [1] >> https://lists.w3.org/Archives/Public/public-dwbp-comments/ >> <https://lists.w3.org/Archives/Public/public-dwbp-comments/> >> 2016Jul/0028.html >> >> [2]http://agreiner.github.io/dwbp/bp.html#LocaleParametersMetadata >> <http://agreiner.github.io/dwbp/bp.html#LocaleParametersMetadata> >> >> [3] >> https://www.w3.org/TR/dwbp/#dataFormats >> <https://www.w3.org/TR/dwbp/#dataFormats> >> [4] >> https://lists.w3.org/Archives/Public/public-dwbp-wg/2016Aug/0009 >> <https://lists.w3.org/Archives/Public/public-dwbp-wg/2016Aug/0009>. >> ht >> ml >> >> >> 2016-08-04 23:26 >> GMT+02:00 Annette Greiner >> <amgreiner@lbl.gov >> <mailto:amgreiner@lbl.gov>>: >> >> Hi Addison, >> >> Thanks for your >> response, and it does >> make sense. I think >> what I >> am still missing is >> whether there is >> guidance we can point >> to as >> to how to represent >> the "locale-neutral" >> data so that it can >> most easily be made >> locale specific by >> existing tools. You >> mention "pre-made >> standards for the >> basic data types". Is >> there >> a recommended list we >> could >> >> reference? >> >> Thanks for your help! >> -Annette >> >> >> On 8/4/16 12:31 PM, >> Phillips, Addison wrote: >> >> Hi Annette, >> >> Thanks for the >> note. This is a >> personal reply >> not on behalf of >> the WG. >> >> Locale neutral >> formats are quite >> common on the Web >> and the >> Internet in >> general. One >> familiar format >> referenced by your >> document, for >> example, is XML >> Schema. While the >> >> representations >> >> of numbers, >> dates, and the >> like in XML >> Schema would be "more >> appropriate" for >> some >> languages/locales >> than others if >> given as >> plain text, what >> distinguishes >> them is that they >> are all >> machine readable >> and intended to >> >> be read by machines for later >> processing. >> >> The display of >> values is a >> separate, local, >> concern for the >> data's consumer. >> This necessarily >> means choosing >> specific >> separators (such >> as decimal >> separators) over >> other, more >> localized values. >> Save for "free >> >> text" >> >> (natural >> language) data, >> most data formats >> are locale neutral >> and these include >> things like >> JSON-LD, XML >> Schema, CSV, and so >> >> forth. >> >> Not every >> possible data >> structure or data >> value is, of course, >> covered fully. >> For example, in >> my day job (I >> work at Amazon), >> we have many >> different common >> measurement units >> defined >> >> internally. >> >> To transmit these >> in a >> locale-neutral >> manner, we need to >> construct our own >> data schemas and >> identifiers. >> There are >> profoundly many >> ways to measure >> shoes, dresses, >> auto parts, >> hats, drone >> propellers, and >> so forth. But it >> would be a >> nightmare to have >> to deal with >> localized >> >> presentation formats on top >> of that. >> >> But there are >> pre-made >> standards for the >> basic data types and >> these are what >> are needed to >> build almost any >> data structure >> necessary for >> global >> interchange of data. >> >> Does that make sense? >> >> Addison >> >> Addison Phillips >> Principal SDE, >> I18N Architect >> (Amazon) Chair >> (W3C I18N WG) >> >> Internationalization >> is not a feature. >> It is an >> architecture. >> >> >> >> >> -----Original >> Message----- >> >> From: Annette >> Greiner >> [mailto:amgreiner@lbl.gov >> <mailto:amgreiner@lbl.gov>] >> Sent: >> Thursday, >> August 04, >> 2016 12:04 PM >> To: >> ishida@w3.org >> <mailto:ishida@w3.org>; >> public-dwbp-comments@w3.org >> <mailto:public-dwbp-comments@w3.org> >> Cc: www >> International >> <www-international@w3.org >> <mailto:www-international@w3.org>> >> Subject: Re: >> [i18n review >> comment] BP3 >> should recommend >> locale-neutral >> representation >> #187 >> >> Hello on >> behalf of the >> DWBP WG, >> >> We're >> interested in >> pursuing this >> concept in >> our best practice >> document, but >> we would like >> some >> clarification >> of the practice >> of locale >> neutrality. >> You >> mention the >> variation >> across >> locales in >> decimal symbol, >> grouping >> symbol, >> number of >> grouping >> digits, digit >> shapes, >> etc., and you >> give an >> example of a >> locale-neutral >> data >> structure for >> monetary >> >> values. >> >> But this >> structure >> alone does >> not appear to >> address >> differences >> in decimal >> symbol, >> grouping >> symbol, number of >> grouping >> digits, or >> digit shapes. >> It does >> provide a >> mechanism >> to separately >> specify the >> units, and >> the example >> uses an >> ISO-4217 >> currency >> code, both of >> which we >> agree are >> good ideas. >> Is there a >> broad >> standard >> (beyond just >> monetary) for >> addressing >> the other >> symbol/representation >> issues you raised >> that we can >> address >> >> briefly in our best practice? >> >> Do you >> consider SI >> units >> consistent >> with a >> locale-neutral >> >> approach? >> >> Is there a >> locale-neutral >> standard for >> representing >> decimal >> numbers >> (perhaps >> using a >> period and no >> grouping, as >> in your >> >> example)? >> >> -Annette >> >> >> On 7/22/16 >> 5:32 AM, >> ishida@w3.org >> <mailto:ishida@w3.org> >> wrote: >> >> [raised >> by aphillips] >> >> https://www.w3.org/TR/dwbp/#LocaleParametersMetadata >> <https://www.w3.org/TR/dwbp/#LocaleParametersMetadata> >> >> Best >> practice >> #3 >> introduces >> itself as: >> >> Providing >> locale >> parameters >> helps >> humans >> and computer >> applications >> to work >> accurately >> with >> things >> like dates, >> currencies >> and >> numbers >> that may >> look >> similar >> but have >> different >> meanings >> in >> different >> locales. >> >> But the >> actual >> best >> practice >> is to use >> **locale-neutral** >> representations >> that are >> interpreted/displayed >> to end-users >> in a >> locale-appropriate >> manner. >> For >> example, >> instead of >> storing >> the >> string >> "€2000.00", >> exchanging >> a data >> structure >> like the >> following >> is strongly >> preferred: >> >> ``` >> "price" { >> >> "value": >> 2000.00, >> "currency": >> "EUR" >> } >> ``` >> >> The date >> examples >> given are >> all in >> xsd:date >> format, >> which is >> an >> excellent >> example >> of using >> a >> locale-neutral >> format. >> >> Many >> things >> are >> dependent >> on >> locale: >> decimal >> symbol, >> >> grouping >> >> symbol, >> number of >> grouping >> digits, >> digit >> shapes, >> etc. It's >> because >> there can >> be wide >> variation >> (sometimes >> open to >> misinterpretation) >> that >> sending a >> locale >> neutral >> format is >> >> preferred for data values. >> >> Note also >> btw that >> the >> position >> of the >> currency >> symbol is >> dependent >> on the >> locale. >> In France >> it would >> be normal to >> write >> >> 2000.00 € rather than €2000.00. >> >> Same even >> when >> talking >> about USD >> when >> using $, >> ie. >> 2000.00 $. >> >> >> -- >> >> Annette Greiner >> NERSC Data >> and Analytics >> Services >> Lawrence >> Berkeley National >> Laboratory >> >> >> -- >> Annette Greiner >> NERSC Data and >> Analytics Services >> Lawrence Berkeley >> National >> Laboratory >> >> >> >> -- >> >> >> Phil Archer >> W3C Data Activity Lead >> http://www.w3.org/2013/data/ >> >> http://philarcher.org >> +44 (0)7887 767755 >> <tel:%2B44%20%280%297887%20767755> >> @philarcher1 >> >> -- >> >> >> Phil Archer >> W3C Data Activity Lead >> http://www.w3.org/2013/data/ >> >> http://philarcher.org >> +44 (0)7887 767755 >> <tel:%2B44%20%280%297887%20767755> >> @philarcher1 >> >> -- >> >> >> Phil Archer >> W3C Data Activity Lead >> http://www.w3.org/2013/data/ >> >> http://philarcher.org >> +44 (0)7887 767755 <tel:%2B44%20%280%297887%20767755> >> @philarcher1 >> >> >> >> -- >> >> >> Phil Archer >> W3C Data Activity Lead >> http://www.w3.org/2013/data/ >> >> http://philarcher.org >> +44 (0)7887 767755 <tel:%2B44%20%280%297887%20767755> >> @philarcher1 >> >> >> >> >> -- >> Bernadette Farias Lóscio >> Centro de Informática >> Universidade Federal de Pernambuco - UFPE, Brazil >> ---------------------------------------------------------------------------- > > -- > Annette Greiner > NERSC Data and Analytics Services > Lawrence Berkeley National Laboratory > > > > > -- > Bernadette Farias Lóscio > Centro de Informática > Universidade Federal de Pernambuco - UFPE, Brazil > ---------------------------------------------------------------------------- -- Annette Greiner NERSC Data and Analytics Services Lawrence Berkeley National Laboratory
Re: [i18n review comment] BP3 should recommend locale-neutral representation #187
Annette Greiner   Fri, 26 Aug 2016 10:31:30 -0700

public-dwbp-comments > August 2016 > 0000.html

Received on Friday, 26 August 2016 17:32:05 UTC

Show in list: by dateby threadby subjectby author

Link to this message in this page.

Sent to: addison@lab126.com
Copied to: phila@w3.org, deirdre@derilinx.com, bfl@cin.ufpe.br, ishida@w3.org, ishida@w3.org, public-dwbp-comments@w3.org, public-dwbp-comments@w3.org, www-international@w3.org.

Addison, or anyone in i18n that knows, I have a question about how to make data locale neutral when all the values are of the same format. It seems a hard sell to tell people to expand every datetime value into a value and a format when the format is always the same. For example, if I have taken a terabyte worth of data (not at all uncommon where I work) that consists of pairs of a datetime plus a sensor reading, and the time is always in UNIX format (seconds since January 1, 1970), is it still considered locale-neutral if I simply indicate in the column metadata that the column holds UNIX time values? UNIX time, sensor reading (mV) 1471995721,4.7 1471995731,7.5 1471995721,6.2 If not, is there a practical way to make it locale neutral without repeating the format for each value? I’m concerned that, for scientific datasets especially, it makes little sense to inflate the size of the dataset with repeated information. For large data, that becomes impractical, which tempts me to suggest that we say in the BP that locale parameters should be used only when a locale-neutral representation is not *practical* (rather than not *possible*), unless my example above would qualify as locale neutral. -Annette > On Aug 22, 2016, at 10:36 AM, Phillips, Addison <addison@lab126.com> wrote: > > Hi Phil, > > This looks good. A few comments. > > 1. Rather than providing your own definition for 'locale', you might make use of the one we provide in LTLI [1]. > > 2. The "why" is still missing something. I would suggest adding a new first paragraph explaining locale-neutral first. Something like: > > -- > Data values that are machine-readable and not specific to any particular language or culture are more durable and less open to misinterpretation than values that use one of the many different cultural representations. By using a locale-neutral format, systems avoid the need to establish specific interchange rules that vary according to the language or location of the user. > > When the data is already in a locale-specific format, providing locale parameters... <rest of existing text> > -- > > Hope that helps, > > Addison > > [1] https://www.w3.org/TR/ltli/#locale > >> -----Original Message----- >> From: Phil Archer [mailto:phila@w3.org] >> Sent: Monday, August 22, 2016 2:34 AM >> To: Deirdre Lee <deirdre@derilinx.com>; Phillips, Addison >> <addison@lab126.com>; Bernadette Farias Lóscio <bfl@cin.ufpe.br>; >> Annette Greiner <amgreiner@lbl.gov> >> Cc: ishida@w3.org; public-dwbp-comments@w3.org; www International >> <www-international@w3.org> >> Subject: Re: [i18n review comment] BP3 should recommend locale-neutral >> representation #187 >> >> Dear all, >> >> I have taken further steps on this. The result can be seen at >> http://philarcher1.github.io/dwbp/bp.html#LocaleParametersMetadata >> >> 1. Addision's text used more or less verbatim; 1a. taken account of Annette's >> suggestion; 1b. replaced inline links to BCP47 and CLDR with references 2. >> title of the BP changed to Use locale-neutral data representations 3. moved >> to Data Formats section as resolved in WG meeting on Friday; 4. added R- >> FormatMachineRead to list of evidence and thereby updated the UCR cross >> matching; 5. updated the Challenges SVG diagram; 6. updated my Pull >> request. >> >> NB, I *retained* the old ID for the BP so that any links to >> #LocaleParametersMetadata will still work. I know there are some of these, >> for example, in the Share-PSI project. >> >> HTH >> >> Phil. >> >> >> >> On 22/08/2016 08:52, Deirdre Lee wrote: >>> HI, >>> >>> Thank you for your comments Addison. I think they make sense and >>> should be straight-forward to incorporate. >>> >>> The title of the BP should probably also be updated to something like >>> 'Provide locale-neutral data' >>> >>> Phil and DWBP editors, in Friday's meeting we also agreed to move BP3 >>> to the Data Formats section from the Metadata section, which would >>> make it BP14, right? >>> >>> Kind regards, >>> >>> Deirdre >>> >>> >>> >>> On 19/08/2016 17:39, Phillips, Addison wrote: >>>> Hi Phil, >>>> >>>> Thanks for starting on this. I think the pull request is a good start. >>>> I have some comments on it. >>>> >>>> My main concern is that this BP is really backwards. It recommends to >>>> "locale parameter metadata" and then says that the simplest way to do >>>> this is to use locale-neutral formats. The recommendation should be >>>> more like "use locale-neutral formats or provide locale/language >>>> information where that's not possible". The pull request captures the >>>> use of locale-neutral, but doesn't really explain about when to >>>> provide locale and language information. >>>> >>>> I would change this: >>>> >>>> -- >>>> <p class="practicedesc">Provide metadata about locale parameters >>>> (date, time, and number formats, language).</p> >>>> -- >>>> >>>> To say: >>>> >>>> -- >>>> <p class="practicedesc">Use locale-neutral data structures and >>>> values, or, where that is not possible, provide metadata about the >>>> locale used by data values.</p> >>>> -- >>>> >>>> I would change: >>>> >>>> -- >>>> <p>The simplest method is to use local-neutral representations of the >>>> actual data, and then add metadata to provide relevant locale >>>> information. For example, rather than storing "€2000.00" as a string, >>>> it's strongly preferred to exchange a data structure such as:</p> >>>> -- >>>> >>>> To say: >>>> >>>> -- >>>> <p>Most common data representations are locale neutral. For example, >>>> XML Schema types such as xsd:integer and xsd: date are intended for >>>> locale-neutral data interchange. Using locale-neutral representations >>>> allows the data values to be processed accurately without complex >>>> parsing or misinterpretation and also allows the data to be presented >>>> in the format most comfortable for the consumer of the data. For >>>> example, rather than storing "€2000,00" as a string, it's strongly >>>> preferred to exchange a data structure such as:</p> >>>> -- >>>> >>>> Also, note the misspelling of "locale-neutral" in the pull request. >>>> >>>> I would then go on to add some text about when locale parameters are >>>> needed. Something like: >>>> >>>> -- >>>> Some datasets contain values that are not or cannot be rendered into >>>> a locale-neutral format. This is particularly true of any natural >>>> language text values. For each data field that can contain locale >>>> affected or natural language text, there should be an associated >>>> language tag used to indicate the language and locale of the data. >>>> This locale information can be used in parsing the data or to ensure >>>> proper presentation and processing of the value by the consumer. >>>> -- >>>> >>>> (Sorry for not generating a pull request of my own) >>>> >>>> Addison >>>> >>>>> -----Original Message----- >>>>> From: Phil Archer [mailto:phila@w3.org] >>>>> Sent: Friday, August 19, 2016 8:37 AM >>>>> To: Bernadette Farias Lóscio <bfl@cin.ufpe.br>; Annette Greiner >>>>> <amgreiner@lbl.gov> >>>>> Cc: Phillips, Addison <addison@lab126.com>; ishida@w3.org; >>>>> public-dwbp- comments@w3.org; www International >>>>> <www-international@w3.org> >>>>> Subject: Re: [i18n review comment] BP3 should recommend >>>>> locale-neutral representation #187 >>>>> >>>>> I took an action on today's call to try and address this in BP3. You >>>>> can see the results at >>>>> http://philarcher1.github.io/dwbp/bp.html#LocaleParametersMetadata >>>>> >>>>> This uses some of Addison's text directly and highlights the value >>>>> of the xsd datatypes - but retains enough of the original BP for it >>>>> to be an amendment rather than a whole new one - I hope. >>>>> >>>>> This addresses most of the resolution taken today [1] but I have not >>>>> moved the BP to the formats section. I leave that to the editors who >>>>> may want to make further changes - or argue for it to be left where >>>>> it is, or add references from the formats section or, or, or... >>>>> >>>>> I've created the Pull Request https://github.com/w3c/dwbp/pull/447 >>>>> >>>>> Phil. >>>>> >>>>> [1] https://www.w3.org/2016/08/19-dwbp-minutes#resolution02 >>>>> >>>>> On 15/08/2016 17:28, Bernadette Farias Lóscio wrote: >>>>>> Dear Ishida, >>>>>> >>>>>> This comment [1] is still under discussion [4] and we'd like to ask >>>>>> your opinion about two of our proposals: >>>>>> >>>>>> 1. to include locale-neutral representation ideas as part of BP3 >>>>>> [2], or 2. to include a paragraph at the introduction of Section >>>>>> 8.8 Data Formats [3] to discuss the relevance of having >>>>>> local-neutral representations. >>>>>> >>>>>> We also discussed the proposal of having a new BP and we agreed >>>>>> that we won't have a lot of time for a broader review of the new BP >>>>>> and to collect feedback from the community. >>>>>> >>>>>> Thanks a lot! >>>>>> DWBP editors >>>>>> >>>>>> [1] https://lists.w3.org/Archives/Public/public-dwbp-comments/ >>>>>> 2016Jul/0028.html >>>>>> >> [2]http://agreiner.github.io/dwbp/bp.html#LocaleParametersMetadata >>>>>> [3] https://www.w3.org/TR/dwbp/#dataFormats >>>>>> [4] >>>>>> https://lists.w3.org/Archives/Public/public-dwbp-wg/2016Aug/0009.ht >>>>>> ml >>>>>> >>>>>> >>>>>> 2016-08-04 23:26 GMT+02:00 Annette Greiner <amgreiner@lbl.gov>: >>>>>> >>>>>>> Hi Addison, >>>>>>> >>>>>>> Thanks for your response, and it does make sense. I think what I >>>>>>> am still missing is whether there is guidance we can point to as >>>>>>> to how to represent the "locale-neutral" data so that it can most >>>>>>> easily be made locale specific by existing tools. You mention >>>>>>> "pre-made standards for the basic data types". Is there a >>>>>>> recommended list we could >>>>> reference? >>>>>>> Thanks for your help! >>>>>>> -Annette >>>>>>> >>>>>>> >>>>>>> On 8/4/16 12:31 PM, Phillips, Addison wrote: >>>>>>> >>>>>>>> Hi Annette, >>>>>>>> >>>>>>>> Thanks for the note. This is a personal reply not on behalf of >>>>>>>> the WG. >>>>>>>> >>>>>>>> Locale neutral formats are quite common on the Web and the >>>>>>>> Internet in general. One familiar format referenced by your >>>>>>>> document, for example, is XML Schema. While the representations >>>>>>>> of numbers, dates, and the like in XML Schema would be "more >>>>>>>> appropriate" for some languages/locales than others if given as >>>>>>>> plain text, what distinguishes them is that they are all machine >>>>>>>> readable and intended to >>>>> be read by machines for later processing. >>>>>>>> The display of values is a separate, local, concern for the >>>>>>>> data's consumer. This necessarily means choosing specific >>>>>>>> separators (such as decimal separators) over other, more >>>>>>>> localized values. Save for "free >>>>> text" >>>>>>>> (natural language) data, most data formats are locale neutral and >>>>>>>> these include things like JSON-LD, XML Schema, CSV, and so forth. >>>>>>>> >>>>>>>> Not every possible data structure or data value is, of course, >>>>>>>> covered fully. For example, in my day job (I work at Amazon), we >>>>>>>> have many different common measurement units defined internally. >>>>>>>> To transmit these in a locale-neutral manner, we need to >>>>>>>> construct our own data schemas and identifiers. There are >>>>>>>> profoundly many ways to measure shoes, dresses, auto parts, hats, >>>>>>>> drone propellers, and so forth. But it would be a nightmare to >>>>>>>> have to deal with localized >>>>> presentation formats on top of that. >>>>>>>> But there are pre-made standards for the basic data types and >>>>>>>> these are what are needed to build almost any data structure >>>>>>>> necessary for global interchange of data. >>>>>>>> >>>>>>>> Does that make sense? >>>>>>>> >>>>>>>> Addison >>>>>>>> >>>>>>>> Addison Phillips >>>>>>>> Principal SDE, I18N Architect (Amazon) Chair (W3C I18N WG) >>>>>>>> >>>>>>>> Internationalization is not a feature. >>>>>>>> It is an architecture. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -----Original Message----- >>>>>>>>> From: Annette Greiner [mailto:amgreiner@lbl.gov] >>>>>>>>> Sent: Thursday, August 04, 2016 12:04 PM >>>>>>>>> To: ishida@w3.org; public-dwbp-comments@w3.org >>>>>>>>> Cc: www International <www-international@w3.org> >>>>>>>>> Subject: Re: [i18n review comment] BP3 should recommend >>>>>>>>> locale-neutral representation #187 >>>>>>>>> >>>>>>>>> Hello on behalf of the DWBP WG, >>>>>>>>> >>>>>>>>> We're interested in pursuing this concept in our best practice >>>>>>>>> document, but we would like some clarification of the practice >>>>>>>>> of locale neutrality. >>>>>>>>> You >>>>>>>>> mention the variation across locales in decimal symbol, grouping >>>>>>>>> symbol, number of grouping digits, digit shapes, etc., and you >>>>>>>>> give an example of a locale-neutral data structure for monetary >> values. >>>>>>>>> But this structure alone does not appear to address differences >>>>>>>>> in decimal symbol, grouping symbol, number of grouping digits, >>>>>>>>> or digit shapes. It does provide a mechanism to separately >>>>>>>>> specify the units, and the example uses an ISO-4217 currency >>>>>>>>> code, both of which we agree are good ideas. Is there a broad >>>>>>>>> standard (beyond just monetary) for addressing the other >>>>>>>>> symbol/representation issues you raised that we can address >> briefly in our best practice? >>>>>>>>> Do you consider SI units consistent with a locale-neutral approach? >>>>>>>>> Is there a locale-neutral standard for representing decimal >>>>>>>>> numbers (perhaps using a period and no grouping, as in your >> example)? >>>>>>>>> >>>>>>>>> -Annette >>>>>>>>> >>>>>>>>> >>>>>>>>> On 7/22/16 5:32 AM, ishida@w3.org wrote: >>>>>>>>> >>>>>>>>>> [raised by aphillips] >>>>>>>>>> >>>>>>>>>> https://www.w3.org/TR/dwbp/#LocaleParametersMetadata >>>>>>>>>> >>>>>>>>>> Best practice #3 introduces itself as: >>>>>>>>>> >>>>>>>>>> Providing locale parameters helps humans and computer >>>>>>>>>> applications to work accurately with things like dates, >>>>>>>>>> currencies and numbers that may look similar but have different >>>>>>>>>> meanings in different locales. >>>>>>>>>> >>>>>>>>>> But the actual best practice is to use **locale-neutral** >>>>>>>>>> representations that are interpreted/displayed to end-users in >>>>>>>>>> a locale-appropriate manner. For example, instead of storing >>>>>>>>>> the string "€2000.00", exchanging a data structure like the >>>>>>>>>> following is strongly >>>>>>>>>> preferred: >>>>>>>>>> >>>>>>>>>> ``` >>>>>>>>>> "price" { >>>>>>>>>> "value": 2000.00, >>>>>>>>>> "currency": "EUR" >>>>>>>>>> } >>>>>>>>>> ``` >>>>>>>>>> >>>>>>>>>> The date examples given are all in xsd:date format, which is an >>>>>>>>>> excellent example of using a locale-neutral format. >>>>>>>>>> >>>>>>>>>> Many things are dependent on locale: decimal symbol, grouping >>>>>>>>>> symbol, number of grouping digits, digit shapes, etc. It's >>>>>>>>>> because there can be wide variation (sometimes open to >>>>>>>>>> misinterpretation) that sending a locale neutral format is >> preferred for data values. >>>>>>>>>> Note also btw that the position of the currency symbol is >>>>>>>>>> dependent on the locale. In France it would be normal to write >>>>> 2000.00 € rather than €2000.00. >>>>>>>>>> Same even when talking about USD when using $, ie. 2000.00 $. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>> Annette Greiner >>>>>>>>> NERSC Data and Analytics Services Lawrence Berkeley National >>>>>>>>> Laboratory >>>>>>>>> >>>>>>>>> >>>>>>> -- >>>>>>> Annette Greiner >>>>>>> NERSC Data and Analytics Services >>>>>>> Lawrence Berkeley National Laboratory >>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>> -- >>>>> >>>>> >>>>> Phil Archer >>>>> W3C Data Activity Lead >>>>> http://www.w3.org/2013/data/ >>>>> >>>>> http://philarcher.org >>>>> +44 (0)7887 767755 >>>>> @philarcher1 >>> >> >> -- >> >> >> Phil Archer >> W3C Data Activity Lead >> http://www.w3.org/2013/data/ >> >> http://philarcher.org >> +44 (0)7887 767755 >> @philarcher1
Re: [i18n review comment] BP3 should recommend locale-neutral representation #187
Martin J. Dürst   Sat, 27 Aug 2016 14:44:19 +0900

public-dwbp-comments > August 2016 > 0000.html

Received on Saturday, 27 August 2016 05:45:06 UTC

Show in list: by dateby threadby subjectby author

Link to this message in this page.

Sent to: amgreiner@lbl.gov, addison@lab126.com
Copied to: phila@w3.org, deirdre@derilinx.com, bfl@cin.ufpe.br, ishida@w3.org, ishida@w3.org, public-dwbp-comments@w3.org, public-dwbp-comments@w3.org, www-international@w3.org.

Hello Annette, I think the data in your example is locale-neutral, because Unix time is the same in all locales. It's not necessarily the most frequent locale-neutral representation, and it's not human readable, but these are separate issues. Regards, Martin. On 2016/08/27 02:31, Annette Greiner wrote: > Addison, or anyone in i18n that knows, > > I have a question about how to make data locale neutral when all the values are of the same format. It seems a hard sell to tell people to expand every datetime value into a value and a format when the format is always the same. For example, if I have taken a terabyte worth of data (not at all uncommon where I work) that consists of pairs of a datetime plus a sensor reading, and the time is always in UNIX format (seconds since January 1, 1970), is it still considered locale-neutral if I simply indicate in the column metadata that the column holds UNIX time values? > > UNIX time, sensor reading (mV) > 1471995721,4.7 > 1471995731,7.5 > 1471995721,6.2 > > If not, is there a practical way to make it locale neutral without repeating the format for each value? I’m concerned that, for scientific datasets especially, it makes little sense to inflate the size of the dataset with repeated information. For large data, that becomes impractical, which tempts me to suggest that we say in the BP that locale parameters should be used only when a locale-neutral representation is not *practical* (rather than not *possible*), unless my example above would qualify as locale neutral. > -Annette