Warning:
This wiki has been archived and is now read-only.

Status of comments about the last call working draft

From Data on the Web Best Practices
Jump to: navigation, search
No. Subject Author Message Comment Proposal Resolution and Implementation
1 Metadata - Structural Ivan Herman https://lists.w3.org/Archives/Public/public-dwbp-comments/2016May/0015.html In example four it is probably a good practice to use datatypes that are as specific as possible to allow for data checks, for example. The one that comes to my mind is the "stop_url"; [2] includes the "anyURI" datatype, it is probably worth using it. There may be other such cases in the listed example. proposal: https://github.com/w3c/dwbp/pull/408/commits/5707d4a5417934021d72c49fd5c189dd159b83ea - inclusion of anyURI datatype https://github.com/w3c/dwbp/pull/408/commits/5707d4a5417934021d72c49fd5c189dd159b83ea

message to the author: https://lists.w3.org/Archives/Public/public-dwbp-comments/2016Jun/0010.html

2 Metadata - Structural Ivan Herman https://lists.w3.org/Archives/Public/public-dwbp-comments/2016May/0015.html B.t.w., it may be a good idea to refer to the CSVW primer[3] in this context. Maybe it goes too far to include it in this document, but [4] also includes an example on how to incorporate geospatial data into the metadata (which may be more appropriate for the bus stop example); it may be worth at least mentioning it in the text. proposal:

https://github.com/w3c/dwbp/pull/408/commits/16adbb13caeb2793d90df8f247c26c978aa5f6e9 - inclusion of CSV primer reference

https://github.com/w3c/dwbp/pull/418/commits/7a62b3a7f77513dc564ee48591fb289c16069f08

message to the author: https://lists.w3.org/Archives/Public/public-dwbp-comments/2016Jun/0010.html

3 Metadata - Locale parameters Michel Dumontier https://lists.w3.org/Archives/Public/public-dwbp-comments/2016May/0024.html Best practice 3 uses dct:conformsTo, but the range of this is a URI of type dct:Standard, so it should be a URI for the ISO spec. proposal: https://github.com/w3c/dwbp/pull/408/commits/35fe8437dc7ffc6f5dca66ad0ba8da983899d617 https://github.com/w3c/dwbp/pull/408/commits/35fe8437dc7ffc6f5dca66ad0ba8da983899d617

message to the author: https://lists.w3.org/Archives/Public/public-dwbp-comments/2016Jun/0009.html

4 Data Enrichment Michel Dumontier https://lists.w3.org/Archives/Public/public-dwbp-comments/2016May/0024.html one social: if one wants to follow BP31 - enrich data by generating new data - but that person is not the original data provider - i'd recommend that they make their contribution public (rather than republishing the whole dataset), with a machine readable provenance description of the work, and contribute the enrichment back to the original data provider. proposal: https://github.com/w3c/dwbp/commit/d2e60fb062e96524d5efa6a933b2d51597bbdf6c https://github.com/w3c/dwbp/commit/d2e60fb062e96524d5efa6a933b2d51597bbdf6c

message to the author: https://lists.w3.org/Archives/Public/public-dwbp-comments/2016Jun/0009.html

5 Data versioning Andrea Perego https://lists.w3.org/Archives/Public/public-dwbp-comments/2016May/0027.html The issue is about a specific metadata field, namely the date of last modification of a dataset (dct:modified in DCAT).

(see message for more details)

proposal:

https://github.com/w3c/dwbp/pull/408/commits/f18044394f04e8495b427667bff9122f1ee194fb https://github.com/w3c/dwbp/pull/408/commits/2a8089e45d9aece41c68b0f947c270f78d9cb726

https://github.com/w3c/dwbp/pull/408/commits/f18044394f04e8495b427667bff9122f1ee194fb

https://github.com/w3c/dwbp/pull/408/commits/2a8089e45d9aece41c68b0f947c270f78d9cb726 message to the author: https://lists.w3.org/Archives/Public/public-dwbp-comments/2016Jun/0008.html

6 Data access Andrea Perego https://lists.w3.org/Archives/Public/public-dwbp-comments/2016May/0027.html There's a scenario that I'm not sure it is addressed, at least explicitly. This concerns data that, to be accessed, require users to register. This is different from data that can be accessed only by authorised users.

(see message for more details)

proposal: https://github.com/w3c/dwbp/pull/409/commits/e2d938a27efe93c523624f1aa8eeae23313ecf6d https://github.com/w3c/dwbp/commit/64d71cb6bd4a3761cf61e3eb8670b834be6d3119

message to the author: https://lists.w3.org/Archives/Public/public-dwbp-comments/2016Jun/0008.html

7 Numerical data Frans Knibbe https://lists.w3.org/Archives/Public/public-dwbp-comments/2016May/0022.html One thing I miss is the advice to use significant figures in numerical data. It is an easy way to make the data match their uncertainty, and in many cases it helps to compact data too. Numerical data with the wrong number of significant digits is a very common problem in geographical data (e.g. geographic coordinates with nanometre precision). proposal: this kind of advice is out of the scope of the DWBP working group. this kind of advice is out of the scope of the DWBP working group.

message to the author: https://lists.w3.org/Archives/Public/public-dwbp-comments/2016Jun/0012.html

8 Data Enrichment David (Annette's colleague) https://lists.w3.org/Archives/Public/public-dwbp-comments/2016Jun/0000.html IMO Topic 8.13 is a little too focused on automated methods for "filling in missing values". I like the summary: Enrich your data by generating new data from the raw data when doing so will enhance its value. but the text does not really address the "enhancement of value" part. It also seems weighted toward interpolation of data values as opposed to "generating new data".

Do you think it's worth emphasizing that enrichment should be demonstrable? I see this as a QA issue. Other examples include visual inspection to identify features in spatial data and cross-reference to external databases for demographic information. [ Lastly, generation of new data may be demand-driven, where missing values are calculated or otherwise determined by direct means. Measured application of these techniques informs the degree and direction of data enrichment]

proposal: https://github.com/w3c/dwbp/pull/412/commits/ce1b1a8c03cd1b6017f029ad77f41c86f8f9c86e

https://github.com/w3c/dwbp/pull/412/commits/540ed3b236068858936d9a03d7c8218945f609d7

resolution and commit: the author agreed with the proposal

message to the author: https://lists.w3.org/Archives/Public/public-dwbp-wg/2016Jun/0042.html

9 Scope aphillips https://lists.w3.org/Archives/Public/public-dwbp-comments/2016Jul/0024.html The scope section of the document lists criteria for inclusion in Section 3. The document makes no mention of character encoding later. While an exhaustive survey of best practices such as [Charmod-Norm](https://www.org/TR/charmod-norm) is not desirable here, it is a somewhat fundamental BP to always choose a Unicode encoding. Would this be appropriate to this document? We are not sure if it is possible to include another BP at this moment. Should we include it as a general guideline in Section 1? https://github.com/w3c/dwbp/commit/e1dcf19f21896be2eab53892fc0657c1974c2620
10 Descriptive Metadata aphillips https://lists.w3.org/Archives/Public/public-dwbp-comments/2016Jul/0025.html In **Example 2**, the media type of the file is given as:

> dcat:mediaType "text/csv" ;

The media type should include the charset parameter, e.g. "text/csv;charset=UTF-8" since the default for text/* is ASCII and since UTF-8 should be preferred.

proposal: We will change it on Github. https://github.com/w3c/dwbp/commit/ce7c3fc30fa08b5804151333e408da3cde3a3f78

https://github.com/w3c/dwbp/commit/706e98a9004cc872f20026c8c3ac5b35cc069b8d

message to the author: https://lists.w3.org/Archives/Public/public-dwbp-comments/2016Aug/0003.html

11 Locale parameters Metadata aphillips https://lists.w3.org/Archives/Public/public-dwbp-comments/2016Jul/0026.html In BP3 the word 'locale' is misspelled once as 'local', which could be confusing:

> Check if the metadata for the dataset itself includes information about local parameters (i.e. data, time, number formats, and language) in a human-readable format.

proposal: We will change it on Github. https://github.com/w3c/dwbp/commit/9234a97853270d0d506055394d01f63cb1a02366

message to the author: https://lists.w3.org/Archives/Public/public-dwbp-comments/2016Aug/0001.html

12 Data formats aphillips https://lists.w3.org/Archives/Public/public-dwbp-comments/2016Jul/0027.html Most machine-readable standardized formats also happen to be locale-neutral (by design, because it is a best practice ) Mention that fact as one of the benefits here? proposal: Should we include a new benefit or just mention that fact in the section's introduction? https://github.com/w3c/dwbp/commit/d2eb11eab9751f56c6bfab2a4c6d07b92c3d08aa

https://github.com/w3c/dwbp/commit/69c1debe0a2064b8605052c73d4fe3090a9318f7

message to the author: https://lists.w3.org/Archives/Public/public-dwbp-comments/2016Aug/0004.html

13 Locale parameters Metadata aphillips https://lists.w3.org/Archives/Public/public-dwbp-comments/2016Jul/0028.html Best practice #3 introduces itself as:

> Providing locale parameters helps humans and computer applications to work accurately with things like dates, currencies and numbers that may look similar but have different meanings in different locales.

But the actual best practice is to use **locale-neutral** representations that are interpreted/displayed to end-users in a locale-appropriate manner. For example, instead of storing the string "€2000.00", exchanging a data structure like the following is strongly preferred:

``` "price" {

  "value": 2000.00, 
  "currency": "EUR" 

} ```

The date examples given are all in xsd:date format, which is an excellent example of using a locale-neutral format.

Many things are dependent on locale: decimal symbol, grouping symbol, number of grouping digits, digit shapes, etc. It's because there can be wide variation (sometimes open to misinterpretation) that sending a locale neutral format is preferred for data values. Note also btw that the position of the currency symbol is dependent on the locale. In France it would be normal to write 2000.00 € rather than €2000.00. Same even when talking about USD when using $, ie. 2000.00 $.

proposal: Annette will contact the I18N WG to ask more details about this issue. https://github.com/w3c/dwbp/pull/447/commits/4646b69b39f2d0080e8aa075749678f8e4b726e7

https://github.com/w3c/dwbp/pull/447/commits/0843bfd664a5a6c3b7af71ef62700a16311528e3 https://github.com/w3c/dwbp/commit/e1dcf19f21896be2eab53892fc0657c1974c2620

14 DUV aphillips & fsasaki https://lists.w3.org/Archives/Public/public-dwbp-comments/2016Jul/0029.html https://www.w3.org/TR/2016/WD-vocab-duv-20160519/#class-usage

In the section above, there are a number of natural language field types that should have language and direction metadata associated with them.

proposal: To change DUV. resolution and commit
15 Locale parameters metadata aphillips https://lists.w3.org/Archives/Public/public-dwbp-comments/2016Jul/0030.html Best Practice 3 includes as an example the use of `dct:language` to indicate locale and refers loosely to the need to indicate the language and locale of data values. The standards that embody locale and language identification on the Web (and on the Internet more generally) are IETF BCP 47 [RFC5646/RFC4647](https://tools.ietf.org/html/bcp47) and [CLDR](http://cldr.unicode.org).

The I18N WG recognizes that these standards do not have a linked data representation currently, but the current representations given as examples in this document are incomplete and have a variety of limitations or problems. This is recognized, for example, by the fact that Dublin Core's language element was defined to reference RFC3066, which was the current BCP47 when that standard was published. However, BCP 47 has been updated since and the current formulation, while fully compatible with RFC3066, is the preferred reference.

The WG feels that BP3 should include a reference or recommendation to consistently use BCP47 as the standard for language and locale identification and, informatively, to CLDR as the source for both representing specific localized formats and as a reference for specific locale data values.

Please note that this is in addition to the need to recommend locale-neutral representations[1] instead).

proposal: To change BP 3. We need some help to make this change. https://github.com/w3c/dwbp/pull/447/commits/4646b69b39f2d0080e8aa075749678f8e4b726e7

https://github.com/w3c/dwbp/pull/447/commits/0843bfd664a5a6c3b7af71ef62700a16311528e3

message to the author: https://lists.w3.org/Archives/Public/public-dwbp-comments/2016Aug/0002.html

16 Locale parameters metadata aphillips https://lists.w3.org/Archives/Public/public-dwbp-comments/2016Jul/0031.html BP3 covers locale-formats but does not suggest the need to tag natural language text with a language tag and with base direction metadata. Natural language can be indicated via the existing RDF representation. However, there currently exists no good mechanism to indicate base direction. This is seen as a gap in formats such as JSON-LD for which no good generalized solution exists. However, it is an unsolved problem that should be acknowledged (and which might be overcome via other means, such as providing Unicode controls or adding field-specific metadata to document formats). proposal: Annette will prepare a text to be included in the DWBP doc. https://github.com/w3c/dwbp/pull/447/commits/4646b69b39f2d0080e8aa075749678f8e4b726e7

https://github.com/w3c/dwbp/pull/447/commits/0843bfd664a5a6c3b7af71ef62700a16311528e3

17 Vocabularies aphillips https://lists.w3.org/Archives/Public/public-dwbp-comments/2016Jul/0032.html https://www.w3.org/TR/dwbp/#ReuseVocabularies

Example 15 contains this citation:

> The Library of Congress publishes lists of ISO 639 country codes as Linked Data (see [ISO639-1-LOC] for two-letter codes):

ISO 639-1 is a list of language codes, not country codes. The standard for country codes is ISO3166-1. Please either change the reference to ISO3166-1 (the change I18N WG would prefer) or change the text to say "language codes".

proposal: We will change on github. We should validate the update with Antoine. https://github.com/w3c/dwbp/commit/157713fbdd16092eb9a9967984c277607167efe4

message to the author: https://lists.w3.org/Archives/Public/public-dwbp-comments/2016Aug/0005.html

18 Metadata afasaki https://lists.w3.org/Archives/Public/public-dwbp-comments/2016Jul/0033.html Section 9.2 Metadata

http://www.w3.org/TR/2015/WD-dwbp-20150625/#metadata

The section says "Best Practice 1: Provide metadata": "Metadata must be provided for both human users and computer applications"

This best practice does not talk about metadata in multiple languages. One should require that metadata is provided in the language of the user at least.

The same comment holds for "Best Practice 2: Provide descriptive metadata" in the same section, and in section 9.14 on enrichment http://www.w3.org/TR/2015/WD-dwbp-20150625/#enrichment

There is "Best Practice 3: Provide locale parameters metadata … Information about locale parameters (date, time, and number formats, language) should be described by metadata." but this best practice does not talk about metadata in multiple languages.

proposal: We will change on github. Should we include a general guideline in the introduction of Metadata Section? https://github.com/w3c/dwbp/commit/95d6df13f67da21beb2f8b8ab1aa211d0600dea3

message to the author: https://lists.w3.org/Archives/Public/public-dwbp-comments/2016Aug/0000.html