Warning:
This wiki has been archived and is now read-only.

Best practices notes

From Data on the Web Best Practices

Jump to: navigation, search

1 notes on best practices

notes on best practices

URI design and management for persistence

[See this page of working group notes https://www.w3.org/2013/dwbp/wiki/Data_on_the_Web_URI_Best_Practices]

From Andy Mabbett's talk on OpenStreetMap at our Face-to-face in London: // This may be too much for this section

OpenStreetMap tags essentially form triples (attributes of nodes). Not expressed that way though.
Reference external sites
- (like openplaques_plaque = 1563). Not referenced with a URL; they'd like to. http://openplaques.org/plaques/1563
- english_heritage_ref 468884. The URL could go to a text-based article or images.
  - key:english_heritage_ref isn't defined. The community may have minted their own key (english_heritage_number, for example). Some editing tool will autosuggest existing keys, but not all, and users can ignore them.
- Do include the URL to a website, but not necessarily structured data.
- Store wikidata value
  - key:wikidata isn't documented. It's still new, and the community hasn't agreed that it should be used. Small community and slow consensus process.
- store reference terms to build a wikipedia page, but don't explicitly build it because
  - it's long,
  - b) there are more than one wikipedia URL per article (http, https, etc.)
People are reluctant; they don't understand why they should. Volunteers think entering the number into a table is enough. We could run a script to replace the data with a URL, or we could make the fact that the data is part of a URL more clear on the (human readable) wiki.)
Subkeys: wikipedia:architect. or wikidata:architect.
Community conflicts: members in different parts of the world will mint different terms meaning the same thing. They slowly come together and conflict.
Unique reference number per node (in this case, for the way) is in the URL. http://openstreetmap.org/way/32892093
Tree. (node 2642471054) Attribute: Ref = 0141. Whose ref is that? In this case, it's a local council reference. No URL to be made there — nothing to link to.
When a node is replaced with a new node or new data, there is no way to know what happened to the old node. (Building destroyed? Data was just there in error?) Old one just returns an HTTP 200 error.

use of core vocabularies to improve interoperability

Day 1 Discussion was held whether or not vocabularies should be standardized. Challenges for vocabularies include:

common vocabularies are not used
should vocabularies be open?
what about licensing constraints on vocabularies?
- ISO,
- apache,
- open licenses was used as an example.
- Vocabularies can be licensed and used.

We need a mechanism to identify new vocabularies:
- Linked Open Vocabularies http://lov.okfn.org/dataset/lov/
- Web Schemas

Standardized and referenced vocabularies were discussed:

- If reference vocabularies not available or suitable, try to:
  - extend existing vocabularies,
  - suggest to cooperative setting to create a vocabulary, or
  - create your own

RESOLVED DAY 1:

Existing reference vocabularies should be reused where possible (resolved)
Reference vocabularies should be shared in an open way (resolved)

guidance on the provision of metadata

Day 1 Discussion on the requirements on metadata:

An initial focus was held on metadata being machine readable.
This requirement was expanded to metadata being both machine and human readable.
A question was raised whether human readable and machine readable should be separate discussions.
Use cases to consider metadata in data that perhaps has not been considered thus far includes:
- Streaming data- XSLT 3 includes transformations for streaming data see http://www.w3.org/TR/xslt-30/
- Measurements from traffic, pollution etc.

Another question was raised if machine readable meant not “natural language” using PDF as an example.
- PDF on its own is only usable by humans. if your pdf include tables, you could use metadata to give provide explanation about the table. One recommendation was that when you scrape pdf it should have metadata
- The following use case was identified:
  - User A publishes a PDF file. User B reads the PDF file over the next 3 weeks and decides to create a table based on the PDF. You need metadata to refer back to the original source that you generated the table from.
  - From the CSVW Working Group If User B page scrapes a table from the PDF file the PDF file is considered as a external source such as a database. The concern of the CSVW working group is the form that it is parsed and how the resulting tabular data is organized. From the Data Activity perspective this represents an interesting use case because DWBP Working Group could represent the best practices for using metadata to associate the derived table with the original PDF file.

Discussion on metadata continued about the value of metadata being understood, described, and well defined.
- Dublin core was as an example of being understood by some users and difficult for others. Having sufficient documentation makes metadata more understandable by intended audiences.
- Some useful links on this topic include:

RESOLVED DAY 1:

There should be metadata (resolved)
Metadata vocabulary, or values if vocabulary is not standardised, should be well-documented (resolved)

publishing and accessing versions of datasets

making controlled vocabularies accessible as URI sets

technical factors for consideration when choosing data sets for publication

technical factors affecting potential use of open data for innovation, efficiency and commercial exploitation

data preservation

Retrieved from "https://www.w3.org/2013/dwbp/wiki/index.php?title=Best_practices_notes&oldid=593"

Best practices notes

Contents

notes on best practices

URI design and management for persistence

use of core vocabularies to improve interoperability

guidance on the provision of metadata

publishing and accessing versions of datasets

making controlled vocabularies accessible as URI sets

technical factors for consideration when choosing data sets for publication

technical factors affecting potential use of open data for innovation, efficiency and commercial exploitation

data preservation

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Navigation

extra links

Tools