Dataset usage vocab workspace

From Data on the Web Best Practices
Jump to: navigation, search

Purpose

This web page is dedicated to providing background material, and documenting the process used to create the Dataset Usage Vocabulary. The work of the vocabulary in 2015 is very different in some ways than the work done in 2014. Nonetheless, it is important to capture the discussions and ideas shared.


Early Work (2013-2014)

A wiki page was established to capture some of the early work that came out of the London Face to Face meeting. Breakout sessions provided the means for the meaning of data usage to be explored. This work coincided with the development of the use case requirements so many of the initial requirements for the vocabulary were still being formulated. Part of the initial struggle of the data usage vocabulary was separating traditional concepts of data usage from a processing perspective (web service chain, workflow, software pipelines) and thinking about a vocabulary that was more person centered in terms of how consumers used, accessed, discussed, and shared information relating to data. Data usage notes

Data usage schedule

Initial Data Usage Model April 2015

File:DataUsageVocabulary04122015.pdf

This Dataset Usage Model was first introduced at the F2F3 Meeting April 2015. The model is color coded to denote action items coming out of the F2F meeting.

Dataset Usage Vocabulary Model version 0.0 Introduced at the F2F3 Meeting April 2015

Post F2F3 Dataset Usage Vocabulary Discussions

Following the F2F3 Meeting there were a number of email threads that covered a variety of topics. This section attempts to capture comments, observations, and concerns expressed over the past month. Input from the working group has been taken into account for the dataset usage vocabulary. The following major themes seem to include: Citations, the relationship between DQG and DU, dataset reuse, dataset consumers/producers, usage annotation, feedback versus usage, and illustrative examples we could begin to use.

Relationship Between the Dataset Usage Vocab and Data Quality Vocab

  • Mail List Documented Discussion
  • Decision: The Quality and and Granularity Vocabulary will support objective metrics, subjective metrics, and qualified opinions.
  • Decision: The DQGV will not support activities such as “rating the raters”.
  • Decision: The DUV will support objective metrics
  • Decision: The DQGV will support subjective metrics.
  • Decision: There could be some synergistic activities between DQGV and DUV for opinions.

Discussions on Dataset Citations

At the F2F3 Meeting the HCLS document was reviewed for possible referenced citation vocabularies. The following decisions and discussions were held:

Feedback Models

  • At the F2F3 Meeting the Semantically Interlinked Ontology Specification was suggested as a means to model capturing feedback and discussions about the dataset. SIOC is quite extensive as a vocabulary capturing detailed information about discussions, but it was felt to be to heavily oriented toward chat room paradigms.
  • Inquiries to a related SIOC Google chat group brought about a suggestion to also look at a lighter weight Review Vocab. The Review Vocab provided simple classes that the vocabulary could leverage.
  • In addition to the Review Vocab classes, significant discussions were held by the group about using the Annotation Model.
  • Decision: Use classes from the Review Vocab for a lightweight way to describe feedback related concepts.
  • Decision: Inherit from Annotation Model oa:Annotation and properties.

Examples of Dataset Usage

May 09 2015 Model

Based on the discussions a new model was introduced.

Duv-model05082015.png

Sao Paulo Meeting

The diagram below attempts to combine the DUV and DQV vocabs. Classes 'on the edge' of both models have been elided to focus on the areas of common interest.

Bothvocabs.png

Post F2F São Paulo DUV=

New thoughts on citations

House Keeping

Issues Tagged Incorrectly