Warning:
This wiki has been archived and is now read-only.
Data Usage Vocabulary Meetings
From Data on the Web Best Practices
Data Usage Vocabulary Meetings
May 20, 2014 Kickoff Meeting
Attendees: Bernadette Locsio, Eric Stephan
- On May 16, 2014 at the DWBP Group Telecon Bernadette and Eric agreed to be co-editors.
- Data Usage Vocabulary documents will be hosted at: https://github.com/w3c/dwbp/
- Bernadette is already working with her students on aspects of data usage.
During our meeting different aspects of the data usage vocabulary were discussed:
- Leveraging the existing W3C PROV vocabulary.
- The vocabulary is the culmination of over a decade of data provenance research.
- It was developed by an international team of researchers to track data lineage, event history, and human interaction with data and systems.
- Provenance itself is only concerned from a past tense perspective, however data usage supports present tense (what you can do) and future tense (what is possible).
- Since the focus is “Data on the Web” we want to focus on using distributed datasets.
- Datasets are format dependent.
- It doesn’t matter of RDF or XML, its just a “collection of data”
- Think more in terms of mathematical representations: Graph, Tree, Table
- Two interesting interviews with Peter Buneman on the role of mathematics and data modeling
- Possible operations on such structures.
- Providing usage design patterns
- Identify usage patterns, showing examples in a similar way that patterns are used in software engineering
June 3, 2014 Meeting
Should consider two types of datasets: a general concept of a dataset, and one for specific structures. How can I map this abstract model to specific implementation model representations (json, rdf etc).
Topics for Data Usage:
- Defining data processing steps (PROV, or new information)
- Datasets comprised of many datasets
- Datasets defining discoverability of applications that a user can leverage for the dataset.
- Data usage, data publisher and data consumer feedback relationship.
- Is this an overlap with the data quality vocabulary? Is there anything not considered feedback we should represent?
- Reproducibility and repeatability as aspects that should be covered already in the above topics.
Path forward for first week of June 2014:
- Define what data usage means to us that can be the foundation of our work.
- What are the intersections between data quality vocabulary and best practices document?
- Familiarize ourselves with use cases from the DWBP, and CSV working group. Find linkages between data usage and these use cases.
- Follow up with those who might be interested in contributing to the vocabulary.
- Put these notes on the DWBP data usage notes wiki.