Warning:
This wiki has been archived and is now read-only.

Use Cases Document Outline

From Data on the Web Best Practices

Jump to: navigation, search

Headings

Abstract

Status of this document

Table of contents

1. Introduction

The aim of this document is to present concrete use cases of publishing and using data on the Web Data. The use cases are provided as a means to guide the definition of the set of best practices for Data on the Web as envisioned by Data on the Web Best Practices Working Group.

2. Terminology

(maybe we can reuse the terminolgy used in other working groups)

Data on the Web
Publisher
Consumer
Metadata
...

3. Use cases

3.1 Original use cases

3.2 General use cases

Data publication scenario

Data usage scenario

4. Challenges (or Data on the Web Dimensions)

...for each challenge (dimension) we provide a brief description based on the description of google drive. The challenges below are the ones from google drive.

Metadata
Static/Real-time (or Real-time data access)
Privacy/security
Archiving
Provenance
Licenses
Granularity
Quality
Formats
Vocabularies
APIs (or Data access)
URIs (or Identification)
Industry-reuse
Usability (or Usage)
Feedback (instead of Processing)
Policy (together with Licenses?)
Data selection

I'm not sure if the following challenges are in the scope of the best practices or the vocabularies. What do you think?

Tools
Skills/Expertise
Revenue

5. Requirements

The use cases may give rise some requirements. In the case of the best practices, those requirements may specify required best practices.

Metadata
- RM1: There should be metadata
- reference to supporting use case (“Motivation:”, list)
- RM2: Metadata should be machine-readable
- reference to supporting use case (“Motivation:”, list)
- RM3: Metadata vocabulary, or values if vocabulary is not standardised, should be well-documented
- reference to supporting use case (“Motivation:”, list)

Vocabularies
- RV1: Existing reference vocabularies should be reused where possible
- reference to supporting use case (“Motivation:”, list)
- RV2: Reference vocabularies should be shared in an open way
- reference to supporting use case (“Motivation:”, list)

APIs (or Data access)
- RDA1: Collaboration between API providers and API users is necessary.
- RDA2:APIs should be well documented.

Provenance
- RP1: Data provenance should be provided.
- RP2: Standard vocabularies should be used to describe data provenance.

SLA
- RSLA1: SLAs should be provided in a machine-readable format.
- RSLA1: Standard vocabularies should be used to describe SLA.

Formats
- RF1: Data should be provided in several formats
- RF2: Data should be provided in a machine-readable format
- RF3: Standard data formats should be adopted

Quality
- RQ1: Information about data quality should be provided
- RQ2: Standard vocabularies should be used to describe the quality of the data
- RQ3: Data quality should be verified before data release
- RQ4: Information about data quality should be provided in a machine-readable format.

Static/Real-time (or Real-time data access)
- RRT1: Real-time data should be provided when possible
- RRT2: Bulk data access should be provided when possible

Versioning
- RV1: Release schedule should be provided in meta-data
- RV2: Guidance about how to keep different versions of the same data should be provided

Licenses
- RL1: Data licenses should be interoperable
- RL2: Data licenses should be provided in a machine-readable format
- RL3: Standard vocabularies should be used to describe licenses
- RL4: Guidance about license combination should be provided

Granularity
- RG1: Guidance about how to define data granularity should be provided.

Data Selection
- RDS: Guidance about how to select data to be published should be provided.

APIs can be too clunky/rich in their functionality, which may increase the amount of calls necessary and size of data transferred, reducing performance Collaboration between API providers and users is necessary to agree on 'useful' calls API key agreements could restrict Openess of Open Data? Documentation accompanying APIs can be lacking What is best practice for publishing streams of real-time data (with/without APIs)? For accessing numerous datasets scientists will be accessing the archive directly using other protocols such as sftp, rsync, scp, access techniques such as: http://www.psc.edu/index.php/hpn-ssh For accessing individual datasets a REST GET interface to the archive should be provided.

...

Candidate best practices

6. Conclusions

References

Acknowledgements

Retrieved from "https://www.w3.org/2013/dwbp/wiki/index.php?title=Use_Cases_Document_Outline&oldid=926"

Use Cases Document Outline

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Navigation

extra links

Tools