Data Usage Use Cases
Data Usage Description Vocabulary
The Data Usage vocabulary will embody different ways humans and machines interact with data on the Web. This vocabulary is intended to be extensible recognizing that data usage will vary across Web communities and will require customization for specific implementations. While extensibility is important for customization, we envision communities that rely upon the Data Usage Vocabulary also gain an important advantage of being able to compare and cross reference data usage techniques used in different domain implementations.
Data Usage Activities
The Data Usage Description Vocabulary will describe the use made of one or more data sets. The scope could relate data usage in different activities, including data analysis and visualization, collaborations, data exploration and discovery and scientific and B2B data handling.
Data Analysis and Data Visualization Applications: Where data is used in an application, it will facilitate a description of what the application does and what problem it helps to solve. This will improve discoverability of the application.
Collaborations: It will describe the role data plays in the collaboration, it will describe how the data is contributed, published, and shared by the collaborators. Collaborative data usage can include data that is governed, handled in projects and studies that have a limited lifespan, and events that are prompted by spontaneous or rare occurrences.
Data exploration and discovery: It will describe the role data plays in data exploration and discovery, it will describe how the data is browsed, navigated, and exploratory experiences shared with others.
Scientific and B2B data handling : It will describe data usage in workflows, pipelines, and situations that rely on complex processing involving the intercommunication between multiple applications, usage of intermediate results, and relationships to provenance.
Data usage Trending and Metrics
The Data Usage Description Vocabulary will describe the trends and metrics relating to data usage. This might be include but not limited to:
- How is data advertised?
- How frequently is the data being accessed?
- Is the data up to date or irrelevant?
- What is the data’s perceived intrinsic value?
- Not reproducible/Expensive to reproduce/easily reproducible.
- Not repeatable/Expensive to repeat/easily repeatable.
- Reflects expertise.
- Data consumer stats
- extremely useful
- easy to use
- useful as a bench mark
Data usage scenarios
The information gathered from the scenarios are helpful for the definition and of the data usage vocabulary and to illustrate how it should be used.
The general description of a data usage scenario is given by:
- Short description:
- Detailed description:
- What type of data usage activity is performed (ex: analysis, visualization, collaboration, exploration and discovery, scientific data handling, B2B)?
- What is the main purpose of the data usage activity?
- If the activity is data analys and data visualization, what types of problems does the application solve?
- If the activity is scientific data handling, what is the experimental procedure being performed?
- If the activity is collaboration, what is the purpose of the collaboration?
- If the activity is B2B data handling, what is the business process model being performed?
- If the activity is exploration and discovery, what is the purpose of the exploration and discovery?
- What user communities or user-types perform the data usage activity?
- What user communities or user-types use the results of the data usage activity?
- What datasets are used?
- How datasets were obtained? What is the origin of the datasets? (ex: data portal, web service)
- What types of datasets are used?
- What tools are used for data processing?
- If more than one dataset is used:
- How datasets are combined?
- If datasets are used for collaboration:
- How do collaborating organizations use and share data?
- If datasets are used for data exploration and discovery:
- How the data discovery process is described?
- If datasets are used for data analysis and visualization applications:
- What tools are used for data analysis and visualization?
- If datasets are used for scientific and B2B data handling:
- Are there raw or intermediate results that need to be retained?
- How data processing (or data usage) is described?