Warning:
This wiki has been archived and is now read-only.
Proposed structure
Contents
- 1 Table of Contents DWBP
- 1.1 Audience (Editor's responsability - contributors are welcome!)
- 1.2 Scope (Editor's responsability - contributors are welcome!)
- 1.3 Background (context - Editor's responsability - contributors are welcome!))
- 1.4 Data on the Web Lifecycle
- 1.5 Best Practices Themes (challenges)
- 1.6 Conclusions
- 1.7 Each theme will have:
- 2 Mapping of Themes[1]
Table of Contents DWBP
Audience (Editor's responsability - contributors are welcome!)
Scope (Editor's responsability - contributors are welcome!)
Background (context - Editor's responsability - contributors are welcome!))
- actors: data publishers and data consumers
- why best practices are necessary?
- Technical factors for consideration when choosing data sets for publication (Nathalia, Flavio)
Data on the Web Lifecycle
- Data Selection
- Data Organization
- Data Publication
- Data Usage
- Feedback
Best Practices Themes (challenges)
Data Selection
- Data Selection (the requirements to be reviewed)
Data Organization
- Metadata
- Licenses
- Data quality
- Provenance
- Data vocabulary
- Data granularity
- Data identification
Data Publication
* Data formats Best practices on data formats include considerations on making data available in multiple open and machine readable formats for example RDF and JSON. In addition this section will include guidance on preferred data formats for example preferred formats for date, string and numbers. Guidance will also be provided on expressing data in multiple languages. While mentioned in the use case document as challenge, the best practices document should not make recommendations regarding the formats of source data for example database dumps, spreadsheets. It is expected that data sources for data on the web will continue to be in multiple formats.
- Data access
- Sensitive data (privacy)
- Preservation
- Data versioning
Data usage
- Data usage
- Data enrichment
Feedback(?)
- Data usage feedback
- Data to be published
- Lifecycle
Conclusions
- Technical factors affecting potential use of open data for innovation, efficiency and commercial exploitation (Vagner, Nathalia, Hadley, Yaso)
Each theme will have:
- a description
- 01 or more best practices
- link between best practices and requirements
Mapping of Themes[1]
Theme | Current sections | Contributors |
---|---|---|
Metadata
What kind of metadata should be considered? How to describe:
|
Guidance on the Provision of Metadata | Makx, Carlos, Laufer, Bernadette |
Data Vocabularies
How to provide semantic interoperability between datasets? |
Use of core vocabularies to improve interoperability | Bart, Eric K, Giancarlo, João Paulo, Ig
Antoine, Mark |
Data formats
What kind of data formats should be considered when publishing data on the Web? To provide guidance about different types of data formats and machine-readable formats. |
Eric K, SumitPurohit | |
Data granularity (Still to be clarified its meaning)
What kind of data granularity should be considered when publishing data on the Web? To provide guidance about different types of data granularity. |
TBD | |
Data access
What kind of data access should be considered when publishing data on the Web? How to decide if to publish a dump versus API options (SPARQL, etc)? What requirements to take into account? (reliability, time to query dataset, etc.) |
TBD | |
Sensitive data (privacy)
How to deal with sensitive data in order to not infringe a person's right to privacy or an organization's security? |
TBD | |
Data identification
To provide guidance on URI Design and Management for Persistence. |
Compact Uniform Resource Identifier (COMURI) - mirror |
Tomas Carrasco, Phil, Newton, Flavio, Carlos Iglesias |
Data persistence
To provide guidance on how data could be preserved. |
Data preservation | Phil, Cristophe |
Data versioning
How to track/manage different versions of a dataset? |
Publishing and accessing versions of datasets | Newton, Flavio |
Data enrichment
How to enrich data before consuming? |
Adriano Machado, Adriano Veloso, Wagner Meira | |
Data usage
How to track/gather data usage information ? |
TBD | |
Feedback
How to track/gather user feedback? |
TBD |
[1] The themes were based on the Use Case Challenges and Requirements: http://www.w3.org/TR/dwbp-ucr/, https://www.w3.org/2013/dwbp/wiki/Best_practices_guidelines
Scope Thoughts:
- Is it unique to publishing data on the web?
- Does addressing it encourage people to publish/reuse data on the web (or remove barriers to it)?
- is it testable?
- is it about publishing data on the web?
- requirements/ issues are in the scope and can be testable then they will be in the recommendation
- requirements/ issues are in the scope and can not be testable then they will be in a note
- requirements/ issues out of scope will be deleted