Proposed structure

From Data on the Web Best Practices
Jump to: navigation, search

Table of Contents DWBP

Audience (Editor's responsability - contributors are welcome!)

Scope (Editor's responsability - contributors are welcome!)

Background (context - Editor's responsability - contributors are welcome!))

Data on the Web Lifecycle

  • Data Selection
  • Data Organization
  • Data Publication
  • Data Usage
  • Feedback

Best Practices Themes (challenges)

Data Selection

  • Data Selection (the requirements to be reviewed)

Data Organization

  • Metadata
    • Licenses
    • Data quality
    • Provenance
  • Data vocabulary
  • Data granularity
  • Data identification

Data Publication

* Data formats Best practices on data formats include considerations on making data available in multiple open and machine readable formats for example RDF and JSON. In addition this section will include guidance on preferred data formats for example preferred formats for date, string and numbers. Guidance will also be provided on expressing data in multiple languages. While mentioned in the use case document as challenge, the best practices document should not make recommendations regarding the formats of source data for example database dumps, spreadsheets. It is expected that data sources for data on the web will continue to be in multiple formats.

  • Data access
  • Sensitive data (privacy)
  • Preservation
  • Data versioning

Data usage

  • Data usage
  • Data enrichment


  • Data usage feedback
  • Data to be published
  • Lifecycle


Each theme will have:

  • a description
  • 01 or more best practices
  • link between best practices and requirements

Mapping of Themes[1]

Theme Current sections Contributors

What kind of metadata should be considered?

How to describe:

  • Licenses
  • Data quality
  • Provenance
Guidance on the Provision of Metadata Makx, Carlos, Laufer, Bernadette
Data Vocabularies

How to provide semantic interoperability between datasets?

Use of core vocabularies to improve interoperability

Making controlled vocabularies accessible as URI sets

Bart, Eric K, Giancarlo, João Paulo, Ig

Antoine, Mark

Data formats

What kind of data formats should be considered when publishing data on the Web?

To provide guidance about different types of data formats and machine-readable formats.

Eric K, SumitPurohit
Data granularity (Still to be clarified its meaning)

What kind of data granularity should be considered when publishing data on the Web?

To provide guidance about different types of data granularity.

Data access

What kind of data access should be considered when publishing data on the Web? How to decide if to publish a dump versus API options (SPARQL, etc)?

What requirements to take into account? (reliability, time to query dataset, etc.)

Sensitive data (privacy)

How to deal with sensitive data in order to not infringe a person's right to privacy or an organization's security?

Data identification

To provide guidance on URI Design and Management for Persistence.

Compact Uniform Resource Identifier (COMURI) - mirror

URI Design and Management for Persistence

URIs versus APIs

Tomas Carrasco, Phil, Newton, Flavio, Carlos Iglesias
Data persistence

To provide guidance on how data could be preserved.

Data preservation Phil, Cristophe
Data versioning

How to track/manage different versions of a dataset?

Publishing and accessing versions of datasets Newton, Flavio
Data enrichment

How to enrich data before consuming?

Adriano Machado, Adriano Veloso, Wagner Meira
Data usage

How to track/gather data usage information ?


How to track/gather user feedback?


[1] The themes were based on the Use Case Challenges and Requirements:,

Scope Thoughts:

  • Is it unique to publishing data on the web?
  • Does addressing it encourage people to publish/reuse data on the web (or remove barriers to it)?
    • is it testable?
    • is it about publishing data on the web?
  • requirements/ issues are in the scope and can be testable then they will be in the recommendation
  • requirements/ issues are in the scope and can not be testable then they will be in a note
  • requirements/ issues out of scope will be deleted