Warning:
This wiki has been archived and is now read-only.

Best practices document guidelines

From Data on the Web Best Practices
Jump to: navigation, search
General Guidelines
  • A Best Practice implements one or more UC Requirements.
  • A UC Requirement is motivated by one or more Use Cases.
  • A Best Practice may be a General Best Practice or a Specific Best Practice.
  • A General Best Practice applies to all resources, while a Specific Best Practice applies to a specific Resource.
  • A Resource may be a Dataset, a License or a Vocabulary. A Resource is "something" that may be published or consumed.
  • A Best Practice has a title, a description and one or more How to Sections.
  • A How to Section specifies one possible way of implementing a Best Practice.


General Best Practices
  • GBP1. Resources should be available in a machine-readable format
  • GBP2. Resources should have a unique identifier
  • GBP3. Resources should be available in an open format
  • GBP4. Resources should be described by metadata
  • GBP5. Standard vocabularies should be used to describe resources (to define metadata)
  • GBP6. Provenance information (about resources ?) should be available
  • GBP7. Quality information (about resources ?) should be available
  • GBP8. Usage information (about resources ?) should be available
  • GBP9. Versioning information (about resources?) should be available


Specific Best Practices
  • Dataset Best Practices
  • Licenses Best Practices
  • Vocabularies Best Practices


Mapping between Best Practices and Proposed Chapters
General Best Practice Section
GBP1. Resources should be available in a machine-readable format
GBP2. Resources should have a unique identifier URI Best Practice for Web Data URI (DURI), URI Design and Management for Persistence, URIs versus APIs
GBP3. Resources should be available in an open format
GBP4. Resources should be described by metadata Guidance on the Provision of Metadata
GBP5. Standard vocabularies should be used to describe resources Use of core vocabularies to improve interoperability
GBP6. Provenance information should be available
GBP7. Quality information should be available Data quality vocabulary
GBP8. Usage information should be available Data usage vocabulary
GBP9. Versioning information should be available Publishing and accessing versions of datasets

To discuss with the group (how to map?) * Making controlled vocabularies accessible as URI sets:

Mark, Antoine
  • Technical factors for consideration when choosing data sets for publication
Nathalia, Flavio
  • Technical factors affecting potential use of open data for innovation, efficiency and commercial exploitation
Vagner, Nathalia, Hadley, Yaso
  • Data preservation:
Phil, Christophe


Mapping between UC Requirements and Best Practices
Requirement Requirement Description Best Practice
R-MetadataAvailable Metadata should be available GBP4. Resources should be described by metadata
R-MetadataMachineRead Metadata should be machine-readable GBP1. Resources should be available in a machine-readable format
R-MetadataStandardized Metadata should be standardized GBP5. Standard vocabularies should be used to describe resources (to define metadata)
R-MetadataDocum Metadata vocabulary, or values if vocabulary is not standardized, should be well-documented VBP1. Vocabularies should be well documented
R-MetadataInteroperable Metadata should be interoperable GBP5. Standard vocabularies should be used to describe resources (to define metadata)
R-GranularityLevels Data available at different levels of granularity should be accessible and modelled in a common way DBP1. Data should be available at different levels of granularity
R-FormatMachineRead Data should be availabe in a machine-readable format GBP1. Resources should be available in a machine-readable format
R-FormatStandardized Data should be availabe in a standardized format DBP2. Data should be available in multiple formats (standard format)
R-FormatOpen Data should be availabe in an Open format GBP3. Resources should be available in an open format
R-FormatMultiple Data should be availabe in multiple formats DBP2. Data should be available in multiple formats
R-FormatLocalize It should be possible to localize data on the Web ????
R-VocabReference Existing reference vocabularies should be reused where possible GBP5. Standard vocabularies should be used to describe resources (to define metadata)
R-VocabDocum Vocabularies should be clearly documented GBP4. Resources should be described by metadata
R-VocabOpen Vocabularies should be shared in an Open way GBP3. Resources should be available in an open format
R-VocabVersion Vocabularies should include versioning information GBP9. Versioning information (about resources) should be available
R-LicenseAvailable Data should be associated with a license GBP4. Resources should be described by metadata (???)
R-LicenseMachineRead Data licenses should be provided in a machine-readable format GBP1. Resources should be available in a machine-readable format
R-LicenseStandardized Standard vocabularies should be used to describe licenses GBP5. Standard vocabularies should be used to describe resources (to define metadata)
R-LicenseInteroperable Data licenses should be interoperable LBP1. Data licenses should be interoperable
R-LicenseLiability Liability terms associated with usage of Data on the Web should be clearly outlined LBP2. Liability terms associated with usage of Data on the Web should be clearly outlined
R-ProvAvailable Data provenance information should be available GBP6. Provenance information (about resources) should be available
R-SelectHighValue Datasets selected for publication should be of high-value DBP3. Datasets selected for publication should be of high-value
R-SelectDemand Datasets selected for publication should be in demand by potential users DBP3. Datasets selected for publication should be of high-value
R-AccessBulk Data should be available for bulk download DBP4. Data should be accessible in different ways
R-AccessRealTime Where data is produced in real-time, it should be available on the Web in real-time DBP4. Data should be accessible in different ways
R-AccessUptodate Data should be available in an up-to-date manner DBP5. Data should be available in an up-to-date manner
R-SensitivePrivacy Data should not infringe on a person's right to privacy ???
R-SensitiveSecurity Data should not infringe on national security ???
R-UniqueIdentifier Each data resource should be associated with a unique identifier GBP2. Resources should have a unique identifier
R-MultipleRepresentations A data resource may have multiple representations, e.g. xml/html/json/rdf DBP2. Data should be available in multiple formats
R-DynamicGeneration Dynamic generation of Data on the Web from non-Web data resources ???
R-AutomaticUpdate Automatic update of Data on the Web when original data source is updated DBP5. Data should be available in an up-to-date manner
R-CoreRegister Core registers should be accessible DBP3. Datasets selected for publication should be of high-value
R-IndustryReuse Data should be suitable for industry reuse DBP6. Data should be suitable for industry reuse
R-SLAAvailable Service Level Agreements (SLAs) for industry reuse of the data should be available if requested DBP6. Data should be suitable for industry reuse
R-SLAMachineRead SLAs should be provided in a machine-readable format GBP1. Resources should be available in a machine-readable format
R-SLAStandardized Standard vocabularies should be used to describe SLAs GBP5. Standard vocabularies should be used to describe resources (to define metadata)
R-PotentialRevenue Potential revenue streams from data should be described DBP6. Data should be suitable for industry reuse
R-PersistentIdentification Data should be persistently identifiable GBP2. Resources should have a unique identifier
R-Archiving It should be possible to archive data DBP7. Data should be archived
R-QualityAvailable Quality information should be available GBP7. Quality information (about resources) should be available
R-UsageAvailable Usage information should be available GBP6. Usage information (about resources) should be available