Warning:
This wiki has been archived and is now read-only.
Best practices document guidelines
From Data on the Web Best Practices
Contents
General Guidelines
- A Best Practice implements one or more UC Requirements.
- A UC Requirement is motivated by one or more Use Cases.
- A Best Practice may be a General Best Practice or a Specific Best Practice.
- A General Best Practice applies to all resources, while a Specific Best Practice applies to a specific Resource.
- A Resource may be a Dataset, a License or a Vocabulary. A Resource is "something" that may be published or consumed.
- A Best Practice has a title, a description and one or more How to Sections.
- A How to Section specifies one possible way of implementing a Best Practice.
General Best Practices
- GBP1. Resources should be available in a machine-readable format
- GBP2. Resources should have a unique identifier
- GBP3. Resources should be available in an open format
- GBP4. Resources should be described by metadata
- GBP5. Standard vocabularies should be used to describe resources (to define metadata)
- GBP6. Provenance information (about resources ?) should be available
- GBP7. Quality information (about resources ?) should be available
- GBP8. Usage information (about resources ?) should be available
- GBP9. Versioning information (about resources?) should be available
Specific Best Practices
- Dataset Best Practices
- Licenses Best Practices
- Vocabularies Best Practices
Mapping between Best Practices and Proposed Chapters
General Best Practice | Section |
---|---|
GBP1. Resources should be available in a machine-readable format | |
GBP2. Resources should have a unique identifier | URI Best Practice for Web Data URI (DURI), URI Design and Management for Persistence, URIs versus APIs |
GBP3. Resources should be available in an open format | |
GBP4. Resources should be described by metadata | Guidance on the Provision of Metadata |
GBP5. Standard vocabularies should be used to describe resources | Use of core vocabularies to improve interoperability |
GBP6. Provenance information should be available | |
GBP7. Quality information should be available | Data quality vocabulary |
GBP8. Usage information should be available | Data usage vocabulary |
GBP9. Versioning information should be available | Publishing and accessing versions of datasets |
To discuss with the group (how to map?) * Making controlled vocabularies accessible as URI sets:
- Mark, Antoine
- Technical factors for consideration when choosing data sets for publication
- Nathalia, Flavio
- Technical factors affecting potential use of open data for innovation, efficiency and commercial exploitation
- Vagner, Nathalia, Hadley, Yaso
- Data preservation:
- Phil, Christophe
Mapping between UC Requirements and Best Practices
Requirement | Requirement Description | Best Practice |
---|---|---|
R-MetadataAvailable | Metadata should be available | GBP4. Resources should be described by metadata |
R-MetadataMachineRead | Metadata should be machine-readable | GBP1. Resources should be available in a machine-readable format |
R-MetadataStandardized | Metadata should be standardized | GBP5. Standard vocabularies should be used to describe resources (to define metadata) |
R-MetadataDocum | Metadata vocabulary, or values if vocabulary is not standardized, should be well-documented | VBP1. Vocabularies should be well documented |
R-MetadataInteroperable | Metadata should be interoperable | GBP5. Standard vocabularies should be used to describe resources (to define metadata) |
R-GranularityLevels | Data available at different levels of granularity should be accessible and modelled in a common way | DBP1. Data should be available at different levels of granularity |
R-FormatMachineRead | Data should be availabe in a machine-readable format | GBP1. Resources should be available in a machine-readable format |
R-FormatStandardized | Data should be availabe in a standardized format | DBP2. Data should be available in multiple formats (standard format) |
R-FormatOpen | Data should be availabe in an Open format | GBP3. Resources should be available in an open format |
R-FormatMultiple | Data should be availabe in multiple formats | DBP2. Data should be available in multiple formats |
R-FormatLocalize | It should be possible to localize data on the Web | ???? |
R-VocabReference | Existing reference vocabularies should be reused where possible | GBP5. Standard vocabularies should be used to describe resources (to define metadata) |
R-VocabDocum | Vocabularies should be clearly documented | GBP4. Resources should be described by metadata |
R-VocabOpen | Vocabularies should be shared in an Open way | GBP3. Resources should be available in an open format |
R-VocabVersion | Vocabularies should include versioning information | GBP9. Versioning information (about resources) should be available |
R-LicenseAvailable | Data should be associated with a license | GBP4. Resources should be described by metadata (???) |
R-LicenseMachineRead | Data licenses should be provided in a machine-readable format | GBP1. Resources should be available in a machine-readable format |
R-LicenseStandardized | Standard vocabularies should be used to describe licenses | GBP5. Standard vocabularies should be used to describe resources (to define metadata) |
R-LicenseInteroperable | Data licenses should be interoperable | LBP1. Data licenses should be interoperable |
R-LicenseLiability | Liability terms associated with usage of Data on the Web should be clearly outlined | LBP2. Liability terms associated with usage of Data on the Web should be clearly outlined |
R-ProvAvailable | Data provenance information should be available | GBP6. Provenance information (about resources) should be available |
R-SelectHighValue | Datasets selected for publication should be of high-value | DBP3. Datasets selected for publication should be of high-value |
R-SelectDemand | Datasets selected for publication should be in demand by potential users | DBP3. Datasets selected for publication should be of high-value |
R-AccessBulk | Data should be available for bulk download | DBP4. Data should be accessible in different ways |
R-AccessRealTime | Where data is produced in real-time, it should be available on the Web in real-time | DBP4. Data should be accessible in different ways |
R-AccessUptodate | Data should be available in an up-to-date manner | DBP5. Data should be available in an up-to-date manner |
R-SensitivePrivacy | Data should not infringe on a person's right to privacy | ??? |
R-SensitiveSecurity | Data should not infringe on national security | ??? |
R-UniqueIdentifier | Each data resource should be associated with a unique identifier | GBP2. Resources should have a unique identifier |
R-MultipleRepresentations | A data resource may have multiple representations, e.g. xml/html/json/rdf | DBP2. Data should be available in multiple formats |
R-DynamicGeneration | Dynamic generation of Data on the Web from non-Web data resources | ??? |
R-AutomaticUpdate | Automatic update of Data on the Web when original data source is updated | DBP5. Data should be available in an up-to-date manner |
R-CoreRegister | Core registers should be accessible | DBP3. Datasets selected for publication should be of high-value |
R-IndustryReuse | Data should be suitable for industry reuse | DBP6. Data should be suitable for industry reuse |
R-SLAAvailable | Service Level Agreements (SLAs) for industry reuse of the data should be available if requested | DBP6. Data should be suitable for industry reuse |
R-SLAMachineRead | SLAs should be provided in a machine-readable format | GBP1. Resources should be available in a machine-readable format |
R-SLAStandardized | Standard vocabularies should be used to describe SLAs | GBP5. Standard vocabularies should be used to describe resources (to define metadata) |
R-PotentialRevenue | Potential revenue streams from data should be described | DBP6. Data should be suitable for industry reuse |
R-PersistentIdentification | Data should be persistently identifiable | GBP2. Resources should have a unique identifier |
R-Archiving | It should be possible to archive data | DBP7. Data should be archived |
R-QualityAvailable | Quality information should be available | GBP7. Quality information (about resources) should be available |
R-UsageAvailable | Usage information should be available | GBP6. Usage information (about resources) should be available |