Warning:
This wiki has been archived and is now read-only.
Quality Aspects In Use Cases
Which are the parts of UC descriptions relevant for the Quality and Granularity vocabulary?
Parent page: https://www.w3.org/2013/dwbp/wiki/Quality_Requirements_From_UCR
Source: http://www.w3.org/TR/2015/NOTE-dwbp-ucr-20150224/#use-cases-1
The table below has:
- UC: the UC from which inspiration has been taken;
- Quotes: *all* sentences about Q&G aspects in the description of the UC
- Requirements : Reqs for the UC, which are potentially in scope for the vocabulary.
NB: right now, reqs are generally listed only the first time they appear.
UC | Quotes | Requirements |
---|---|---|
1 ASO: Airborne Snow Observatory | Quality: Available in a number of scientific formats to customers and stakeholders based on customer requirements |
|
2 BBC | Quality: High level and domain vocabularies adapted to BBC applications. |
R-MetadataStandardized |
3 Bio2RDF | Quality: Bio2RDF scripts generate provenance records using VoID, PROV and DC. A date-specific dataset IRI is linked to a unique dataset IRI using the PROV predicate wasDerivedFrom such that one can retrieve all provenance records for datasets created on different dates. Each resource in the dataset is linked the date-unique dataset IRI that is part of the provenance record using the VoID inDataset predicate. Provenance indicates the time at which the RDF was generated, licensing (if available from the data source provider), dc:creator link to the script on Github that was used to generate a dataset, the void:sparqlEndpoint to point to the dataset SPARQL endpoint, and void:dataDump to point to the data download URL.
Dataset metrics: total number of triples number of unique subjects number of unique predicates number of unique objects number of unique types unique predicate-object links and their frequencies unique predicate-literal links and their frequencies unique subject type-predicate-object type links and their frequencies unique subject type-predicate-literal links and their frequencies total number of references to a namespace total number of inter-namespace references total number of inter-namespace-predicate references |
|
4 BuildingEye: SME use of public data | Quality: standardized, interoperable across local authorities |
|
5 Dados.gov.br | Quality: Authoritative, clean data, vetted and guaranteed. |
R-QualityOpinions |
6 Digital archiving of Linked Data |
R-PersistentIdentification (if we decide some info about preservation and persistence should be part of quality info) | |
7 Dutch Base Registers | Governmental data has to be traceable/trustable as such. |
|
8 GS1 Digital | Quality: Very important to have trustworthy authoritative data from respective organizations.
Challenges: An organization (e.g. retailer) might embed authoritative data asserted by another organization (e.g. brand owner) and there is the risk that such embedded information becomes stale if it is not continuously synchronized. Potential Requirements:
|
Q: is the first 'potential req' related to granularity? |
9 ISO GEO Story | Challenges: A unified way to have access to each record within the catalog at different levels: local, regional, national or EU level. |
R-GranularityLevels |
10 The Land Portal | Quality: Every sort of data, from high quality to unverified. |
R-GranularityLevels R-QualityCompleteness R-QualityMetrics |
11 LA Times' Reporting of Ron Galperin's Infographic | The methodology used is not explained - making it hard to assess trustworthiness. How can provenance be described? |
R-DataProductionContext R-GeographicalContext R-QualityMetrics R-UniqueIdentifier (if relevant for quality) |
12 LusTRE: Linked Thesaurus fRamework for Environment |
Quality: Largely variable. Challenges: Assessment and documentation of dataset and linkset quality with domain-dependent quality metrics. LusTRE considers the heterogeneity in scope and levels of abstraction of existing environmental thesauri as an asset. It includes a review of thesauri and their characteristics in term of multilingualism, openness and quality. Expressing dataset and linkset quality would be needed to make accessible the quality assessment of thesauri. Quality of thesauri and linksets is not necessarily limited to the initial review of thesauri, it should be monitored and promptly documented. http://www.edbt.org/Proceedings/2013-Genova/papers/workshops/a8-albertoni.pdf presents measures for quality of linksets. |
R-Citable, R-DataEnrichment, R-DataVersion, R-ProvAvailable, R-QualityComparable, R-QualityCompleteness, R-QualityMetrics, R-QualityOpinions, etc. |
13 Machine-readability and Interoperability of Licenses | ||
14 Mass Spectrometry Imaging (MSI) | Quality: varies with mass spectrometry instrument used, preparation of sample. note AI: this quality of content (images)! |
R-DataEnrichment R-QualityCompleteness R-QualityMetrics |
15 OKFN Transport WG | Perceived liability risks, often associated with data quality issues, prevent operators from opening up their data. |
R-DataMissingIncomplete R-DataProductionContext R-GeographicalContext R-QualityComparable R-QualityCompleteness R-QualityMetrics |
16 Open City Data Pipeline | Challenges:
|
R-DataMissingIncomplete R-DataProductionContext R-GeographicalContext R-QualityComparable R-QualityCompleteness |
17 Open Experimental Field Studies | For measurements to be considered useful and comparable to other findings scientists need to track every aspect of their laboratory and field experiments. This can include: background describing the purpose of the experiment, [...] quality assurance, problem reporting [...] quality control codes selected...
Quality: House keeping data, problem reporting, maintenance history, calibration history. Negative aspects: When data is published on the Web there is no mechanism for users to rate and review data. Challenges:
|
R-AccessRealTime R-DataIrreproducibility R-DataLifecycleStage R-DataProductionContext R-QualityOpinions |
18 Resource Discovery for Extreme Scale Collaboration (RDESC) | Quality: is important to maintain correctness and quality of search result.
Challenges:
Potential Requirements:
|
R-AccessRealTime R-DataMissingIncomplete R-ProvAvailable R-SLAAvailable |
19 Recife Open Data Portal | Quality: Verified and clean data.
Challenges: Automate the data publishing process to keep data up to date and accurate. |
R-QualityComparable, R-QualityCompleteness |
20 Retrato da Violência (Violence Map) | Quality: not guaranteed. (!)
Negative Aspects: the data is already outdated (in 2014) |
R-AccessUpToDate R-QualityCompleteness |
21 Share-PSI 2.0 | Report from which many requirements can be derived http://www.w3.org/2013/share-psi/workshop/samos/report |
R-AccessRealTime, R-AccessUpToDate, R-GeographicalContext, R-ProvAvailable, R-QualityComparable, R-QualityOpinions |
22 Tabulae - how to get value out of data | Quality: The information must be at least semi-structured (for instance, an spreadsheet).
Challenges:
|
R-AccessUpToDate R-FormatLocalize R-ProvAvailable R-QualityComparable R-QualityCompleteness |
23 UK Open Research Data Forum | Quality: Variable - often empirical, often messy. Some of the data may not be repeatable. |
R-ProvAvailable |
24 Uruguay Open Data Catalog | Quality: Most of the data is realized properly, with complete or near complete metadata.
Challenges: Automated publication process using harvesting or similar tools. Alerts or control panels to keep data updated. |
R-DataMissingIncomplete R-AccessLevel |
25 Web Observatory | Quality: Variable, depend on the data source, can be structured or not.
Challenges: Data velocity; Data variety |
R-DataEnrichment R-GranularityLevels R-ProvAvailable |
26 Wind Characterization Scientific Study | The DMF will record all processing history, quality assurance work, problem reporting, and maintenance activities for both instrumentation and data. |