From Data on the Web Best Practices
Jump to: navigation, search

This questionnaire is meant to obtain more precise information from Use Case owners about the quality features and requirements of their use cases. This is work-in-progress

For each of the following quality dimensions, is assessing this dimension a requirement for your Use Case? Can you rank the required dimensions in order of priority? How do you assess your dataset against the quality dimensions? Are you using measures, or assigning one element from a controlled list of values or a ranking system (stars, levels...)? Are there any other quality dimensions and metrics that are required for the Use Case?

  • accuracy;
  • availability;
  • completeness;
  • conformance;
  • consistency;
  • credibility;
  • processability;
  • relevance;
  • timeliness.

Source for the quality dimensions: slide 8 of this presentation CC-BY Makx Dekkers/Open Data Support/PwC European Commission)

Some definitions from http://eis-bonn.github.io/Luzzu/papers/ldow2014.pdf

  • A Quality Dimension is a characteristic of a dataset relevant to the consumer (e.g. Availability of a dataset).
  • A Quality Metric is a procedure for measuring a data quality dimension, which is abstract, by observing a concrete quality indicator. There are usually multiple metrics per dimension; e.g., availability can be indicated by the accessibility of a SPARQL endpoint, or of an RDF dump. The value of a metric can be numeric (e.g., for the metric “human-readable labelling of classes, properties and entities”, the percentage of entities having an rdfs:label or rdfs:comment) or boolean (e.g. whether or not a SPARQL

endpoint is accessible).

Another example: a dimension could be "multilinguality" and two metrics could be "ratio of literals with language tags" and "number of different language tags".