Warning:
This wiki has been archived and is now read-only.

Best Practices/Linked Data Approach for Publishing Statistical Data

From Share-PSI EC Project
Jump to: navigation, search

Outline

Linked Data Approach for Publishing Statistical Data

Management summary

Challenge

Statistical data is used as the foundations for policy prediction, planning and adjustments, and therefore has a significant impact on the society (from citizens to businesses to governments). The process of collecting and monitoring socio-economic indicators can be considerably improved if the data produced by government organizations such as Statistical Offices, National Banks, Employment services, etc. are published in Linked Data Format.

Solution

The Linked Data approach enables datasets to be linked together through references to common concepts. A dataset is represented in the form of a graph, using the Resource Description Framework (RDF) as a general-purpose language. Linked Data publication process refers to a set of activities related to extraction, transformation, validation, exploration and publication of RDF datasets originating from different sources (e.g., databases) on the Web. The ready for use RDF datasets can be either stored locally or registered at a metadata catalog e.g. build with CKAN open-source tool.

Best Practice identification

Why is this a Best Practice?

  • The approach contributes to the standardization of the process of publishing and re-use of multi-dimensional data on the Web.
  • The approach is based on RDF Data Cube vocabulary that is mature enough to be used for publishing statistical data as it improves interoperability and allows comparison of data from different statistical sources.
  • The vocabulary underlies SDMX (Statistical Data and Metadata eXchange), an ISO standard for exchanging and sharing statistical data and metadata among organizations and provides a layer on top of data to describe domain semantics, dataset's metadata, and other crucial information needed in the process of statistical data exchange.

Links to the PSI Directive

Publication and deployment of information

Why is there a need for this Best Practice?

  • To spread experience and encourage government organizations to follow existing approaches

What do you need for this Best Practice?

This best practice is based on a set of tools for automating the data extraction and publication process. However the EU research community delivered many open-source tools for publishing the statistical data in Linked Data format, see e.g. the LOD2 Statistical Workbench

Applicability to other Member States

The approach is applicable to any Member State.

Contact info

Valentina Janev, Institute Mihajlo Pupin, valentina.janev@institutepupin.com