Data Cube Implementations

From Government Linked Data (GLD) Working Group Wiki
Jump to: navigation, search

Implementer's note: The WG is considering a change to the normalization algorithm to improve the coverage of the integrity checking rules. Please see [1] for details and respond if this would cause problems.

A summary of known implementations of Data Cube is given below, followed by a table of conformance results that have been formally reported.


Data Cube implementations
Reporter Description Link (if public)
Dave Reynolds Environment Agency, Bathing water quality. Data Cubes are used to represent but current and history weekly and annual assessments of quality of water at bathing locations in England and Wales http://environment.data.gov.uk
Dave Reynolds COINS. The UK treasury Combined Online Information System dataset for 2010 was released as Linked Data using the Data Cube vocabulary. http://data.gov.uk/resources/coins
Dave Reynolds Local government payments. A number of UK Local Authorities have published Linked Data describing payments made to suppliers. This was achieved via the payments ontology, an extension of the Data Cube vocabulary. http://data.gov.uk/resources/payments
Dave Reynolds Weather forecasts. The UK MetOffice has develop a beta service to publish site-specific weather forecasts, using the Data Cube vocabulary to represent the forecast values.
Peter Winstanley [2] Consumption data. Scottish Government have made use of the RDF Data Cube vocabulary for the publication of utilities consumption data. http://cofog01.data.scotland.gov.uk/
Leigh Dodds [3] NHS performance statistics
Leigh Dodds [4] DOPA project. Used DataCube to define how to surface Linked Data from a

statistical data platform.

Bill Roberts [5] opendatacommunities.org Contains over 100 different data cube datasets on housing, planning, deprivation, departmental business metrics from the UK Department of Communities and Local Government. http://opendatacommunities.org/themes
Ørnulf Risnes [6] Nesstar visualization. Under development by Norwegian Social Science Data Services (NSD),
Sarven Capadisli [7] Linked SDMX Data. Linked data publication of statistical data from IMF, OECD, FAO, BFS, ECB, World Bank, Transparency International. http://imf.270a.info/ http://oecd.270a.info/ http://bfs.270a.info/ http://fao.270a.info/ http://ecb.270a.info/ http://worldbank.270a.info/ http://transparency.270a.info/
Benedikt Kämpgen [8] Eurostat Linked Data Wrapper. Linked data publication of statistical SDMX data from Eurostat using the RDF Data Cube Vocabulary. http://estatwrap.ontologycentral.com/
Benedikt Kämpgen [9] SEC Edgar Linked Data Wrapper. Data Cubes are used to represent XBRL data from the U.S. Securities and Exchange Commission. For example, those cubes are used in the Financial Information Observation System (FIOS) [10]. http://edgarwrap.ontologycentral.com/
Benedikt Kämpgen [11] Global Health Observatory Dataset. Data Cubes are used to represent WHO's Global Health Observatory (http://apps.who.int/ghodata/) dataset (http://gho.aksw.org/). http://gho.aksw.org/
Benedikt Kämpgen [12] ISTAT Immigration. This dataset collects official statistical data about immigration in Italy, provided by the Italian National Institute of Statistics (dati.istat.it). Data is represented by means of the Data Cube vocabulary. http://www.linkedopendata.it/datasets/istat-immigration
Michael Martin [13] CubeViz - the RDF DataCube browser is a tool dealing with statistics represented with the RDF DataCube vocabulary. http://aksw.org/Projects/CubeViz.html
Jose Emilio Labra Gayo [14] Computex. This implementation is called Computex (Computational Statistical Indexes). It can be seen as an extension of RDF Data Cube to represent statistical indexes that can be automatically computed using SPARQL queries. https://github.com/weso/computex
Jose Emilio Labra Gayo [15] Computex validator. Validator, which will also validate RDF Data Cube data, is in development. https://github.com/weso/computex
(via Phil Archer) TabLinker. Converts manually annotated Microsoft Excel workbooks to the RDF Data Cube vocabulary. https://github.com/Data2Semantics/TabLinker
Phil Archer, Valentina Janev [16] LOD2 Validator: validates the integrity constraints defined in the W3C RDF Data Cube specification and automatic repair of observed/identified errors. Part of the LOD2 Statistical workbench. http://ict-act.org/proceedings/2013/htmls/papers/icti2013_submission_01.pdf
Ghislain Atemezing [17] [18] Eurecom, School statistics application: provides stats on French schools using a vocabulary that extends Data Cube http://semantics.eurecom.fr/datalift/PerfectSchool/#school/
Benedikt Kämpgen [19] Linked Data Cubes Explorer (LDCX): LDCX takes in the URI of a dataset, loads it from Linked Data, executes the normalisation algorithm and integrity checks, and allows users to create pivot tables from the dataset. As such, LDCX may be useful to two groups: 1) Users of statistical Linked Data that want to explore a dataset and 2) Publishers of statistical Linked Data that want to validate their publication. http://www.ldcx.linked-data-cubes.org:8000/ldcx-trunk/ldcx/ld-cubes-explorer.html
Dave Reynolds [20] Clinical trials : Non-public application of Data Cube to represent aggregate clinical trial results including adverse events. Not available
George Papastefanatos [21] Greek Census Data: linked-statistics.gr is using the Data Cube Vocabulary to publish statistical data from Hellenic Statistical Authority (ELSTAT), as Linked Open Data (LOD) http://linked-statistics.gr
Søren Roug [22] European Environment Agency: is using the Data Cube Vocabulary to be able to import SDMX datasets from other organisations into our triple store for further work. This work typically involves combining two or more datasets. Activity includes data conversion and visualization. Example dataset: http://rdfdata.eionet.europa.eu/eurostat/data/env_air_gge.rdf.gz
Dictionary for a dimension: http://rdfdata.eionet.europa.eu/eurostat/dic/ai.rdf
Valentina Janev [23] Serbia Statistics: In the LOD2 framework (http://lod2.eu), the Institute Mihajlo Pupin, Belgrade, Serbia is working with data from the Statistical Office of the Republic of Serbia. http://rs.ckan.net/group/rzs
Bill Roberts [24] Hampshire County Council : A dataset of predicted numbers of houses to be built at various locations in Hampshire, used to be transparent about their building planning process. http://linkeddata.hants.gov.uk/datasets/net-additional-dwellings
Bill Roberts [25] Open data Scotland : Includes approximately 12 separate data cube datasets on deprivation and education. http://data.opendatascotland.org - not yet live but due for release in December 2013.
Bill Roberts [26] OpenCube FP7 Project : EU project to produce tools for production and consumption of Data Cubes. Just started November 2013, no implementations yet. http://www.opencube-project.eu/
Florian Stegmaier [27] CODE FP7 project - Data Cube used for dats integration of primary research data http://code-research.eu/
Sarven Capadisli Linked Statistical Data Analysis is a human and machine-friendly Web based application which uses statistical linked dataspaces (i.e., data modelled with QB) for federated queries, generates analysis and visualisations. http://stats.270a.info/

Conformance reports

Conformance reports
Reporter Name Conformance
Jose Emilio Labra Gayo [28] Computex All IC pass [29]
Ghislain Atemezing [30] Eurecom School statistics application All IC pass [31]
Dave Reynolds [32] EA Bathing Water Quality All IC pass [33]
Dave Reynolds [34] Clinical trials All IC pass
Valentina Janev [35] Serbia Statistics Office All IC pass [36]
Florian Stegmaier [37] CODE Project early stage report IC-1, IC-3 and IC-4 fail correctly. None of these failures reflect errors in the rules but limitations of the early stage data (IC-1, IC-4) and lack of normalization in the validation process used (IC-3). [38]
Sarven Capadisli [39] Linked SDMX IC-4 fails correctly [40]. The implementation report explains that the data is automatically generated from SDMX-ML to RDF transformation which is not yet able to infer the range statements needed for IC4.
Sarven Capadisli [41] Linked SDMX IC-19a fails correctly [42]. This correctly identified an error in the transformation.
Sarven Capadisli [43] Linked SDMX IC-1, IC-4 and IC-8 fail correctly. This was synthetic data designed with a deliberate omission which should have been detected by IC-8 and was.

Optional property usage

Optional property usage
Property Usage
qb:ObservationGroup, qb:observationGroup Environment Agent Bathing Water (experimental)
qb:concept Eurecom School Statistics application
SDMX Concepts (see http://www.w3.org/TR/vocab-data-cube/#dsd-cog)
Linked SDMX
qb:measureDimension, qb:measureType Guillaume Duffes - implementation report offered [44]
Greek Census data (project ongoing)
Open Data Coummunities
qb:componentRequired Computex
qb:order Computex
Linked SDMX
qb:HierarchicalCodeList, qb:parentChildProperty, qb:hierarchyRoot OpenDataCommunities - experimental use, planned to move to public site
Linked SDMX