This wiki has been archived and is now read-only.

Data Cube PR transition

From Government Linked Data (GLD) Working Group Wiki
Jump to: navigation, search

Transition to Proposed Recommendation

This page is for editors to organize the documentation and evidence necessary to transition a document to Proposed Recommendation. The page's content will be used for the transition request and to inform the transition meeting for that document.

This is a working page for the Government Linked Data working group. It may be subject to change/revision at any time.

Structure taken from W3C Technical Report Development Process.

Data Cube Timetable

Data Cube CR transition request - for reference


The RDF Data Cube Vocabulary


Current CR version: http://www.w3.org/TR/2013/CR-vocab-data-cube-20130625/

Proposed PR version: https://dvcs.w3.org/hg/gld/raw-file/default/data-cube/static-pr.html

Diff with CR version: https://dvcs.w3.org/hg/gld/raw-file/default/data-cube/static-pr-diff.html

Document Abstract

There are many situations where it would be useful to be able to publish multi-dimensional data, such as statistics, on the web in such a way that it can be linked to related data sets and concepts. The Data Cube vocabulary provides a means to do this using the W3C RDF (Resource Description Framework) standard. The model underpinning the Data Cube vocabulary is compatible with the cube model that underlies SDMX (Statistical Data and Metadata eXchange), an ISO standard for exchanging and sharing statistical data and metadata among organizations. The Data Cube vocabulary is a core foundation which supports extension vocabularies to enable publication of other aspects of statistical data flows or other multi-dimensional data sets.

The namespace for all terms in this ontology is: http://purl.org/linked-data/cube#

The vocabulary defined in this document is also available in these non-normative formats: Turtle.

Status of the document

See: https://dvcs.w3.org/hg/gld/raw-file/default/data-cube/static-pr.html

This vocabulary was originally developed and published outside of W3C, but has been extended and further developed within the Government Linked Data Working Group.

This document was published by the Government Linked Data Working Group as a Proposed Recommendation. This document is intended to become a W3C Recommendation. The W3C Membership and other interested parties are invited to review the document and send comments to public-gld-comments@w3.org (subscribe, archives) through 12 January 2014. Advisory Committee Representatives should consult their WBS questionnaires. Note that substantive technical comments were expected during the Last Call review period that ended 08 April 2013.

Please see the Working Group's implementation report.

Publication as a Proposed Recommendation does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.

This document was produced by a group operating under the 5 February 2004 W3C Patent Policy. W3C maintains a public list of any patent disclosures made in connection with the deliverables of the group; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy.

Changes to the document

Substantive changes

There have been no changes that the working group regards as substantive.

Noteworthy changes

There have been two changes to the document that should be highlighted.

ISSUE-68: RDFS closure rule completeness

A case was reported [1] where a data set passed the Integrity Constraint (IC) rules but in fact had an error in it. There is no claim that the IC rules are complete, however, the error could be detected by a small extension of the partial RDFS closure rules included in the spec. The working group determined that the extension was a non-substantive correction since the intent of this part of the specification was clear. The specification already allowed implementations to use full RDFS closure in place of the minimal closure rules provided. The proposed change would not affect published data, only validation tools. The working group published a detailed description of the change [2] and linked it to the implementations reporting page to make sure implementors were aware of it. No concerns were raised and the WG closed the issue.

ISSUE-69: typographical error in IC-8 rule

The WG noted a typographical error in IC-8 ISSUE-69. The rule would in fact work as intended in the normal situation where the test graph contained only one Data Cube but would fail in cases where there are multiple Data Cubes presented simultaneously. The WG determined that the intent of the rule is clear, the published query will work for the normal (single cube) case, and the correction would not affect any current implementation reports. On this basis the WG resolved to treat this as an editorial correction.

Minor changes

  • Added namespace, and links to vocabulary file, to the Abstract for ease of reference.
  • Corrected mis-statement of domain of qb:sliceKey in the reference section (was correct in the rest of the specification).
  • Minor typographical corrections.
  • Updated references.
  • Removed CR specific text on implementation feedback and At Risk features.

Evidence of wide review

Agreed exit criteria were: Two independently developed data sources have been demonstrated which comprise either well-formed abbreviated Data Cubes or well-formed Data Cubes, which pass all retained integrity checks.

An on-line validation tool was provided which implements the well-formedness checks (runs the Normalization Algorithm and Integrity Checking rules) http://www.w3.org/2011/gld/validator/qb/qb-validator. Implementors were not required to use this tool.

Summary of implementation reports is provided at Data Cube Implementations.

This includes 29 reports of usage of the vocabulary, including some work in progress. These include publication of several hundred data cubes ranging in individual size and complexity from 500 MTriples (COINS) down to spread-sheet scale cubes. Implementations include two general visualizers, two validators, one general data converter and some vocabulary extensions in addition to actual data publication. Internal commercial applications as well as open data publications were reported. Further publicly funded projects are developing additional Data Cube tools including further data converters and visualizers.

In terms of the exit criteria we have 5 formal conformance reports, from 4 independent groups, which pass all integrity checks; together with further 4 reports in which integrity checks correctly fail.

The formal exit criteria have thus been met.

At Risk Two parts of the specification were marked At Risk at CR - the Normalization Algorithm and the Integrity Checking rules (IC).

All formal conformance reports received were able to pass all the IC rules after normalization and no cases of incorrect failure of an IC rule have been reported. Note the changes reported above under ISSUE-68 and ISSUE-69.

We thus propose to proceed with these features retained.

Optional usage In addition, at CR transition, we were asked to check that terms in the vocabulary that might be regarded as optional (it is possible to assemble a well-formed data cube without using them) were used. This was not part of the formal exit criteria. The table at Data Cube Implementations#Optional property usage shows that there has been at least one report of successful usage of each such term.

Evidence that the document satisfies group's requirements

This is unchanged since CR.

The charter requirement states:

Statistical "Cube" Data. The group will produce a vocabulary, compatible with SDMX, for expressing some kinds of statistical data. This need not be as expressive as all of SDMX, but may provide a subset as in the RDF Data Cube vocabulary. It may also include ways to annotate data to indicate its assumptions and comparability.

The specification is based directly on the previously RDF Data Cube vocabulary and so directly meets the charter requirement. It provides URIs for individual observations and groups of observations and so offers a basis for annotating data. It does not provide additional vocabulary to describe "assumptions and comparability" and so does not fully address the final optional aspect of the charter specification.

Evidence that issues have been formally addressed

All issues raised before CR were addressed and reviewed at that stage.

During the implementation phase only two substantive issues where raised. These were tracked as ISSUE-68 and ISSUE-69 and were discussed above.

In addition two implementors [3] [4] noted that the non-normative examples in the specification include Turtle prefixed names with a leading digit in the localname part. This is legal under the current version of the Turtle specification [5] (and has been for some time) but was not legal under the original Team Submission for Turtle, which is still the default for some tools.

A consolidated summary of comments disposition for both CR and LC periods is provided at Data Cube comments.

Formal objections

No formal objects received.

Dependencies on other groups

The specification uses Turtle syntax for its examples and thus references the Turtle CR document non-normatively.

Record the group's decision to request advancement


Expected date of publication

17 December 2013