[QB] ISSUE-31 (Aggregation hierarchies) Discussion and proposal

ISSUE 31 [1] concerns support for hierarchies other than SKOS. I know 
this one is controversial ...

In Data Cube as it stands then a cube component (qb:ComponentProperty, 
especially a dimension) can be coded using codes from a 
skos:ConceptScheme. This roughly corresponds to the SDMX notion of a 
Concept Scheme [2] where codes are arranged in a parent/child hierarchy. 
The normal skos:narrower/skos:broader relationships are used to express 
such parent/child relationships.

Often in statistical publishing values will be given for different 
levels in such a concept scheme - e.g. the population statistics for the 
UK as a whole as well as those for each region. Though the value for the 
parent may not simply be the sum of those for its children.

Especially when dealing with geographic hierarchies some limitations of 
the current approach that have come up in practice, e.g. see [3].

There are 3 problems here and a non-problem.

The non-problem is that it is perfectly reasonable to create a 
skos:Concept to represent an geographic region and to use 
skos:narrower[4] to represent a relevant containment hierarchy. That's 
not why ISSUE-31 is on the list.

The problems are:

(a) Publishers would like to reuse existing geographic hierarchies 
already published as linked data (e.g. [5]) but where that data uses 
different predicates to represent the hierarchy than skos:narrower.

(b) The same geographic regions can participate in multiple hierarchies. 
As well as spatial containment there is also administrative containment. 
We can only use skos:narrower for one of these.

(c) In cases like [3] people also wish to state when the child concepts 
are disjoint so that aggregation might be possible (so long as the 
measures themselves can be aggregated), or more strongly that the parent 
concepts are a disjoint union of the child concepts.

It is possible to work around (a) by publishing your own skos:narrower 
assertions about the existing published linked data. However, that is 
problematic to keep up to date, may conflict with someone who wants to 
use skos:narrower in a different sense over the same concepts (see b) 
and can be socially/politically problematic.

Problems like (b) are handled in SDMX through the use of hierarchical 
code schemes [6] which can be used to create multiple different 
generalized hierarchies over the same code list.

PROPOSAL.  Proposed approach is a vocabulary extension:

qb:hierarchy (domain: qb:CodedProperty, range: qb:Hierarchy)
    Indicates a specification of the hierarchy used for coding this 
property (typically a DimensionProperty). Where a skos:ConceptScheme 
exists with appropriate broader/narrower relations then that should be 
used and should be specified using qb:codeList. The qb:hierarchy 
declaration is only need for situations where a suitable 
skos:ConceptScheme is not available.

qb:Hierarchy (owl:Class)
    Specifies a hierarchy which can be used for coding. The same 
concepts may be members of multiple hierarchies provided that different 
qb:[narrowing/broadening]Property values are using for each hierarchy.

qb:AggregatableHierarchy (sub class of: qb:Hierarchy)
     Indicates a hierarchy in which each parent concept is a disjoint 
union of its child concepts. So that measures such as simple counts 
*may* be aggregated up the hierarchy.

qb:hierarchyRoot (domain: qb:Hierarchy, range: skos:Concept)
    Specifies a root of the hierarchy. A hierarchy may have multiple 
roots but must have at least one.[7]

qb:narrowingProperty (domain: qb:Hierarchy, range: rdf:Property)
    Specifies a property which relates a parent concept in the hierarchy 
to a child concept. One of qb:narrowingProperty or qb:broadeningProperty 
must be given but it is not necessary to have both. Note that a child 
may have more than one parent.

qb:broadeningProperty (domain: qb:Hierarchy, range: rdf:Property)
    Specifies a property which relates a child concept in the hierarchy 
to a parent concept. One of qb:narrowingProperty or 
qb:broadeningProperty must be given but it is not necessary to have 
both. Note that a child may have more than one parent.

Dave

[1] http://www.w3.org/2011/gld/track/issues/31

[2] http://sdmx.org/docs/2_0/SDMX_2_0%20SECTION_02_InformationModel.pdf p.39

[3] 
https://groups.google.com/forum/#!topic/publishing-statistical-data/3I1-Ix1Hk14/discussion

[4] I'm using skos:narrower as a shorthand for 
skos:narrower/skos:broader, no slight intended to skos:broader

[5] http://data.ordnancesurvey.co.uk/.html

[6] http://sdmx.org/docs/2_0/SDMX_2_0%20SECTION_02_InformationModel.pdf p.96

[7] This is to match SDMX 2.0 [6] which also supports multiple roots.

Received on Thursday, 28 February 2013 11:31:16 UTC