Washington Agenda Technical Item: SKOS Mapping Properties -- Data Model and Usage Conventions (Issues 71, 73, 74, 75) (@@TODO mins)

Issues tracker: ISSUE-71, ISSUE-73, ISSUE-74, ISSUE-75

The main questions are, if we keep skos:exactMatch, skos:broadMatch, skos:narrowMatch and skos:relatedMatch, then:

Usage Conventions: what do you use them for?
Data Model: what are the formal property definitions, and in particular, what are the dependencies on skos:broader, skos:narrower and skos:related?

These two questions are bound up together, because usage conventions interact strongly with formal definitions (ideally, you want the formal definitions to support and reinforce the usage conventions).

Contents

Washington Agenda Technical Item: SKOS Mapping Properties -- Data Model and Usage Conventions (Issues 71, 73, 74, 75) (@@TODO mins)

Usage Conventions

Taking the question of usage first, there appear to be four relevant scenarios identified so far. I'll try to describe them here, to help structure the discussion...

Usage scenario A, describing the structure of a concept scheme (a.k.a. kos description)

This is the common scenario, where SKOS is used to describe the conceptual links (hierarchical, associative) within a knowledge organisation system such as a thesaurus.

There should be little controversy here, the main usage option is: use skos:broader, skos:narrower and skos:related, with skos:inScheme on all the concepts.

E.g.

<A> skos:prefLabel "animals"; skos:inScheme <S>.
<B> skos:prefLabel "mammals"; skos:broader <A>; skos:inScheme <S>.

Usage scenario B, mapping between two distinct concept schemes with substantial overlap in scope (a.k.a. kos mapping)

This is the typical scenario of creating links between two distinct knowledge organisation systems that have lots in common, e.g. DDC and LCSH.

If we have the mapping properties, then the main usage option is: use skos:exactMatch, skos:broadMatch, skos:narrowMatch and skos:relatedMatch as appropriate, with skos:inScheme on all the concepts pointing to the scheme the concept is in.

E.g.

# concepts from scheme S
<A> skos:prefLabel "animals"; skos:inScheme <S>.
<B> skos:prefLabel "mammals"; skos:inScheme <S>.

# concepts from scheme T
<C> skos:prefLabel "animals"; skos:inScheme <T>.
<D> skos:prefLabel "zoology"; skos:inScheme <T>.

# some possible mapping links from S to T
<A> skos:exactMatch <C>.
<A> skos:relatedMatch <D>.
<B> skos:broadMatch <C>.

Usage scenario C, linking two concept schemes to make one "big" concept scheme, where the two concept schemes have little or no overlap in scope (a.k.a. kos extension)

This usage scenarios is less well explored.

An example of this scenario is where a community uses a "upper level" kos such as LCSH to provide a general framework, then creates some more specific concepts for their own domain as an extension to LCSH.

There are at least two usage options here.

One usage option is: use skos:broader, skos:narrower and skos:related, and use skos:inScheme to "include" concepts from one scheme into the other. This is the pattern described in http://www.w3.org/TR/2008/WD-skos-primer-20080221/#secextension

E.g.

# more general concept scheme S
<S> rdf:type skos:ConceptScheme. 
<A> skos:prefLabel "Animals"; skos:inScheme <S>.
<B> skos:prefLabel "Mammals"; skos:broader <A>; skos:inScheme <S>.

# extension concept scheme T, "including" concepts from S
<T> rdf:type skos:ConceptScheme.
<A> skos:inScheme <T>.
<B> skos:inScheme <T>.
<C> skos:prefLabel "Cats"; skos:broader <B>; skos:inScheme <T>.
<D> skos:prefLabel "Domestic cats"; skos:broader <C>; skos:inScheme <T>.

Another, similar, usage option is: use skos:broader, skos:narrower and skos:related, both within each scheme and to link them together, then coin a URI for the "virtual" linked concept scheme that includes both the parent and the extension.

E.g.

# more general concept scheme S
<S> rdf:type skos:ConceptScheme. 
<A> skos:prefLabel "Animals"; skos:inScheme <S>.
<B> skos:prefLabel "Mammals"; skos:broader <A>; skos:inScheme <S>.

# extension concept scheme T
<T> rdf:type skos:ConceptScheme.
<C> skos:prefLabel "Cats"; skos:inScheme <T>.
<D> skos:prefLabel "Domestic cats"; skos:broader <C>; skos:inScheme <T>.

# virtual combined concept scheme with linked parts U
<U> rdf:type skos:ConceptScheme.
<A> skos:inScheme <U>.
<B> skos:inScheme <U>.
<C> skos:inScheme <U>; skos:broader <B>.
<D> skos:inScheme <U>.

An application can use this data to present a view of just the parent concept scheme only, just the extension scheme only, or the combined scheme with extension links.

A third usage option is: use skos:broadMatch, skos:narrowMatch, skos:relatedMatch and skos:exactMatch to assert the extension links between the kos. This treats the linking (extension) scenario as if it was a mapping scenario.

Note that the boundaries between this scenario and the previous kos mapping scenario can get blurred in some instances.

Usage scenario D, asserting semantic links within someone else's concept scheme (a.k.a. "kos enrichment")

As above, this usage scenarios is less well explored.

An example of this scenario is where a concept scheme is published with little or no internal structure, and a third party asserts some links between the concepts within that scheme, even though they don't "own" the scheme.

As with scenario C, there are at least two usage options.

One usage option is: use semantic relation properties (skos:broader, skos:narrower and skos:related) and use graph provenance to distinguish between "authoritative" and "third party" assertions.

E.g. "authoritative" data...

<A> skos:prefLabel "animals"; skos:inScheme <S>.
<B> skos:prefLabel "mammals"; skos:inScheme <S>.

... and "third party" data ...

<B> skos:broader <A>.

An alternative usage option is: use mapping properties (skos:broadMatch, skos:narrowMatch, skos:relatedMatch). This is the pattern described in the final paragraph of http://www.w3.org/TR/2008/WD-skos-primer-20080221/#secmapping

E.g. "authoritative" data...

<A> skos:prefLabel "animals"; skos:inScheme <S>.
<B> skos:prefLabel "mammals"; skos:inScheme <S>.

... and "third party" data ...

<B> skos:broadMatch <A>.

I think Antoine's rationale for recommending this second usage option in the primer was the idea that the SKOS mapping properties generally carry less "authority", and therefore are appropriate for making "non-authoritative" assertions like this. However, this goes against the intuitive notion that mapping properties by nature link concepts from different schemes.

Usage from the Application Point of View

Another way of looking at the usage question is from the point of view of an application. What assumptions (if any) can an application make about each of the six following graphs...

# Graph 1: semantic relation link between two concepts asserted in same scheme
<A> skos:inScheme <S>.
<B> skos:inScheme <S>; skos:broader <A>.

# Graph 2: semantic relation link between two concepts, not asserted in same scheme
<B> skos:broader <A>.

# Graph 3: mapping link between two concepts asserted in same scheme
<A> skos:inScheme <S>.
<B> skos:inScheme <S>; skos:broadMatch <A>.

# Graph 4: mapping link between two concepts, not asserted in same scheme
<B> skos:broadMatch <A>.

# Graph 5: both semantic relation link and mapping link between two concepts, asserted in same scheme
<A> skos:inScheme <S>.
<B> skos:broadMatch <A>; skos:broader <A>; skos:inScheme <S>.

# Graph 6: both semantic relation link and mapping link between two concepts, not asserted in same scheme
<B> skos:broadMatch <A>; skos:broader <A>.

...?

Remember the open world assumption means that just because two concepts are not asserted to be in the same scheme, we cannot infer that they are not in the same scheme.

One difficulty with this approach is that we haven't clearly defined what an "application" is, and different applications might want to make different assumptions. E.g. some applications might be "mapping aware", others not.

Data Model

Coming to the question of definitions, the primary decision is how the semantic relation and mapping properties are related. There are four main options, which can be characterised as:

skos:broadMatch is a sub-property of skos:broader
skos:broadMatch is disjoint with skos:broader
no relationship between skos:broadMatch and skos:broader is asserted (i.e. say nothing)
skos:broadMatch is a sub-property of skos:broaderTransitive

Summary & Discussion

As you can see, there's quite a complex interaction here between usage conventions and formal definitions. The answers aren't at all obvious to me, I think in part because we have little experience with scenarios C and D. This is one of the main reasons why I am sensitive to Graham Klyne's proposal that the mapping vocabulary be declared as an extension outside the scope of the SKOS Reference, decoupling it's development from the rest of SKOS and giving it a bit more time to evolve.

FWIW, I *think* I prefer the following choices re usage conventions:

Stick with the intuitive notion that mapping properties always link concepts in *different* schemes. This means, for kos mapping (scenario B ) use mapping properties, for all other scenarios use something else.
For kos description (scenario A) use semantic relation properties.
For kos extension (scenario C) use semantic relation properties.
For kos enrichment (scenario D) use semantic relation properties, and if necessary use graph provenance to distinguish between "authoritative" and "third party" assertions.

This also fits with what the SKOS Reference currently says in section 10.6.6, "By convention, the SKOS semantic relation properties are only used to state links between conceptual resources within the same concept scheme, and the SKOS concept mapping properties are only used to state links between conceptual resources in different concept schemes." -- although that statement is slightly stronger.

Re formal definitions, I'm not so sure.

The first question is: is graph 3 inconsistent? I.e. do we formally define mapping properties (e.g. skos:broadMatch) such that they cannot link concepts in same scheme? Or do we say nothing formal about that?

The current SKOS Reference says nothing formal on this point, and so example 68 (analogous to graph 3 above) is formally consistent, even though it contradicts the usage conventions.

The second question is: are graphs 4 and 5 inconsistent? I.e. do we formally define skos:broadMatch as disjoint with skos:broader? If not, how should applications handle these two graphs, which contradict the usage convention? (This question is not discussed in the current SKOS Reference.)

Beyond these two questions, there are quite a large number of further consistency and entailment questions regarding the mapping vocabulary. However, rather than go into them here, I've captured them as a series of test cases in a separate wiki page (see SKOS/TestSuite). Looking at test cases, asking what data we want to be consistent and what do we want to be entailed, then working back to formal definitions which fit the test cases, might be the only way to break this problem down. However, I found 36 consistency tests and 23 entailment tests for which a decision needs to be made -- which is quite a lot of work! (Hence my leaning towards pushing mapping into an extension).

Links

References:

Dependencies:

Discussion & comments: