This is an archive of an inactive wiki and cannot be modified.

What are the semantics of SKOS, and how can they be specified?

N.B. This page uses the Turtle syntax to express RDF graphs throughout. The prefix ex: is used as an abbreviation for the URI http://www.example.org/eg# and all other prefixes are used according to convention.

Editor's Note (Alistair Miles): Be warned, I am at best an amateur logician - this page may contain errors and improper use of terminology.

Valid Inference

The semantics of SKOS require a notion of valid inference.

For example, the graph:

ex:a skos:broader ex:b.

... should imply the graph:

ex:b skos:narrower ex:a.

... and vice versa (i.e. the extensions of skos:broader and skos:narrower define inverse relations).

For example, the graph:

ex:c skos:related ex:d.

... should imply the graph:

ex:d skos:related ex:c.

... and vice versa (i.e. the extension of skos:related defines a symmetric relation).

Consistency

The semantics of SKOS require a notion of consistency (and inconsistency).

For example, the following graph should be inconsistent:

ex:a skos:prefLabel "foo"@en.
ex:a skos:prefLabel "bar"@en.

(I.e. for any resource, each literal value in the extension of skos:prefLabel must have a different language tag.)

For example, the following graph should be inconsistent:

ex:a skos:prefLabel "foo"@en.
ex:a skos:altLabel "foo"@en.

(I.e. for any resource, the extensions of skos:prefLabel and skos:altLabel must be disjoint.)

Closed World Assumption

Some SKOS applications will produce and consume RDF graphs that express fragments of a concept scheme. For such applications, the open-world assumption is appropriate, because they may be handling incomplete data.

However, some SKOS applications will knowingly produce and consume RDF graphs that express the entirety of a concept scheme. For such applications, making a closed-world assumption will allow the detection of inconsistencies that could not otherwise be detected - an important factor in quality control.

For example, under a closed-world assumption, the following graph should be inconsistent:

ex:a skos:altLabel "foo"@en.

(I.e. for any resource, a literal value in the extension of skos:altLabel implies the existence of a literal value in the extension of skos:prefLabel with the same language tag.)

Optional Conditions

In order for a SKOS concept scheme to be compatible with the ISO 2788 thesaurus standard (and hence usable in applications that conform to this standard), the sets of lexical labels for each concept in the scheme must be all pairwise disjoint. I.e. no two concepts may share a lexical label.

For example, if we assume that ex:a and ex:b denote different concepts, and if the following graph expresses the entirety of a concept scheme, then it is not compatible with ISO 2788:

ex:a skos:prefLabel "foo"@en.
ex:b skos:prefLabel "foo"@en.

For example, if we make the same assumptions as above, then the following graph is not compatible with ISO 2788:

ex:a skos:prefLabel "bar"@en.
ex:b skos:altLabel "bar"@en.

However, other types of controlled vocabulary do not make the same constraints. For example, in some classification schemes and taxonomies, it is perfectly reasonable for two concepts to have the same "caption" (i.e. preferred label) and for users to disambiguate meaning through context (other labels, notations, definitions, semantic relations etc.). For applications that expect this type of controlled vocabulary, the two example graphs given in this section are perfectly acceptable.

Therefore, there may be additional semantic conditions that are specifically required for conformance with ISO 2788, but which are otherwise optional.

(This does not address the problem of how to actually express the semantic conditions of compatibility with ISO 2788, which may also require a closed-world assumption.)

Hierarchies

In RDFS, every class is a sub-class of itself (i.e. the extension of rdfs:subClassOf is reflexive). It is perfectly reasonable for an RDF graph to assert a circular class hierarchy, for example:

ex:e rdfs:subClassOf ex:f.
ex:f rdfs:subClassOf ex:g.
ex:g rdfs:subClassOf ex:f.

I believe this is tantamount to asserting that all three classes are equivalent (i.e. their extensions are identical).

Compare this with skos:broader. I can think of no application for which it is of any value to assume that every concept is broader than itself.

From an operational point of view, it may be of greater value to application developers to specify that no concept may be broader than itself (i.e. that the extension of skos:broader is irreflexive). If this were the case, then the following graph would be inconsistent:

ex:a skos:broader ex:a.

From an operational point of view, it may also be of greater value to application developers to specify that there may be no circularities whatsoever involving skos:broader (i.e. that the transitive closure of skos:broader is irreflexive.) If this were the case, then the following graph would be inconsistent:

ex:a skos:broader ex:b.
ex:b skos:broader ex:a.

A further question is whether skos:broader should itself be a specified as a transitive property. If it were, for example, the following graph:

ex:a skos:broader ex:b.
ex:b skos:broader ex:c.

... would imply the graph:

ex:a skos:broader ex:c.

This may be useful for some applications. However, for other applications that perform graph-based reasoning, the number of step in a hierarchical path is an important metric, which can be lost if skos:broader is transitive.