This is an archive of an inactive wiki and cannot be modified.

Indexing properties in SKOS

This page is about ISSUE-48 IndexingRelationship and ISSUE-77 SubjectIndexing

The SKOS model should contain mechanisms to attach a given resource (e.g. corresponding to a document) to a concept the resource is about, e.g. to query for the resources described by a given concept.

Notice: there has recently been a discussion on the SKOS mailing list, starting from [2]

Current situation

The deprecated SKOS WD [3] includes skos:subject, skos:isSubjectOf, skos:primarySubject and skos:isPrimarySubjectOf as "subject indexing properties". There, skos:subject is defined as a sub-property of dc:subject

The current Reference does not mention such a property for now. The current Primer proposes to use dc:subject to associate an arbitrary resource with a skos:Concept [1].


The issue is about, as Alistair has reformulated:

Should these properties be carried forward to the SKOS Reference? Or should they be dropped?

Note: In the following I focus on one property, skos:subject, as all the others are related to it.

1. Which function/semantics are we talking about?

The "subject" name and function are actually questioned [4].

The deprecated SKOS vocabulary introduces the "subject" property to represent that documents are about one or several concepts: its subjects/topics.

But the SKOS Use cases and requirements WD has a more functional formulation:

the link between a given resource (e.g. corresponding to a document) to a concept the resource is about, e.g. to query for the resources described by a given concept.

Some mails on the SKOS mailing list go further:

SKOS specification should stress and explain what the *functional* semantics of this property are, and are not. Simply to *retrieve resources* indexed on a concept. Not to infer any specific semantics on the indexing link. Just : "If you are interested in this concept, here are resources dealing about it in some way". No more, no less. If you want to be more specific, use a specific subproperty. [2]

Is it too simple to say: "Would someone seeking information about concept A wish to have their attention drawn to resource B?" If so, it is useful to create a link between the concept and the resource, normally called "assigning the indexing term used as a label for concept A to resource B".
A resource can be assigned lots of indexing terms - this is not the same as saying that any one or more of these terms is "the subject" of the resource. [10]

A first choice is thus whether we want to consider (one of) the following properties:

2. Is (subject) indexing in the scope SKOS?

The basic choice [15] is

  1. SKOS is about KOS representation only, and it is not feasible/desirable have SKOS representing the indexing link between (possibly very numerous) resources. The functions are just different.
  2. Accessing the resources that are the "extension" of a concept can be very interesting for manipulating the concepts, e.g. for designing standard applications.

Arguments for '''out'''

A first reason is representation adequacy: an indexing property, additionnally to making the vocabulary larger, and may fall outside of what is considered to be the core job of SKOS.

Concept schemes are usually independant of the resources they are used to organise. The concepts are usually presented in the indexed object records, not present in the description of concepts (from Leonard Will, [12]).
We may have millions of resources from various collections pointing to a particular concept URI. It makes sense that metadata of the resource should point to the SKOS concept - but not the opposite. (from Aida Slavic, [14])

A second reason is the redundancy with other vocabularies. If we consider subject indexing, there is no clear motivation to distinguish skos:subject from dc:subject [6]. Other vocabularies, especially in the work related to resource tagging, also propose properties that could represent links betweeen resources and concepts: e.g. sioc:topic ( -- which is btw a subproperty of dc:subject.

Arguments for '''in'''

The first argument for having an indexing property in SKOS is that it matches an identified requirement:

if you want to federate all that in an open world and ask "how is this concept used, by all means" (e.g., on the Web)? If you have a generic pointer, the query is much simpler to write.[13]

It is indeed already used in significant use cases (e.g. dbpedia, [5]

From the vocabulary design perspective, it makes sense that SKOS includes all that is important for the SKOS application domain:

vocabularies should be well-rounded  packages that address all the typical needs of a particular domain. Some duplication with existing vocabularies is acceptable if it makes the new vocabulary more complete. (Of course it's still important to declare subclass- and subproperty relationship to existing vocabularies.) [9]

As lot of SKOS applications are indeed about accessing documents via concepts, it may be helpful for SKOS to have minimal control on the "interface" between conceptual knowledge bases and documentary systems. While of course allowing the application designers to coin their own properties (subproperties of skos:subject) for this.[15]

Note that this concern can be dealt without introducing a new property, if we find a convenient one elsewhere:

We can write a guideline that says which property from another vocabulary should be used as a standard. dc:subject is of course a natural candidate, the problem being perhaps that it is too "subject" oriented, as was criticized in this thread. [15]

One of the indexing property contenders finished half convinced ;-)

In the light of the above - it appears to me that the possibility to  
manipulate concepts under the influence of resources is a valuable addition. [16]

Alistair's additional reasons

Additional Reasons for in:

Additional Reasons for out:

Formal semantics of indexing property

A final discussion concerns the class of objects that should be the domain of a SKOS indexing property. A first thought is to use foaf:Document, but there is a common request to go beyond [5]. [7] proposes rdfs:Resource as domain. Note that we could also have foaf:Document as domain, but assuming a very wide understanding of what is document [8]