SKOS Core Guide

Editor's Draft 3 September 2004

This version:: http://www.w3.org/2004/02/skos/core/guide/2004-09-03
Latest version:: http://www.w3.org/2004/02/skos/core/guide/
Previous version:: No previous version.
Editors:: Alistair Miles, CCLRC; Dan Brickley, W3C

Status of this Document

This section describes the status of this document at the time of its publication.

This document is an Editor's Draft for review by the Semantic Web Best Practices and Deployment Working Group (hereafter 'the Working Group') and the participants of the public-esw-thes@w3.org mailing list and is subject to change without notice. This document has no formal standing within W3C. Please consult the Working Group's home page and the W3C technical reports index for information about the latest publications by this group. This document may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.

This document is published by the Semantic Web Best Practices and Deployment Working Group, part of the W3C Semantic Web Activity. The Working Group intends the SKOS Core Guide to become a W3C Working Group Note. However, other outcomes are possible within the framework of the W3C process and will be considered in response to deployment experience and feedback from the W3C membership. The Working Group has discussed the potential for SKOS Core to evolve into possible future W3C Recommendation Track work items, and would value feedback on the level of formal standardization that is appropriate.

We encourage public comments. Please send comments to public-esw-thes@w3.org [archive] and start the subject line of the message with "comment:".

Publication as a Working Draft does not imply endorsement by the W3C Membership.

Part 1 - Basic Features

Introduction

SKOS Core is an RDF vocabulary for describing language-oriented knowledge organisation systems (KOS) such as thesauri. glossaries, terminologies and other types of controlled vocabulary. It is also well suited to hierarchically organised KOS such as taxonomies and classification schemes, where the hierarchy of categories or concepts is not necessarily a logically consistent class subsumption hierarchy. In other words, you can use SKOS Core to describe a set of concepts, categories or subjects of interest, and organise them into a hierarchy, without needing to know anything about logic.

SKOS Core is an RDF vocabulary, which means it gives you a set of basic building blocks for creating an RDF description of your thesaurus, glossary, blog category scheme etc. By publishing an RDF description of your thesaurus on the web, you become a part of the semantic web, which means that not only can other people access and re-use the content of your thesaurus, but so can other computer programs and applications. .

This guide tries to assume the reader knows next to nothing about the Resource Description Framework (RDF) or the Semantic Web. A selection of resources introducing and explaining these ideas is included in the references, if you would like to explore them further.

All the examples in this guide use the RDF/XML serialisation of RDF. It is worth noting that RDF can be written down (serialised) in a number of different formats - see the references for more information.

This guide is a complement to the SKOS Core Vocabulary Specification. The Specification provides and overview and reference guide to the SKOS Core vocabulary. This document tries to explain how to use it.

Declaring Concepts

SKOS Core talks about Concept and Concept Scheme. These terms were chosen because thesauri, terminologies, controlled vocabularies, glossaries etc. can all be modelled as consisting fundamentally of a set of concepts (i.e. ideas, notions). In this guide, Concept is used to refer to the fundamental unit of a thesaurus or controlled vocabulary, and Concept Scheme refers to a set of concepts, and including (optionally) a set of relationships between those concepts.

When it comes to using SKOS Core for things like blog category schemes and web directories, it get's a little hazy as to whether the fundamental units of these things should be called 'concepts' or not. ... @@TODO clarify ... If a thesaurus gets used in a similar way to a blog category scheme or a web directory, then it can be useful to treat their fundamental units in a similar way. So if it helps you to think of a blog category as a 'concept' then that's great. If it doesn't, don't worry about it.

I'm going to use a set of examples to illustrate how to use SKOS Core (and other vocabularies) to build up an RDF description of a concept and a concept scheme.

This snippet of RDF/XML basically says '<http://isegserv.itd.rl.ac.uk/semwebtopics/102> is a skos:Concept'.

It is worth noting that the above example is in fact shorthand for the following:

Declaring Concept Schemes

Labelling Concepts

A concept is quite useless without any labels that help us (as people) identify what it means. It you are only interested in adding a single label to a concept, you can use the rdfs:label property, for example:

However, it can often be useful to assign a preferred label, and then a set of alternative labels. This can be done using the skos:prefLabel and skos:altLabel properties, for example:

The purpose of adding alternative labels can be to help another person find the concept they are looking for. It can also help to further clarify the meaning of a complex concept or subject of interest, for example:

Using SKOS Core for Thesauri

The above example has been adapted from the [@@TODO some thesaurus]. Thesauri such as [whatever] loosely conform to the standards ISO2788 and ANSI Z39.19 which dictate conventions for thesaurus construction and structure. (If you don't know thesauri, these types of thesauri are quite different from 'Roget's' or the thesaurus that comes with MS Word). These thesauri are modelled as consisting of a set of 'Preferred Terms' and a set of 'Non-Preferred Terms'. Preferred terms are mapped to non-preferred terms via a relation called 'Use For (UF)' and non-preferred terms are mapped to preferred terms via an imperative called 'Use (USE)'. An example from [whatever] is :

SKOS Core takes the perspective that such a thesaurus consists of a set of concepts. A preferred term becomes a preferred label for a concept, and the non-preffered relatives become the alternative labels for a concept. So an RDF description of the above example using SKOS Core is as follows:

Symbolic Labels for Concepts

Scope Notes, Definitions, Examples and Depictions

Concept Hierarchies

Associative Relationships Between Concepts

Unique Identifiers for Concepts

The simplest way to talk about a specific concept, category or subject of interest on the semantic web is to give it a Universal Resource Identifer (URI). (@@TODO examples of URIS)

URIs can look alot like URLs (Universal Resource Locators), but there is an important difference - a URI is an identifier for something, whereas a URL tells you how to retrieve something. This can obviously get a bit confusing. However, try to bear in mind that, when we are talking about 'the concept with URI <http://www.example.org/concepts/0023>' we are not talking about a web page - we are just using <http://www.example.org/concept/0023> as a unique identifier for some idea we have in our heads.

You don't have to use HTTP based URIs for your concepts - you can use the INFO URI scheme (@@TODO e.g.) , or even go so far as to register a new URN domain (@@TODO e.g.) . However, the convenient thing about using HTTP based URIs for concepts is that you can arrange it so that, when you plug that URI into a web browser, it does actually fetch and display something for you (i.e. the URI resolves to some web resource). So for example, I could describe a concept with URI <http://isegserv.itd.rl.ac.uk/semwebtopics/124> and then put either a web page at that address, or a service that redirects to a web page. If that web page the tells you about the concept itself, what it is, how it is defined, how it should be used etc. that can be really useful for other people wanting to use your concept scheme.

Ideally, you would like to have a service that supplies a content-negotiable representation of your concept. What this means is that, if a web client program asks for <http://isegserv.itd.rl.ac.uk/semwebtopics/124> and requests content-type 'text/html' (as a normal web browser such as IE or Mozilla always does), then the service returns an HTML description of the concept. If a different web client program asks for <http://isegserv.itd.rl.ac.uk/semwebtopics/124> with content-type 'application/rdf+xml' then the service returns an RDF/XML description of the concept.

Anyway, the choice is up to you. But if you do decide to URIs as identifiers for concepts, you are going to have to decide upon a URI naming convention for your concept scheme and all of your concepts. The convention used throughout the examples in this guide was chosen for purely practical reasons. It is described here as a suggestion only - for a full discussion and recommendation of URI naming conventions, see the upcoming VM TF Note (@@TODO).

Base namespace: HTTP URI base chosen because easy to deploy service for resolution. Must use domain name that you own or are in control of.

Scheme identifier: Here same as base uri. Could be somthing appended to base uri.

Concept identifiers: Defined by appending some unique number or string to base uri. Here slash used throughout, alternative is hash. Slash chosen because frag iDs don't get passed to http servers in http requests - means can't have concept URIs resolving to something unique to that concept. This can be desirable for large schemes.

For alternatives to using URIs, see part 2 (Advanced Features) section on reference by description.