W3C

SKOS Use Cases and Requirements

W3C Editors' Draft 06 March 2007

This version:
http://www.w3.org/2006/07/SWD/???/usecases/200703??
Latest version:
http://www.w3.org/2006/07/SWD/???/usecases
Previous version:
<previous version uri>
Editors:
Antoine Isaac, Vrije Universiteit Amsterdam, aisaac@few.vu.nl
Jon Phipps, Cornell University, jphipps@madcreek.com
Daniel Rubin, Stanford Medical Informatics, dlrubin@stanford.edu

Abstract

This document provides use cases for SKOS.

Status of this document

This is an internal draft produced by the Semantic Web Deployment Working Group [SWD].

This document is for internal review only and is subject to change without notice. This document has no formal standing within the W3C.

Table of contents


1 Introduction

Knowledge organization systems play a fundamental role in information structuring and access, e.g. for asset description or website organization. Such vocabularies, coming in the form of thesauri, classification schemes, subject heading lists, taxonomies or even folksonomies, are developed and used worlwide, by institutions as well as individuals. However these very important knowledge resources are still mostly isolated from the outside world, and, according to professionnals, not used at their best by the individual systems they are part of.

The development of new information technologies and infrastructures, such as the Web, calls for new ways to create, manage, publish and use these knowledge organization systems. It is especially expected that conceptual schemes will benefit from greater shareability, e.g. by being published via web services. In the meantime, the documentary systems which use them will turn to advanced information retrieval techniques making most of their semantic structure and lexical content.

SKOS (Simple Knowledge Organization System) [SWBP-SKOS-CORE-GUIDE] provides a model to represent and use these vocabularies in the framework of the Semantic Web. A first version has been produced by the Semantic Web Best Practice and Deployment working group [SWBPD], and is already used in some research projects. The Semantic Web Deployment Working Group [SWD] has been chartered to continue this work, namely to "produce guidelines and an RDF vocabulary (SKOS) for transforming an existing vocabulary representation into an RDF/OWL representation" [SWD-Charter].

In order to delimit the scope and elicit the required features for SKOS, the SWD working group has issued a call for use cases, asking for descriptions of existing or planned SKOS applications. Following the gathering of these use cases, the Working Group has elicited a number requirements for SKOS which are motivated by previous work on SKOS or by the contributions received after the call for use cases.

This document gives an account of this process. First, section 2 presents summaries of selected contributions, and pointers to the complete set of cases which were sent to the Working Group. Second, section 3 lists the requirements the Working Group has elicited so far.

@@ TODO: reference to call form, maybe adding it in annex @@

2 Use Cases

2.1 Use Case #1 — An integrated view to medieval illuminated manuscripts

(Contributed by Antoine Isaac. Complete description available at http://www.w3.org/2006/07/SWD/wiki/EucManuscriptsDetailed and at http://www.w3.org/2006/07/SWD/wiki/EucIconclassDetailed)

The purpose of this application is to provide the user with access to two collections of illuminated manuscripts from the Dutch and French national libraries, Medieval Illuminated Manuscripts and Mandragore (accessible online at http://www.kb.nl/manuscripts and http://mandragore.bnf.fr).The descriptions of images from these two collections follow different metadata schemes, and contain values from different controlled vocabularies for subject indexing. The user should however be able to search for items from the two collections using his preferred point of view, either using vocabulary from collection 1 or vocabulary from collection 2.

The main feature of the application is collection browsing, which uses hierarchical links in vocabularies: if a concept matching a query have subconcepts, the documents indexed against these subconcepts should be returned. The application also uses mapping links between the concepts from the two vocabularies. For example, if an equivalence link is found between a query concept from one vocabulary and another concept from the second one, documents indexed by this other concept shall also be included in the query results.

Requires: R-ConceptualRelations, R-IndexingRelationship

Additionally, the application enables search based on free text queries: documents can be retrieved based on free-text querying of the different fields used to describe the documents (creator, place, subject, etc.). For the subject indexing, if a text query matches the label of a controlled vocabulary concept, the documents indexed against this concept will be returned.

The two collections use respectively the Iconclass and Mandragore analysis vocabularies.

Iconclass (http://www.iconclass.nl) contains 28000 items used to describe the subjects of an image (persons, event, abstract ideas). Complete versions are available for English, German, French, Italian, and partial translations for Finnish and Norvegian.

Requires: R-MultilingualLexicalInformation

The main building blocks of Iconclass are subjects, used to describe the subjects of images. An Iconclass subject consists of a notation (an alphanumeric identifier used for annotation) and a textual correlate (e.g. “25F9 mis-shapen animals; monsters”). Subjects are organized in hierarchical trees, as in the following extract:

2 Nature

25 earth, world as celestial body

25F animals

25F(+) KEY

25F1 groups of animals

….

25F9 mis-shapen animals; monsters

25FF fabulous animals (sometimes wrongly called 'grotesques'); 'Mostri' (Ripa)

Subjects can have associative cross-reference links between them (systematic references) and are linked to keywords that are used to search for them in Iconclass tools. Keywords form a network of their own, featuring see links (from one non-preferred keyword, not attached to any subject, to a preferred one), see also links (between keywords that are semantically or iconographically related) and translation links (between keywords in different languages).

Requires: R-LabelRepresentation, R-RelationshipsBetweenLabels

Iconclass additionally provides with auxiliary mechanisms for subject specialisation at indexing time. These actually allow for collection-specific vocabulary extension:

Requires: R-ConceptSchemeExtension, R-SkosSpecialization, R-IndexingAndNonIndexingConcepts, R-ConceptCoordination

Maintenance of vocabulary is done via manual edition of semi-structured source files. As a general rule, the standard version shall only be changed in a conservative way, not modifying the existing subjects.

Mandragore contains 16000 subjects. 15800 are descriptors, which are used to describe the illuminations and form a flat list. Additional structure is given by 200 abstract topic classes which form a hierarchy organizing the descriptors according to general domains, but cannot themselves be used to describe documents:

ZOOLOGIE

.zoologie (généralités)

.mollusques

.mammifères

cochon [mammifère ongulé]

girafe [mammifère ongulé]

A descriptor is specified by a French label (“cochon”, for pig), optional rejected forms (“porc”), an optional definition (“mamifère ongulé”, hoofed mammal) and a reference to one or more topic classes (“.mammifères”, mammals). A note can sometimes be found as an complementary definition.

To enable integrated browsing, elements from Mandragore and Iconclass vocabularies must be linked together using equivalence or specialization links as in the following:

25F72 molluscs (Iconclass) is equivalent to mollusques (Mandragore)

25F711 insects (Iconclass) is more specific than autres invertébrés (vers,arachnides,insectes...) ("other invertebrates (worms, arachnida, insects", Mandragore)

11U4 Mary and John the Baptist together with (e.g. kneeling before) the judging Christ, 'Deesis' ~ Last Judgement (Iconclass) is equivalent to the combination of subjects s.marie, s.jean.baptiste, christ and jugement.dernier (Mandragore)

25F(+441) herd, group of animals (Iconclass) is equivalent to troupeau (Mandragore)

Requires: R-ConceptualMappingLinks

2.2 Use Case #2 — Bio-zen ontology framework for representing scientific discourse in life science

(Contributed by Matthias Samwald, Medizinische Universität Wien. Complete description available at http://www.w3.org/2006/07/SWD/wiki/EucBiozenDetailed)

Bio-zen (http://neuroscientific.net/index.php?id=43) allows the description of biological systems and the representation of scientific discourse on the web in a highly distributed manner. It is intended to be used by researchers and developers in the life sciences.

SKOS is used in bio-zen for the representation of many existing life sciences vocabularies, taxonomies and ontologies coming from the "Open Biomedical Ontologies" (OBO) collection (http://www.fruitfly.org/~cjm/obo-download/). The size of all converted taxonomies taken together is in the order of millions of concepts. Typical examples are the Gene Ontology or Medical Subject Headings (MeSH), an entry of which is displayed here:

id MESH:A.01.047.025
name abdominal_cavity
def "The region in the abdomen extending from the thoracic DIAPHRAGM to the plane of the superior pelvic aperture (pelvic inlet). The abdominal cavity contains the PERITONEUM and abdominal VISCERA\, as well as the extraperitoneal space which includes the RETROPERITONEAL SPACE." [MESH:A.01.047.025]
synonym abdominal_cavity
synonym cavitas_abdominis
is_a MESH:A.01.047 ! abdomen

To represent such vocabulary elements as well as other type of information, the existing SKOS model has been integrated into a single OWL ontology, together with the DOLCE fundational ontology and the Dublin core metadata model. In the process, the SKOS model has been extended with special types of concepts, e.g. biozen:sequence-concept. To enable efficient reasoning with the available dataset, it is important to notice that existing constructs have been made compatible with the OWL-DL language.

Requires: R-CompatibilityWithOWL-DL

The bio-zen framework will consist of several applications, especially Semantic Wikis. A Bio-zen ontology incorporates constructs to make statements about digital information resources, that is creating "concept tags". This concept-tagging is an important feature of bio-zen, because it eases the integration of information from different sources.

Requires: R-IndexingRelationship

2.3 Use Case #3 — Semantic search service accross mapped multilingual thesauri in the agriculture domain

(Contributed by Margherita Sini and Johannes Keizer, Food and Agriculture Organization. Complete description available at http://www.w3.org/2006/07/SWD/wiki/EucAimsDetailed)

This application coming from the AIMS project (http://www.fao.org/aims) is a semantic search service that makes use of mapped agriculture thesauri. It allows users to search any available terminology in any of the languages the thesauri are provided and retrieve information from resources which may have been indexed by one of the mapped vocabularies. Typical functions are navigating resources, helping to build boolean searches via concept identification or expanding given searches by extra languages or synonyms.

Requires: R-IndexingRelationship

The service builds on several agriculture vocabularies: the Agrovoc Thesaurus (http://www.fao.org/aims/ag_intro.htm), the Agris/Caris Classification Scheme (ASC), the FAO Technical Knowledge Classification Scheme (TKCS), the subjects from the FAOTERM vocabulary, etc.

Agrovoc contains 35000 terms in 12 languages (not all the languages feature the same translated terms, however), while ASC, TCKS and FAOTERM range between 100 and 200 categories coming in the 5 official FAO languages. Agrovoc terms consist of one or more words and represent always one and the same concept. Terms are divided into Descriptors and non-descriptors, the first ones being the only currently used for indexing. For each descriptor, a word block is displayed showing the relation to other terms: BT (broader term), NT (narrower term), RT (related term), UF (non-descriptor). There are also scope notes, used to clarify the meaning of both descriptors and non-descriptors.

Term code 1939
Term label EN : Cows, FR : Vache, ES : Vaca, AR : بقرات , ZH : 母牛 , PT : Vaca, CS : krávy, JA : 雌牛 , TH : แม่โค , SK : kravy, DE : KUH
BT Cattle (code 1391)
NT Suckler cows, Dairy cows (26767, 36875)
RT Heifers, Cow milk, Milk yielding animals, Females (3535, 4833, 15969, 16080)
SNR Females (15969)
Scope Note Use only for cattle and zebu cattle; for other species use "Females" (15969) plus the descriptor for the species

Requires: R-ConceptualRelations, R-LabelRepresentation, R-TextualDescriptionsForConcepts, R-MultilingualLexicalInformation

Actually, the AIMS project includes some more specific links, presented in http://www.fao.org/aims/cs_relationships.htm: Concept-to-Concept relationships (subclass of; caused by; member of; part of), Term-to-Term relationships (related term; synonym; translation) and String-to-String relationships (spelling variant; acronym). Examples of such links are:

synonym bucket pail
abbreviation_of Corp. Corporation
acronym Food and Agriculture Organization FAO
spelling_variant organisation organization
translation vache cow
scientific_taxonomic_name African violet Saintpaulia

Requires: R-SkosSpecialization, R-RelationshipsBetweenLabels

Currently the Agrovoc management system lacks distributed maintenance, but it is expected that a new system will soon solve this problem, which is crucial since changes are made by experts from all over the world.

For AIMS, Agrovoc has been converted into SKOS (ftp://ftp.fao.org/gi/gil/gilws/aims/kos/agrovoc_formats/skos/2006) and is being mapped to two other vocabularies: the Chinese Agricultural Thesaurus (CAT) and the National Agricultural Library thesaurus (NAL). This mapping uses links inspired by the SKOS mapping vocabulary [SWBP-SKOS-MAPPING], as below:

CAT-ID CAT-EN Map AG-ID AG-EN AG-ID AG-EN
30854 Senta flammea Exact 9748 Cheena
50008 Mayetola destructor Exact-OR 24260 Triticale (gramineae) 7949 Triticales (product)
1160 Two-shear sheep NT1 3662 Hordeum vulgare

Requires: R-ConceptualMappingLinks

2.4 Use Case #4 — Supporting product life cycle

(Contributed by Sean Barker, BAE Systems. Complete description available at http://www.w3.org/2006/07/SWD/wiki/EucProductLifeCycleSupportDetailed)

The problem of the Product Life Cycle Support (PLCS) application is to integrate a network of interconnected supply chains, with multiple, large customers buying a wide range of products (shoes to aircraft) each dictating their own standards, and with every supplier being part of multiple supply chains. Each customer wants to maintain a common approach over all its supply chains. And each supplier wants to maintain the same system for each of the supply chains it works in.

The aim of this application is to propose a data exchange mechanism for managing the life support of complex products (http://www.oasis-open.org), including configuration definition, maintenance definition, maintenance planning and scheduling, maintenance and usage recording (including configuration change).

For that, an upper ontology of several hundred items for description of product life cycle will be defined. There is no chance of the entire supply system (10,000's of businesses) developing a single detailed model. However, given the upper ontology, they will be free to specialize individual ontology terms (playing the role of place holders for local extension) to meet their precise needs.

PLCS is conceptually a co-operatively developed web in XML, with the live version being a set of run time views assembled from files submitted by a dozen or so contributors. It may be useful, where ontologies diverge, to map terms between the diverging branches, either to indicate where terms can be harmonized to their equivalent, or to identify that there is no exact equivalence.

Requires: R-ConceptualRelations, R-ConceptSchemeExtension, R-ConceptualMappingLinks

PLCS vocabulary addresses hundreds of separate functions, including classification of items, classification of information usages (e.g. types of part identifier), classification of entity roles (e.g. date as start date) or classification of relationships (e.g. supersedes). Typical examples of terms are:

Identification_code An Identification_code is an identifier_type which is encoded according to some convention. Typically but not necessarily concatenated from parts each with a meaning. E.g. tag number, serial number, package number and document number.
Part_identification_code A Part_indentfication_code is a Identification_code that identifies the types of parts. For example, a part number.

CONSTRAINT: An Identification_assignment classified as a Part_identification_code can only be assigned to Part Organization_name

Owner_of An Owner_of is an Organization_or_person_in_organization_assignment that is assigning a person or organization to something in the role of owner.

For example, the owner of the car.

The vocabulary has been encoded using OWL, and is managed via the Protege OWL editor.

Requires: R-TextualDescriptionsForConcepts

2.5 Use Case #5 — CHOICE@CATCH ranking of candidate terms for description of radio and TV programs

(Contributed by Véronique Malaisé and Hennie Brugman, Vrije Universiteit Amsterdam and Max Planck Institute for Psycholinguistics. Complete description available at http://www.w3.org/2006/07/SWD/wiki/EucRankingForDescriptionDetailed and at http://www.w3.org/2006/07/SWD/wiki/EucGtaaBrowser)

Radio and television programs at the Dutch national broadcasting archive (Sound and Vision) are typically associated with contextual text descriptions: web site texts, subtitles, program guide texts, texts from the production process, etc. These context documents are used by documentalists at Sound and Vision who manually describe programs using concepts from the GTAA thesaurus (Gemeenschappelijke Thesaurus Audiovisuele Archieven - Common Thesaurus for Audiovisual Archives).

The CHOICE project (part of the Dutch CATCH research programme) uses natural language processing techniques to automatically extract candidate GTAA terms from the context documents. The application focused in this section takes these candidate terms as input, and ranks them on basis of the structure of the GTAA thesaurus. For example, the fact that "Voting" and "Democratization" are related in GTAA by a two-step path (via the "Election" term and two "related-to" links) will influence positively the ranking of these terms. Ranked terms will be presented to documentalists to speed up their description work.

The GTAA vocabulary covers a wide range of topics, as it is meant to describe anything that can be broadcasted on TV or radio. It contains approximately 160.000 terms, divided in 6 disjoint facets: Keywords, Locations, Person Names, Organization-Group-Other Names, Maker Names and Genres.

The thesaurus mainly uses constructs from the ISO 2788 standard, like Broader Term, Narrower Term, Related Term and Scope Notes. Terms from all facets of the GTAA may have Related Terms, Use/Use for and Scope Notes, but only Keywords and Genres can also have a Broader Term/Narrower Term relations, organizing them into a set of hierarchies. Additionally to these standard features, Keywords terms are thematically classified in 88 subcategories of 16 top Categories.

Preferred Term ambachten (crafts)
Related Terms ondernemingen (ventures) , beroepen (professions), artistieke beroepen (artistic professions)
Broader Term beroepen (professions)
Narrower Terms boekbinders (bookbinders), bouwvakkers (building workers), glasblazers (glassblowers)
Scope Note niet voor afzonderlijke ambachten maar alleen als verzamelbegrip, bijv. voor (markten van) oude ambachten (not for specific crafts, only in general meaning, e.g. (markets of) old crafts)
Categories 05 economie (economy), 09 techniek (technique)

Requires: R-ConceptualRelations, R-LabelRepresentation, R-SkosSpecialization

The application, envisionned as a SOAP web service, uses a Sesame RDF web repository containing the SKOS version of the GTAA thesaurus to retrieve the 'term contexts' of the terms in the input list, which is stored in a local RDF repository.

This term context includes, for one given term, all terms that are directly connected to it by broader term, narrower term or related term relations. This includes pre-computed inter-facets links that are not part of the ISO standard, though allowed by the GTAA data model. For example, one can link a "King" in the Person facet to the general subject "Kings" and the country which this King rules.

For the ranking, it is now assumed that candidate terms that are mutually connected by thesaurus relations (directly or indirectly) are more likely to be good descriptions than isolated candidate terms. Later on, it might be interesting to differentiate between types of thesaurus relations, or to use more complex patterns of these relations.

The thesaurus-based recommendation system can also be integrated with a recommendation system that is based on co-occurences between terms that are used in previously existing descriptions of programs.

2.6 Use Case #6 — BIRNLex: a lexicon for neurosciences

(Contributed by William Bug, Drexel University College of Medicine. Complete description available at http://www.w3.org/2006/07/SWD/wiki/EucBirnLexDetailed)

Application

General purpose and services to the end user

BIRNLex is an integrated ontology+lexicon used for various purposes - some end-user/interactive, others back-end/infrastructure - within the the BIRN Project to support semantically-formal data annotation, semantic data integration, and semantically-driven, federated query resolution.

Requires: R-ConceptualMappingLinks, R-IndexingRelationship, R-LexicalMappingLinks

Functionality examples

Here a few examples of BIRNLex class definitions that illustrate the need for lexical support and links to external knowledge sources. Our general design goals have been to use both the Dublin Core MD elements and SKOS where ever possible. Preferably we'd like to use SKOS for all lexical qualities. There are certain annotation properties that should be shared across all biomedical knowledge resources. There are other required elements specific to our needs in BIRN.

@@ TODO?: condense information (removing some fields already present, of choose 2 classes having all the fields present in these 4 ? @@

Class Anterior_ascending_limb_of_lateral_sulcus
birn_annot:birnlexCurator Bill Bug
birn_annot:birnlexExternalSource NeuroNames
birn_annot:bonfireID C0262186
birn_annot:curationStatus raw import
birn_annot:neuronames ID 49
birn_annot:UmlsCui C0262186
obo_annot:createdDate "2006-10-08"^^http://www.w3.org/2001/XMLSchema#date
obo_annot:modifiedDate "2006-10-08"^^http://www.w3.org/2001/XMLSchema#date
skos:prefLabel Anterior_ascending_limb_of_lateral_sulcus
skos:scopeNote human-only

Class Medium_spiny_neuron
birn_annot:birnlexCurator Maryann Martone
birn_annot:birnlexDefinition The main projection neuron found in caudate nucleus, putamen and nucleus accumbens...
birn_annot:bonfireID BF_C000100
birn_annot:curationStatus pending final vetting
dc:source Maryann Martone
obo_annot:createdDate "2006-07-15"^^http://www.w3.org/2001/XMLSchema#date
obo_annot:modifiedDate "2006-09-28"^^http://www.w3.org/2001/XMLSchema#date
skos:prefLabel Medium_spiny_neuron

Class Fear
birn_annot:birnlexCurator Jessica Turner
birn_annot:birnlexExternalSource UMLS
birn_annot:bonfireID C0015726
birn_annot:curationStatus uncurated
birn_annot:UmlsCui C0015726
obo_annot:externallySourcedDefinition Unpleasant but normal emotional response to genuine external danger or threats; compare with ANXIETY and CLINICAL ANXIETY. (CSP)
obo_annot:externallySourcedDefinition The affective response to an actual current external danger which subsides with the elimination of the threatening condition. (MeSH)
obo_annot:createdDate "2006-06-01"^^http://www.w3.org/2001/XMLSchema#date
obo_annot:modifiedDate "2006-10-11"^^http://www.w3.org/2001/XMLSchema#date
skos:prefLabel Fear

Class Forebrain
birn_annot:birnlexCurator Allan MacKenzie-Graham
birn_annot:birnlexExternalSource NeuroNames
birn_annot:bonfireID C0085140
birn_annot:curationStatus pending final vetting
birn_annot:UmlsCui C0085140
birn_annot:birnlexDefinition The part of the brain developed from the most rostral of the three primary vesicles of the embryonic neural tube and consisting of the Diencephalon and Telencephalon.
birn_annot:neuronamesID 8
obo_annot:synonym prosencephalon
obo_annot:createdDate "2006-07-15"^^http://www.w3.org/2001/XMLSchema#date
obo_annot:modifiedDate "2006-09-28"^^http://www.w3.org/2001/XMLSchema#date
skos:prefLabel Forebrain

Requires: R-CompatibilityWithDC, R-CompatibilityWithOWL-DL, R-ConceptualRelations, R-LabelRepresentation, R-ConceptSchemeExtension

Application architecture

The following is a subset of tools either extant or in the offing:

In all of these applications, it is critical to have a clear, distinct, and shared representation for the associated lexicon. For instance, when integrated BIRN segmented brain images with those from other projects across the net, use of lexical variants from a variety of public terminilogies and thesauri such as SNOMED and MeSH can provide a powerful means to largely automate semantic integration of like entities - e.g., corresponding brain region, equivalent behavioral assays described using different preferred labels/names. In provided a community shared formalism for representing the associated lexicon, SKOS can greatly simplify this task. If, for instance, the lexical repository (collection of LUIs) contained in UMLS were represented according to SKOS, this would provide an extremely valuable resource to the community of semantically-oriented bioinformatics researchers, as well as a powerful tool to support LSI/NLP when linking to unstructured text.

Vocabularies

Titles of Vocabularies

The following are the collection of terminologies and ontologies we are linking into BIRNLex: Neuronames, Brainmap.org classification schemes, RadLex, Gene Ontology, Reactome, OBI, PATO, Subcellular Anatomy Ontology (CCDB - http://ccdb.ucsd.edu/), MeSH

General characteristics of the vocabularies

Neuronames: brain anatomy (~750 classes and 1000s of associated lexical variants) Brainmap.org classification: hierarchies to describe neuroanatomy, subject variables, stimulus conditions, and experimental paradigms associated with functional MRI of the nervous system Subcellular Anatomy Ontology: designed to describe the subcellular entities associated with ultrastructural and histological imaging of neural tissue.

Language(s) in which the vocabulary is provided

We currently are only dealing with English.

Software applications used to create and/or maintain the vocabulary, features lacking for the case

Protege-OWL.

Requires: R-CompatibilityWithOWL-DL

Standards and guidelines considered during the design and construction of the vocabulary

We have been working close with the NCBO to adopt the OBO Foundry recommendations in the construction of our ontology. Use of SKOS elements has been a big help to us here, so that, for instance, we can create software applications specifically designed to draw on "skos:prefLabel", "obo_annot:synonym", "obo_annot:definition", etc.

Management of changes

Currently we are doing this manually in Protege-OWL, but, as mentioned above, we are moving toward a client-server infrastructure that will created an RDF-based backend store and support both curation of the ontology and annotation using the ontology via Java Portlet-based applications. BIRN has a core infrastructure staff dedicated to use of the GridSphere Java Portlet implementation framework (www.gridsphere.org).

2.7 Use Case #7 — Radlex: a lexicon for radiology

(contributed by Curt Langlotz. Complete description available at http://www.w3.org/2006/07/SWD/wiki/EucRadLexDetailed)

Application

General purpose and services to the end user

RadLex provides a structured vocabulary of terms used in the field of radiology. Currently completed are listings of anatomic terms and "findings", which includes things that can be seen on or inferred from images produced by radiologists. These two sets include a total of about 7500 terms. A list of the terms used to describe the creation of such images, including information about the equipment used and the various imaging sequences performed, will be complete by the end of 2007.

Functionality examples

An example application demonstrating functionality is an image annotation program that reads in RadLex and provides users ability to search for and use particular RadLex terms to associate with images, post-coordinating them if necessary. Users would want to be able to retrieve RadLex terms by name or synonym.

Requires: R-ConceptualRelations, R-LabelRepresentation, R-TextualDescriptionsForConcepts, R-ConceptCoordination

Vocabularies

Titles of Vocabularies

RadLex: a lexicon for radiology

General characteristics of the vocabularies

RadLex is a taxonomy currently built predominantly using isa relations, but there are also part-of and other relations (especially for anatomy), and new relations will be added as RadLex expands. Each term has a rich set of metadata fields to include provenance information and terminological data such as synonyms, definition, and related terms from other vocabularies.

Requires: R-ConceptualRelations, R-AnnotationOnLabel

Structure of the Vocabulary

The vocabulary can be searched and browsed online at www.radlex.org.

Each term has metadata, including:

and optionally, any

Requires: R-ConceptualRelations, R-AnnotationOnLabel, R-RelationshipsBetweenLabels, R-LexicalMappingLinks

There are 9 separate hierarchies in the vocabulary: Treatment; Image acquisition, Processing and Display; Modifier; Finding; Anatomic Location; Uncertainty (to be renamed Certainty); Teaching Attribute; Relationship; and Image Quality(as seen in the screenshots above).

Each term is given a numerical ID with no inherent semantics. There are currently no relations holding between terms in different hierarchies, though this could be developed in future (e.g. linking of particular Findings to potential Anatomic Locations.)

RadLex will be available in OWL-DL

Requires: R-CompatibilityWithOWL-DL

 

The relationships used among terms include:

For instance, in Example 3, “nervous system” has a part called “brain”, and “nervous system” contains “nervous system spaces”. The view of the hierarchy itself does not reveal the relationships among the terms; this information is found within the term features, shown in this format on the right-hand side. In this framework, the hierachy is generated from the different relationships amongst terms, using either SPARQL or a custom interface to an application that consumes the terminology.

Language(s) in which the vocabulary is provided

English, with plans to include other languages (e.g., German)

Requires: R-MultilingualLexicalInformation

Machine-readable representation of the vocabulary

Protégé and XML version of the vocabulary are available at http://www.radlex.org/radlex/docs/downloads.html

Software applications used to create and/or maintain the vocabulary, features lacking for the case

Protege

Standards and guidelines considered during the design and construction of the vocabulary

We used basic guidelines from Cimino and Chute, such as ensuring that a term only corresponds to one concept. As we are developing the terminology into a more structured form, with more types of relationships, we are allowing different parents, as long as the relationship type is different. E.g. one ISA parent, one PART-OF parent, etc. We relied on SNOMED and the American College of Radiology Index as a starting point for terminology development.

Management of changes

Potential changes are submitted to the chair of the RadLex Steering Committee of the Radiological Society of North America, who consults with the relevant lexicon development committee. Accepted changes are periodically incorporated into the vocabulary. The first release was made public in November 2006.

Vocabulary mappings

We are developing a mapping to the corresponding terms/codes in SNOMED (Systematized Nomenclature of Medicine) and the ACR (American College of Radiology) Index.

From a representational point of view, this mapping shall consist of equivalence and specialization links. Later, we expect people to compose atomic terms (post-coordination) to describe composite entities.

Requires: R-ConceptCoordination

2.8 Use Case #8 — NSDL Metadata Registry

(Contributed by Jon Phipps, Cornell University.Complete description available at http://www.w3.org/2006/07/SWD/wiki/RucMetadataRegistryExtended)

The NSDL Registry is intended to provide a complete vocabulary development and management environment for development of controlled vocabularies. Services are primarily directed at vocabulary owners and include provisions for:

The registry currently has a number of vocabularies registered. A sample entry of a vocabulary/scheme and a single concept is below...

<?xml version="1.0" encoding = "UTF-8"?>
<rdf:RDF
    xmlns="http://www.w3.org/2004/02/skos/core#"
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
    xmlns:skos="http://www.w3.org/2004/02/skos/core#"
    xmlns:dc="http://purl.org/dc/elements/1.1/">
<!-- Scheme: NSDL Education Level Vocabulary -->
    <skos:ConceptScheme rdf:about="http://metadataregistry.org/uri/NSDLEdLvl">
        <dc:title>NSDL Education Level Vocabulary</dc:title>
        <skos:hasTopConcept rdf:resource="http://metadataregistry.org/uri/NSDLEdLvl/1000"/>
        <skos:hasTopConcept rdf:resource="http://metadataregistry.org/uri/NSDLEdLvl/1018"/>
    </skos:ConceptScheme>

<!-- Concept: Pre-Kindergarten  -->
    <skos:Concept rdf:about="http://metadataregistry.org/uri/NSDLEdLvl/1002">
        <skos:inScheme rdf:resource="http://metadataregistry.org/uri/NSDLEdLvl"/>
        <skos:prefLabel>Pre-Kindergarten</skos:prefLabel>
        <skos:altLabel>Nursery School</skos:altLabel>
        <skos:altLabel>Pre-K</skos:altLabel>
        <skos:altLabel>Preschool</skos:altLabel>
        <skos:broader rdf:resource="http://metadataregistry.org/uri/NSDLEdLvl/1001"/>
        <skos:definition>Activities and/or experiences that are intended to effect developmental changes in children, from birth to entrance in kindergarten (or grade 1 when kindergarten is not attended) [ERIC]</skos:definition>
        <skos:historyNote>Term source: http://www.ed.gov</skos:historyNote</skos:historyNote>
        <skos:broader rdf:resource="http://metadataregistry.org/uri/NSDLEdLvl/1001"/>
    </skos:Concept>
</rdf:RDF>

Other use cases

The SWD Working Group maintains on its wiki site the complete list of descriptions that were sent following its call for use cases:

@@ TODO: update/remove this list, depending on final decision wrt. this (still unedited) contributions @@

Other case descriptions, in original format:

Requirements

The use cases presented in the previous section motivate a number of requirements that SKOS specification must or should meet in order to fulfill its aim as a standard model for porting simple concept schemes on the semantic web. Depending on the level of consensus they have reached in the Working Group, these requirements are categorized into accepted and candidate ones.

Notice: in the following, to avoid ambiguities, vocabulary will be used to refer to the SKOS vocabulary, that is, the set of constructs (classes, properties) introduced in the SKOS model. Concept Scheme will be used to refer to the objects built with SKOS, that is, the application-specific collections of concepts that are mentioned in SKOS use cases.

Accepted requirements

R-ConceptualRelations. Representation of relationships between concepts

The SKOS model shall provide with semantic relations that can hold between concepts, for display or search purposes. Typical examples are the hierarchical relations broader than (BT), narrower than (NT) and the non-hierarchical associative relation related to (RT).

Motivation: Tgn, Manuscripts, Aims, ProductLifeCycleSupport, RankingForDescription, etc.

R-LabelRepresentation. Representation of basic lexical values (labels) associated to concepts

The SKOS model shall provide means to represent the labels (preferred or not) of a concept, for display or search purposes.

Motivation: Tgn, Manuscripts, Aims, RankingForDescription, etc.

R-TextualDescriptionsForConcepts. Representation of textual descriptions attached to concepts

The SKOS model shall provide means to represent all kind of descriptive notes that could help understanding the elements of concept schemes, e.g. scope notes explaining the way concepts are used to describe documents.

Motivation: Aims, ProductLifeCycleSupport, TacticalSituationObject, BirnLexDetailed, etc.

R-MultilingualLexicalInformation. Representation of lexical information in multiple natural languages

The lexical information specified in concept schemes (labels, but also definitions and notes) could come in different natural languages. A typical example is the case of a multilingual concept scheme with concepts having labels (eventually partially) translated in several languages.

Motivation: Manuscripts , Aims, RadLex

R-ConceptSchemeExtension. Extension of concept schemes

A concept scheme might be locally extended with new concepts referring to existing ones, e.g. as specializations of these.

Motivation: Manuscripts, BirnLex, ProductLifeCycleSupport

R-SkosSpecialization. Local specialization of SKOS vocabulary

For particular situations, the designer of a SKOS concept scheme should be able to introduce new model-level classes and properties, and link them to existing SKOS constructs. Possible cases include the creation of specific kinds of textual definitions or notes for concepts, or the specification of new types of concepts.

Motivation: Manuscripts, Tgn, Aims, Biozen, RankingForDescription

@@ Linked to SKOS-I-extension-6, SKOS-I-SpecializationOfRelationships @@

@@ TODO: shall it be moved to candidate, as there is 2 linked issues? @@

In order to build links between concepts coming from different concept schemes, SKOS should provide with proper semantic relationships. Possible links, similarly to the ones found existing SKOS and SKOS mapping [SWBP-SKOS-MAPPING] vocabularies, include concept equivalence and specialization/generalization relations.

Motivation: Manuscripts, Aims, ProductLifeCycleSupport, BirnLex

Candidate requirements

R-RelationshipsBetweenLabels. Representation of links between labels associated to concepts

The SKOS model shall provide means to represent relationship between the terms associated to concepts. Typical examples are translation links between labels from different languages, or the link between one label and its abbreviation, when this one stands for an alternative label for the concept.

Motivation: Manuscripts, Aims, RadLex

R-AnnotationOnLabel. Ability to represent annotations on lexical items

Labels, which are currently modeled as literals in SKOS, as well as possibly other literals, are valid subjects of discourse when modelling concept schemes, e.g. when recording the dates during which a particular label was in common use. However, in RDF, only resources may be subjects of statements, and literals may only be objects of statements. The question then arises, how are we to annotate labels and other literals, that is to relate them, as subjects, to other entities.

Motivation: RadLex

@@ Linked to SKOS-I-AnnotationOnLabel @@

R-IndexingAndNonIndexingConcepts. Ability to distinguish between concepts to be used for indexing and for non-indexing

SKOS should provide different classes for the conceptual entities that can be used for indexing resources and for these that cannot be used for such a purpose (e.g. specific qualifiers that can only be used to narrow down the meaning of an existing concept).

Motivation: Manuscripts, UDC, Rameau

@@ Linked to SKOS-I-IndexingAndNonIndexingConcepts, SKOS-I-coordination-8 @@

R-GroupingInConceptHierarchies. Ability to include grouping constructs in concept hierarchies in thesauri

Concept schemes can contain elements (arrays, guide terms, etc.) used to group normal concepts together, e.g. based on a shared semantic property. While these special elements cannot be used for description purposes, they can be introduced in a concept scheme's hierarchy by means of generalization and specialization links.

@@ Linked to SKOS-I-GroupingInConceptHierarchies, SKOS-I-collections-5 @@

R-ConceptCoordination. Coordination of concepts

SKOS should provide the ability to create new concepts from existing ones, e.g. by using special qualifiers that add a shade of meaning to a normal concept.

Motivation: Manuscripts, RadLex, UDC, Rameau

R-IndexingRelationship. Ability to represent the indexing relationship between a resource and a concept that indexes it

The SKOS model should contain mechanisms to attach a given resource (e.g. corresponding to a document) to a concept the resource is about, e.g. to query for the resources described by a given concept.

Motivation: Manuscripts, Biozen, Aims, BirnLex

@@ Linked to SKOS-I-IndexingRelationship @@

In the process of mapping different concept schemes, it should be possible to identify correspondence links not only between concepts from these concept schemes, but also between the labels that can be attached to these concepts.

Motivation: RadLex, BirnLex

@@ Linked to SKOS-I-LexicalMappingLinks @@

R-CompatibilityWithDC. Compatibility between SKOS and Dublin Core Abstract Model

Using SKOS model shall be compatible with using Dublin Core Abstract Model [DCAM]. When there are links between SKOS features and Dublin Core ones, these shall be specified.

Motivation: BirnLex

@@ Linked to SKOS-I-CompatibilityWithDC @@

R-CompatibilityWithISO11179. Compatibility between SKOS and ISO11179[Part 3]

SKOS model shall be compatible with part 3 of ISO 11179 specifications [ISO11179-3].

@@ Linked to SKOS-I-CompatibilityWithISO11179 @@

R-CompatibilityWithISO2788. Compatibility between SKOS and ISO2788

SKOS model shall be compatible with ISO 2788 specifications [ISO2788].

@@ Linked to SKOS-I-CompatibilityWithISO2788 @@

R-CompatibilityWithISO5964. Compatibility between SKOS and ISO5964

SKOS model shall be compatible with ISO 5964 specifications [ISO5964].

@@ Linked to SKOS-I-CompatibilityWithISO5964 @@

R-CompatibilityWithOWL-DL. OWL-DL compatibility

SKOS model should comply with OWL-DL. This may require OWL to allow subproperties of annotation properties, but there are other solutions.

Motivation: Biozen, BirnLex, RadLex

@@ Linked to SKOS-I-owlImport-7, SKOS-I-Semantics-10 @@

R-ConsistencyChecking. Checking the consistency of a concept scheme

Some SKOS applications might require testing the integrity of their concept scheme data. For example, conceptual relationships should only apply to individuals of type skos:Concept, and not for example between the (non-preferred) labels of concepts.

Motivation: GtaaBrowser, MetadataRegistry

@@ Linked issue: SKOS-I-Semantics-10 @@

R-ConceptSchemeContainment. Ability to explicitly represent the containment of any SKOS individual or statement within a concept scheme

It shall be possible to explicitly represent the containment of any individual which is an instance of a SKOS class (e.g. skos:Concept) or statement that uses SKOS property as predicate (e.g. skos:broader) within a concept scheme.

@@ Linked to SKOS-I-ConceptSchemeContainment @@

R-MappingProvenanceInformation. Ability to record provenance information on mappings between concepts in different concept schemes

It shall be possible to record provenance information on mappings between concepts in different concept schemes.

Motivation: MetadataRegistry

@@ Linked to SKOS-I-MappingProvenanceInformation @@

References

[DCAM]
DCMI Abstract Model, A. Powell, M. Nilsson, A. Naeve, P. Johnston, 7 March 2005.
[ISO11179-3]
ISO/IEC 11179-3: 2003(E), Information Technology – Metadata Registries (MDR) – Part 3: Registry metamodel and basic attributes, Second edition. R. Gates, Editor, 15 February 2003.
[ISO2788]
ISO 2788:1986 Documentation - Guidelines for the establishment and development of monolingual thesauri. Second edition. ISO TC 46/SC 9, 1986.
[ISO5964]
ISO 5964:1985 Documentation - Guidelines for the establishment and development of multilingual thesauri. First edition. ISO TC 46/SC 9, 1985.
[SWBP-SKOS-CORE-GUIDE]
SKOS Core Guide, A. Miles, D. Brickley, Editors, W3C Working Draft (work in progress), 2 November 2005. Latest version available at http://www.w3.org/TR/swbp-skos-core-guide .
[SWBP-SKOS-CORE-SPEC]
SKOS Core Vocabulary Specification, A. Miles, D. Brickley, Editors, W3C Working Draft (work in progress), 2 November 2005. Latest version available at http://www.w3.org/TR/swbp-skos-core-spec.
[SWBP-SKOS-MAPPING]
SKOS Mapping Vocabulary Specification, A. Miles, D. Brickley, Editors, W3C Working Draft (work in progress), 11 November 2004. Latest version available at http://www.w3.org/2004/02/skos/mapping/spec.
[SWBPD]
The Semantic Web Best Practices and Deployment Working Group
[SWD]
The Semantic Web Deployment Working Group
[SWD-Charter]
Semantic Web Deployment Working Group (SWDWG) Charter

Acknowledgments

The editors gratefully acknowledge contributions from:

@@ TODO @@