Use cases for a thesaurus service SWAD-Europe

Project name:

W3C Semantic Web Advanced Development for Europe (SWAD-Europe)

Project Number:

IST-2001-34732

Workpackage name:

8. Thesaurus Research Prototype

Workpackage description:

http://www.w3.org/2001/sw/Europe/plan/workpackages/live/esw-wp-8.html

Deliverable title:

N/A

URI:

N/A

Authors:

Nikki Rogers

Abstract:

This document is contributing to Workpackage 8. We show here the results of our initial efforts in collecting a public list of use-cases for all applications of thesaurus and similar services in distributed environments. These use-cases will be used as the basis for defining a web service API for access to a thesaurus service. The definition of this web service API will then form the basis for effort on Deliverable 8.7 - development of a public Research Prototype Demonstrating RDF Thesaurus Technology. We intend the development of these use-cases to be conducted as a public discussion.

Section 1: Context/Scope

Section 2: Human End-user Use Cases

Section 3: Machine-to-Machine Use Cases

Section 4: Towards an API specification: is there a common 'set of questions' arising from the above Use Cases?

GENERAL TODOS

Not going near the multilingual problem yet, I suggest this for a core API -


getConcept(URI uniqueidentifier)

getConcept(Literal descriptor, URI thesaurus)

getConcept(Literal externalID, URI thesaurus)

	--> Returns a single 'Concept' datastructure, including all the
labels.


matchConcepts(String regularExpression)

matchConcepts(String regexp, URI thesaurus)

	--> Returns a list of possible concepts, ordered according to
likelihood of match


getSupportedSemanticRelations()

getSupportedSemanticRelations(URI thesaurus)

	--> Return a list of supported semantic relations (e.g. broader,
narrower, is-a, etc.), each with a unique uri and a description of their
meaning.


getConceptRelatives(URI conceptURI)

getConceptRelatives(Literal descriptor, URI thesaurus)

getConceptRelatives(Literal externalID, URI thesaurus)

getConceptRelatives(URI conceptURI, SemanticRelation rel)

getConceptRelatives(Literal descriptor, URI thesaurus, SemanticRelation rel)

getConceptRelatives(Literal externalID, URI thesaurus, SemanticRelation rel)

	--> Returns list of relatives of concept as specified.   


This API is designed to be consistent with the way thesaurus data is
modelled in the schemas in 

http://www.w3c.rl.ac.uk/SWAD/deliverables/8.1_0_3.html

Section 1: Context/Scope

We consider the use cases both in terms of human end-user requirements of an online thesaurus (or similar) service, and also of machine-to-machine (M2M) requirements.

Section 2: Human End-user Use Cases

1 - Marking up general web resources for exposure in 'the semantic web'

2 - Marking up resources for a specific user community Similar to 1., but where for example the user is a SOSIG (social sciences) cataloguer

3 - Alistair Miles' use case: tool support for better searching and also browsing using web search engines such as Google. By "better searching" we tend to mean improved query recall (i.e. the user's search term is expanded with synonyms/partial equivalents). By browsing we tend to mean support for the user in narrowing down/refining their search term(s) in order to produce greater accuracy/relevancy in search results.

4 - Similar to 3. but in a specialist community environment such as that of a SOSIG end user

5 - Dan Brickley's use case: multilingual IMAGE retrieval (This is the case where a user expresses a query to recall images with embedded metadata, but need their query term to be translated into different languages. Are there overlaps here with the SIMILE use cases?)

6 - Charles McCathieNevile - Multilingual support. Similar to 5, but for a specific community: an end-user requires translation services e.g. for the W3C glossary (this is a requirement for term mappings across languages in specific contexts)

Note 1: At the User Interface level: in many cases the user may require visualisation tools for multiple thesauri cross-walking, for example a tool like Protege. We will make a design decision for deliverable 8.7 regarding whether to use such tools with the demonstrator, or whether to keep the User Interface level out of scope (noting that as this is a prototype web service it would be nice to browse thesaurus data online)

Note 2: Again, at the User Interface level we are aware that the sort of browse and search support indicated by use cases 3 and 4 above might be confusing to the user. For example when using a browse facility to refine a search term, the an end-user may not be clear that they are browsing terms, instead thinking they are browsing resources. And for example when using thesaurus-enabled search support, the user may find result sets to be confusing in that their original search term often will not appear in the results (instead synonyms or partial equivalents). As with Note 1, We will make a design decision for deliverable 8.7 regarding whether to keep User Interface issues in scope or not.

Section 3: Machine-to-Machine Use Cases

1 - Cross-search support to give ('invisibly') better query recall across a set of data repositories , e.g. this would extend a tool like the JISC-funded Subject Portal Project (SPP) cross-search. [Note 2, above, applies]

2 - Cross-browse support to allow end-users to "seamlessly" browsing a hierarchy of categories represented across a set of available data repositories in order to refine their search terms, for example when 2 or more KOS's have been "federated". This would complement an online subject-specialist service for example, such as complementing the JISC-funded Subject Portal Project (SPP) cross-search, say.

3 - "finding the right thesaurus" (Dave Reynolds suggested this use case), for use in a semantic portal - see "layers on the thesaurus service, point 2 below)

[Note: JISC IE Geo-spatial centralised service - related scenarios?]

4. [Dan Brickley's suggestion - re Bized/Sosig trials and the Desire project] - take two different data services (which for the Desire project were a couple of internet catalogues at ILRT), each using different schemes, and exploit mappings between the taxonomies to merge data into a single environment.

SWAD-Europe: Use Cases for a Thesaurus Service

Contents

Section 1: Context/Scope

Section 2: Human End-user Use Cases

Section 3: Machine-to-Machine Use Cases

Section 4: Towards an API specification: is there a common 'set of questions' arising from the above Use Cases?

Section 5: Are there layers of functionality we would want to specify as part of the Thesaurus Service API specification?