Case studies

From Best Practices for Multilingual Linked Open Data Community Group
Jump to: navigation, search

In this space some "real" examples will be documented in which a set of best practises could help dealing with multilingualism and linked data.

In principle the template will be the same as in Use Cases definition

Generation of Multilingual Linked Data for Libraries

Title Generation of Multilingual Linked Data for Libraries
Identifier GENLIB
Contributor/s Daniel Vila-Suero, Gordon Dunsire
Topics covered
Additional Remarks

International Product Clasification Systems

Title Product Clasification Systems
Identifier PCS
Contributor/s Jose María Álvarez Rodríguez, Jose Emilio Labra Gayo
Description Product Scheme Clasifications that can be used e-Comerce, e-Procurement, etc.
Example CPV (Common Procurement Vocabulary)
Topics covered Content Publishing,
Additional Remarks

Translations of multilingual terminologies for libraries

Title Translations of multilingual terminologies for libraries
Identifier TRANMUL
Contributor/s Gordon Dunsire
Description The Universal Machine-Readable Catalogue (UNIMARC) format for bibliographic records [1] contains a set of codes for the type of score in notated music resources. The attribute is encoded in UNIMARC as tag 125, subfield a, character position 0. Each code is equated with an English term. The full format includes a definition and usage instruction as appropriate, and a list of terms in English, French, German, Italian, Spanish, Hungarian, and Russian introduced by "Use for the following musical presentation statements:".

In general, a musical presentation statement may be transcribed from the resource or constructed according to local cataloguing rules. The code list itself notes "Terms used here as examples are suggestive, not exclusive or restrictive".

IFLA (International Federation of Library Associations and Institutions) bibliographic standards, including UNIMARC, are intended for use in a multilingual environment, but translations of the original English text are developed ad hoc. The concise UNIMARC bibliographic format [2], which excludes the multilingual terms, is available in Italian [3] and Portuguese [4] translations.

The value vocabulary for the code [5] is represented in RDF/SKOS following standard IFLA practice, and a language attribute is assigned to every appropriate value. In particular, the equated term is an instance of skos:prefLabel. The multilingual terms are instances of skos:altLabel, using the standard interpretation of the "Use for ..." relationship.

This results in a semantic collision when a translation is developed because a preferred label and an alternate label in the same language cannot have the same value.

There is no guarantee when or if a translation will be developed, so the best practice resolution is to deprecate the skos:altLabel before creating the skos:prefLabel.

A further issue arises with examples of multiple terms in the same language, for example "English: score, full score, performance score, playing score". Each of these is treated as a separate skos:altLabel except for "full score" which is already the skos:prefLabel for the code. There are no exceptions in the absence of a translation giving the equated term, for example "German: Partitur, Orchesterpartitur, Spielpartitur". If a German translation is developed and one of these terms becomes the equated term, then a similar semantic collision will require action.






Example The code "a" is equated to the English term "full score".

The Italian translation uses the same term or set of terms equated to the code, for example "a = partitura, parte con guida", as given in the full format examples, for example "Italian: partitura, parte con guida". The English full format gives skos:altLabel "partitura, parte con guida"@it and the Italian concise format gives skos:prefLabel "partitura, parte con guida"@it - but skos:altLabel and skos:prefLabel are disjoint.

So the skos:altLabel "partitura, parte con guida"@it should be deprecated before publishing skos:prefLabel "partitura, parte con guida"@it.

Topics covered 4.1 Monolingual vs Multilingual vocabularies
Additional Remarks