Terminology

From Ontology-Lexica Community Group
Jump to: navigation, search

The present document describes the current state of the intended Terminology module extension for the Lexicon Model for Ontologies (OntoLex-lemon) as a result of the work of members of the Ontology Lexicon community group (OntoLex). The module is targeted at the representation of language data included in terminologies and how to relate those data to existing models for lexical data, mainly OntoLex-lemon and the associated LexInfo vocabulary.



Background and motivation

Originally, terminologies were developed to improve communication among experts in a certain domain. The first terminologists, like Eugen Wüster (1898-1977, see also Wüster, 1979), pursued the standardization and classification of terms and their translations, to avoid language vagueness and ambiguity and ease (technical) communication at an international level. Terminologies and other domain specific language resources were manually elaborated and published in physical, closed and proprietary formats. But already in 1936 standardization work for terminology and language resources was initiated by a predecessor of ISO, the International Federation of Standardizing Associations (ISA), under the name ISA/TC 37 technical committee. This standardization work resumed in 1952, after the break caused by the second world war, in the context of ISO, with the ISO TC/37 technical committee. With the aim of structuring and interchanging the knowledge represented in such terminological and language resources, several standardization initiatives based on XML were more recently launched, such as the Text Encoding Initiative (Ide & Véronis, 1995) and, subsequently, the Terminology Based eXchange format (Melby, 2015).

Other initiatives have modeled terminological works using the Semantic Web specifications. The OTR model (Reymonet et al., 2007) was conceived to represent terminologies in OWL format. The SKOS vocabulary and its extension SKOS-XL (Miles & Bechhofer, 2009) were mainly designed to express the basic structure and content of taxonomies and thesauri, but also of concept schemes embedded in glossaries and terminologies. The representation of terminological resources and thesauri in RDF has also been tackled in initiatives like the EuroVoc thesaurus, containing terms related to the European Union (Alvite-Díez et al., 2010); the AgroVoc thesaurus, from the agricultural domain (Caracciolo et al., 2013) and the TheSoz thesaurus of social sciences (Zapilko et al., 2013), all of them modeled in SKOS.

A more comprehensive model was later on proposed to enrich ontologies with linguistic information, the lemon model (McCrae et al., 2012), which became the basis for a W3C Community Group, the Ontology-lexicon Community Group. Under the auspices of this community group, lemon evolved into Ontolex-lemon with the same purpose, to provide a principled way to describe how “ontology entities, i.e. properties, classes, individuals, etc. can be realized in natural language” (https://www.w3.org/2016/05/ontolex/). Ontolex-lemon is a concise and descriptive model that does not contain a complete collection of linguistic categories, but relies on external vacabularies and ontologies, such as Lexinfo (https://www.lexinfo.net/) or OLiA (http://www.acoli.informatik.uni-frankfurt.de/resources/olia/). In its current status - since the official publication of the community report in May 2016 (https://www.w3.org/2016/05/ontolex/) -, Ontolex-lemon consists of five modules, each one dedicated to certain linguistic types.

A conversion of the InterActive Terminology of Europe (IATE) and the European Migration Network glossary into the first lemon model was implemented in the context of the TBX2RDF service (Cimiano et al., 2015). The Terminoteca RDF initiative (Bosque-Gil et al., 2016) makes use of the Ontolex-lemon specifications for supporting the edition of terminological glossaries in the Linguistic Linked Data cloud.

Further modules for Ontolex-lemon have been discussed and implemented after the release of its final specification in May 2016 (see the Lexicog module dedicated to the modelling of lexicographic data (https://www.w3.org/community/ontolex/wiki/Lexicography), and Ontolex is currently an active community group in which further extensions or new modules are discussed.*

Ontolex-lemon introduces a link to SKOS concepts, in the form of a subclass labeled "LexicalConcept" that "represents a mental abstraction, concept or unit of thought that can be lexicalized by a given collection of senses" (https://www.w3.org/2016/05/ontolex/#lexical-concept). This class is offering thus a bridge between the representation of lexical data and language data used in the context of concepts in SKOS data sets. But no details are given on how this "bridge" can be modeled and implemented, especially for the case of terminologies, which are making a extensive use of language data.

Addtionally, the current specification of Ontolex-lemon and related efforts fall short to cover a number of representation needs related to terminologies, especially when generated from multiple sources:

  • Provenance information has to be provided for multiple information items. The current specification does not show how to attribute authorship to a definition, for instance.
  • Provenance information cannot be reduced to a simple statement, and multiple properties may be necessary: author of a definition, creation date, etc.
  • Chained provenance information may need to be captured: we may want to declare that a definition comes from IATE, which, in turn, has taken the text from a piece of European regulation.

The main objective of the additional classes and/or properties is to represent the origin (provenance, source, reference) of certain information items traditionally contained in terminologies. The rationale behind the inclusion of the references or sources of information from which the terms themselves, the definitions or the usage contexts, have been obtained or harvested, is that terminology users consider this to give credibility to the information contained in the resource. Moreover, this would also permit to account for the origin or source of descriptions obtained from resources in the Linguistic Linked Open Data cloud. As claimed in the recently updated IATE User’s Handbook (https://iate.europa.eu/assets/IATE_Handbook_public.pdf) regarding “credibility of entries”: “A well thought-out IATE entry tries to give users as much information as possible to allow them to judge whether the proposed solution is appropriate and credible. It must also allow other terminologists wishing to work on the entry to delimit the concept clearly, by providing references to the relevant sources consulted”.

Additionally, users are encouraged to provide “authoritative, credible sources”. (see section 4.3.5)., since it demonstrates the reliability of the term in question, and is a parameter used later on to assess the reliability of the information contained in the resource.

Specifically for the case of IATE, it is compulsory to include the source of information of certain entries in the resource. See what the Handbook states in the case of Definitions: “4.2.4 DEFINITION REFERENCE(S) This tells the user where the Definition has come from. It is mandatory if the Definition field has been populated (a definition cannot be stored in the new version of IATE without a definition reference).”

This may also be related with the fact that terminological resources have been traditionally created by teams of translators, working in parallel, and the source of the information had to be thoroughly documented. Additionally, and by proposing such complementary classes to the current version of the Ontolex-lemon vocabulary, we would also permit to account for the origin or source of descriptions obtained from resources in the Linguistic Linked Open Data cloud, when creating terminological resources (or any other type of linguistic resources) in a semi-automatic mode and reusing available data in the the cloud.

Aim and Scope

The main purpose of this proposal is to complement the Ontolex-lemon vocabulary with the necessary classes and properties to achieve a twofold purpose: to represent the information usually contained in traditional terminological resources and thesauri (IATE, TermCoord glossaries, EuroVoc, etc.), and to model modern terminological resources created in a semi-automatic fashion, leveraging available resources in the Linguistic Linked Open Data, thus exposed as linked data.

In this scenario, terminological definitions and their sources gain relevance, and a single triple does not suffice to capture all the key information. Considering that neither the specification of skos:definition nor the specification of dct:source restricts the object to be a literal, we recommend definitions and sources to be given as resources (possibly blank nodes), with further attributed properties.

     <subject_resource> skos:definition 
                  [
                  rdf:value “This is an example of definition” ;
                  dct:source “Dictionary X” .
                  ] .

The current spec of Ontolex-lemon (https://www.w3.org/2016/05/ontolex/, “A definition can be added to a lexical concept as a gloss by using the skos:definition property”) was somewhat nonspecific, and we believe the specification should make explicit that implementors of Ontolex-lemon should expect the text definitions to be given either as literals or as the rdf:value attributed to the definition object.

With the purpose of harmonizing the properties to be used for definition and source, we believe a few classes and properties should be specified --favouring thus the interoperability of Ontolex-lemon implementations.

Other alternative designs to represent these key information elements have been considered. First design: a flat structure where new properties are attributed to the definiendum --this solution is directly discarded, for we want different definitions from different sources to be representable for a single entity and the link cannot be lost. Second design: the most novel form of reification, RDF* (https://w3c.github.io/rdf-star/) might have been suitable, but the spec is not final nor adoption is granted.

The same issue can be found in other pieces of information such as usage contexts and term notes.

Namespaces

The list of namespaces to be (re)used in this proposal is shown below. This list includes the new properties and classes proposed, with the prefix Termlex; the Ontolex and related models and other reused vocabularies.

Termlex:

                  @prefix termlex: <http://www.w3.org/ns/lemon/termlex> .               

Ontolex (core) and other lemon modules:

                  @prefix ontolex: <http://www.w3.org/ns/lemon/ontolex#> .
                  @prefix vartrans: <http://www.w3.org/ns/lemon/vartrans#> .
                  @prefix lexicog: <http://www.w3.org/ns/lemon/lexicog#> .

Other models:

                  @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
                  @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
                  @prefix skos: <http://www.w3.org/2004/02/skos#> .
                  @prefix lexinfo: <http://www.lexinfo.net/ontology/2.0/lexinfo#> .
                  @prefix dc: <http://purl.org/dc/elements/1.1/> .
                  @prefix dcterms: <http://purl.org/dc/terms/> .
                  @prefix schema: <http://schema.org/> .
                  @prefix foaf: <http://xmlns.com/foaf/> .

Core proposal

The diagram below shows the classes and properties that are suggested in this proposal. Boxes represent classes and arrows represent properties (of any kind). In this modelling approach, each term is represented by a LexicalConcept, thus, this is the main component of the model. In the left part of the diagram, in black, the Ontolex core diagram is shown, while in the right part, in blue, the new Termlex elements are exposed.

Note that in this proposal for the representation of terminologies as Linked Data it is recommended to reuse, whenever possible, the existing models to represent linguistic data: Lexinfo, Ontolex-lemon, Leicog Module, amongst others. This proposal is intended to cover existing gaps and to avoid redundance at the same time. Consequently, we propose the following list of classes and properties:

Termlex core (5).png


Termlex Classes

termlex:Definition

URI: http://www.w3.org/ns/lemon/termlex#Definition
A definition is a sentence or clause that explains the meaning of a concept. As mentioned earlier in Section 2 - Aim and Scope, we propose the class Definition, since it is much more than a literal since, for its correct usage, it is vital to know about its validity, its provenance, its authorship, etc. This means that the representation of additional information about the definition is usually required. Also, following Sager guidelines (Sager, 1990), we would like to distinguish between two types of definitions:

  • Terminological Definitions: accurate and precise descriptions of specialised terms that a deep knowledge of the domain.
  • Lexicographic Definitions: more general descriptions that do not require such a high level of precision, since they refer to units of the general language, and not to a specific area of knowledge (Pérez Hernández, M. C., 2002).

We propose two subclasses accordingly:

DC.png

termlex:Source

URI: http://www.w3.org/ns/lemon/termlex#Source
Just like definitions, sources play a very important role in this modelling approach. Specially when terminologies are generated from multiple resources, as exposed in Section 1 - Background and Motivation, it is crucial to maintain the traceability of the different terminological data (may they be definitions, term notes, term contexts, etc.). With the automation of the terminology creation process, we may distinguish amonsgt two types of sources:

  • Intermediate sources: not direct sources but information providers, such as existing linguistic resources from which the information is retrieved (IATE, for instance) or applications (a Definition Extractor).
  • Original sources: direct sources, meaning corpora (i.e. European Legislation), organisations (i.e. European Commission) or individuals (i.e. John Doe, European terminologist).

We propose a class Source so we can make a different amongst them, so it is frequent that a term has an intermediate source that has an original source (see diagram below). We, however, do not make a formal distintion amongst both types of classes, since it would be too restrictive. To model each type of data, we reuse properties such as dc:source and dct:BibliographicResource from DublinCore, and the classes prov:Entity, prov:Agent, prov:Person and prov:Organization from PROV ontology. The intersection amongst termlex:Source and the Attestation class from the FRAC module (https://www.w3.org/community/ontolex/wiki/Frequency,_Attestation_and_Corpus_Information) is also being analysed.

Source Class.png

termlex:Note

URI: http://www.w3.org/ns/lemon/termlex#Note
Notes are key elements of traditional terminology cards, providing additional information, such as usage recommendations and domain data. Some of the modern language resources do not use term notes anymore, but others still keep them, thus, we consider them valuable pieces of knowledge for language professionals that need to be preserved. We propose the class Note, since current models, such as SKOS, Ontolex and Lexinfo use properties to represent notes (skos:note, ontolex:usage, lexinfo:note). This means that, if we collect term note from different language resources, we would not be able to model their provenance.

We found examples of the TBX specification containing notes at different levels:

  • Term Notes
  • Definition Notes
  • Context Notes

Since the skos:note property does not have neither domain nor range, we suggest reusing it to link the class Note to the classes ontolex:LexicalConcept, termlex:Definition and lexicog:UsageExample. Thus, we do not need to create three subclasses for each type of note, since we can easily infer their provenance.


Note Class (2).png

termlex:ReliabilityCode

URI: http://www.w3.org/ns/lemon/termlex#ReliabilityCode
Previous work (TBX2RDF ontology) proposes the property tbx:reliabilityCode to represent kind confidence rating that terminologists give to terms. However the domain is ontolex:LexicalEntry, and the property admits any type of rating. For the sake of standardisation, we propose a ReliabilityCode class, pointing at a fixed set of numerical values, 1-5, following IATE guidelines:


ReliabilityCodeClassLine.png

Termlex Properties

termlex:lexicalizedConcept

URI: http://www.w3.org/ns/lemon/termlex#lexicalizedConcept
Domain: skos:Concept
Range: ontolex:LexicalConcept
Inverse: termlex:isLexicalizedConceptOf

termlex:functionalIsEvokedBy

Following traditional terminology theories, this property is used to indicate that a LexicalConcept can only have one LexicalEntry, with the aim of reducing ambiguity.

URI: http://www.w3.org/ns/lemon/termlex#functionalIsEvokedBy
Domain: ontolex:LexicalConcept
Range: ontolex:LexicalEntry
Inverse: termlex:functionalEvokes
Subclass of: ontolex:isEvokedBy

termlex:functionalLexicalizedSense

Following traditional terminology theories, this property is used to indicate that a LexicalConcept can only have one LexicalSense, with the aim of reducing ambiguity.

URI: http://www.w3.org/ns/lemon/termlex#functionalLexicalizedSense
Domain: ontolex:LexicalConcept
Range: ontolex:LexicalSense
Inverse: termlex:isfunctionalLexicalizedSenseof
Subclass of: ontolex:LexicalizedSense

termlex:functionalUsageExample

Following traditional terminology theories, this property is used to indicate that a LexicalConcept can only have one context, with the aim of reducing ambiguity. The definition of the class lexicog:UsageExample fits perfectly for this purpoese. The domain of the corresponding property (lexicog:usageExample), however, is LexicalSense, and cannot be used in this modelling approach. Therefore, to link the lexicog:UsageExample class with he LexicalConcept, we propose the property termlex:functionalUsageExample.

URI: http://www.w3.org/ns/lemon/termlex#functionalUsageExample
Domain: ontolex:LexicalConcept
Range: lexicog:UsageExample
Inverse: termlex:isfunctionalUsageExampleof

termlex:reliabilityCode

Previous work (TBX2RDF ontology) proposes the property tbx:reliabilityCode to represent kind confidence rating that terminologists give to terms. However, it is a data property whose domain is ontolex:LexicalEntry, that does not work for our purposes.

URI: http://www.w3.org/ns/lemon/termlex#reliabilityCode
Domain: ontolex:LexicalConcept
Range: termlex:ReliabilityCode

termlex:normativeAuthorization

In order to represent the recommended usage of a term, we consider reusing the class lexinfo:NormativeAuthorization, that has a fixed list of values (preferredTerm, deprecatedTerm, admittedTerm...). The corresponding property proposed by Lexinfo, lexinfo:normativeAuthorization, has domain LexicalSense, since it is a subclass of ontolex:usage. Since we are working with LexicalConcepts, we propose the property termlex:normativeAuthorization. URI: http://www.w3.org/ns/lemon/termlex#normativeAuthorization
Domain: ontolex:LexicalConcept
Range: lexinfo:NormativeAuthorization

The IATE Scenario

Through this section, we show representation examples of the different elements contained in the terminological entries of IATE. We work with the term "train path", that has every piece of information available, whose IATE ID is 1443648.

Example 1: Representing terms

With the property termlex:lexicalizedConcept, we represent represent all terms, as LexicalConcepts, that describe the same skos:Concept. For each LexicalConcept, in terminological resources we contemplate just one LexicalSense, represented by the functional property termlex:functionalLexicalizedSense. Similarly, each LexicalConcept must only have one LexicalEntry, and that is why we use the property termlex:functionalIsEvokedBy.

1.Terms (3).png
                    <http://www.w3.org/ns/lemon/termlex#IATE_1443648> rdf:type <http://www.w3.org/2004/02/skos/core#Concept> ;
                    <http://www.w3.org/ns/lemon/termlex#lexicalizedConcept> <http://www.w3.org/ns/lemon/termlex#1443648_LC1> ,
                                                                            <http://www.w3.org/ns/lemon/termlex#1443648_LC2> .
                    
                    <http://www.w3.org/ns/lemon/termlex#transport> <http://www.w3.org/2004/02/skos/core#Collection> ;
                    <http://www.w3.org/2004/02/skos/core#member> <http://www.w3.org/ns/lemon/termlex#IATE_1443648> .
                    
                    <http://www.w3.org/ns/lemon/termlex#1443648_LC1> rdf:type <http://www.w3.org/ns/lemon/ontolex#LexicalConcept> ;
                    <http://www.w3.org/ns/lemon/termlex#functionalIsEvokedBy> <http://www.w3.org/ns/lemon/termlex#1443648_LC1_LEN> ;
                    <http://www.w3.org/ns/lemon/termlex#functionalLexicalizedSense> <http://www.w3.org/ns/lemon/termlex#1443648_LS1> ;
                    <http://www.w3.org/ns/lemon/termlex#isLexicalizedConceptOf> <http://www.w3.org/ns/lemon/termlex#IATE_1443648> .
                    
                    <http://www.w3.org/ns/lemon/termlex#1443648_LC1_LEN> rdf:type <http://www.w3.org/ns/lemon/ontolex#LexicalEntry> ;
                    <http://www.w3.org/ns/lemon/ontolex#lexicalForm> <http://www.w3.org/ns/lemon/termlex#1443648_LC1_LEN_FRM> ;
                    <http://www.w3.org/ns/lemon/termlex#functionalEvokes> <http://www.w3.org/ns/lemon/termlex#1443648_LC1> .
                    
                    <http://www.w3.org/ns/lemon/termlex#1443648_LC1_LEN_FRM> rdf:type <http://www.w3.org/ns/lemon/ontolex#Form> ;
                    <http://www.w3.org/ns/lemon/ontolex#writtenRep> "surco ferroviario" .
                    
                    <http://www.w3.org/ns/lemon/termlex#1443648_LC2> rdf:type <http://www.w3.org/ns/lemon/ontolex#LexicalConcept> ;
                    <http://www.w3.org/ns/lemon/termlex#functionalIsEvokedBy> <http://www.w3.org/ns/lemon/termlex#1443648_LC2_LEN> ;
                    <http://www.w3.org/ns/lemon/termlex#functionalLexicalizedSense> <http://www.w3.org/ns/lemon/termlex#1443648_LS2> ;
                    <http://www.w3.org/ns/lemon/termlex#isLexicalizedConceptOf> <http://www.w3.org/ns/lemon/termlex#IATE_1443648> .
                    
                    <http://www.w3.org/ns/lemon/termlex#1443648_LC2_LEN> rdf:type <http://www.w3.org/ns/lemon/ontolex#LexicalEntry> ;
                    <http://www.w3.org/ns/lemon/ontolex#lexicalForm> <http://www.w3.org/ns/lemon/termlex#1443648_LC2_LEN_FRM> ;
                    <http://www.w3.org/ns/lemon/termlex#functionalEvokes> <http://www.w3.org/ns/lemon/termlex#1443648_LC2> .
                    
                    <http://www.w3.org/ns/lemon/termlex#1443648_LC2_LEN_FRM> rdf:type <http://www.w3.org/ns/lemon/ontolex#Form> ;
                    <http://www.w3.org/ns/lemon/ontolex#writtenRep> "franja ferroviaria" .
                    
                    <http://www.w3.org/ns/lemon/termlex#1443648_LS1> rdf:type <http://www.w3.org/ns/lemon/ontolex#LexicalSense> ;
                    <http://www.w3.org/ns/lemon/termlex#functionalIsLexicalizedSenseOf> <http://www.w3.org/ns/lemon/termlex#1443648_LC1> .
                    
                    <http://www.w3.org/ns/lemon/termlex#1443648_LS2> rdf:type <http://www.w3.org/ns/lemon/ontolex#LexicalSense> ;
                    <http://www.w3.org/ns/lemon/termlex#functionalIsLexicalizedSenseOf> <http://www.w3.org/ns/lemon/termlex#1443648_LC2> .

Example 2: Representing language and synonymy

To represent that the meanings of two terms are similar, we do not need any new propertis, since we can apply the property lexinfo:synonym from Lexinfo vocabulary. Likewise, to indicate that two terms are in the same language, we can create a lexicon using Lime.


2. Synonymy (3).png


                    <http://www.w3.org/ns/lemon/termlex#1443648_LS1> rdf:type  <http://www.w3.org/ns/lemon/ontolex#LexicalSense> ;
                    <http://www.w3.org/ns/lemon/termlex#functionalIsLexicalizedSenseOf> <http://www.w3.org/ns/lemon/termlex#1443648_LC1> ;
                    <http://www.lexinfo.net/ontology/3.0/lexinfo#synonym> <http://www.w3.org/ns/lemon/termlex#1443648_LS2> .
                    
                    <http://www.w3.org/ns/lemon/termlex#1443648_lexicon_ES> rdf:type <http://www.w3.org/ns/lemon/lime#Lexicon> ;
                    <http://www.w3.org/ns/lemon/lexicog#entry> <http://www.w3.org/ns/lemon/termlex#1443648_LC1_LEN> ,
                                                               <http://www.w3.org/ns/lemon/termlex#1443648_LC2_LEN> ;
                                                                    <http://www.w3.org/ns/lemon/lime#language> "es" .

Example 3: Representing definition, notes and sources

In this case, we have the same definition for both LexicalConcepts. This is very common amgonst terms that are considered synonyms. We therefore represent it with the class termlex:Definition, that is linked to both LexicalConcepts with the property skos:definition. Likewise, we collect provenance related data within the class termlex:Source, that is linked to termlex:Definition with the property dc:source. In this class we find the name and the identifier of the document from which the definition has been retrieved. In this terminological entry, we also observe a note about the definition, that is represented with the class termlex:Note. This class is linked to the definition with the property skos:note, that does not have any domain/range restrictions. In this class we find the value of the note and a link to its source.

3 Definition Source DefNote (1).png


                    <http://www.w3.org/ns/lemon/termlex#1443648_DEF1> rdf:type <http://www.w3.org/ns/lemon/termlex#Definition> ;
                    <http://www.lexinfo.net/ontology/3.0/lexinfo#note> <http://www.w3.org/ns/lemon/termlex#1443648_DEF1_NT> ;
                    <http://www.w3.org/ns/lemon/vartrans#source> <http://www.w3.org/ns/lemon/termlex#1443648_DEF1_SRC> ;
                    rdf:value "capacidad de infraestructura necesaria para que un tren circule entre dos puntos en un momento dado."@es .
                    
                    <http://www.w3.org/ns/lemon/termlex#1443648_DEF1_NT> rdf:type <http://www.w3.org/ns/lemon/termlex#Note> ;
                    rdf:value "Se trata de unidades cuya disponibilidad depende de factores como el número de vías disponibles, el sistema de señalización o la diferencia de velocidad entre trenes. El Administrador de Infraestructuras Ferroviarias (Adif) concede a los operadores el derecho de explotar un tramo de vía en un día, una hora y un sentido determinado"@es ;
                    <http://purl.org/dc/elements/1.1/source> <https://iate.europa.eu/search/result/1626968015909/1> .
                    
                    <http://www.w3.org/ns/lemon/termlex#1443648_DEF1_SRC> rdf:type <http://www.w3.org/ns/lemon/termlex#Source> ;
                    rdf:value "Directiva 2001/14/CE relativa a la adjudicación de la capacidad de infraestructura ferroviaria y la aplicación de cánones por su utilización"@es ;
                    <http://purl.org/dc/elements/1.1/identifier> "CELEX:32001L0014/ES" .

Example 4: Representing term contexts

Contextual information of a term is a very valuable piece as information. The Lexicog already observes this need: The class UsageExample represents a textual example of the usage of a sense in a given lexicographic record. A usage example can group several string values, in which case they will encode the same meaning. Although this class is applied to LexicalSenses, we propose reusing it with LexicalConcepts as well to avoid class redundancies, since the purpose of this class fits our necessities. To this end, we use the property termlex:functionalUsageExample with domain LexicalConcept and range UsageExample with the restriction to one UsageExample per term.

4 TermContexts.png
                    <http://www.w3.org/ns/lemon/termlex#1443648_LC1> rdf:type <http://www.w3.org/ns/lemon/ontolex#LexicalConcept> ;
                    <http://www.w3.org/ns/lemon/termlex#functionalIsEvokedBy> <http://www.w3.org/ns/lemon/termlex#1443648_LC1_LEN> ;
                    <http://www.w3.org/ns/lemon/termlex#functionalLexicalizedSense> <http://www.w3.org/ns/lemon/termlex#1443648_LS1> ;
                    <http://www.w3.org/ns/lemon/termlex#functionalUsageExample> <http://www.w3.org/ns/lemon/termlex#1443648_LC1_UE> 
                    
                    <http://www.w3.org/ns/lemon/termlex#1443648_LC1_UE> rdf:type <http://www.w3.org/ns/lemon/lexicog#UsageExample> ;
                    <http://www.w3.org/ns/lemon/vartrans#source> <http://www.w3.org/ns/lemon/termlex#1443648_LC1_UE_SRC> ;
                    rdf:value "Un surco ferroviario es, por tanto, el derecho que el ADIF concede a un operador determinado, para explotar un tramo de vía determinado en un día, hora y sentido determinados. A partir de aquí, resulta fácil imaginar que la competencia entre los operadores que entren en el negocio consistirá fundamentalmente en la obtención de cuántos más surcos mejor"@es .
                    
                    <http://www.w3.org/ns/lemon/termlex#1443648_LC1_UE_SRC> rdf:type <http://www.w3.org/ns/lemon/termlex#Source> ;
                    rdf:value "Hacia la liberalización efectiva del ferrocarril, Miquel Llevat, Director General de Comsa Rail"@es ;
                    <http://purl.org/dc/elements/1.1/identifier> <http://www.cel-logistica.org/subidasArticulos/90.pdf> .
                    
                    <http://www.w3.org/ns/lemon/termlex#1443648_LC2> rdf:type <http://www.w3.org/ns/lemon/ontolex#LexicalConcept> ;
                    <http://www.w3.org/ns/lemon/termlex#functionalIsEvokedBy> <http://www.w3.org/ns/lemon/termlex#1443648_LC2_LEN> ;
                    <http://www.w3.org/ns/lemon/termlex#functionalLexicalizedSense> <http://www.w3.org/ns/lemon/termlex#1443648_LS2> ;
                    <http://www.w3.org/ns/lemon/termlex#functionalUsageExample> <http://www.w3.org/ns/lemon/termlex#1443648_LC2_UE> .
                    
                    <http://www.w3.org/ns/lemon/termlex#1443648_LC2_UE> rdf:type <http://www.w3.org/ns/lemon/lexicog#UsageExample> ;
                    <http://www.w3.org/ns/lemon/vartrans#source> <http://www.w3.org/ns/lemon/termlex#1443648_LC2_UE_SRC> ;
                    rdf:value "Capacidad de infraestructura: la capacidad para programar las franjas ferroviarias solicitadas para un segmento de la infraestructura durante un periodo determinado"@es .
                                         
                    <http://www.w3.org/ns/lemon/termlex#1443648_LC2_UE_SRC> rdf:type <http://www.w3.org/ns/lemon/termlex#Source> ;
                    rdf:value "Ley 39/2003, de 17 de noviembre, del Sector Ferroviario ( [nov-2009])"@es ;
                    <http://purl.org/dc/elements/1.1/identifier> <http://www.fomento.es/NR/rdonlyres/432E286F-0172-4227-95DC-FA8F7B5F776C/12102/leysectorferroviario.pdf> .

Example 5: Representing term usage and reliability

There are two important term quality indicators that are often present in IATE entries: reliability and term usage. Reliability indicates the confidence level given by the terminologist who added the entry. To represent it, we propose the class termlex:ReliabilityCode, with a limitted number of instances from 1 to 5. Term usage indicates whether the acceptability rate of a term. To represent it, we use the class lexinfo:NormativeAuthorization, also with a fixed number of instances: admittedTerm, deprecatedTerm, legalTerm, preferredTerm, regulatedTerm, stantadardizedTerm and supersededTerm.

5 Usage Reliability (1).png


                    <http://www.w3.org/ns/lemon/termlex#1443648_LC2> rdf:type <http://www.w3.org/ns/lemon/ontolex#LexicalConcept> ;
                    <http://www.w3.org/ns/lemon/termlex#normativeAuthorization> <http://www.lexinfo.net/ontology/3.0/lexinfo#deprecatedTerm> ;
                    <http://www.w3.org/ns/lemon/termlex#functionalIsEvokedBy> <http://www.w3.org/ns/lemon/termlex#1443648_LC2_LEN> ;
                    <http://www.w3.org/ns/lemon/termlex#functionalLexicalizedSense> <http://www.w3.org/ns/lemon/termlex#1443648_LS2> ;
                    <http://www.w3.org/ns/lemon/termlex#functionalUsageExample> <http://www.w3.org/ns/lemon/termlex#1443648_LC2_UE> ;
                    <http://www.w3.org/ns/lemon/termlex#reliabilityCode> <http://www.w3.org/ns/lemon/termlex/ReliabilityCode#2> .
                                         
                    <http://www.w3.org/ns/lemon/termlex#1443648_LC1> rdf:type <http://www.w3.org/ns/lemon/ontolex#LexicalConcept> ;
                    <http://www.w3.org/ns/lemon/termlex#functionalIsEvokedBy> <http://www.w3.org/ns/lemon/termlex#1443648_LC1_LEN> ;
                    <http://www.w3.org/ns/lemon/termlex#functionalLexicalizedSense> <http://www.w3.org/ns/lemon/termlex#1443648_LS1> ;
                    <http://www.w3.org/ns/lemon/termlex#functionalUsageExample> <http://www.w3.org/ns/lemon/termlex#1443648_LC1_UE> ;
                    <http://www.w3.org/ns/lemon/termlex#reliabilityCode> <http://www.w3.org/ns/lemon/termlex/ReliabilityCode#4> ;
                    <http://www.w3.org/ns/lemon/termlex/normativeAuthorization> <http://www.lexinfo.net/ontology/3.0/lexinfo#preferredTerm> .

Contributors

  • Patricia Martín-Chozas, Ontology Engineering Group, Universidad Politécnica de Madrid
  • Thierry Declerck, Multilinguality and Language Technology, Deutsches Forschungszentrum für Künstliche Intelligenz
  • Elena Montiel-Ponsoda, Ontology Engineering Group, Universidad Politécnica de Madrid
  • Víctor Rodríguez-Doncel, Ontology Engineering Group, Universidad Politécnica de Madrid

Acknowledgements

This work has been supported by the European Union’s Horizon 2020 research and innovation programme through the Prêt-à-LLOD project, with grant agreement No. 825182, and by the NexusLinguarum COST Action CA18209 - European network for Web-centred linguistic data science.

References

Alvite-Díez, M. L., Pérez-León, B., Martínez-González, M. M., & Vicente Blanco, D.-J. (2010). A proposal for representing the Eurovoc thesaurus with SKOS for its integration in juridical information systems.

Bosque-Gil, J., Gracia, J., Aguado-de-Cea, G., & Montiel-Ponsoda, E. (2016). Applying the OntoLex Model to a Multilingual Terminological Resource. European Semantic Web Conference.

Bosque-Gil, J., Gracia, J., & Montiel-Ponsoda, E. (2017). Towards a Module for Lexicography in OntoLex. LDK Workshops.

Bosque-Gil, J., Gracia, J., Montiel-Ponsoda, E., & Aguado-de-Cea, G. (2016). Modelling multilingual lexicographic resources for the Web of Data: The K Dictionaries case. GLOBALEX. Lexicographic Resources for Human Language Technology Workshop Programme.

Bosque-Gil, J., Montiel-Ponsoda, E., Gracia, J., & Aguado-de-Cea, G. (2016). Terminoteca RDF: a gathering point for multilingual terminologies in Spain. TERM BASES AND LINGUISTIC LINKED OPEN DATA.

Caracciolo, C., Stellato, A., Morshed, A., Johannsen, G., Rajbhandari, S., Jaques, Y., & Keizer, J. (2013). The AGROVOC Linked Dataset. Semantic Web.

Cimiano, P., Buitelaar, P., McCrae, J., & Sintek, M. (2011). LexInfo: A declarative model for the lexicon-ontology interface. Journal of Web Semantics.

Cimiano, P., McCrae, J. P., Rodríguez-Doncel, V., Gornostay, T., Gómez-Pérez, A., Siemoneit, B., & Lagzdins, A. (2015). Linked Terminologies: Applying Linked Data Principles to Terminological Resources. Proceedings of the eLex 2015 Conference.

Ide, N., & Véronis, J. (1995). Text encoding initiative: Background and contexts. Springer Science & Business Media.

McCrae, J., Aguado-de-Cea, G., Buitelaar, P., Cimiano, P., Declerck, T., Gómez-Pérez, A., Gracia, J., Hollink, L., Montiel-Ponsoda, E., Spohr, D., & Wunner, T. (2012). Interchanging lexical resources on the Semantic Web. Language Resources and Evaluation.

Melby, A. (2015). TBX: A terminology exchange format for the translation and localization industry. Handbook of Terminology, 393-424.

Miles, A., & Bechhofer, S. (2009). SKOS simple knowledge organization system reference. W3C recommendation.

Pérez Hernández, M. C. (2002). Explotación de los córpora textuales informatizados para la creación de bases de datos terminológicas basadas en el conocimiento. Estudios de lingüística del español, 18, 000-0.

Reymonet, A., Thomas, J., & Aussenac-Gilles, N. (2007). Modelling ontological and terminological resources in OWL DL. Proceedings of ISWC, 7.

Sager, J. C. (1990). Practical course in terminology processing. John Benjamins Publishing.

Wüster, E. (1979). Einführung in die Allgemeine Terminologielehre und Terminologische Lexikographie 1979, Springer in Komm. (3rd edition in 1991, Romanistischer Verlag).

Zapilko, B., Schaible, J., Mayr, P., & Mathiak, B. (2013). TheSoz: A SKOS representation of the thesaurus for the social sciences. Semantic Web.