Linking to SKOS

From Ontology-Lexica Community Group

List of Proposals

We have been discussing two alternative proposal for linking the ontolex model to the SKOS (XL) model:

  1. ontolex:LexicalEntry rdfs:subClassOf skosxl:Label
  2. ontolex:Form rdfs:subClassOf skosxl:Label

Semantic Issues

Clearly, there are issues on both sides for what concerns respecting the "intended meaning" of the mapped objects:

  • issue with LexicalEntry case: as SKOS-XL labels have exactly one literalForm, this is not really compatible with our view of a lexical entry as an equivalence of forms with the same syntactic and semantic behaviour.
  • issue with Form case: SKOS-XL labels have a literalForm (skosxl:literalForm), and saying that an ontolex:Form has a literalForm would seem rather disturbing

Objectives

Nevertheless, the mapping should take two basic aspects into account

  • On the one hand, we should specify what the result of querying a ontolex model with the SKOS (XL) vocabulary should deliver, e.g. what should be the result of ?x ?x rdf:type skosxl:Label be?. So this is about inference possibilities, or at least mixing inference and adding few more statements to an ontolex described dataset, in order to get easily a SKOSXL description for that dataset too.
  • On the other hand, the question is how our model conceptually interoperates with SKOS-XL in the sense that some people might have an existing SKOS concept schema and they would want to enrich their conceptual schema with linguistic information from an ontolex model (possibly in addition to available labels in the SKOSXL namespace)


To illustrate the problems, let's take into account a SKOSXL conceptual scheme that models self-propelled vehicles and let's consider the SKOS:Concept car.

ex:car rdf:type skos:Concept .
ex:car skosxl:prefLabel ex:label1 .
ex:car skosxl:altLabel   ex:label2 .

ex:label1 rdf:type skosxl:Label.
ex:label1  skosxl:literalForm "car"

ex:label2 rdf:type skosxl:Label.
ex:label2 skosxl:literalForm "automobile"

Now imagine that the designers of this SKOSXL scheme want to link this to a proper lexicon, specifying form variants of these labels. With ontolex:Form rdf:subClassOf skosxl:Label, this could look as follows:

ex:car rdf:type skos:Concept .
ex:car skosxl:prefLabel ex:label1 .
ex:car skosxl:altLabel   ex:label2 .

ex:label1 rdf:type ontolex:Form.
ex:label1 ontolex:writtenForm "car".

ex:lex1 rdf:type ontolex:LexicalEntry.
ex:lex1 ontolex:canonicalForm ex:label1.
ex:lex1 ontolex:otherForm ex:label3.

ex:label3 rdf:type ontolex:Form.
ex:label3 ontolex:writtenForm "cars".
ex:label3 ling:numerus "plural".

In this way, the alternative form "cars" would not be a label of the concept, but would be indirectly accessible through the lexical entry.

By the following axioms, querying for the prefLabel of ex:car would give the canonical form only.

  • ontolex:Form rdfs:subClassOf skosxl:Label
  • ontolex:writtenForm rdfs:subPropertyOf skosxl:literalForm

With this solution, we get the right inference that the writtenForm of the Form is the literalForm of the corresponding label. While the are other forms "floating" around, this is not an issue as they are not linked to the concept, so should not be an issues, they are formally labels, but not labels of any concept.

The other solution would be to use a LexicalEntry as a Label and thus as possible class in the range of pref/alt/hiddenLabel.

This would thus look as follows:

ex:car rdf:type skos:Concept.
ex:car skos:prefLabel ex:lex1.

ex:lex1 rdf:type ontolex:LexicalEntry.
ex:lex1 ontolex:canonicalForm ex:label1.
ex:lex1 ontolex:otherForm ex:label3.

ex:label1 rdf:type ontolex:Form.
ex:label1 ontolex:writtenForm "car".

ex:label3 rdf:type ontolex:Form.
ex:label3 ontolex:writtenForm "cars".
ex:label3 ling:numerus "plural".


Now our (not uncritical) intuition was that literalForm of the LexicalEntry as label should be "car". We could not infer this as we would need an axiom of the sort:

canonicalForm o writtenForm -> literalForm

which does not work in OWL (yet).

In favor of OntolexForm

Given this examples, I think that the disadvantages of modelling LexicalEntry as subClassOf of skosxl:Label are three ones (none!) compared to modelling Form as subClassOf skosxl:Label. The disadvantages are:

1) LexicalEntry in our view are equivalence classes and should not themselves have a literalForm (which they would have if we decide that LexicalEntry rdfs:subClassOf skosxl:Label

2) Assumption that the canonical form is the literalForm is unwarranted (could be otherwise)

3) Needed inference can not be expressed in OWL

In favor of LexicalEntry

Premise: none of the two candidates perfectly represent a label. The issue is that a label is a label (Gertrude Stein in the Semantic Web..), and is just a shallower concept than the one represented by the combo of LexicalEntry and Form, though these last, may also satisfy the purpose of a label.

Respect of Intended Meaning of both vocabularies (Ontolex and SKOSXL)

The fact that ontolex:LexicalEntry has many forms while skosxl:Label has only one literalForm (criticized in #Semantic Issues) is explained by this: usually in thesauri we find only the *lemmas* (or in any case, one chosen form, which indeed maybe problematic in both cases) of the many lexical entries. So, the most appropriate form is chosen, and that one is the label. Different forms are useful only in that they describe the language itself (though yes, they represent useful lexical anchors for a text-to-ontology process). Two notes here:

  1. there is no semantic inconsistency: formally, an ontolex:LexicalEntry would be a subclass of skosxl:Label, and the skosxl:literalForm may exactly be one the many (refied ontolex) forms which represent the label; abandoning formalism, and talking about "intendend meaning", the "closeness" is really better than in the Form case (see "issue with Form case" in #Semantic Issues)
  2. the fact of seeing all the above forms as labels, may be really not desired (for the reasons described above in this paragraph).

Opposition to Form

  1. As of point 1 in #Objectives, in the Form case, we just get a listing of labels, but there is really no OWL inference for connecting them to concepts. So even worse than in the OntolexForm. Thus invalidating point 3 of the #In favor of OntolexForm section
  2. This has been criticized as a "pragmatic" approach, but all my considerations point to the fact that semantically (both formally, and interpretatively), Form is no better than LexicalEntry (see above)

Concrete Advantages

very nice compatible use of pref/alt/hiddenLabels in SKOSXL in combination with Ontolex

As of point 1 in #Objectives, since the implication is that only lemmas (canonicalForms) of ontolex:LexicalEntries are de facto automatically inferred (ok, still not possible for that ending datatype proprty in the chain, but that’s the *sole* limitation, and this kind of inference is used in SKOSXL too to materialize SKOS triples), one could freely use:

skosxl:prefLabel, skosxl:altLabel etc..

with any ontolex:LexicalEntry, as it is also a skosxl:Label! And that would be *really* nice for satisfying easily the exigencies of thesauri makers/publishers who would embrace our vocabulary. By having skosxl:pref/alt/hiddenLabel subproperties of the inverse of ontolex:denotes, you also get in turn the possibility to use one property only, and to get the inverse ontolex:denotes by inference.

Thus, we can derive from:

    <?c a skos:Concept>  --> skosxl:prefLabel --> <?x a ontolex:LexicalEntry>  --> ontolex:canonicalForm --> <a_ontolex:Form>  --> ontolex:writtenRep  --> <?l a literal>

The following:

    <?c a skos:Concept>  --> skosxl:prefLabel --> <?x a skosxl:Label>  --> skosxl:literalForm --> <?l a literal>

As previously said, we cannot infer anything about the otherForms of ?x for concept ?c, but again:

  1. people interested in having an ontolex-->skosxl export, could easily decide to not export the otherForms, because they only want the Lemmas, and this representation could provide a useful knife for deciding what to cut in the export, by limiting the axiomatized inference to the lemmas.
  2. other (non-exclusive) solution is to generate altLabels from the otherForms (yes, even from otherForms of an ontolex:LexicalEntry ) or, why not? skosxl:hiddenLabels! Because hidden Labels represent exactly further information which does not need to be shown (at the level of representation of skosxl) but which should be kept for application dependent purposes. Choice on altLabel/hiddenLabel could also depend on (possible) Form2Form relationship or further qualification of Forms through properties different from canonicalForm/otherForm (maybe subs of otherForm).

Issues from the "pro" side

To me, the only real issue for the LexicalEntry case among the three listed in #In favor of OntolexForm is:

  • Assumption that the canonical form is the literalForm is unwarranted (could be otherwise)

We should need some kind of marker for the desired form, which clearly demands a much more complex reasoning support (and effort from the dadtaset developer, as OWL does not support defaults due to its OWA assumption).


A longer discussion is in http://lists.w3.org/Archives/Public/public-ontolex/2013Oct/0030.html

A third way?

Possible alternatives (or solutions supporting any of the two main approaches discussed above) :

  • we could decide that a mapping vocabulary (even for establishing rdfs:subClassOf rels) could be a separate module, and there could be indeed different mapping vocabularies, depending on what the dataset publisher prefers to see inferred
  • Abandon any form of purely OWL based inference, and rely on SWRL/Spin/whatever, just to provide some shareable form for content production. Still, if we want OntoLex info to be transformable into SKOSXL one:
    • we should be able to provide enough info to be able to produce this transformation...
    • ...and possibly in a clean and direct way, without requiring too many additional markers. That is, there should be some implication between classes/properties in Ontolex, and classes properties in SKOSXL. In this sense, the proposal for connecting SKOSXL terminological labels is an example.