Real Scenarios for Use Cases

From Ontology-Lexica Community Group

Real Scenarios for Use Cases

AGROVOC

requirements covered:: most of the vocabulary modules we are developing for OntoLex

refer to: Armando Stellato

AGROVOC is a controlled vocabulary covering all areas of interest to FAO, including food, nutrition, agriculture, fisheries, forestry, environment etc. To date, AGROVOC contains over 32,000 concepts organized in a hierarchy, each concept may have labels in up to 25 languages: Arabic, Chinese, Czech, English, French, German, Hindi, Hungarian, Italian, Japanese, Korean, Lao, Persian, Polish, Portuguese, Russian, Slovak, Spanish, Thai, Turkish, Malaysian, Moldavian, Telugu, Ukrainian.

Current metadata info about agrovoc are published here: http://aims.fao.org/aos/agrovoc/void.ttl

Being part of the group maintaining AGROVOC, Armando Stellato can introduce the OntoLex model to the FAO group maintaining AGROVOC and, if OntoLex goes into a promising direction, we could early adopt it for modeling the linguistic aspects of the thesaurus.

The property vocabulary of VOCBENCH is http://aims.fao.org/aos/agrontology. There is wide margin of improvement in this vocabulary, as this has been only recently separated from the thesaurus data (previously, data and vocabulary were put together under the common umbrella of AGROVOC) and also still includes many modeling choices taken in the past, before even SKOS and SKOS-XL were developed (and before many popular ontologies were distributed). As you can see by inspecting Agrontology, there are several lexical relations mapping lexical entries between them. Lexical Entries are currently modeled as SKOS-XL Labels, thus first class citizens (for which at least editorial data is always present, other than domain attributes and relationships)

VOCBENCH

requirements covered:: most of the vocabulary modules we are developing for OntoLex

VOCBENCH is the tool originaly developed (with name AGROVOC Workbench Server) to edit Agrovoc. Today, it is a collaborative web application for editing thesauri. Semantic Turkey (a RDF framework from the University of Tor Vergata) and VocBench (both listed here: http://www.w3.org/2001/sw/wiki/SKOS) are being integrated into a new version (2.0) of VocBench, which will be released (hopefully..and likely) before the end of this Spring. Currently VocBench is being used by FAO for collaborative (multiuser) maintenance of their thesauri (Agrovoc, Biotech etc..), and we know of other parties adopting it (e.g. http://taskman.eionet.europa.eu/projects/gemet/wiki/VocBench for the GEMET thesaurus). The modeling vocabulary being used by VocBench is SKOS-XL.

SemaGrow EU Project

use cases covered:: IE, OA, LLD and SAOM

refer to: Armando Stellato

"SemaGrow envisages to develop the scalable, efficient, and robust data services needed to take full advantage of the data-intesive and inter-disciplinary Science of 2020 and to re-shape the way that data analysis techniques are applied to the heterogeneous data cloud"

To obtain real use cases anchored to available partner/users in the project, in WP2 we are lifting to RDF structured data available from partners, while in WP3 we are:

  • eliciting knowledge from unstructured content, by focusing on architectures and frameworks for doing it on a systematic way
  • implementing scalable solutions for ontology matching in real distributed scenarios

The first objective could benefit of results from specifications related to the IE scenario. The second objective fits the SAOM and OA use cases, with requirements related to LLD too.

The work will start from the observation that, beyond the many studies conducted on "in most of the cases" pre-elaborated examples, no OM system is really able to sustain the ontology matching task in a open, heterogeneous and distributed environment, cause there is nothing to really coordinate automatic, independent agents in a process of mediation involving problems such as heterogeneities in the natural (e.g. italian vs english, and to which extent? which coverage of the resource?) and formal (e.g. which formalism is used to describe the resource?) languages used to linguistically characterize an RDF dataset.