This is an archive of an inactive wiki and cannot be modified.

Section 0. Contact and confidentiality

Contact e-mail:

Do you mind your use case being made public on the working group website and documents?

No, you are welcome to use it.

Section 1. Application

In this section we ask you to provide some information about the application for which the vocabulary(ies) and or vocabulary mappings are being used. Please note:

1.1. What is the title of the application?

Extended Metadata Registry (XMDR) Prototype
This is a prototype implementation of metadata design specifications proposed
for edition 3 of ISO/IEC 11179 part 3
see http://xmdr.org/, http://xmdr.org/software/ and http://xmdr.lbl.gov/xmdr

1.2. What is the general purpose of the application?

Extensions to ISO/IEC 11179 Metadata Registry Standard

• registration of metadata, including concept systems such as terminologies and ontologies, as well as data elements and value domains (codesets).
• registering and managing any semantic information that is useful in data management, data administration, data analysis and linkage of concepts to data.
• provide semantics services for semantic computing such as the Semantic Web, semantics service oriented architectures, and semantic grids.
• Interrelate concept systems with other concept systems
• Interrelate concept systems with data held in databases, terminologies, and metadata deriving from natural language text understanding systems
• Enable use of new services for semantic computing: Semantics Service Oriented Architecture, Semantic Grids, semantics based workflows, Semantic Web ….
• Capture semantics with more formal techniques (in addition to natural language) -- First Order Logic, Description Logic, Common Logic, OWL
• Encourage and enable the sharing of concept systems and traditional metadata through means that reduce the cost of accessing, obtaining and interacting with the broadest range of content
• provide semantic services needed to support semantic computing, such as dereferencing the URIs used in creating RDF statements, by providing relevant information describing the referenced concept and its authoritative standing within some community of interest.

1.3. Provide some examples of the functionality of the application. Try to illustrate all of the functionalities in which the vocabulary(ies) and/or vocabulary mappings are involved.

SEE http://hpcrd.lbl.gov/SDM/XMDR/use-cases.html

1.4. What is the architecture of the application?

see http://hpcrd.lbl.gov/SDM/XMDR/arch.html

REST(Representational State Transfer) architecture, Metamodel (in OWL) and data format {XML), RegistryStore (Persistence and Versioning), Metadata Content Validation, Indexing (text, asserted, and logical inference), Mapping, and Authentication, (Human) User Interface

Not at present; but in the future, we hope that content data as well as extended metadata registry software, might be so distributed.

1.5. Briefly describe any special strategy involved in the processing of user actions, e.g. query expansion using the vocabulary structure.

Users may choose to expand queries to include inferred as well as asserted information. Users may draw inferences based on XMDR metamodel (ISO/IEC 11179) as well as specific content and relationship of individual sets of metadata.

1.6. Are the functionalities associated with the controlled vocabulary(ies) integrated in any way with functionalities provided by other means? (For example, search and browse using a structured vocabulary might be integrated with free-text searching and/or some sort of social bookmarking or recommender system.)

One of the main purposes of the XMDR Prototype is to demonstrate how concept systems (including vocabularies, terminologies, thesauri, and ontologies) can be used to help integrate, search, and harmonize more traditional metadata registry information about data elements, valid value sets, etc. The functionality for humans and computers is to enable linkage of concept systems and data. This can be utilized in many ways, including finding data elements that are related to particular concepts or sets of concepts.

COMBINES TEXT AND INFERENCE SEARCHING

1.7. Any additional information, references and/or hyperlinks.

See http://xmdr.org/

As noted elsewhere, we have been working closely with Harold Solbrig, and using LexGrid to facilitate import of concept systems into XMDR whenever possible. As SKOS gains wider acceptance and software tools to work with it, we can envision eventually using SKOS and related tools for many of the same purposes for which we are currently using LexGrid. We thus hope that SKOS will be able to incorporate many of the current features of LexGrid so that we can easily use concept systems that use SKOS. LexGrid and XMDR may prove to be useful tools for working with content expressed in SKOS. In the meantime, it might be very useful to have software that could translate from SKOS to LexGrid and vice-versa.

Section 2. Vocabulary(ies)

In this section we ask you to provide some information about the vocabulary or vocabularies you would like to be able to represent using SKOS. Please note:

2.1. What is the title of the vocabulary? If you're describing multiple vocabularies, please provide as many titles as you can.

XMDR has loaded a number of different concept systems in order to demonstrate different kinds of capabilities, particularly for large, complex concept systems. For the list of the current concept systems included and proposed for the XMDR Prototype, and a summary of their respective characteristics, see the table at http://hpcrd.lbl.gov/SDM/XMDR/contentlist.html

2.2. Briefly describe the general characteristics of the vocabulary, e.g. scope, size...

see http://hpcrd.lbl.gov/SDM/XMDR/contentlist.html, a portion of which is included below...

XMDR terminology and concept system content varies greatly in size, from small to hundreds of thousands of concepts, millions of terms, and millions of relations between concepts.

2.3. In which language(s) is the vocabulary provided?

XMDR is intended to input concept systems in their entirety from any format.

Wherever possible, we have used LexGrid as an intermediate step in loading content. Content expressed in SKOS would make it easier for XMDR to load additional content from diverse fields and sources.

2.5. Describe the structure of the vocabulary.

2.6. Is a machine-readable representation of the vocabulary already available (e.g. as an XML document)? If so, we would be grateful if you could provide some example data or point us to a hyperlink.

There is substantial content available in two prototype implementation instances on our web site at xmdr.org. We have permission to make the content accessible on the web but may not be able to re-distribute content in bulk.

2.7. Are any software applications used to create and/or maintain the vocabulary?

Some parts of the XMDR architecture are not yet implemented (e.g., mapping). We are working actively to add such capabilities and welcome collaboration that might expedite that work.

2.8. If a database application is used to store and/or manage the vocabulary, how is the database structured? Illustration by means of some table sample is welcome.

Content Systems are translated into XML files that conform to the XMDR metamodel, as described at https://xmdr.lbl.gov/mediawiki/index.php/11179_Diagrams

2.9. Were any published standards, textbooks or written guidelines followed during the design and construction of the vocabulary?

We are trying to coordinate our work with development of ISO/IEC 11179 edition 3, and other standards efforts, particularly ISO TC37 and the W3C Semantic Web Working Groups (XML, RDF, OWL and SKOS). Other related ISO standards include 639, 704, 3166, 11179, 12620 and and UML (Universal Modeling Language).

We also have used the LexGrid specification (http://LexGrid.org/) because it bridges the SKOS/OWL boundary.

2.10. How are changes to the vocabulary managed?

Changes to vocabularies are the responsibility of the different organizations from which we obtain them. How to keep the experimental XMDR Prototype updated with respect to changing external sources is an active research and development topic.

2.11. Any additional information, references and/or hyperlinks.

Section 3. Vocabulary Mappings

In this section we ask you to provide some information about the mappings or links between vocabularies you would like to be able to represent using SKOS. Please note:

Although the XMDR Prototype does not yet include facilities for mapping between different concept systems, that is one of our important goals.

See http://hpcrd.lbl.gov/SDM/XMDR/arch.html  section C.4.

XMDR  MappingEngine
The part of the MDR specification that requires the most work for XMDR is the support for registration and use of mappings between pairs of classification systems, ontologies, schemas, and value domains. Thus far, we have identified three general approaches to mapping being used today:

Translation Tables
The simpler approach is to build a table of pairs (an unlabeled bipartite graph) between the classification scheme items, concepts, or values which have corresponding or overlapping meaning. They are also sometimes called "correspondence tables"; see for example the mappings provided with NAICS 2002. A one-to-many matching indicates ambiguity. Translation tables are sometimes qualified with a confidence and/or completeness scale or measure, which is necessarily direction-dependent.

DL-based Translation
A more powerful approach is to use a description logic (such as OWL) to express mappings, which provides more precision. However some non-trivial tool is still needed to "apply" a DL mapping as a transformation.

FOL-based Translation
First-order logic (FOL) is more powerful than description logics, and so it supports the definition of more complicated mappings. The trade-off is that DLs are typically decidable and tractable, while full FOL is neither. Still, some communities have been using full FOL for some time and found that these theoretical problems rarely (if ever) materialize in practice. Additionally, there is a great variety of DLs, many of which cannot be combined without breaking the decidability and tractability conditions which motivate the use of a DL in the first place, and so any system attempting to leverage knowledge which is expressed in two different DLs will generally be forced to use a substantial subset of FOL anyway. An emerging ISO standard for exchange of FOL axiom sets is called Simple Common Logic (SCL).

Rule- or Query-based Translation
In the relational database community translations are commonly described as views. Euzenat observes that an equivalent level of expressivity is provided by SWRL for OWL/RDF [Euzenat, 2004]. It is not immediately apparent whether rule-based translation is (in theory) any less powerful (or more tractable) than FOL-based translation.

3.1. Which vocabularies are you linking/mapping from/to?

We have just begun this part of our research and development efforts, starting with mappings between the old Standard Industrial Classification (SIC) codes and their successor, the North American Industrial Classification (NAIC) Codes.

See forthcoming work by Fred Gey, who is a member of our XMDR team at LBNL, a preliminary copy of which is attached.

3.3. Describe the different types of mapping used, with reference to the examples given in paragraph 3.2.

3.4. Any additional information, references and/or hyperlinks.