From W3C Wiki
- Ratnesh Sahay, DERI, Galway
- Eric Prud'hommeaux, W3C
This document provides a set of guidelines for ontologising the HL7 Version 3 that is based on a central conceptual model called Reference Information Model (RIM). The objectives of lifting the HL7 Version 3 to an ontological format are: (1) express the RIM in a standard logical model for processing with generic tools; (2) use the OWL based RIM to define extensions to the RIM, especially for the HL7's Fast Healthcare Interoperability Resources (FHIR) instances; and (3) use OWL tooling to address the interoperability between HL7 and medical terminologies such as SNOMED.
Health Level Seven (HL7) Standard
Health Level Seven (HL7) was founded in 1987. It is a non-profit ANSI accredited standard that develops healthcare information models and schemas for storing and exchanging information across the healthcare industry. The HL7 standard covers both clinical and administrative aspects of the healthcare industry, including laboratory, medical records, patient care, pharmacy, public health reporting, regulated studies, accounts and billing, claims and reimbursement, patient administration, personnel management and scheduling. There are two major versions of the HL7 standard, Version 2 and Version 3. HL7 Version 2 is based on Electronic Data Interchange (EDI) format but specifications are also available in XML format. The Version 3 specification, as opposed to Version 2 EDI messaging, is based on object-oriented principles. In this document we specifically discuss the HL7 Version 3.
HL7 Version 3
The HL7 Version 3 specifications are centered around a static conceptual model, called Reference Information Model (RIM) which covers all domains of the healthcare industry. The scope of RIM is global, which means it is inherited by all Version 3 complying healthcare institutes. In addition to the RIM, two local interrelated types of information models are the Domain Message Information Model (DMIM) and Refined Message Information Model (RMIM). The DMIM is a local model (a refined subset of the RIM) and is used for modelling a particular domain (e.g., Lab, Hospital Admin). The RMIM is a subset of a DMIM and is used to express the information content for a message or set of messages (e.g., Lab Observation Order Message). All three interrelated models use the same notation and have the same basic structure but differ from each other in their information content, scope (context), and intended use. The RMIM model is exported in XML Schema and an implementation technology like XML is used for the actual construction and exchange of clinical messages.
As said, the RIM is an upper model shared by all users of the Version 3 system. Figure below shows six core RIM classes:
Entity: A physical thing (living and non-living), a group of physical things or an organisation capable of participating in Acts, while in a Role. The Entity further specialises in classes like, EntityPerson, EntityPlaces, EntityDevices, etc. Each specialised class then introduces additional attributes.
Role: An Entity playing the Role as identified, defined, or acknowledged by the Entity that scopes the Role. For example, the Entity (e.g. Patient Sean) playing the Role (i.e., patient) and participating (as subject) for the clinical Act (e.g., Lab Observation). The Role class further specialises into RoleMember, RoleIngredients, RolePart, etc.
Act: A record of something that is being done, has been done, can be done, or is intended or requested to be done. The Act class further specialises into ActObservation, ActClinicalTrial, ActSpecimenOrder, etc.
Participation: An association between an Act and a Role with an Entity playing that Role. Each Entity (in a Role) involved in an Act in a certain way is linked to the Act by one Participation. The Participation class further specialises into ParticipationHolder, ParticipationReciver, ParticipationConsultant, etc.
RoleLink: A connection between two Roles expressing a dependency between those Roles. ActRelationship: A directed association between a source Act and a target Act.
As said, in addition to the RIM upper model, the Version 3 messaging framework allows to construct local UML models that are specific to a healthcare institution. Local UML models inherit from the upper model to specialise the six core classes (Entity, Act, Role, Participation, RoleLink, ActRelationship) for constructing clinical messages.
The RIM is organised in three categories: (i) Data Type (ii) Structure (i.e., hierarchy of classes and attributes) and (iii) Vocabulary. The RIM data types are further divided into two groups (Foundational and Basic), where foundational data types are designed to construct the other data types. Basic data types are built by extending foundational data types for clinical message construction. With the UML semantic specifications for these three categories, an Implementable Technology Specification (ITS) provides a mapping between the constructs of implementation technology and the RIM semantics. The UML models are stored using the Model Interchange Format (MIF). In Version 3, the separation of semantics from implementation technology aids in the consistent use of data types and vocabularies, irrespective of the implementation technology.
UML-RIM to OWL-RIM
In some recent works, transforming a UML model to an ontology and vice-versa has been carried out in two categories. First, some of the initial works in this area proposed a conceptual mapping between UML and ontology constructs. These works have revealed that it is not a straightforward task and domain specific customisation is required to preserve the semantics of both conceptual models. The second category of transformation task is creating mappings between UML and ontology serialisation languages. In standard settings, UML models are serialised and stored in the XML Metadata Interchange (XMI) format. However, in the case of HL7 Version 3 serialisation the format is different. HL7 repositories store UML models in the Model Interchange Format (MIF). Though several tools are available to transform (partial) XMI to an ontology language such as OWL, none of them support MIF to OWL conversion. We now present a set of lifting (i.e., transforming) guidelines to build an OWL-RIM from the UML-RIM.
UML-Class to OWL-Class
A class is the most fundamental concept in UML and ontological models. There are some differences between the traditional UML or OO programming class and the ontology class. For instance, UML uses namespaces to create self contained classes along with attributes and methods that vary depending on the scope of use. For example, each class has its own namespace for its attributes, associations and methods. This is further enriched by the ability to specify private, protected and public scopes. We consider that, if the type of a UML model element is a class or a component, then the model element corresponds to owl:Class. For example, we can find 4 classes in the RIM class figure above, and create ontology classes: Entity, Role, Participation, and Act.
UML-Attribute to OWL-Property
One of the major foundational difference between UML and ontology languages is the ontology notion relation, predicate, or property. An ontology property appears, at a first glance, to be the same as the UML association or attribute. However, properties in an ontology are first-class modelling elements, while the UML association or attribute is attached to UML classes where they are described. This means the UML association or attribute cannot exist in isolation or as a self-describing entity defining relationships such as inheritance. More precisely, in an ontology a relation can exist without specifying any classes to which it might relate. In UML, an association is defined in terms of association ends, which must be related to classes, and an attribute is always within the context of a single class.
OWL distinguish two kinds of properties -- object and data type properties. Their domain is always a class. The range of an object property is a class, while the range of data type properties is a data type. For example, in RIM class figure above the classCode attribute within the Entity class has a type CS class. Additionally, the classCode attribute has a range constraint restricting it to contain only one value from the CS type (CS here represents a set of code described in the RIM vocabulary domain). There are two design options to represent the classCode attribute as an OWL object property. The first option can use the restriction feature of the OWL language:
Class: Entity SubClassOf: entity.classCode max 1 CS, entity.classCode min 1 CS
In this case restriction is applied on the Entity class. Please note that in addition to this restriction definition, first the classCode needs to described as an object property with a domain: Entity class and Range: CS class. For the second option, thanks to OWL 2 for allowing restrictions (i.e., minimum and maximum cardinality) on properties definition as below:
Domain: Entity Range: entity.classCode max 1 CS, entity.classCode min 1 CS
Please note that in the first option cardinality restriction is local to the Entity Class, whereas, in the second option restriction on the entity.classCode object property is universal. Similarly, other attributes (e.g., determinerCode, code) of the Entity class can be lifted as OWL object properties. Additionally, the two lower elements (ActRelationship and RoleLink) in the RIM class figure are UML classes as per the standard specification. However, the semantics of ActRelationship and RoleLink is much closer to an ontology property as their purpose is to relate instances of the Act and Role classes.
UML-Association to OWL-Property
A UML association represents a relationship between classes. If a property is a part of the ends of an association and the type of the property is a class, then the property is mapped to owl:ObjectProperty. For example, in the RIM class figure above, playedRoleIn is an association between the Role and the Participation classes. In this case, the “Domain” and “Range” will be the classes related to the association.
UML-RIM Vocabulary to OWL-RIM Vocabulary
As discussed, for properties such as the classCode the range value is CS. Each coded attribute in the RIM (i.e., CD, CE, CR, CS or CV ) are associated with one vocabulary domain. Approximately 1500-1800 types of codes and concepts are described within the HL7 Version 3. This vocabulary domain should be built as a separate ontology where OWL-RIM object properties (like classCode) can use them for range definitions.
UML-DMIM/RMIM to OWL-DMIM/RMIM
Similar to OWL-RIM we need to define a methodology for building DMIM/RMIM equivalent ontologies. The scope of DMIM/RIMIM would be local to a domain/institution implementing them. The methodology should describe (1) how to reuse existing ones through some automated methods (e.g., XSLTs rules); and (2) how to build DMIM/RMIM local ontologies from a scratch.
HL7 Version 3 Ontologies: Envisioned Importing Structure
The figure below shows an arrangement of HL7 Version 3 ontologies into global and local spaces.
The binding arrow in the above importing structure and its implications is discussed separately at terminfo use cases
Based on the guidelines discussed above, next steps are:
- Lifting mechanism for MIF/XSD to OWL ontologies
- Layering mechanism to arrange and integrate global (RIM) and local ontologies (DMIM, RMIM)