HCLSIG/cabig/protocol matching

From W3C Wiki
< HCLSIG‎ | cabig

caBIG Protocol Matching Components

This page describes a set of components that utilize Semantic Web technology and can be composed to support use cases articulated by members of the caBIG Clinical Trials Management Systems (CTMS) workspace.

Contents

Components Overview

Components

  • Eligibility Criteria Representation
  • Eligibility Criteria Authoring Environment
  • Semantic Query Translator
  • ORM2RDF Adapter
  • caBIG SPARQL & Linked Data Plugin
  • SPARQL Access Control
  • Protocol Matching Component
  • Eligibility Criteria Service and Repository
  • Knowledge Base Service

Eligibility Criteria Representation (ECR)

A representation of patient eligibility criteria that

  • can be effectively queried using SPARQL
  • can be efficiently evaluated over realistically-sized sets of patient and protocol data
  • enables validation and explanation of matches
  • allows integration with multiple terminologies (e.g. local drug terminologies)

Eligibility Criteria Authoring Environment (ECAE)

A Web-based UI that facilitates the application of an NLP-assisted methodology for translating textual eligibility criteria into the structured Eligibility Criteria Representation.

Features of the authoring environment include:

  • criteria translation: translating text to structured representation
  • criteria validation: checking that the criteria is sound, complete, or if there are conflicts with other criteria; testing against sample data sets
  • criteria query & navigation: finding existing criteria based on concepts or relationships between criteria
  • terminology mapping: selecting or creating terminology mappings to be used during validation

Semantic Query Translator (SQT)

A component that translates SPARQL queries between RDF-based ontologies using mapping rules that are specified as SPARQL Construct queries. This component could potentially be exposed as a service allowing clients to specify mappings rules on-demand.

ORM2RDF Adapter

Provides an RDF adapter layer for systems that use object-relational mapping tools for data management. The initial implementation should enable translation from SPARQL to Hibernate Query Language (HQL) and serialization of the materialized Java objects to RDF using a flexible mapping of a UML-based, object-oriented model to an RDF-based ontology. Ideally, the adapter would translate SPARQL to an intermediate form (e.g. the monoid comprehension calculus or the ODMG's Object Query Language) and then to HQL so that other persistence strategies (e.g. XML) that have object-oriented mappings (e.g. Castor, JAXB) can be accommodated in the future.

<<reference SPOON and EMFTriple work>>

caBIG SPARQL & Linked Data Plugin (SW Plugin)

Uses the ORM2RDF Adapter to provide SPARQL and Linked Data interfaces to caBIG systems that have been built using the caCORE SDK. This SDK provides O-R mapping artifacts and mappings of UML models to NCI Thesaurus concepts. These could potentially be used to generate the mapping information needed by the ORM2RDF Adapter component. This plugin should be able to be dropped into any existing caBIG data source and work (perhaps at limited capacity) with no additional configuration.

SPARQL Access Control

<<Need more information about how Eric's approach works. Ideally, this approach would leverage a XACML-based approach (e.g. PDP, PEP, etc.). Refer to the XSPA XACML profile documentation.>>

Trial Matching Component (TMC)

A Java component that implements the functionality defined by the HL7 Clinical Research Filtered Query Service Functional Model (CRFQ SFM). This component should be able to be easily wrapped by a SOAP-based implementation of the CRFQ SFM. It utilizes the Eligibility Criteria Service and Knowledge Base Service to implement most of its functionality.

Eligibility Criteria Service and Repository (ECS)

The service supports management of a sharable, secure repository of eligibility criteria. It supports the query and navigation features that the Eligibility Criteria Authoring Environment exposes. Ideally, initial implementation would be populated with the CDISC ASPIRE criteria.

Knowledge Base (KB) Service

This service extends the SPARQL and OWLLink service interfaces to enable provisioning of on-demand triple-store and reasoner resources on the Web. It is used by other components (e.g. the Eligibility Criteria Authoring Environment, Trial Matching Component, and Eligibility Criteria Service and Repository) to create dynamic, RDF data warehouses, that have configurable query and reasoning capabilities.

Usage Scenarios

  1. The ECAE Retrieves protocols from a caBIG data source, for example, C3PR and NCI Enterprise Services (NES). These data sources publish their information models as RDF-based ontologies. Each also uses an Hibernate-based object-relational mapping (ORM) layer to manage their data. So, they need the ORM2RDF Adapter to translate SPARQL queries to HQL and serialize Java objects to RDF that conforms to their information models. The ECAE, however, presents only a BRIDG-based view to the user, and so the SQT Service must translate SPARQL queries from BRIDG to the C3PR and NES information models. The mappings also must accommodate different terminologies bindings, and so must use terminology mappings when translating queries.
  2. The ECAE negotiates with KB Service Factory to allocate KB resources for storing protocol information. A KB service resource is allocated and protocol data is stored in it.
  3. <<working>>