HCLS/ClinicalObservationsInteroperability/FDATherapeuticAreaOntologies/TAFile

From W3C Wiki

Use

.TA (pronounced "dot T A") files capture the definition of the Therapeutic Area Ontologies or the libraries of functions shared between Therapeutic Area Ontologies. The compiler (e.g. [www.w3.org/2013/12/FDA-TA/util/TAnode.js TAnode]) parses the .TA file and a definitions table to produce an ontology written in Turtle:

NODE_PATH=util node util/TAnode.js qualityOfLife.ta qualityOfLife-definitions.xlsx > qualityOfLife.ttl

Definitions file

Accompanying each .ta file is a definitions table. TAnode (invoked above) reads the file extension to decide which of the following parsers to use to read the supplied definitions table:

  • .csv Comma-separated values
  • .tsv Tab-separated values
  • .xlsx Open spreadsheet form consisting of a zipped XML file.

The comma or tab-separated files may have newlines and tabs escaped as \n or \t but it is not necessary.

The table must include one row with the following headers:

Name Definition Definition Source Code Code System Code Extension See Also

where Code Extension is either empty or one of:

  • < create a subclass of the Code/Code System concept.
  • = use the Code/Code System concept directly.
  • > create a superclass of the Code/Code System concept.

(The latter has not been tested.)

Structure

The TA grammar begins with a declaration of the Therapeutic Area Ontology:

TA: Anticoagulants

or shared library:

LIBRARY: hematapoietic

This ontology may import any number of other ontologies. These may include ontologies defined in .ta files or in turtle:

IMPORT: "core.ttl" AS: core:
IMPORT: "hematapoietic.ta" AS: hema:

The first argument is the name of the source and the second is the namespace prefix to which it will be assigned.

Next the grammar captures the information flow of data conforming to a Therapeutic Area Ontology, in particular:

  1. A Therapeutic Ontology includes a set of safety and efficacy endpoints.
  2. Each endpoint is a template for analysis of the impact of an intervention (a study material or procedure) on a corpus of subjects. The impact is captured in an outcome assessment which references observations from before and after the intervention.
  3. The measured observations may be another assessment, or a single observation, which is one of diagnostic procedure, quantitative measurement or sign or symptom.
  4. Assessements reference in turn other assessments single observations. Every chain of observations is expected to end with a single observation.

Declarations of the bolded types above can be seen in examples from two files:

Anticoagulants.ta:

EFFICACY: TimeToNonCNSSystemEmbolismEndpoint @"Time to the first occurrence of Non-CNS systemic embolism (SEE)"
   NonCNSSystemEmbolismOutcomeAssessment({hema:NonCNSSystemEmbolismAssessment})

hematapoietic.ta:

ASSESSMENT: NonCNSSystemEmbolismAssessment @"CRF: Non-CNS systemic embolism (SEE)"
                                           ScintigraphyObservation
                                           AngiographyObservation

DIAGPROC: ScintigraphyObservation "Scintigraphy" """Nuclear medicine imaging procedure"""
DIAGPROC: AngiographyObservation

In the above example, Anticoagulants imports the hematapoietic library and references an assessment called NonCNSSystemEmbolismAssessment. The declaration for TimeToNonCNSSystemEmbolismEndpoint uses a definiton in the accompanying definitions file as indicated by @"Time to … (SEE)". Each declaration may include a reference to external definitons, include the definitions inline, or elide definitions entirely. The declaration for ScintigraphyObservation includes a label (in single quotes) and a definition (in triple quotes) while the declaration for AngiographyObservation has no definitions.

Because assessments and single observations can both appear in the same places, references to assessments are enclosed in {}s.

Creation process

When confronted with a new CMap (or analogous source description of the interactions of concepts), one must

  • Identify the roles (one or more) represented by each node in the CMap.
  • Identify the terms likely to be shared with other Therapeutic Area Ontologies and add or find those declarations in the appropriate library files.
  • Add the SAFETY: and EFFICACY: declarations, including all of the referenced declarations until all the leaves are single observations.
  • Run the compiler and look for errors and warnings. These messages will include the line number on which the condition was found.
  • Edit and iterate until the remaining warnings are acceptable (e.g. unreferenced defintions for concepts which don't appear in the CMap).

The TAnode compiler will complain about:

  • Parsing errors. These are unrecoverable errors indicating that the compiler was not able to parse the supplied .TA file or definitions file.
  • Terms that don't explicitly include their role, or include a role counter to what is implied by the grammar (e.g. reference inside {}s which doesn't end with "Assessment"). Note that Endpoints, OutcomeAssessments, Assessments, etc are labeled above. The subtypes of single observation need not be labeled.
  • Definition references (@"...") with no corresponding Name in the definitions table.
  • Unreferenced defintions.