HCLS/UncertaintyUseCases

From W3C Wiki

Uncertainty Use Cases Encountered in HCLS

  • Hypothesis Uncertainty
    • Mutations in the alpha synuclein could cause Parkinsons Disease
    • Hypotheses of relationships based on statistical analysis of microarray data associated with p-values, confidence intervals, etc.
    • Gene Ontology Evidence codes in support of a particular GO annotation of a gene
    • Evidence classes in the OBO Evidence Ontology
  • Interpretation/Classification Uncertainty
    • The patient has elevated cholesterol based on his reading of X mg/dl
    • Given the same set of symptoms, Doctor X and Y come up with diagnosis of mild and severe disease respectively
    • True/False Positive/Negative rates of patient classifications and diagnoses. Use of measures such as Precision, Recall, PPV, NPV, etc.
  • Prediction-oriented Uncertainty
    • A person with the BRCA1 gene has a disposition towards Breast Cancer with 70% probability in the future
  • Belief oriented uncertainty
    • It is believed to the best of our knowledge that a particular gene is not implicated in a particular disease
    • Associated non-monotonicity with the above, i.e., if more knowledge is available, the statement could be proven false.
  • Data Source based Uncertainty
    • Samples from the same patient are analyzed by different labs. Lab 1 results show an 80% probability of Disease 1, whereas Lab2 shows a 90% probability for the same.
    • If the Cleveland Clinic says that Avandia is bad for Diabetes, the statement has a higher value of certainty as opposed to an individual Dr. X
  • Data Uncertainty
    • Approximate location of a clinical feature, e.g, tumor in spatial location in the human body as captured in radiological image or any other digital artifact
    • Data inconsistency and incompletenes encountered in Healthcare and Drug Databases
    • Data uncertainty introduced due to sampling errors, sampling rates, etc.)
    • Data uncertainty introduced due to the limitations (least count error?) of the device measuring patient characteristics (e.g., temperature)
    • Data uncertainty introduced due to limitation of instruments used to collect experimental data, e.g., micro-arrays

Patterns identified within the Use Cases

  • Belief statements made by researchers: interpretations, hypotheses, classification models
  • Data analytic uncertainties: sampling or machine induced
  • Data - metadata ommission (too open-world): Absence of relevant time and location information

Proposed Solutions

  • Thresholding issues
  • KD45 and S5
  • Named Graphs and RDF-based approaches

Some Thoughts on the Above

(Please feel free to delete/ add these to the main body)

  • Much of clinical research only produces uncertain knowledge. Clinical trials, especially of treatments, produce statistical associations between treatment and outcome. This knowledge base contains conflicts and is defeasible - later knowledge may lead to a different conclusion.
  • One solution is to harness existing EBM approaches to ranking evidence. Then knowing that a study is an RCT allows one to infer greater certainty about its conclusions than if it is a case-control study.
  • Reification (in a general, non-RDF sense) takes place at (at least) two points: Some Knowledge Engineer claims that this study claims that X -> Y. This may be important for trust (where we may be concerned about conflict of interest).
  • The OBO evidence ontology seems to lack terms for clinical evidence. I may be wrong, but I couldn't find any. This seems to be an easy fix.