Semantic Web Health Care and Life Sciences Interest Group Teleconference -- 29 Aug 2017

--> hsolbrig (~hsolbrig@public.cloak) has joined #hcls

I2B2 mapping experience (Ken Lord)

eric: Diff between RDF mindset and I2Bc approach is that in RDF you initially uniqify everything, then harmonize in that process.
... There's a contiuum of vocabularies, that everyone on the planet speaks versus one the only you and your neighbor speak.
... In RDF you can just merge graphs without anything additional. But with I2Bc approach, you have a can where people can dump their data, and then providing infrastructure for after-the-fact harmonization.

ken: Problem was that ONC (if I'm remembering right after 4-5 years) ... Partners spent considerable time dev CDA factor: all of their patient-centric info.
... Separate entity (Partners REsearch) does purely research, taking patient-centric models, based on I2B2 schema, and it is cohort-centric.
... Therefore the data model is very different. They wanted to explore, instead of writing procedural code, model-driven approach to consume CCDA docs.
... We also extended to Boston Childrens, which is using HL7 v2.
... to do research from these. That was the problem, and we set about doing this using our model-driven methodology.
... Community doing this research was using proprietary bindings to their concepts. That's a fundamental issue still.
... Might be helpful if they normalized, but we didn't get into it.

eric: Not combining two datasets?

ken: They had something that could export CCDA -- an XML store.

eric: Could have done XML queries on it?

ken: Yes, one could have done that.
... I2B2 exists because at the time they needed high performance for these researchers, and they used a star schema on relational DB to get at the data.

eric: There are some normalized SQL representations of clinical data. MIMIC2, for example captures a good fraction of what an EMR does.
... That's used around Boston area. Then there's another SQL representation from EPIC, which is probably all of the EMR.
... Is it more productive to put it into a start schema where you have a fact table, but the context for the facts you have to manage on your own versus a highly normalized schema like MIMIC2?

ken: But even if it is highly normalized you have issues around how it is structured -- primary/secondary keys.
... Also they've built a lot of tooling around their star schema.

luis: Putting clinical data into a DWH. I2B2 is very useful to implement queries, because it provides the tech to do it.
... We were pumping data into a "sinker", and openEHR DB.
... Also for NLP that would have to be a plug-in, as preprocessing. Whereas I2B2 gives you some of the tooling.

eric: Therefore for an individual org it comes down to a question of whether the queries you want to do would work well with I2B2, right?

luis: right.

eric: cohort vs patient-centric?

ken: We do transformations. I2B2 has a wrench. Some might want a hammer. Big issue besides terminology, is it becomes a many-many transformation. You need to put in rules to do that.
... You have many observations that will be many facts that will have many observation pieces.

eric: If I have a CCDA record for Bob, Alice, etc. and I want to stick them into I2B2, and want to look at medications, diagnoses and outcomes.
... And I want to put the whole longitudinal record into I2B2?

ken: In I2B2 I have 3 containers: patient, medication, etc. Need to understand what I have on this side and how I'm organizing it, and on the other side what we want, and how to map them into containers with completely different containers.

eric: For exmaple blood panel, with creatinine and lipids. Star schema has facts in the middle, patient encounters, etc.
... When you put them in the facts table . . . .

ken: They all go into a table-oriented view of the world. I have 5-6 objects that need to populate 5-6 objects on the other side.
... For object 1 I populate object A and C, and for object 2 I populate object B, C and D.

eric: Diff between longitudinal and population-centric is having multiple people in the DB?
... Advantage to I2B2 is that it is easy to ask population-based questions?

ken: right.
... But they haven't tried to make progress on semantics. Everybody works independently. No interop between I2B2 sites.

harold: I2B2 followed the traditional research model (with spreadsheet or whatever), and you lay out your own data items.
... Each org makes their own studies. But there's an effort now for a shared-study data model for I2B2.
... But another approach we've been taking: Using FHIR RDF work and map RDF URIs for subjects and predicates into I2B2 concept and modifier entries, and then take I2B2 data and converting it to RDF as entries in the I2B2 fact table.
... The challene of mapping data structures and terminologies then becomes a FHIR problem rather than an I2B2 problem.

ken: We believe we can automate the process of creating an I2B2 map.
... Objective is to provide better tooling on top of FHIR, for I2B2 community?

harold: Combination. For FHIR community it is to leverage I2B2 tooling. But the other element that is useful: if we can get FHIR-I2B2 idiom developed, you have this ont (info model), and you call something hematocrit, and if we represent the data in consistent and close-to-native FHIR, it should be cheaper to create studies and share/exchange studies.
... The Shrine is I2B2 federation project.
... But we could make the data interop that way.
... Sean Murphy published mapping info on I2B2 that allowed us to do this.

eric: If we could get Grahame interested in I2B2 and start imposing structure, then the same thing that is happening in FHIR could have been done in I2B2.
... But maybe more natural to do it in a less normalized way.

harold: I2B2 cannot model devices, and cannot handle workflow. Also very focused on anonymization.
... only a subset of FHIR makes sense in I2B2. But relatively atomic model from converting FHIR into RDF . . .

<hsolbrig> build.fhir.org/fhir.ttl

harold: That is the complete set of FHIR resource models, in RDF.
... "FHIR Metadata Vocabulary"

ken: On our referent index, we've done the work to create a classification scheme based on 11179 semantic concepts.
... We're using 11179 because it all comes down to mapping eventually.
... We can check for unique concepts using it.

luis: Do you mean a terminology like SNOMED-CT?

ken: No, we have semantic concepts that have value domains, and they can be bound to SNOMED-CT concepts. We expect RDF to be able to do that.
... We focus on interop and info model domain.

luis: Makes sense to use FHIR in the middle at this point.

eric: part of what makes it exciting is that folks are excited by it (coupled with the fact that a normalized structure is probably more intuitive for folks).

harold: At Mayo exciting: large group mapping the EMR to FHIR, and another group mapping the EMR to I2B2. If we can reduce this redundant effort, that will save a lot of money, and researcher get the advantage of all of the FHIR work.

<scribe> ACTION: David to schedule Ken to present about I2B2 sometime after HL7 meeting in San Diego (which is Sept 9-15) [recorded in http://www.w3.org/2017/08/29-hcls-minutes.html#action01]

<trackbot> 'David' is an ambiguous username. Please try a different identifier, such as family name or username (e.g., dderour, dhansen2, dnewman, dshotton).

ADJOURNED

- DRAFT -

Semantic Web Health Care and Life Sciences Interest Group Teleconference

29 Aug 2017

Attendees

Contents

I2B2 mapping experience (Ken Lord)

Summary of Action Items

Summary of Resolutions

Scribe.perl diagnostic output