HCLS Health Care Domain

27 Mar 2012

See also: IRC log


Charlie Mead
Special Guest Star
Cecil Lynch
Follow-up Meeting
Fri 30 March 11AM EST on #hcls


[slide 3]

<Joanne_Luciano> slides aren't numbered :-(

<Joanne_Luciano> ah, but the browser numbers them!

<egombocz> If you look at them not in show mode, you can see the numbers on the side thumbnails

Cecil: antibiotic-resistent airline passenger promted review on Tuberculosis Information Management System (TIMS)
... reporting a TB case required passing a brittle set of messaging and business rules

[slide 4: Message Processing Integration]

Joanne_Luciano: each state wanted their own standard?

Cecil: CDC wanted a standard
... states would take anything which makes reporting easier
... [re: slide 4]
... choices about how to import messages to CDC
... .. after message had some processing
... .. as a Web Service RPC

[slide 5: Deployment Architecture]

Cecil: going with existing CDC infrastructure
... staring from left:
... .. some source, usually state or large counties (53 jurisdictions) reports

<Joanne_Luciano> is going with the CDC one of those three options on slide 4 or is it another one (not listed on slide 4)?

Cecil: .. goes into data messaging broker, which validates syntax

<Joanne_Luciano> looks like it's option 1 on slide 4

Cecil: .. if a valid TB message, off to content validation queue
... .. also split into components for e.g. line listing of incoming cases
... .. after validation, email with contents of alert sent to CDC's TB group

Joanne_Luciano: this is slide 3 option 1?

Cecil: this is slide option 3 (RPC)
... we had tried driving real-time alerting from biosense
... we took messages off the first transport, never queued in DMB [slide 5 left]
... the HL7 2.x standard is fairly loose
... flexible, can take any payload
... can be structured in any way
... segments are well-defined, but segment structure requires point to point negotiation
... p2p neg is a guideline

charlie: HL7 2.x is a syntactic standard and a semantics guideline

[slide 6: Message Content Validation Architecture]

<Joanne_Luciano> JMS?

Cecil: after leaving broker, falls into JMS interface
... because this has the 2.5 validation, we don't need the 2.x syntactic validation
... so we don't do the validation
... before we went live, we validated and found 2 errors in HL7 messaging
... (was a benefit of 2-tier validation)
... once live, we don't do syntacit validation
... but we do parse out components
... questions like birthday and date of problem were found via OBX extractions
... an OWL ontology tells us how to process a message
... the ontology links all the knowledge
... it guides parsing the message by aligning the OBX-extracted facts with an RDF graph
... we can then use the JESS reasoner for evaluating these facts
... JESS (Java Expert System Shell) is a rules FW/BW chaining rules engine
... has a protege plugin, interprets SWRL
... good commercial tool for high-volume processing
... paid for by tax dollars, only free for government use
... $75K otherwise

<Stuart> Drools

<iker> DROOLS

<mr_sticky> Drools is from JBoss

<mr_sticky> http://www.jboss.org/drools

Cecil: we tried Drools, which has FW/BW chaining and similar fact structure
... use JESS if you're processing millions of facts

Joanne_Luciano: and Jena?

Cecil: no experience with it
... at OTR, we pass what we expect to see and what we got as two graphs
... the choreography of the OTR framework works out that something is a question about an e.g. resistance pattern of anitbiotic
... we have a set of "listeners" (patterns)
... we built this on V3 semantics, but mapped back to V2 syntax
... once we've matched the graph against the patterns, we pass it to jess
... we give jess the profile for an e.g. normal patient, MDR (multi drug resistant) patient, XDR (extensive drug resistant) (potential super-spreader)
... the reasoning framework decides if an event needs action
... another listener strains through alerts from JESS for outbound messaging
... we also use the output for visualization
... folks don't need to need to use SAS to extract this data from mid-tier, instead just using graph representations
... with agreement from CDC, we could have sent output messages back to reporters
... output:
... .. drug resistant
... .. appropriateness of drugging (per WHO codes)
... .. predictive analysis of whether someone is likely to fall off treatment based on patient history

[slide 7: Types of problems that could be solved by extending the TB framework]

Cecil: had to bend to time and budget limitations
... we could have added a d2rq interface to retrofit the pre-existing data
... a lot we could have done

[slide 8: The use of an OWL ontology]


[slide 9: HL7 Message Artifact Taxonomy]

Cecil: this is how we mapped the OBX structure to the ontology

[slide 11: Rule Processing]

[slide 12: Message Content Validation Rule Implementation]

Cecil: this demonstrates the advantage of using OWL
... the blue is what we deleted
... (from TIMS)
... went from 358 to 175
... reduces frustration of reporters facing conflicting rules
... beyond OWL being able to do syntax, vocabulary, rule processing, we see the advantage of declarative rules

[slde 13: Message Content Validation Rules]

Cecil: with tons of volume and response time requirements, you need a more efficient bw-chaining system (JESS)

[slide 14: Message Content Validation Results View]

Cecil: sample output

[slide 15: Processing Results]

Cecil: average processing time 3.5s round trip
... far faster than a human, and more accurate
... scales up to ~350k messages/day
... ~300K TB messages/year
... could scale to influenza
... at worst case (4 month window), 50-75M, so ~ 200K message/day
... in a surveillance, you're also looking at folks who don't have it
... feeds from 800 VA hospitals, + laps a quest and labcore, ...
... congress says we need response in 2 mins
... had to put everything in memory
... biosense lost funding

mscottm: summary of SemWeb advantages is very different from our usual tech demos in HCLS
... what are your SemWeb wins?
... what could be improved?

charlie: would like formal continuation
... to help us find focal points in HCLS

Cecil: SemWeb is a flexible way to extract knowledge
... we were given a TB messaging system and a deadline
... 7 days before deadline, CDC said we'd like to upgrade a 1.2 of our implementation guideline
... had around 35 new rules and 100 terminology changes
... because everything CDC gave us was in the OWL. expected to do it in 4 days
... made it on 4 days with no additional charge to CDC
... big commercial motivation is the flexibility at responding to rapidly changing knowledge
... at NCI, i wanted to build an EMR system
... NCO SHARP projects kind of get to this
... win 1: rapid software engineering
... win 2: rule validation
... win 3: can infer things that a human has problems inspecting

<mscottm> Nice to hear that experience in the field confirms my main sales pitch about advantage of SemWeb tech for software: easier maintenance and change, agile development, effectively lower cost.

Cecil: .. (large systems (e.g. BRIDG's UML) hard to swap into a brain)

Minutes formatted by David Booth's scribe.perl version 1.136 (CVS log)
$Date: 2012/03/27 21:57:08 $