HCLS F2F Day 1 -- 8 Nov 2007

<jar> hello rrsagent, I was saying there are about 31 people here, and they are introducing themselves.

<jar> ha

<jar> restrictions on new logo are described on w3 web site

<jar> intros done. eric n is introducing agenda

<jar> eric n: what is hcls?

<jar> (i'm reporting on what's on his slide, not asking)

<jar> i am self-appointed scribe, apparently

<Susie> The agenda for the meeting is at: http://esw.w3.org/topic/HCLS/F2F/2007/11/Agenda

<Susie> EricN introduces HCLS - http://www.w3.org/2001/sw/hcls/

<jar> excellent, thanks for the uris

<jar> or are they iris?

<Susie> There will be a HCLS Workshop at WWW 2008 in Beijing

<Karen> Invite RRSAgent

<jar> next issue of scientific american has semantic web theme with hcls article

<jar> karen, rrsagent is already on.

<jar> karen, I'm scribing

<Karen> thanks!

<zzz> scribe: jar

no writing anything down now because rrsagent has the url to erin n's powerpoint

i think scribing here on irc makes most sense. maybe we can trade off, should i continue now or do you want to continue?

<chimezie> i can continue here

<Karen> Please continue and let me know when you're tired

excellent, go ahead, let me know when you get tired.

<chimezie> alrighty

maybe you can paste in your previous notes or give a uri

<chimezie> Initial notes on ESW Wiki

<chimezie> TanyaH: Partners HC is a strong supporter of this W3C IG

<scribe> scribe: Karen

Next speaker: Tonya Hongsermeir

<chimezie> ... Most EHR lack knowledge (uner-nourished).

<jar> do we have a uri to karen's slides?

Don't know; I'll ask EriP when he returns

<chimezie> Tonya's slides

<jar> oops tonya

Partners legacy challenges; roadmap to leverage SW technologies; eager for tools and approaches

<zzz> scribe: chimezie

TonyaH: There is nothing today for solving dense problems for knowledge aquisition and maintainance
... we want to get rid / reduce of the vertical silos
... Content Management Servers (Documentum) are used for document management

<alanr> who just joined?

TanyaH: alot of ontology content is being handled / modeled as rules. The intersection and the tools that support the intersection are still not yet mature

Next Speaker: VipulK

VipulK: we are looking at the intersections between the biological / clinical spectrum and research / practice
... if there is one thing common across these it is the sharing of clinical observations

-> http://esw.w3.org/topic/HCLS/ClinicalObservationsInteroperability/November2007F2FAgenda.html?action=AttachFile&do=get&target=Vipul_F2F_Slides.ppt Vipul's slides

VipulK goes over blood pressure models in both DCM and SDTM

scribe: demonstrates how these models can be expressed in OWL

VipulK: patient data can be merged by making coherent connections between observations on the same patient

Question: was it difficult mapping terms and identifying criteria for definitions?

scribe: was OWL sufficiently expressive?

VipulK: Initially OWL was expressive enough
... eventually the pragmatism must be considered in determining which languages to use

ScottM: Subsumption and equivalence may not be sufficient

Question: how were the mappings generated / created

VipulK: manually (across 2 models)
... one issue: we want real world data from healthcare providers ...

JoanneL: which tools were used?

VipulK: we mostly used spreadsheets and protege

Question (from audience): how are you validating

VipulK: 1) validation by experts (scribe not sure)
... 2) ultimate test is to run it in a system
... 3) use inference to test consistency

Next speaker: Rachel Richesson

-> http://esw.w3.org/topic/HCLS/ClinicalObservationsInteroperability/November2007F2FAgenda.html?action=AttachFile&do=get&target=Rachel_F2F_Slides.ppt RachelR's slides

<Karen> Rachel describes projects her group does

<Karen> ...NIH projects for example in diabetes

EMR: Electronic Medical Record

EHR: Electronic Healthcare Record

Rachel: The goal is to identify / screen patients via data in the EMR/EHR

Rachel goes over a slide of the process and the data involved

<zzz> scribe: Karen

clinicaltrials.gov is source of sample protocol data

scribe: sites challenges around terminology of standards; reviews chart

CHI= Consolidated Healthcare [?]

Q: Does it have to include MEDRA (sp?)

Rachel: MEDRA would be in the findings range
... the chart information will change; but is a frame today
... CDSIC = Clinical Data Standards Interchange Consortium
... people and activities overlap; however some different terminology in use
... eg CDISC uses NCI

Tonya: We're mapping some things to SNOMED, but have to create our versions sometimes

Rachel: we have use cases but need standards
... one level for mapping
... another is a bottom up approach to look at the data
... good context to know there is a variety of representation issues to deal with
... not there yet on delivery end or clinical end
... HL7 is one to watch for healthcare

<alanr> ping

Rachel: BRIDG trying to bring together HL7 and CDISC
... Detailed Clinical Models; lots of ways to represent this
... what we need to remember is what the models are and how we use them consistently

[visual cartoon]

<ericP> what is the biggest conduit for getting patient data into a bytes and chars?

Fabien?

I have a message for you

<alanr> data entry personel?

<ericP> i wonder if we can play with these machines/interfaces

<ericP> when someone says "taxonomy-direccted input", i don't have a feel for it 'till i see the UI

<ericP> and the pain of the person inputing the data

<eneumann> For those virtually present (Bo and Kerstin), feel free to raise a question by typing 'q+'

Participants discuss legal issues with releasing data

scribe: challenges of anonymizing data, not releasing patient information

<alanr> http://browse.opengalen.org/

<alanr> in firefox

<alanr> example of taxonomic choice

<jar> breaking for coffee now, rrsagent.

<eneumann> scribe: eneumann

Marcus Collins from Novartis is attending...

Tom Oniki presentation...

Tom: working with Stan Huff and Joey Coyle from Intermountain
... not-for profit serving Utah
... need clinical models, and computerized decision support, automated data analysis
... this requires standard data structure at a fine-granular level
... one use case in Clin Decision Support is CT recruitment
... Clinical Element Model (CEM): 12 yrs of effort; for data entry, decision logic, core services...

CEML is an XML model

scribe: Type, Key, and Value Choice
... Data and Item (also a CEM)
... use HL7 v.3 data types
... can make collection of Observation items
... qualifier give more info on value choices (recursive items)
... do qualify the data or the subject?

Joey: qualifiers of modifiers is more complicated, incl. negation

alanr: the qualifier item may vary depending on how you use the data

<ericP> alanr, how about { [ a :BunchOfBodyMeasurements ; :bodyPos [ a :BodyPos ; :bodyPosition :sitting ] ; systolicBP "120"^^:mmHg ] } ?

Tom: data is usually 'controlled terminology codes'

<alanr> how do you know that bodypos is of the same body as the bp is taken from

<alanr> *you know*

Tom: eager to learn more about SW

<alanr> but we want this explicit

<alanr> could be a rule - that would be fine

<ericP> just makes the semantics of :bodyPos more obvious from its domain

<alanr> yes

Tom: need codes + structure: numbness and location??
... dryweight vs weight with type 'dry'
... try and define a single data storage model

eric: can all the groups decide one one clinical model standard in reasonable time, or should we be willing to work with multiple standards and mappings between them?

<ericP> writing it three ways: [ a :BunchOfBodyMeasurements ; :weight [ a :DryWeight ; :massValue "70"^^kg ] ; :weight "70"^^:dryKg ] ; :dryWeight "70"^^si:kg ]

Tom: pre-coordinated single-name way of defining things (snomed style?), vs combinations of codes and values
... UI and how the data is actually stored; hiding the various levels from the user
... coded domains: measurements verses coded assrertions: when to use one or the other
... things we seek: explicit model for CEM
... single way to store and access data elements
... interlingua for sharing modelled data between institutions
... extensibility
... apps the address data generally, i.e., new kinds of data
... Status: partnering with GE healthcare, with func requirement sheet
... would like to discuss creating the needed models
... no data stored yet against these models

Ronen: how does the DCM relate to OpenEHR?

Tom: there is probably a lot of potential mappings
... HL7 RIM templates are another example

<ericP> i'm curious to learn how vipul's models differed consistently from HL7

Chimezie presentation

Chimezie: focus mainly on demo, but some background: SemanticDB is a CV registry for doing outcome research
... to provide knowledge for decision support systems
... more emphasis on longitudinal (life long) content, not just clinical records; how to avoid changing DB schemas each time?
... provide a more flexible system built on XML and SW technologies
... problems: fragmeted data; inflexible to extend; idiosyncratic terms; overlapping content so queries are difficult
... director gave CC a mandate: use better models for data handling; what is the next set of technologies?

<ericP> alanr, sound like Exhibit + SPARQL to you?

<alanr> yup

Chimezie: solutions: extending un-anticipated domains is important; should not rely on dev for UI- automated; expressive languages and drawing inferences; highly distributed; scalable to large data volume; use standards!
... why CC joined W3C
... XML to RDF using GRDDL and OWL
... drop down driven terms for queries; use of xforms;

<ericP> CC = Cleveland Clinit [distinguished from Creative Commons for benefit of later readers]

thanks eric

scribe: quality of data is important for getting more use out of it
... CyCorp reasoner used
... queries: NL strings to parsed phrases to SPARQL
... extendinto new areas of research; use of local terminology; high-fidelity integration
... Cyc Ontology- high level; several levels of logic including micro-theories
... patient record core onto, at center of other terminologies: heart rhythm, meds, surgery, events...
... separation of domain goals from DB optimization issues (joins, normal forms)
... demo now based at texas
... Cyc takes clauses and builds a SPARQL query from it; also optimizes it

<alanr> but they are not using a very fast database/sparql implementation. Based on Mysq

<alanr> mysql

<alanr> would be interesting to test this in virtuoso

Chintan's presentation:

Chintan: At Biomedical semantics dept at Columbia working with IBM
... ontological reasoning of Patient reasoning for CTs
... recruitment is ain bottleneck (50% oftime)
... low participation rates
... lack of time by clinicians and patients
... idea is to automate this better
... data not well encoded digitally
... CT criteria is not well expression -- semantic gap
... electronic records to eligibility criteria mapping
... our approach is to use knowledge in terminologies and ontologies: MED and SNOMED

MED = Medical Entities Dictionary

scribe: classificaiton of therapies and diseases
... reason on data+terminology
... TB reasoning as infectious agent and cause for meningitis
...challenges: reasoning needsto scale large for terms
... trials data is also large and queries tend to be very expressive
... SHER is a scalable highly expressive reasoner uses of novel summarization approach
... mappings of MED to SNOMED; clinical data repositoty (CDR) ETL'ed to RDF ABox
... query formulator to input inclusion/exclusion criteria
... local + domain knowledge mapping (89% MED concepts map to SNOMED)
... Incomplete data handling
... increased response rate using sound but incomplete logic
... now trying to integrate with Columbia's CTM systems

alanr: how much loss with incomplete reasoner?

Kavitha: we can combine both and get both benefits?

Jyotish: How does this comapre to rule systems?

brainstorm session now...

<scribe> scribe: ericP

Brainstorming

Vipul: Goal: what can we do to move forward?
... is this an attractive use case?

chimezie: doesn't intersect with our data, but it addresses VMR content

rachel: have a foot in HC and research worlds
... CDISC folks not here, but appeard helpful to Brahn Kissler

june: need exists
... we created a forum for families of patients of alz
... we post results of about 7 studies
... federally mandiated barrier between HC and research
... considered an ethical breach
... not sure this use case falls in the real world
... who would do it and how

vipul: ok if a machine referrs folks to studies?

june: grey area

george allen: other way to do it ...

alanr: to use cases, right?
... 2nd where clinicial refers patient is ruled out

aaron: my org is currently working on cohort selection and cohort identification

june: is there a way for research selection criteria to feed back to acquired data

alanr: issue is whether there are too many to eval without feedback criteria

vijay: good use case
... isn't last demo the same?

vipul: clearly aspect of matching is there
... but we are more ambitious 'cause we want to map to HL7

alanr: the use case description does not mention this

aaa: interesting to me
... inclusion/exclusion rules are very broad
... think there was an ex from rachel's presentation

rachel: e.g. ability to lie flat in this instrument

ronen: SOP: completely opaque
... manhandle data out of EPRs into flat files
... then search through flat files
... could be an excel spreadsheet

tom (IMH): qualified yes
...transformation form UI model to storage model and other model archetypes more interesting to us right now

joey: make sure you have examples for everything in the use case

george: yes
... have a masters informatics student now doing this
... the research gives us the variables of interest
... other vars tell us whether we want to approach this patient

alanr: you mean extra questions to ship to the interviewer?

george: yes -- clear up grey areas

susie: not in dev at Lilly
... but, happy to share this UC with folks in dev
... this seems like a good UC

don rucker (SEIMENS): yes, but we have a product that uses a totall diffn't technology (hidden markov models, ...)
... need to id clinical DB and funder that will want to use that DB
... have specifics of mapping before you go to far

vipul: need to make sure it's consumer-driven

dan corwin: terrific
... good thing: mapping to a specfic EMR system
... the person who will do most of that mapping is the clinical research corrdinator

vipul: ok, so need intuitive interface

Karen: what do we need to show regulatory bodies

vipul: started with tbat, but reallized that reg agency was not involved

marcus: will bring to development arm
... good use case
... similar use ceses in social networks like telecom

vipul: not sure we'll have resorces to follow these up

t. n. bhat (NIST): haven't heard metrics for success
... how much cross-over is possible

joanne: metric for success: integrator reliability

AdrianP Different data mining techniques used including e.g. logic based approaches such as inductive logic programming (ILP)

max: can see applicability of semweb here

dbooth: interested in feedback

alanr: blocking factor is whether we can get some patient data
... but yes to this UC

data issue

joanne: yes to UC

ACTION: Vipul to get some relevent patient data [recorded in http://www.w3.org/2007/11/08-hcls-minutes.html#action01]

alanr: rachel analyzed criteria
... might be easiest to release if you could pick out the criteria reducing the amoutn you have to anonymized
... e.g. if it has no need of family history, we might be able to use deceased data

joanne: what's the inclusion criteria

rachel: we sampled five protocols

joanne: might want to make something up

eneumann: have synthetic data on a web page

<alanr> http://www.cse.buffalo.edu/~szhong/papers/kanony.pdf

eneumann: is there confidence in K-anonymization?

ronen: doesn't work on small populations

<eneumann> see http://privacy.cs.cmu.edu/people/sweeney/kanonymity.html

vipul: how many do we need?

ericP: why do we need more than three?

vipul: need to cover all of the selection criteria

eneumann: recommend narrowing the first step
... see what surprises come up

resources

vipul: need to know time commitments
... george, your masters student

ACTION: george will try to get a masters student? [recorded in http://www.w3.org/2007/11/08-hcls-minutes.html#action03]

joey: not prepared to commit at this point

ronen: will try build up over next 5-6 months
... would hope to get 1

scott: if we get involved, would be on a mapping issue
... ... using associated text resources
... 1 student if we get involved

eneumann: does your group interact with T&O?

scott: yes
... lots of folks in AMS involved in ontology mapping but i don't command their resoruces
... can ask them if they would join us
... lots of resources, but it's a young technology

ronen: time frame?

eneumann: have this done, say, in a year?

vipul: Banff project took a year

joanne: would like to capture the process

aaron: can't give time commitment

Michal Z: train 20 students, can probably persueade 2

Rachel: maybe 1day / week

chimezie: can't promise -- lack of time and divergent tragectories

susie: can participate
... if it becomes a well-scoped project, and Lilly dev folks are interested, i can probably fund a student

Dan: maybe 1day/week

marcus: happy to pass something more concrete to dev folks

adrian: intested in rule-based applications
... if it fits with ReactionRuleML, could get a student

concrete tasks

<AdrianP> Several projects, e.g. GoPubMed, Reaction RuleML, Rule Responder

HCLS F2F Day 1

8 Nov 2007

Attendees

Contents

Brainstorming

data issue

resources

concrete tasks

Summary of Action Items