SV_MEETING_TITLE -- 28 Oct 2010

<matthias_samwald> i volunteer for scribing.

<michel> thanks

<matthias_samwald> scribenick: matthias_samwald

joanne: in the last telecon, chime summarized some of his work.
... the information did not have enough detail to tell us which mapping would be right, because there was not enough information.

<ericP> 1 person with 3 bodies?

joanne: a sample from the same age/date/gender from a different location. we imagined a situation where supplies are short. we decided we did not have the authority to decide about identity.
... the patient data is ambiguous.
... the different options should be presented in the paper to the readers.

<michel> ericP is this a model problem, or a data input problem?

joanne: question: "when it happens, what are the possible causes?"

scott: a question when you are trying to map is, are you starting with the correct term -- as a recommendation, there should be disambiguation built into the interface. if a clinican is labeling a clinical report, they should be offered many choices to help them disambiguate.

joanne: another issue is when the categories are too narrow.
... if the annotator has to record something that has not been seen before, it will not be recorded/annotated.
... you have to expect that unusual / unexpected things will need to be captured.

<michel> http://esw.w3.org/HCLSIG/PharmaOntology/Data/MappingDataToSNOMED

michel: i have made a mapping to SNOMED terms on the wiki.

scott: alignment is a different process than data entry.
... but the underlying decisions that need to be made are probably similar.

michel: regarding the mapping table on the wiki -- what does our coverage look like?
... how many patient records do not have any matches, what are we proposing to do about it?

joanne: the table... is the data from all patients combined?

michel: don't know, chime gave me the link.

<Bob> Michel: Summary of last week's meeting?

<Bob> <Bob> Joanne: Went thru Chime's record, discussed what could not be matched

<Bob> <Bob> ... decided that it was important not to make inferences where they didn't exist

<Bob> <Bob> ... granularization didn't have enough detail

<Bob> <Bob> ... Second case was where inference was not warranted

<Bob> <Bob> ... influenza case had sample coming from same...but different parts of body

<Bob> <Bob> ... different modes of taking samples, but we didn't have authority to infer

<Bob> <Bob> Eric: Is this a modelling question?

<Bob> <Bob> Joanne: Data are ambiguous

<Bob> <Bob> ... this is what Chime was working on last week

<Bob> <Bob> ... answering Eric: these are the kinds of things that do come up

<Bob> <Bob> Eric: We get to resolve the ambiguities, port to Indivo, more modelling, bugs are in Indivo

<Bob> <Bob> ... yes there's value in disussing in the paper, since people will encounter this

<Bob> <Bob> ... but those are resolvable bugs in our model etc

<Bob> <Bob> ... Invdivo model carries these interpretations, gives a concrete example

<Bob> <Bob> ... prepares people for encountering these, point out to reader that these can be fixed

<Bob> <Bob> Scott: Disambiguation? There will always be cases where data themselves have not be disambiguating

<Bob> <Bob> ... need to have disambiguation built into the interface, at the clinician front end

<Bob> <Bob> Joanne: Two thumbs up!

<Bob> <Bob> Scott: If you've done that then minimizes noise later on

<Bob> <Bob> Joanne: If categories are too narrow, as in galaxies, people need to be able to record that things have not been seen before

<Bob> <Bob> ... want to make sure that unusual things can be noted

<Bob> <Bob> Scott: Yes, this also should be captured

<Bob> <Bob> Michel: Q about mapping of terms

<Bob> Joanne: We can look at coverage

<Bob> (above section was at beginning of call!)

oh, i was scribing here as well!

okay, you take over!

<Bob> (I was off on hlcs2!)

<Bob> Joanne: Map to more than one snomed category, but there weren't that many cases like this

<scribe> scribenick: Bob

Michel: Q is more tech, In patient data conversion we were keeping track of labels

Eric: Going Indivo to TMO?
... Will the Indivo xml carry these IDs? Yes
... this is place where tests are not carefully done yet
... rest of these tests have interesting modeling questions
... Indivo writes down a code w coding system, units, etc.
... combination of coding and ids gives you snomed
... But we want to start picking away at these, modeling w TMO
... too many, so we bring them in slowly, elevate some of them to our first-class way
... can start putting in further information
... goal here is to take bp, figure out mechanics
... problem is that some test results are not structured
... also have very structured tests, but some are just values
... ex: had to make up a genetic test, result

Michel: Share the structure?

Eric: This came up early b/c w SNPs
... genotyping SNPs

<ericP> https://dvcs.w3.org/hg/TMO-Indivo/file/3334734509e9/syntheticPatients/AD_PCHR_1.xml#l702

Eric: these are tests w lots of structure

Discussion of what is deep in this representation...

<michel> http://www.ncbi.nlm.nih.gov/sites/entrez?db=snp&cmd=search&term=rs72547528

<michel> http://www.ncbi.nlm.nih.gov/projects/SNP/snp_ref.cgi?rs=72547528

Michel: See C/T, chromosome, position, etc
... This is analogous to term-mapping, in that there is no corresponding record in dbSNP
... needs to be formalization, but up to you whether to embed

Eric: Alan R. had raised Q, to what degree are we capturing all the information?
... there are laws re capturing data, etc, so we can't throw anything away

Michel: Next-gen seq, how to get back to raw data from annotation?

Scott: We can throw down challenge to Indivo re SNPs

Eric: Clinics have laws that say you have to do all this, but EHRs are not motivated to

Scott: Maybe a placeholder in our data to enable this capability
... going back to low level from high level, sort of a reverse use case

<michel> http://esw.w3.org/HCLSIG/PharmaOntology/Diagrams

Michel: Here is the diagram to represent elements in the first query
... describe visually the query and identify elements in our KB
... query is largely described in relation rather than in type
... could be domain and range restrictions, but we haven't done that
... we had discussion re specialization of relations
... Will RDF still have this structure?

Eric: Yes, roughly. If something better, then would use it
... would like to have the extra predicates so consumers don't have to constrain by type
... use where I can

Michel: We have in linked-data: specific predicates sitting in one namespace

Eric: We need some inference if we are over-specializing our predicates

Michel: Coarse relations, use them in different combinations
... we should create a mapping of relations

Eric: We can do this in specific use cases
... point out that there can be more general use cases, what can be queried w/o inference

Joanne: Use cases is what drives the TMO

Eric: Yes, this would be the prize if we can show

Michel: TMO isn't playing a big role if we don't really use predicate mappings

<michel> use case - query the dataset - use dataset specific types/relations

<michel> use case - query all datasets - use TMO types/relations

<michel> but requires mappings

Eric: Use case: get query across datasets, but can still ask intelligent questions

<michel> at the type and relation level

Eric: Do we get to ask readers for more general queries?
... maybe say that these are questions for the community, OK as an interest group

Michel: Contrast two different queries w different predicate formulations

Scott: Chime's queries were from wiki, not aligned w TMO?
... should we change queries or TMO or both, from last week's call

Michel: Some of this is just in patient data, not in TMO
... before: mapping at type-level, we could query the type but could not do it w relation
... Scott, we want to contrast sparql within-dataset, against TMO types and relations across all datasets that are aligned
... LODD we could query irrespective of which dataset, we did not have relation mapping
... objects, individuals call for relations, we want to query on these sorts of predicates

Eric: LODD certain amount of coding to make things similar, so queries get similar units, etc

Michel: One query for a union across datasets

Eric: How general can you get before there is no useful unified query
... Ex: BP need normalization
... what query can we get in this non-normalized form? (excellent point) "Head-scratching time."

Michel: Take the paper that we had, start adding in new work.

Eric: We can point out about EHR integration

Michel: Want to convey that linked-data play nicely in the whole business space

<Trish> agree, at least for now no need to include SNP data but point to source

Michel: Nobody is going to undertake the whole process to convert everything
... they can just take their own part. SNP is a good example.

<mscottm> hand of God?

<mscottm> ;)

<Trish> need to move to next call

<michel> yup

<mscottm> bye

<michel> bye

<matthias_samwald> (i need to leave, bye!)

<Trish> i can work on some MS items

- DRAFT -

SV_MEETING_TITLE

28 Oct 2010

Attendees

Contents

Summary of Action Items

Scribe.perl diagnostic output