HCLSIG/LODD/Meetings/2010-07-07 Conference Call

Conference Details

Date of Call: Wednesday July 7, 2010
Time of Call: 11:00am Eastern Daylight Time (EDT), 16:00 British Summer Time (BST), 17:00 Central European Time (CET)
Dial-In #: +1.617.761.6200 (Cambridge, MA)
Dial-In #: +33.4.89.06.34.99 (Nice, France)
Dial-In #: +44.117.370.6152 (Bristol, UK)
Participant Access Code: 4257 ("HCLS").
IRC Channel: irc.w3.org port 6665 channel #HCLS (see W3C IRC page for details, or see Web IRC)
Duration: ~1h
Convener: Susie

Agenda

Mapping experimental data - All

- Feedback on Wiki site
- Learning from the BioOntologies Paper

Seed grants - Susie
Data updates - Egon, Matthias, Anja, Oktie
AOB

Minutes

Attendees: Bosse, Rich, Matthias, Oktie, Claus, Elgar, Susie

<Susie> http://esw.w3.org/HCLSIG/LODD/Mapping_Experimental_Data

<rboyce> agenda item 1: feed back on the wiki site

<rboyce> 10 questions -- are these good questions? Should there be other questions?

<rboyce> Discussion about the definition of experimental data...does it include EHR data?

<rboyce> EHR record could be an experimental dataset

<rboyce> aka "instance data"

<rboyce> main focus -- how do we best model what is often very complex data produced from experiments in RDF?

<rboyce> are these good questions to have answered in a "best practice

<rboyce> " document"

<rboyce> bbalsa: questions: once data is published -- how can the original data be augmented with additional insights?

<rboyce> bbalsa: this question would be helpful for scientists publishing their data in RDF

<rboyce> http://en.wikipedia.org/wiki/Entity-attribute-value_model

<rboyce> question: it seems that experience with best practices for representing health data in the entity-attribute-value (EAV) model would be applicable to representing experimental data in RDF. Has anyone looked into this?

<Susie> http://esw.w3.org/HCLSIG/LODD/Mapping_Experimental_Data#Mapping_Experimental_Data_to_RDF

<rboyce> example paper on best practices for EAV modeling: http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2110957/

<rboyce> some of the questions would be much harder to address than others

<rboyce> for example, determining URI namespaces might be less involved than some others

<rboyce> question: questions regarding tools -- e.g. how to get the tool to work with linked data and usable interfaces

<rboyce> Susie -- tools questions might not be appropriate for a 'best practices' paper

<rboyce> mapping should be independent of implementation

<rboyce> oktie -- as data sources grow; d2r mapping might not scale to allow efficient query responses

<rboyce> oktie -- how important is scalability?

<rboyce> Susie -- scalability is good to consider when making recommendations

<rboyce> Is this a general "best practice" document?

<rboyce> Susie -- the document should be applicable to any disease and patient population

<rboyce> The first questions would be very helpful to researchers new to using RDF; would save people time and confusion.

<rboyce> how well does using ADNI as a focus area generalize ?

<rboyce> Susie -- it is a realistic data set that might be a very good starting point

<rboyce> There might be other useful data sources that are not in a relational databases -- what about those?

<rboyce> Susie -- it is possible to convert (e.g. XML to triple store) but relational to RDF would be a good place to start.

<rboyce> clausstie: introduction -- experienced in medicinal chemistry, IT, knowledge management

<rboyce> Are we restricting this to experimental data?

<rboyce> Susie -- would like to start with an experimental data set because it is more complicated

<rboyce> Susie -- other types of datasets of interest?

<rboyce> How do we assign ids? Do we create our own and map other objects to the new ones?

<rboyce> Mapping might be too general of a term -- changes from application from application

<rboyce> we should be precise by what we mean by "mapping"

<rboyce> implementation should be last decision -- are we restricting this to RDF?

<rboyce> Should we focus on mapping concepts etc first then implementation

<rboyce> Susie -- what questions would we want to ask of the data set --- this might influence the way we model the data

<rboyce> Susie -- the representation choice might be predetermined (as RDF) given the purpose of this SIG

<rboyce> Which entities should be classes and which ones should be instances?

<rboyce> Susie -- we should probably be thinking of some other data sources to include so that we could demonstrate federated queries

<rboyce> and aggregation

<rboyce> Susie -- will take the bioontologies paper (http://esw.w3.org/images/d/d0/ISMB2010_Final.pdf) and extract findings that help address some of the questions

<rboyce> Susie -- please document the steps one would take to map ADNI to RDF

<rboyce> Susie -- please be thinking about ADNI and potential complementary data sources (e.g. drug bank, clinicaltrials.gov)

<rboyce> Susie -- will create a wiki page with new/updated questions

<rboyce> ----------------

<rboyce> Seed grants...

<Susie> https://www.jnjcosat.com/cosat

<rboyce> --------------

<rboyce> Data updates

<rboyce> Anja mentioned (by email to Richard) that Drug Bank RDF mapping will be edited to address issues: <https://sourceforge.net/projects/loddproject/forums/forum/910130/topic/3719723/index/page/1>