HCLSIG/LODD/Meetings/2010-07-07 Conference Call
Conference Details
- Date of Call: Wednesday July 7, 2010
- Time of Call: 11:00am Eastern Daylight Time (EDT), 16:00 British Summer Time (BST), 17:00 Central European Time (CET)
- Dial-In #: +1.617.761.6200 (Cambridge, MA)
- Dial-In #: +33.4.89.06.34.99 (Nice, France)
- Dial-In #: +44.117.370.6152 (Bristol, UK)
- Participant Access Code: 4257 ("HCLS").
- IRC Channel: irc.w3.org port 6665 channel #HCLS (see W3C IRC page for details, or see Web IRC)
- Duration: ~1h
- Convener: Susie
Agenda
- Mapping experimental data - All
- Feedback on Wiki site - Learning from the BioOntologies Paper
- Seed grants - Susie
- Data updates - Egon, Matthias, Anja, Oktie
- AOB
Minutes
Attendees: Bosse, Rich, Matthias, Oktie, Claus, Elgar, Susie
<Susie> http://esw.w3.org/HCLSIG/LODD/Mapping_Experimental_Data
<rboyce> agenda item 1: feed back on the wiki site
<rboyce> 10 questions -- are these good questions? Should there be other questions?
<rboyce> Discussion about the definition of experimental data...does it include EHR data?
<rboyce> EHR record could be an experimental dataset
<rboyce> aka "instance data"
<rboyce> main focus -- how do we best model what is often very complex data produced from experiments in RDF?
<rboyce> are these good questions to have answered in a "best practice
<rboyce> " document"
<rboyce> bbalsa: questions: once data is published -- how can the original data be augmented with additional insights?
<rboyce> bbalsa: this question would be helpful for scientists publishing their data in RDF
<rboyce> http://en.wikipedia.org/wiki/Entity-attribute-value_model
<rboyce> question: it seems that experience with best practices for representing health data in the entity-attribute-value (EAV) model would be applicable to representing experimental data in RDF. Has anyone looked into this?
<Susie> http://esw.w3.org/HCLSIG/LODD/Mapping_Experimental_Data#Mapping_Experimental_Data_to_RDF
<rboyce> example paper on best practices for EAV modeling: http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2110957/
<rboyce> some of the questions would be much harder to address than others
<rboyce> for example, determining URI namespaces might be less involved than some others
<rboyce> question: questions regarding tools -- e.g. how to get the tool to work with linked data and usable interfaces
<rboyce> Susie -- tools questions might not be appropriate for a 'best practices' paper
<rboyce> mapping should be independent of implementation
<rboyce> oktie -- as data sources grow; d2r mapping might not scale to allow efficient query responses
<rboyce> oktie -- how important is scalability?
<rboyce> Susie -- scalability is good to consider when making recommendations
<rboyce> Is this a general "best practice" document?
<rboyce> Susie -- the document should be applicable to any disease and patient population
<rboyce> The first questions would be very helpful to researchers new to using RDF; would save people time and confusion.
<rboyce> how well does using ADNI as a focus area generalize ?
<rboyce> Susie -- it is a realistic data set that might be a very good starting point
<rboyce> There might be other useful data sources that are not in a relational databases -- what about those?
<rboyce> Susie -- it is possible to convert (e.g. XML to triple store) but relational to RDF would be a good place to start.
<rboyce> clausstie: introduction -- experienced in medicinal chemistry, IT, knowledge management
<rboyce> Are we restricting this to experimental data?
<rboyce> Susie -- would like to start with an experimental data set because it is more complicated
<rboyce> Susie -- other types of datasets of interest?
<rboyce> How do we assign ids? Do we create our own and map other objects to the new ones?
<rboyce> Mapping might be too general of a term -- changes from application from application
<rboyce> we should be precise by what we mean by "mapping"
<rboyce> implementation should be last decision -- are we restricting this to RDF?
<rboyce> Should we focus on mapping concepts etc first then implementation
<rboyce> Susie -- what questions would we want to ask of the data set --- this might influence the way we model the data
<rboyce> Susie -- the representation choice might be predetermined (as RDF) given the purpose of this SIG
<rboyce> Which entities should be classes and which ones should be instances?
<rboyce> Susie -- we should probably be thinking of some other data sources to include so that we could demonstrate federated queries
<rboyce> and aggregation
<rboyce> Susie -- will take the bioontologies paper (http://esw.w3.org/images/d/d0/ISMB2010_Final.pdf) and extract findings that help address some of the questions
<rboyce> Susie -- please document the steps one would take to map ADNI to RDF
<rboyce> Susie -- please be thinking about ADNI and potential complementary data sources (e.g. drug bank, clinicaltrials.gov)
<rboyce> Susie -- will create a wiki page with new/updated questions
<rboyce> ----------------
<rboyce> Seed grants...
<Susie> https://www.jnjcosat.com/cosat
<rboyce> --------------
<rboyce> Data updates
<rboyce> Anja mentioned (by email to Richard) that Drug Bank RDF mapping will be edited to address issues: <https://sourceforge.net/projects/loddproject/forums/forum/910130/topic/3719723/index/page/1>