HCLSIG/LODD/Meetings/2010-01-20 Conference Call
Conference Details
- Date of Call: Wednesday January 20, 2010
- Time of Call: 11:00am Eastern Daylight Time (EDT), 16:00 British Summer Time (BST), 17:00 Central European Time (CET)
- Dial-In #: +1.617.761.6200 (Cambridge, MA)
- Dial-In #: +33.4.89.06.34.99 (Nice, France)
- Dial-In #: +44.117.370.6152 (Bristol, UK)
- Participant Access Code: 4257 ("HCLS").
- IRC Channel: irc.w3.org port 6665 channel #HCLS (see W3C IRC page for details, or see Web IRC)
- Duration: ~1h
- Convener: Susie
Agenda
- Open data follow up - all
- Data update - Anja, Jun, Matthias, Egon
- TCM special issue - Matthias, Jun
- Bio-Ontologies SIG - Susie
- ACS, AMIA - Egon, Richard
- AOB
Minutes
Attendees:EricP, Scott, Egon, Kei, Oktie, Susie, Jun
Apologies: Anja, Bosse
<kei> opendata follow-up
<kei> making data availabe in rdf
<kei> plan to send email to data providers to get their approval and permission
<kei> data update
<kei> egon reported on progress on rdf conversion chembl database at ebi
<egonw> http://pele.farmbio.uu.se/chembl/sparql
<kei> links of drugs to targets and clinicaltrials
<kei> not only drugs on market but also failed drugs
<egonw> http://chembl.blogspot.com/
<kei> who contributed some of the data to chembl?
<kei> susie: how to use sparql to enable meaningful searches (e.g., substructure)
<kei> egon: find small molecules (drugs) for particular targets
<ericP> http://pele.farmbio.uu.se/chembl/sparql endpoint seems to need a tickle
<kei> egon: use sparql to do substructure mining/subsubstructure comparison
<ericP> [[ select distinct ?Concept where {[] a ?Concept} limit 2 ]] taking forever and it used to be instant
<kei> susie: this might be an interesting area to explore the feasibility of substructure search
<kei> susie: link chembl to other resources such as TCM and drugbank
<kei> egon: yes, this is important
<kei> scott: question of substructure search: different approaches such as chemblast by TN Bhat in the context HIV virus
<kei> scott: characterize different types of structures using rdf
<mscottm> http://xpdb.nist.gov/pdb_chem_blast/help.html
<kei> egon: they tried to fingerprint molecules using chemblast
<kei> scott: start with basic structures and refine them in the hierarchy
<mscottm> http://bioinfo.nist.gov/SemanticWeb_pr2d/chemblast.do
<kei> egon: good start, but finger printing generates noise
<kei> egon: the noise problem needs to be addressed
<kei> susie: oktie has updated linkct dataset
<egonw> ericP: yeah, I think I might have messed up the index :)
<egonw> ericP: I need to check how many triples I have right now...
<kei> oktie: updated based on new data extracted from CT.gov
<egonw> but in RDF/XML format... it's about 2GB of data
<kei> oktie: interventions have erroneous links to trial data
<kei> oktie: CT has fixed the problem so the data quality is better
<ericP> egon, (checkign my comprehension) i understood from what you said that more than one substructure leads to the same fingerprints; fingerprints are a first pass at restricting to substructures. correct?
<egonw> correct
<OktieH> http://queens.db.toronto.edu/~oktie/linkedct/
<kei> oktie: latest dump available
<kei> oktie: should update the HCLS KB endpoint at DERI
<egonw> ChEMBL-RDF example: http://pele.farmbio.uu.se/chembl/snapshot.php
<kei> oktie: after the update, many things have changed so old links may not work
<egonw> source code: http://github.com/egonw/chembl.rdf
<egonw> issues, comments, etc can be filed there too
<kei> oktie: Anja should look at her data that has links to CT
- mscottm doesn't think that the chemblast approach produces the fingerprints that Egon addressed but characterizes substructures precisely. Main problem seems to be that it is difficult to do exhaustively.
<egonw> mscottm: I will read the paper and comment on it next meeting
<kei> susie: Anja sent her apologies for missing today
<kei> susie: asked oktie to contact Anja
<kei> oktie: links between drugs and interventions in CT
<kei> oktie: types of links should not be owl:sameAs because intervention is not just a drug (but drug dosage sometimes)
<kei> oktie: this link semantics problem should be addressed as soon as possible
<kei> susie: similarity or dissimilarity between drug and intervention
<kei> susie: would it help to use a more standard terminology/vocabulary?
<kei> susie: adoption of consisting naming in links
<kei> oktie: create another entity for drug (e.g., generated from interventions)
<kei> oktie: this might ease maintenance/management of links
<kei> susie: link to the standard definition of drug (in some ontology)
<kei> oktie: it's helpful
<kei> susie: tcm special issue
<SusieS> kei: deadline for tcm is the end of the month
<SusieS> kei: deadline may be extended, but very uncertain at the moment
<SusieS> kie: have been in touch with matthias
<SusieS> Kei: matthias has been talking to Huajun
<SusieS> kei: but don't have a recent update
<SusieS> kei: Jun is also working on the paper
<SusieS> kei: but also haven't heard from her
<SusieS> kei: the special issue has long research paper and shorter commentary paper
<SusieS> kei: will mention to Matthias
<egonw> drug terms in the ChEBI ontology:
<egonw> http://www.ebi.ac.uk/chebi/searchId.do?chebiId=CHEBI:23888
<kei> egon: drug roles are described in chebi
<kei> susie: The drug concept was discussed in the TMO effort
<kei> susie: TMO could potentially be used for sharing information about drugs
<kei> susie: network of collaborations is key to future companys' success
<kei> susie: ontologies are important in the collaborative context
<kei> susie: information captured chebi might be of interest
<kei> susie: there is a drug ontology in stanford
<kei> scott: saw Samson Tu who was involved in drug ontology
<kei> scott: Samson seemed to be willing to work with us
<kei> scott: a follow up might work
<kei> susie: Trish might also be able to follow up with Samson
<kei> susie: chebi focuses on small molecules at the moment, but we might need a broader context
<kei> susie: larger molecule, RNAi, ....
<kei> oktie: chembl and chebi are linked ...
<kei> susie: deadline for bio-ontology sig is approaching
<kei> susie: people have been busy
<kei> susie: upcoming meetings?
<kei> ACS meeting -- egon might be able to give update
<kei> end of march
<kei> susie: identify additional data sources for linking and mapping
<kei> susie: how people use and interact with the converted data is important
<kei> susie: visualization of data is also important
<kei> susie: can data providers convert their data in rdf or linked data?
<kei> susie: technologies are maturing/improving
<kei> susie: governments are more involved in the data sharing process
<kei> susie: invite representatives from government to give talks.
<kei> scott: uniprot available in rdf
<kei> scott: these are linked data
<kei> kei: data size and performance
<mscottm> kei: Next Generation Sequencing, etc. provide a challenge when trying to represent data in RDF
<kei> susie: data sources are very diverse and mappings are very complex
<kei> susie: experimental/clinical data contexts need to be captured in some way
<kei> susie: levels of granularities
<kei> susie: reasoning/inferencing using triplestores
<kei> eric: pre-processed inferencing
- ** zhaoj [chatzilla@129.67.24.116] has joined #hcls
<kei> susie: linked data framework use cases -- sparql queries
<egonw> http://saml.rilspace.com/content/initial-performance-comparison-pellet-vs-prolog-in-bioclipse
<egonw> my student Samuel very much likes to here about time consuming queries and compare SPARQL and prolog there
<kei> susie: how to explore and aggregate information (e.g., entity-based)
<kei> susie: interesting issues to explore: inferencing, data mapping, user-interface paradigm
<kei> eric: warehouses are still prevanlent, distributed/federated queries
<kei> scott: how to make use of the data in a meaningful way ...
<kei> susie: another area: how far the semantic mapping can go?
<kei> susie: federated queries: how they can be done?
<kei> eric: when semantics are not working, string matching is a backup approach ...
<kei> egon: reasoning compare sparql and prolog
<egonw> http://saml.rilspace.com/
<kei> scott: prolog can be a nice underpining to reasoning
<kei> scott: reasoning and rule integration is complicated
<kei> eric: new reasoning features may be added to sparql
<kei> scott: use of property chaining in sparql
<kei> susie: need to wrap up
<kei> susie: how to come up with best practices in terms of reasoning ...
<kei> susie: we can start with experimental data ...