SV_MEETING_TITLE -- 27 May 2009

<scribe> scribe: matthias_samwald

TOPIC --- Datasets

jun: we created a mapping between TCM and Entrez gene

i did some manual correction of the mapping

scribe: i did some manual correction of the mapping
... anja also sent me some links today
... she also linked to SIDER, drugbank
... dataset quite well interlinked
... anja encountered performance problems with running SILK over remote SPARQL endpoints
... we have a rough idea of an interesting use-case in this area
... we managed to submit an abstract to the DILS poster/demo session, we will get feedback at the end of this month
... we investigate aTags to enrich the knowledge base with new statements

anja: i did not put use-case on wiki, proposed another small use-case related to TCM today via mail, will put it on the wiki later
... for SILK, i had to put data into local SPARQL endpoint
... performance not really an issue, datasets are not updated that often, letting SILK run for some hours is not that bad in this case.
... had to use local SPARQL endpoint because public endpoints do not allow 7 million queries in a row...

<scribe> scribeNick: matthias_samwald

anja: egon willighagen was excited about this new dataset
... matthias and peter ansell were also working on SIDER in parallel.
... i might also look into linking to LinkedCD

susie: LinkedCT contains new drugs, will there be much side effects data?

anja: there are also marketed drugs in LinkedCT

matthias: i am almost finished with converting SIDER to aTags, i am re-using the DBpedia and OBO URIs directly, should be complementary to the conversion of Anja

anja: I looked at the conversion script from peter ansell, don't know about the results

susie: a next step will be to think about additional datasets

bosse: i have not made progress with working on a query and identifying necessary datasets

susie: we should be able to pose new interesting questions based on the LOD we created
... we are still lacking questions / necessary steps to utilize the data

jun: in the wiki page we are not giving enough demonstrations. we are just giving a description on how things are done by browsing different websites -- but we wwant to show how linked data can be used!

susie: bosse and i put together some top questions.

<Susie> http://esw.w3.org/topic/HCLSIG/LODD/Questions

susie: (going through questions)

anja: we can give a summarisation of all active ingredients a company is working on / marketing

susie: can we link to pathways?
... there is pathway information in some Bio2RDF datasets

matthias: the "Linked Life Data" datasets (LarKC / Astra Zeneca) should also have a lot of pathway data

susie: genes and proteins are often linked interchangegably, this could be an entry point.

vassil: hi, i am from ontotext, we are working with Bosse on Linking Life Data
... we can work on interlinking / sharing identifiers
... we can also look at user interfaces
... just querying with SPARQL does not help

susie: i agree

vassil: we are planning to work on user interfaces
... we should start from the other side -- developing interfaces for concrete tasks by researchers

matthias: could we use Ontotext LifeSKIM?

vassil: this is a bit more specialized, you cannot put arbitrary RDF into it

susie: it's a chicken and egg problem. has anyone else thought about the right order?

vassil: we are looking at Exhibit
... the idea is to run SPARQL queries, get back JSON, and render on screen
... we hope to see results in 1 month

susie: (describes query 2 on the wiki page)
... it seems to be a broad question... aggregate everything around a compound
... questions 2 (How is this therapy/compound different from existing therapies/compounds) is too broad actually, i will move them down in the list
... question 3 (What are our patients saying about our drugs) has a strong text mining component
... another question: Who are the key opinion leaders for the therapeutic area?

vassil: we have pubmed, but not citation informatoin

matthias: we could ask anita de waard (elsevier)

susie: i will ask anita
... question: Of drugs from either the same or different company for the same indication, are they approved in the same target region of interest – US or Global
... is this possible in SPARQL
... ?

eric: not possible in a single SPARQL query

susie: this query seems to be quite tricky
... especially the part about geography

bosse: the information is proprietary, but i am not totally sure

ACTION ITEM -- have a look whether WHO has useful data about geography of approved drugs

<scribe> ACTION: bosse have a look whether WHO has useful data about geography of approved drugs [recorded in http://www.w3.org/2009/05/27-hcls-minutes.html#action01]

susie: in two weeks we will see how successful we were in looking into these questions
... question "are there natural alternatives to this drug?" -- can it be answered with the TCM dataset?

jun: i am a bit worried about the user interface for presenting query results, that might cost me more time
... we might need to add some new information (via aTags) to make it useful

anja: you can use associations between genes and diseases, if a drug and a TCM drug are working on the same genes, this might be a hint
... we can define similarity, but we need to have a critical look at the validity

jun: the TCM people also used pathway information to validate

matthias: i can help in judging the validity

TOPIC --- triplification challenge

susie: maybe TCM?

anja: we need use-cases

bye!

- DRAFT -

SV_MEETING_TITLE

27 May 2009

Attendees

Contents

Summary of Action Items

Scribe.perl diagnostic output