See also: IRC log
<scribe> scribe: matthias_samwald
TOPIC --- Datasets
jun: we created a mapping between TCM and Entrez gene
i did some manual correction of the mapping
scribe: i did some manual
correction of the mapping
... anja also sent me some links today
... she also linked to SIDER, drugbank
... dataset quite well interlinked
... anja encountered performance problems with running SILK
over remote SPARQL endpoints
... we have a rough idea of an interesting use-case in this
area
... we managed to submit an abstract to the DILS poster/demo
session, we will get feedback at the end of this month
... we investigate aTags to enrich the knowledge base with new
statements
anja: i did not put use-case on
wiki, proposed another small use-case related to TCM today via
mail, will put it on the wiki later
... for SILK, i had to put data into local SPARQL
endpoint
... performance not really an issue, datasets are not updated
that often, letting SILK run for some hours is not that bad in
this case.
... had to use local SPARQL endpoint because public endpoints
do not allow 7 million queries in a row...
<scribe> scribeNick: matthias_samwald
anja: egon willighagen was
excited about this new dataset
... matthias and peter ansell were also working on SIDER in
parallel.
... i might also look into linking to LinkedCD
susie: LinkedCT contains new drugs, will there be much side effects data?
anja: there are also marketed drugs in LinkedCT
matthias: i am almost finished with converting SIDER to aTags, i am re-using the DBpedia and OBO URIs directly, should be complementary to the conversion of Anja
anja: I looked at the conversion script from peter ansell, don't know about the results
susie: a next step will be to think about additional datasets
bosse: i have not made progress with working on a query and identifying necessary datasets
susie: we should be able to pose
new interesting questions based on the LOD we created
... we are still lacking questions / necessary steps to utilize
the data
jun: in the wiki page we are not giving enough demonstrations. we are just giving a description on how things are done by browsing different websites -- but we wwant to show how linked data can be used!
susie: bosse and i put together some top questions.
<Susie> http://esw.w3.org/topic/HCLSIG/LODD/Questions
susie: (going through questions)
anja: we can give a summarisation of all active ingredients a company is working on / marketing
susie: can we link to
pathways?
... there is pathway information in some Bio2RDF datasets
matthias: the "Linked Life Data" datasets (LarKC / Astra Zeneca) should also have a lot of pathway data
susie: genes and proteins are often linked interchangegably, this could be an entry point.
vassil: hi, i am from ontotext,
we are working with Bosse on Linking Life Data
... we can work on interlinking / sharing identifiers
... we can also look at user interfaces
... just querying with SPARQL does not help
susie: i agree
vassil: we are planning to work
on user interfaces
... we should start from the other side -- developing
interfaces for concrete tasks by researchers
matthias: could we use Ontotext LifeSKIM?
vassil: this is a bit more specialized, you cannot put arbitrary RDF into it
susie: it's a chicken and egg problem. has anyone else thought about the right order?
vassil: we are looking at
Exhibit
... the idea is to run SPARQL queries, get back JSON, and
render on screen
... we hope to see results in 1 month
susie: (describes query 2 on the
wiki page)
... it seems to be a broad question... aggregate everything
around a compound
... questions 2 (How is this therapy/compound different from
existing therapies/compounds) is too broad actually, i will
move them down in the list
... question 3 (What are our patients saying about our drugs)
has a strong text mining component
... another question: Who are the key opinion leaders for the
therapeutic area?
vassil: we have pubmed, but not citation informatoin
matthias: we could ask anita de waard (elsevier)
susie: i will ask anita
... question: Of drugs from either the same or different
company for the same indication, are they approved in the same
target region of interest – US or Global
... is this possible in SPARQL
... ?
eric: not possible in a single SPARQL query
susie: this query seems to be
quite tricky
... especially the part about geography
bosse: the information is proprietary, but i am not totally sure
ACTION ITEM -- have a look whether WHO has useful data about geography of approved drugs
<scribe> ACTION: bosse have a look whether WHO has useful data about geography of approved drugs [recorded in http://www.w3.org/2009/05/27-hcls-minutes.html#action01]
susie: in two weeks we will see
how successful we were in looking into these questions
... question "are there natural alternatives to this drug?" --
can it be answered with the TCM dataset?
jun: i am a bit worried about the
user interface for presenting query results, that might cost me
more time
... we might need to add some new information (via aTags) to
make it useful
anja: you can use associations
between genes and diseases, if a drug and a TCM drug are
working on the same genes, this might be a hint
... we can define similarity, but we need to have a critical
look at the validity
jun: the TCM people also used pathway information to validate
matthias: i can help in judging the validity
TOPIC --- triplification challenge
susie: maybe TCM?
anja: we need use-cases
bye!
This is scribe.perl Revision: 1.135 of Date: 2009/03/02 03:52:20 Check for newer version at http://dev.w3.org/cvsweb/~checkout~/2002/scribe/ Guessing input format: RRSAgent_Text_Format (score 1.00) Found Scribe: matthias_samwald Inferring ScribeNick: matthias_samwald Found ScribeNick: matthias_samwald WARNING: No "Topic:" lines found. Default Present: +049308385aaaa, +1.610.651.aabb, +03592490aacc, +46.4.63.3.aadd, EricP Present: +049308385aaaa +1.610.651.aabb +03592490aacc +46.4.63.3.aadd EricP WARNING: No meeting title found! You should specify the meeting title like this: <dbooth> Meeting: Weekly Baking Club Meeting WARNING: No meeting chair found! You should specify the meeting chair like this: <dbooth> Chair: dbooth Got date from IRC log name: 27 May 2009 Guessing minutes URL: http://www.w3.org/2009/05/27-hcls-minutes.html People with action items: bosse WARNING: No "Topic: ..." lines found! Resulting HTML may have an empty (invalid) <ol>...</ol>. Explanation: "Topic: ..." lines are used to indicate the start of new discussion topics or agenda items, such as: <dbooth> Topic: Review of Amy's report[End of scribe.perl diagnostic output]