See also: IRC log
<kei> take up next agendum
<scribe> scribenick: matthias_samwald
kei: over the last days i spent
time on editing the discussion section
... smoother, more readable. i did not delete the old text, so
others can have a look and possibly add. i highlighted it in
yellow.
... we made a lot of good progress
... how far are we with the examples?
lena: all federated queries are now also on the demo
<Lena> http://ibl.mdanderson.org/~mhdeus/BioRDF/microarray/sparql_endpoint.html
lena: on that page, you can find the six demo queries
kei: is this the full gene list?
lena: this is a small subset
<Lena> http://mibupload.com/u0PSbD.xml
<Lena> (never mind this link)
kei: the first queries are
querying the gene lists themselves
... what type of brain regions, disease etc.
... also something about the data themselves (same software
package etc.)
... looks pretty good to me
... the last queries focus more on query federation
... these queries also make use of the origins of the
datasets
lena: see Q4
... as scott suggested, i used the VoID vocabulary.
... if you click on the query, the SPARQL query is
automatically entered.
... for the federated queries -- it does not work on some
browsers
michael: would it be possible to
have a query that returns all the genes in the final gene
list?
... i.e., the simple final gene list for a certain
experiment.
... most of these queries do not return much information, it
would be nice to know what the basic information available
is
lena: for the first gene list
there are 162
... i can use VoID for that kind of information
egon: the vocabulary predicate you are using on diseasome, what are you doing with that?
lena: what we are doing for all
datasets that have been annotated with that vocabulary.
... gives us certainty that we are finding the things we are
looking for
scott: but are you referring to diseasome as a dataset? it is not.
<mscottm> I am talking about this part of a query: OPTIONAL { [ rdf:type void:Dataset ; void:sparqlEndpoint ?srvc2 ; void:vocabulary <http://www4.wiwiss.fu-berlin.de/diseasome/resource/diseasome/> ; dct:issued ?issued2 ] . FILTER (?issued2 > ?issued)
kei: is it a language or vocabulary? how do you use VoID so that machine knows that this dataset has to do with this certain set of diseases.
(sorry, audio quality is quite bad, hard to discern people)
<Lena> void guide: http://semanticweb.org/wiki/VoiD
kei: the concern is: when we are using VoID to describe data origins, how is the data provenance actually captured by VoiD?
lena: you can describe subjects (e.g. "gene").
<kei> I can't hear what Lena said
<Lena> +351 21 4469852
<mscottm> Thank you Eric!
<ericP> the part of Lilly Tomlin's "Operator" will be played by ericP today
lena: to find all datasets that have to do with genes, i would have to figure out the URI -- it is easier with the VoiD vocabulary.
scott: diseasome is not a vocabulary, but a dataset
lena: you still have to say what
is a disease by indicating a full URL
... we need a dataset of genes AND diseases
scott: VoID just gives you means to talk about a graph
kei: would it be a lot of work to
switch to that use of subject?
... to make the semantics a bit more understandable
lena: okay. this is easy to change.
<ericP> uname -a
(lena and eric talk about issues with 32 vs. 64 bit version of federation software)
eric: but this is not critical for the paper
kei: for the paper the review period is quite sure (in the next 3 weeks)
eric: the second query we are working on, i got too many resoultions from one endpoint, we need to figure out how to solve this.
kei: the query that makes use of PharmGKB is a good example. we do not need to get into biological details for this paper, though.
lena: there would not be enough papers, also the reviewers would not understand.
scott: about the NCBO sparql
endpoint: i don't know if there is a way with this microarray
scenario. we would need an appropriate vocabulary (e.g. for
diseases). but this is a level of provenance that is not fully
formalized on the NCBO sparql endpoint.
... e.g., a query that finds all datasets about
neurodegenerative diseases -- that would be possible
... another example: if you have a list of neurodegenerative
diseases (based on ontology), then you can find data from other
neurodegenarative diseases
lena: we could trim the list of disesases in Q4 to only neurodegenerative diseases
kei: in terms of the paper, how do we go about finalizing it?
lena: we have to calculate ~1
page for abstract, 1 page for references
... most of the results can be deferred to links to web
pages
kei: lena, you are the person to
do the first cut
... still, it is interesting to talk about the data model in
the paper and give some examples
lena: i would say keep the diagram, lose the triples. i will make these changes.
kei: the deadline is friday, at one time we need to convert it to the IEEE format. when do we make that switch?
lena: my suggestion is to switch
to IEEE on wedenesday and have everyone read it.
... on thursday we can still have a conference call and see if
we all agree
scott: i would like to have some slides about this work that i can present at Oxford Global Pharma conference in october
<mscottm> Uh oh - on hcls2 Zakim, I get "This passcode is not valid."
<mscottm> Can you help, Eric?
<scribe> Scribe: Matthias Samwald
<mscottm> ericP - can you help with hcls2 (Terminology)?
This is scribe.perl Revision: 1.135 of Date: 2009/03/02 03:52:20 Check for newer version at http://dev.w3.org/cvsweb/~checkout~/2002/scribe/ Guessing input format: RRSAgent_Text_Format (score 1.00) Succeeded: s/egon/scott/ Found ScribeNick: matthias_samwald Found Scribe: Matthias Samwald WARNING: No "Topic:" lines found. WARNING: No "Present: ... " found! Possibly Present: IPcaller Kei_Cheung Lena aaaa aabb egon eric ericP kei kennyluck matthias_samwald michael mscottm scott scribenick You can indicate people for the Present list like this: <dbooth> Present: dbooth jonathan mary <dbooth> Present+ amy Got date from IRC log name: 30 Aug 2010 Guessing minutes URL: http://www.w3.org/2010/08/30-hcls-minutes.html People with action items: WARNING: No "Topic: ..." lines found! Resulting HTML may have an empty (invalid) <ol>...</ol>. Explanation: "Topic: ..." lines are used to indicate the start of new discussion topics or agenda items, such as: <dbooth> Topic: Review of Amy's report[End of scribe.perl diagnostic output]