23 Nov 2009

See also: IRC log


+1.410.720.aaaa, YolandaG, +1.414.456.aabb, Kei_Cheung, +1.937.775.aacc, [IPcaller], EricP, Oliver



Yolanda Gil introduction: chairing W3C provenance interest group

Yolanda: talked about semantic workflows

Christain Fritz intro: postdoc with Yolanda at USC, working on semantic workflows

<YolandaGil> The site of the W3C Provenance Group I mentioned is http://www.w3.org/2005/Incubator/prov/wiki

<mscottm> Welcome Yolanda, Christian, and Simon!

Simon Twiger, U. of Wisconsin, intro: interested in integrated semantic web technologies into rat database they maintain

<Simon> Its the Medical College of Wisconsin, not actually part of the U.Wisconsin system :)

<mscottm> Simon: also one of NCBO's Research Application collaborations

Agenda item 1: how to represent microarray data in RDF/OWL and use this representation to support query federation over the web

query federation paper identified provenance and workflow as future directions; would like to explore microarray use cases in these areas

<kei> http://esw.w3.org/topic/HCLSIG_BioRDF_Subgroup/QueryFederation2?action=AttachFile&do=get&target=Microarray_Use_Case.pdf

<jun> can anyone pls post the link to the irc again? I lost my irc when kei posted the link. thanks

kei: linked doc contains description of example neuroscience microarray experiment

<Joshua> http://esw.w3.org/topic/HCLSIG_BioRDF_Subgroup/QueryFederation2?action=AttachFile&do=get&target=Microarray_Use_Case.pdf

kei: experiment compared gene expression profiles for multiple classes of patients to evaluate the relationship between NFT and Alzheimer's
... has been collaborating with another researcher at Yale to propose a structured digital abstract (not specifically RDF)
... 3 main elements of the structured digital abstract: 1) translation table listing all biological entities (e.g. genes, protiens) in article (maps human names to db identifiers)
... 2) list of main results described as a simple ontology
... 3) standard evidence codes for how how the results were obtained

<Simon> FEBS info on SDA: http://www.febsletters.org/content/sda_summary

kei: has been evaluated on pilot basis
... Question: can we do something similar for microarray data?

<Simon> FEBS special issue with SDAs: http://www.sciencedirect.com/science/issue/4938-2008-994179991-684107

kei: import the gene lists into public database (gene omnibus, etc.)
... leverage GO for representing the gene list
... also need to capture the context (the biological condition, the brain region); need to capture this contextual information as part of the gene list

<ssahoo2> Provenance related Experiment Context: http://esw.w3.org/topic/HCLSIG_BioRDF_Subgroup/QueryFederation2?action=AttachFile&do=get&target=NeuroscienceMicroarrayConsortium_ProvenanceTerms.txt

kei: this use case can help guide work on representing microarray data in RDF

<ssahoo2> http://esw.w3.org/topic/HCLSIG_BioRDF_Subgroup/MicroarrayProvenanceUseCase

ericp: NLP over the document will give you the set of keywords but won't give relationships between the keywords

kei: can use text mining to generate initial set of gene names but still need manual curation
... authors need to be convinced to enter gene lists in a standard format; it is currently up to the author whether to make it available

ericp: our work is how to map the SDA into a semantic web representation
... are the identifiers URLs?
... if not URLs, need to provide some additional scheme/context information that can eliminate identifier ambiguity
... are folks open to creating SDA info in RDF from the start?

kei: the key is having an easy interface/tools to create the metadata

mscottm: who are the folks that will be creating the SDA data?

kei: Mark Goldstein at Yale
... need to convince Mark and others to use sem web technology

rfrost: are all resources used in results ontology defined in translation table?

kei: yes

rfrost: what is the target expressiveness of the ontology?

kei: simple set of relations based on results reported in papers

<mscottm> http://www.ebi.ac.uk/gxa/qrs?gprop_0=&gval_0=&fexp_0=UP_DOWN&fact_0=&specie_0=&fval_0=brain&view=hm

mscottm: explore ways of representing gene list data in RDF

<mscottm> http://compbio.dfci.harvard.edu/genesigdb/

kei: explore bridge between paper and gene atlas

?: role of biopax?

<matthias_samwald> (need to leave now, bye)

<mscottm> bye Matthias

kei: you could eventually incorporate gene list representation into biopax

Yolanda/Kei: use of genespring to generate results, how to reproduce results if not using that software package

Yolanda/Kei/Scott: providing workflow information and annotated data sufficient to reproduce results as well as apply methodology to other experiments

Yolanda/Kei/Scott: semantic annotation/description of workflow would enable the retrieval of data relevant to that workflow (i.e. data that could be used to populate that workflow for a different experimental scenario)

Scott: two data sources relevant to gene lists: genesigdb, gene expression atlas
... Helen indicated that we can get RDF data returned from gxa

rfrost: what would RDF look like from gxa?
... just URLs/linked data or RDF that references an ontology?

scott: would be RDF references ontology

scott/yolanda/kei: to do: look at application of workflow systems to microarray analysis

scott: experimental factor ontology (http://www.ebi.ac.uk/efo/)

kei: try to create a workflow process for generating the gene list?

scott: given analytical challenge, may be better to start with the gene list and look at workflows that start with the gene list

yolanda: want to focus on a workflow were the use of semantics have a clear goal

yolanda/satya: example: provenance of the microarray data, represented in RDF, would impact the execution of the workflow

yolanda: provenance would include information about how the microarray data was obtained (equipment, data processing steps, etc.)

satya: provenance data could reference OBI terms

scott: some of this provenance data will already be in the MAGE files
... type of analysis may not be included

jun: good to review existing data files (MAGE) and evaluate to what extent necessary provenance data is contained and, if not, what ontologies define these concepts

kei: can start that process via a wiki page

scott: another idea for application of workflows: visualization of data related to differentially expressed genes

jun: should combine provenance data with domain-specific annotations

satya: will create wiki page with set of experimental contexts for evaluation

<ssahoo2> quit

<Simon> quit

<mscottm> ericP - you still on?

Summary of Action Items

[End of minutes]

Minutes formatted by David Booth's scribe.perl version 1.135 (CVS log)
$Date: 2009/11/23 17:26:26 $

Scribe.perl diagnostic output

[Delete this section before finalizing the minutes.]
This is scribe.perl Revision: 1.135  of Date: 2009/03/02 03:52:20  
Check for newer version at http://dev.w3.org/cvsweb/~checkout~/2002/scribe/

Guessing input format: RRSAgent_Text_Format (score 1.00)

No ScribeNick specified.  Guessing ScribeNick: rfrost
Inferring Scribes: rfrost

WARNING: No "Topic:" lines found.

Default Present: +1.410.720.aaaa, YolandaG, +1.414.456.aabb, Kei_Cheung, +1.937.775.aacc, [IPcaller], EricP, Oliver
Present: +1.410.720.aaaa YolandaG +1.414.456.aabb Kei_Cheung +1.937.775.aacc [IPcaller] EricP Oliver

WARNING: No meeting chair found!
You should specify the meeting chair like this:
<dbooth> Chair: dbooth

Got date from IRC log name: 23 Nov 2009
Guessing minutes URL: http://www.w3.org/2009/11/23-hcls-minutes.html
People with action items: 

WARNING: Input appears to use implicit continuation lines.
You may need the "-implicitContinuations" option.

WARNING: No "Topic: ..." lines found!  
Resulting HTML may have an empty (invalid) <ol>...</ol>.

Explanation: "Topic: ..." lines are used to indicate the start of 
new discussion topics or agenda items, such as:
<dbooth> Topic: Review of Amy's report

[End of scribe.perl diagnostic output]