14:57:15 RRSAgent has joined #hcls 14:57:15 logging to http://www.w3.org/2010/05/24-hcls-irc 14:57:22 Zakim, this will be BioRDF 14:57:22 ok, kei; I see SW_HCLS(BioRDF)11:00AM scheduled to start in 3 minutes 14:57:25 matthias_samwald has joined #hcls 14:57:41 agenda+ intro [Kei] 14:58:20 agenda+ genelist representation [Jun, Lena, Scott, Satya] 14:58:33 agenda+ iphone app [Don] 14:58:42 Zakim, take up next agendum 14:58:42 agendum 1. "intro" taken up [from Kei] 15:00:39 SW_HCLS(BioRDF)11:00AM has now started 15:00:46 + +1.832.386.aaaa 15:00:50 jun has joined #hcls 15:01:07 LenaDeus has joined #hcls 15:01:17 +Kei_Cheung 15:01:39 +??P5 15:01:58 Zakim, ??P5 is matthias_samwald 15:02:03 +matthias_samwald; got it 15:03:17 scribenick matthias 15:04:52 +[IPcaller] 15:05:01 +??P11 15:05:04 mscottm has joined #hcls 15:05:52 +mscottm 15:06:22 Zakim, who is here? 15:06:22 On the phone I see +1.832.386.aaaa, Kei_Cheung, matthias_samwald, [IPcaller], ??P11, mscottm 15:06:24 rfrost has joined #hcls 15:06:25 On IRC I see mscottm, LenaDeus, jun, matthias_samwald, RRSAgent, Zakim, kei, trackbot, ericP 15:07:06 kei: we have been exchanging e-mails in the last few days, lena has been looking into genelist representation 15:07:14 http://esw.w3.org/HCLSIG_BioRDF_Subgroup/QueryFederation2 15:08:02 ... i made some edits to that page: first, giving high-level context 15:08:04 + +1.937.775.aabb 15:09:21 ... the more detail the metadata/annotations give, the more likely the user is to find relevant datasets 15:09:43 ssahoo2 has joined #HCLS 15:09:56 ... another potential benefit of adding metadata to microarray datasets is that it allows us to better compare experiments. 15:11:30 ... and, of course, another major benefit is integration. microarray results describe a lot of different gene expression patterns, but sample size is often limited. pooling could increase sample size, improving statistical power. 15:15:26 ... often you find a selective gene list in papers based on microarray results. a small number of genes is listed in the paper. they are often exposed as human-readable tables, but not in a machine-readable format. 15:16:05 ... i added all these points to the wiki page. 15:16:49 ... at the bottom of this page i also added several queries. 15:17:06 ... still quite general, but a start. 15:21:32 matthias: what is "fold change"? 15:21:55 kei: a ratio that gives information about up- or downregulation 15:23:21 scott: they use significance analysis to see if significant differential expression happened. fold change is intuitive, but it does not provide whole picture. 15:23:34 lena: when we talked about gene lists, is there any other measure? 15:23:39 scott: the p value. 15:23:48 lena: i think the p value refers to the fold change? 15:24:00 scott: no. fold change refers to the raw signal. 15:24:53 scott: often it is indicative, but after normalisation and statistical analysis, some drastic changes in fold change turn out not to be caused by significant differential expression. 15:25:13 kei: i think this discussion is helpful. 15:26:54 ... we are not the ones who can judge the validity of results, but we can use metadata to expose context. this makes it possible for other researchers to judge validity and find possible improvements. 15:28:25 scott: people would like to know where gene list came from, which algorithms were used etc. 15:28:36 ... knowing that can make data more valuable. 15:28:40 egonw has joined #hcls 15:30:32 scott: recently i had the chance to talk to chris stockert of MGED. i asked him how to push this forward. he suggested to contact publishers. 15:31:00 ... publishers could ask authors to submit information. going back to unstructured text is not feasible. 15:31:08 - +1.937.775.aabb 15:31:45 + +1.937.775.aacc 15:32:12 ... i also talked to [name] from gene atlas; interested in RDF representation, they are facing similar problems 15:32:19 ... we should ask him to join the call 15:33:35 kei: if we don't go through it ourselves, we will have problems with collaborating with others. trying it ourselves will give us a good understanding. 15:35:27 ... some of us will be looking at spreadsheet and try to come up with RDF representation 15:38:23 lena: the different gene lists are giving us differnt data. there is something missing in list 2. 15:39:55 lena: they are only giving us the signal which is only meaningful in the context of the group and is not comparable 15:40:28 http://neuinfo.org/ 15:40:46 scott: the main thing of interest is the list of genes itself. just knowing the lists of genes already enables queries. 15:41:32 ... the NIF project would be happy to support our work 15:41:47 ... they can provide vocabulary about the neuroscience domain 15:42:02 http://bioportal.bioontology.org/ 15:42:03 ... BioPortal is also relevant 15:42:30 ... if you go to BioPortal and search for 'entorhinal cortex', you will find it in some of the ontologies. 15:43:30 -[IPcaller] 15:43:49 kei: i also recently talked to maryann and steven larson. 15:44:27 ... there will be a SPARQL endpoint for Neurolex (which is different from NIF ontology) 15:45:41 scott: NCBO will also soon have a SPARQL endpoint. 15:46:37 It has one now: http://sparql.bioontology.org:8080/webui/ 15:47:05 Also: http://sparql.bioontology.org/webui/ 15:47:54 kei: we do not need to worry too much about the quantitative values at the moment. researchers mainly want to know about 'is it under/overexpressed?'. it is mostly a qualitative query. 15:49:20 lena: in some of the gene lists i cannot see the information about whether they were up- or downregulated 15:49:40 kei: the page usually gives some context. 15:50:32 kei: knowledge about some kind of differential expression could already help. 15:54:34 lena: fold change could give some hints on whether a gene is over- or underexpressed 15:55:04 scott: usually scientist do RT-PCR to see if a gene is over- or underexpressed 15:56:29 kei: of course we do not need to exclude those quantitative values when we have them 15:57:42 kei: satya, do you have any recommendations for providence representation? 15:58:13 satya: we are collaborating with sci-discourse group, have shared OWL file, maybe lena could discuss with them as well. 15:58:39 lena: before we go to sci-discourse, we need to make sure it uses existing terms 15:59:17 satya: we are already using external terms, we make it clear that is undergoes changes. 16:00:33 scribenick: scott 16:01:14 Matthias: Yes, the HCLS KB contains the diseasome dataset and all of the datasets that came out of the LODD work. 16:02:37 matthias: nothing new about Science Commons knowledge base. 16:03:03 ..I haven't heard anything from Neurocommons. A new release is supposed to be on the way. 16:03:26 scribenick: matthias 16:04:45 Bye - I have to go to the Terminology call. 16:04:58 -mscottm 16:05:17 thanks, scott 16:05:31 satya: KEGG, BioCyc etc. expose pathway data as RDF (BioPAX) 16:08:40 kei: the group can also think about publication opportunities. 16:09:02 kei: i will be away between late june and late july. 16:11:14 kei: jun, maybe you could help out during july. 16:11:50 kei: we should try to keep the momentum going. is it okay for everyone to participate in weekly calls? 16:13:10 - +1.832.386.aaaa 16:13:12 - +1.937.775.aacc 16:13:17 -??P11 16:13:18 -Kei_Cheung 16:13:44 zakim, generate minutes 16:13:44 I don't understand 'generate minutes', matthias_samwald 16:14:04 -matthias_samwald 16:14:05 SW_HCLS(BioRDF)11:00AM has ended 16:14:07 Attendees were +1.832.386.aaaa, Kei_Cheung, matthias_samwald, [IPcaller], mscottm, +1.937.775.aabb, +1.937.775.aacc 16:16:03 rrsagent, please make logs world-visible 16:16:27 rrsagent, please create the minutes 16:16:27 I have made the request to generate http://www.w3.org/2010/05/24-hcls-minutes.html matthias_samwald 16:17:16 rrsagent, please make logs world-visible 16:18:18 scribenick: matthias_samwald 16:18:30 rrsagent, please create the minutes 16:18:30 I have made the request to generate http://www.w3.org/2010/05/24-hcls-minutes.html matthias_samwald 16:28:40 matthias_samwald has left #hcls 17:36:25 egonw has joined #hcls