14:55:53 RRSAgent has joined #hcls 14:55:53 logging to http://www.w3.org/2010/08/09-hcls-irc 14:56:04 Zakim, this will be BioRDF 14:56:04 ok, kei; I see SW_HCLS(BioRDF)11:00AM scheduled to start in 4 minutes 14:56:18 agenda+ introduction [Kei] 14:56:48 agenda+ rdf representation of genelists [lena, jun, satya, michael, scott] 14:57:05 agenda+ paper [All] 14:57:16 Zakim, take up next agendum 14:57:16 agendum 2. "RDF genelist representation" taken up [from Jun, Lena, Satya, Scott, Michael] 15:00:26 jun has joined #hcls 15:01:27 SW_HCLS(BioRDF)11:00AM has now started 15:01:28 matthias_samwald has joined #hcls 15:01:34 +Kei_Cheung 15:02:16 + +1.510.527.aaaa 15:02:20 ssahoo2 has joined #HCLS 15:02:41 LenaDeus has joined #hcls 15:02:56 +??P3 15:03:01 mscottm has joined #hcls 15:03:08 +[IPcaller] 15:03:11 + +1.832.386.aabb 15:03:25 zakim, [IPcaller] is jun 15:03:25 +jun; got it 15:03:27 +??P6 15:03:40 +mscottm 15:03:49 zakim, ??P6 is matthias_samwald 15:03:49 +matthias_samwald; got it 15:04:19 I can scribe 15:05:13 scribenick lena 15:05:23 - +1.832.386.aabb 15:06:13 mscottm has joined #hcls 15:08:11 io-informatics.com 15:08:46 + +1.832.386.aacc 15:09:14 Lena has joined #hcls 15:09:25 sorry... my internet connection was lost 15:09:57 IO informatics is moving towards integration of biomedical data; 15:10:14 mscottm: data sharing in bioinformatics 15:10:37 (sorry... who is talking?) 15:12:31 That was Chuck Raffi (sp?) from IO Informatics 15:12:35 kei: been working on data integration using rdf/owl to standardize in machine understandable way 15:13:22 kei: provennace, data used to describe experiments; how to link to existing ontologies 15:13:30 lena, you are causing noise while typing. 15:13:50 (thaks, just mutted myself :-) ) 15:14:29 kei: could recommend best practices to facilitate data integration 15:14:46 kei: could be applied to microaray data but to other datasets as well 15:15:27 kei: capture the relationships to support semantic queries 15:16:31 kei: finalizing the genelist as soon as possible due to deadline submission approaching 15:17:09 I think I should talk about Lena 15:17:38 s/about/after/ 15:19:03 kei: introduction almost complete 15:20:35 kei: possible link to other ontologies (provenir?) that our structure could be related to 15:22:13 jun: provenance information is quite rich 15:22:28 jun: looking at what used to generate the samples 15:23:21 kei: capture the version of software 15:23:42 satya: have not modified the ontology 15:25:26 jun: doap ( ?) has been used to describe software 15:26:16 jun: makes sense to start with the queries and decide how the rdf representation of the data looks like and then represent in the ontology 15:26:54 http://usefulinc.com/ns/doap# 15:27:02 kei: our work should be in line with other existing efforts 15:27:13 kei: examples are array express or mged group 15:27:54 ssahoo2 (breaking up): ncbo and obi ontologies 15:28:06 ssahoo2: making sure we are not re-inventing the wheel 15:28:32 ssahoo2: nci thesaurus and other ontologies, need to be careful to avoid re-creating the ontologies 15:29:08 mscottm: in contact with EBI who have said that they have not done this yet 15:29:43 mscottm: software ontologies have been used 15:30:27 kei: look at the rdf structure and see how well the queries can be answered 15:31:05 kei: decide what are the unique things that we have contributed and how can we link to other groups work 15:31:07 http://bioportal.bioontology.org/ontologies/42036 15:31:18 http://www.ebi.ac.uk/efo/swo 15:32:30 kei: other potential datasets that we can integrate with (pathway/protein/diseasome) 15:33:47 kei: scott mentions uniprot dataset 15:34:01 mscottm: uniprot has its own rdf representation - not in hcls kb because it is very large 15:34:23 mscottm: we can integrate relevant parts of the uniprot datasets 15:34:35 kei: interesting to integrate genomics and proteomics 15:35:10 mscottm: uniprot datasets behind a sparql endpoint has been done (but used their own flavor of rdf structure) 15:36:00 mscottm: sticky point is coordination with others in the community 15:36:10 mscottm: is there a way to coordinate with bio2rdf? 15:36:31 mscottm: while we are at it, why not integrate shared vocabularies? 15:37:05 mscottm: use some of the information in the protein records 15:37:14 http://www.stanford.edu/~coulet/material/ontology/phare.owl 15:37:31 mscottm: another source of rdf it's a pharma ontology about genes, drugs and diseases 15:37:52 mscottm: put together at ncbo by Adrian Cullet (?) 15:38:02 mscottm: the source of information is nlp techniques 15:38:11 http://sparql.bioontology.org/webui 15:38:16 mscottm: (not a text miner) 15:38:18 http://www.stanford.edu/~coulet/material/sparql_queries 15:38:44 mscottm: already behind a sparql endpoint 15:39:43 scott agreed to coordinate with other sparql endpoints 15:40:35 matthias_samwald: annotating some of the text associated with the microarray studies 15:40:45 matthias_samwald: need to know which kinds of studies did we chose 15:42:15 mscottm: use void 15:42:37 mscottm: void statement can be inserted into the graph itself 15:42:52 mscottm: can also put the statement in the second graph 15:43:19 mscottm: use void to refer to who created a particular statement 15:44:21 jun has joined #hcls 15:46:05 to follow up on Scott's idea - should we treat Lena's RDF file and Jun's RDF file as two separate sources? 15:46:28 satya - they are not separate representations, they are follow ups 15:46:56 the name at the end of the files is not its owner but the latest person who made the modifications 15:47:15 mscottm: idea is coming up with interesting provenance information 15:47:53 right - we add distinct named graph ids with each of the ttl files and issue a federated SPARQL query directed to each named graph 15:48:09 yes 15:48:23 ok 15:48:58 http://ibl.mdanderson.org/~mhdeus/sparql_federation/endpoint2.php 15:52:05 yes, sure 15:52:13 yes 15:53:35 matthias_samwald: if we focus on some example it is easier to connect to other sources 15:54:06 kei: think about some example queries that will give broader integration with other types of data 15:54:58 mscottm: will look into how niff is doing relatively to its work with microarrays 15:55:56 Chuck has joined #HCLS 15:56:27 mscottm: problem is our focus queries are too directed to the neurosciences, but not so much towards provenance 15:56:46 kei: jun is adding the provenance dimension to the paper 15:58:16 mscottm: if we know where the genelist came from and the microarray experiment, we can come up with the experiments as results from previous provenance queries 15:58:35 mscottm: we can build a query that preceeds the selection of the 3 datasets 15:58:48 kei: all the examples are affy platforms, but different statistical approachas 16:02:49 kei: main tasks - get the examples working! 16:03:29 -matthias_samwald 16:03:41 -jun 16:03:48 mscottmarshall@gmail.com 16:03:53 -??P3 16:04:07 - +1.832.386.aacc 16:04:11 -mscottm 16:04:14 RRSAgent, please draft minutes 16:04:14 I have made the request to generate http://www.w3.org/2010/08/09-hcls-minutes.html kei 16:04:15 - +1.510.527.aaaa 16:04:23 RRSAgent, please make log world-visible 16:09:14 disconnecting the lone participant, Kei_Cheung, in SW_HCLS(BioRDF)11:00AM 16:09:18 SW_HCLS(BioRDF)11:00AM has ended 16:09:20 Attendees were Kei_Cheung, +1.510.527.aaaa, +1.832.386.aabb, jun, mscottm, matthias_samwald, +1.832.386.aacc 16:21:07 matthias_samwald has joined #hcls