14:53:06 RRSAgent has joined #hcls 14:53:06 logging to http://www.w3.org/2010/06/14-hcls-irc 14:53:13 Zakim, this will be BioRDF 14:53:13 ok, kei; I see SW_HCLS(BioRDF)11:00AM scheduled to start in 7 minutes 14:53:31 agena+ introdution [Kei] 14:53:41 agenda+ introduction [Kei] 14:53:59 agenda+ NIF SPARQL endpoints [Jeff, Stephen] 14:54:24 agenda+ genelist rdf representation [Lena, Jun, Satya, Scott] 14:59:32 matthias_samwald has joined #hcls 15:00:30 slars0n has joined #hcls 15:01:13 SW_HCLS(BioRDF)11:00AM has now started 15:01:19 +Kei_Cheung 15:01:41 +??P18 15:01:56 +[IPcaller] 15:04:51 +??P0 15:05:27 mscottm has joined #hcls 15:06:18 mscottm2 has joined #hcls 15:06:23 + +1.650.331.aaaa 15:09:57 scribenick matthias 15:10:07 scribenick: matthias_samwald 15:10:20 Zakim, +1.650.331.aaaa is daniel_rubin 15:10:20 +daniel_rubin; got it 15:10:58 + +1.858.353.aabb 15:11:15 stephen: my name is stephen larson, i am a a 5th year neuroscience / IT student, my advisor is maryanne martone 15:11:40 ... one of the resources i put together is Neurolex.org (building a standard ontology) 15:11:53 ... i also built the 'whole brain catalog', connects to neurolex 15:12:21 ... also built the 'multi-brain connectome browser', based on SPARQL queries of NIF and Neurolex. 15:13:20 ... i plan to finalize my work over the next year. 15:13:47 ... i am increasingly interested in the BioRDF world. Was excited about the LODD paper for triplification challenge. 15:14:01 ... thought about possibilities for re-using that content for the Wiki 15:14:45 Jeff: My name if Jeff Grethe, I am one of the co-PIs of NIF 15:15:13 .. previously, i have been involved with BIRN, worked on bringing MRI data on the web 15:16:09 ... i have worked with Maryanne on a user-centered portal, but also on the back-end. we began publishing RDF with Neurolex, we also support tools for publication in RDF format from other data sources. 15:17:21 daniel: My name is Daniel Rubin, i am from Stanford University, I am no longer affiliated with NCBO, now working in Radiology group at stanford (data integration in the context of imaging. "image-phenotype correlation). 15:17:48 ... i am still involved in NCBO, synergy with imaging projects of maryanne martone 15:18:20 ... why i joined this call: i am interested in linked data for imaging. 15:18:42 ... image-phenotype data connected to disease information on the web. 15:19:10 ... scott and eric have visited stanford, we have been working on setting up SPARQL endpoints. 15:20:07 kei: i want to give a bit of context. part of the agenda is to have jeff and stephen give a description of the new NIF sparql endpoints. 15:20:19 ssahoo2 has joined #HCLS 15:20:21 ... this is related to our broader query federation use-case. 15:21:13 ... more recently we also looked at a more specific use case, microarray data. 15:21:33 ... we have looked at some examples of microarray results in the area of neurological diseases. 15:21:52 ... from gene expression data we could also link to other kinds of data, including imaging. 15:22:12 + +1.937.775.aacc 15:22:41 zakim, +1.937.775.aacc is Satya Sahoo 15:22:41 I don't understand '+1.937.775.aacc is Satya Sahoo', matthias_samwald 15:23:07 zakim, +1.937.775.aacc is ssahoo2 15:23:07 +ssahoo2; got it 15:23:20 kei: let us start with the description of NIF endpoints. 15:24:26 jeff: we can divide it into two types of content: the entities in NIF, and the properties that are entered by the community. 15:24:27 NIF SPARQL Endpoint: https://confluence.crbs.ucsd.edu/display/NIF/Sparql+endpoint 15:24:47 ... this is available in several ways. first, a SPARQL endpoint. 15:25:46 ... second, extracted data from literature, and making it query-able. this data will also be available through the SPARQL endpoint. 15:26:48 ... this contet will be available in September. 15:27:49 ... kei: for the microarray use-case, we have looked at some examples, such as Alzheimer's disease. Information about different types of neurons, brain regions etc. would be very helpful for annotation. 15:29:09 kei: you also mentioned the literature aspect. one of the challenges we encountered was extracting gene lists from papers. 15:30:19 stephen: to get a sense of the basic structure of what we are doing here: we are going through a loop between an OWL file which contains the NIF content, and a Semantic MediaWiki, which has every entity in that ontology renderes as a page. 15:30:37 ... an ontology engineer can track the changes in the wiki and updates the OWL ontology. 15:30:53 http://neurolex.org/wiki/Main_Page 15:31:02 ... as the OWL file changes, the engineers will update the wiki. 15:31:28 ... neurolex is easily accessible through the web browser. l 15:32:05 ... our goal with the ontology was to be very comprehensive. instead of linking out, we brought everything in. 15:32:19 ... now that semantic web is growing, we are evaluating ways of linking out. 15:32:47 ... the SPARQL endpoint i sent before contains a lof of OWL statements (restrictions etc.) 15:32:47 http://neurolex.org/wiki/SparqlEndPoint 15:33:03 ... the SPARQL endpoint i sent just now comes from the Semantic MediaWiki export. 15:33:22 ... this version has less OWL (restrictions etc.) in it. 15:33:41 ... the two endpoints are on different servers. 15:34:02 ... the ontology endpoint is on a virtuoso server. advantage: can do transitive queries. 15:34:41 ... the performance of transitive queries is good. 15:35:04 satya: did you run rules / pre-inferencing? 15:35:29 stephen: the transitive operation does not require rulesets as far as i know, you just add it to the query. 15:35:38 stephen: don't know about internals. 15:35:51 (sorry, scott) 15:36:14 s/satya/scott/ 15:36:40 stephen: we used a cloud-based service that lets you do SPARQL 15:36:52 ... has well-documented update facilities. 15:37:24 ... you can even have a 'history' of updates. 15:38:13 http://n2.talis.com/wiki/Main_Page 15:38:39 ... (N2 by Talis) 15:40:13 kei: in HCLS we have two instances of Knowledge Bases: the one at DERI (based on Virtuoso), one at University of Berlin (based on AllegroGraph). 15:42:01 ... we have the endpoints, but users still need to know detailed graph structure. it would be helpful to have some high-level metadata that would help users know what information is contained in endpoint, what information can be interrelated between endpoints... 15:42:20 ... at the moment we have to develop federated queries at a very low level. 15:43:25 scott: at the moment we have a few, nice, useful SPARQL endpoints, but in the future there could be thousands of enpoints to choose from 15:43:49 ... the ultimate form of federation would be asking the question at one place and having it automatically distributed to the right places. 15:45:07 ... OWL, SKOS? is it exposed via D2R or SWObjects? Licensing information? 15:45:37 ... you also need to know the contents. having very condensed information about what is contained in the named graph. 15:46:38 egonw has joined #hcls 15:46:41 jeff: we are extracting data from tables, we have a curator working on that. 15:47:13 ... e.g., how up- and down-regulation is represented. we use a mixture of automated tools and manual curation. 15:47:56 ... the tables usually come from HTML/PDF version of papers. sometimes also from supplemental material. 15:50:31 scott: another aspect (having spoken to chis stoeckert)... if we take this not only to MGED, but also the publishers, and try them to get researchers to submit gene lists, that would solve this problem in the future. 15:51:21 kei: the NIF ontology will also be deposited in NCBO BioPortal 15:51:35 kei: BioPortal has its own SPARQL endpoint, too 15:51:55 kei: will there be redundancy? which endpoints / URIs will I use? 15:52:43 jeff: Neurolex is the 'working draft', before it goes through the rigours of ontology engineer. 15:52:59 egonw has joined #hcls 15:54:33 ... NCBO is a community place. 15:55:13 scott: i suppose that some of the data released in september will also contain the data that was annotated 15:55:17 jeff: yes 15:55:31 scott: you could also make that data available from NCBO 15:55:55 sorry I have to leave 15:56:01 bye 15:56:22 -ssahoo2 15:57:54 kei: another topic: gene lists. a number of us have been working on how to represent gene lists. 15:58:21 kei: we could look at Neurolex to see which neuroscience terms we can extract form these endpoints that would be relevant for annotation. 15:58:53 kei: matthias has also been working with aTags, used NCBO resources. 15:59:20 egonw_ has joined #hcls 16:01:35 kei: we need an iterated process of debugging, based on use-cases 16:02:56 kei: i will be away, jun will convene some of the calls 16:03:34 stephen: we would be happy to receive feedback, suggestions for links. 16:04:16 scott: one potential use-case would be EHRs, helping clinicians with certain tasks through integrated information. 16:05:08 -daniel_rubin 16:05:34 -??P18 16:05:37 -Kei_Cheung 16:05:40 kei: feel free to e-mail me while i am away throughout the next 5 weeks 16:05:41 -??P0 16:05:57 - +1.858.353.aabb 16:07:33 zakim, please make logs world-visible 16:07:33 I don't understand 'please make logs world-visible', matthias_samwald 16:09:10 RRSAgent, please draft minutes 16:09:10 I have made the request to generate http://www.w3.org/2010/06/14-hcls-minutes.html matthias_samwald 16:09:20 RRSAgent, please make log world-visible 16:09:58 -[IPcaller] 16:09:59 SW_HCLS(BioRDF)11:00AM has ended 16:10:01 Attendees were Kei_Cheung, [IPcaller], daniel_rubin, +1.858.353.aabb, ssahoo2 16:14:04 RRSAgent, please make log world-visible 16:14:25 RRSAgent, please draft minutes 16:14:25 I have made the request to generate http://www.w3.org/2010/06/14-hcls-minutes.html matthias_samwald 17:45:12 matthias_samwald has left #hcls