SV_MEETING_TITLE -- 14 Jun 2010

<kei> agena+ introdution [Kei]

<kei> scribenick matthias

<scribe> scribenick: matthias_samwald

stephen: my name is stephen larson, i am a a 5th year neuroscience / IT student, my advisor is maryanne martone
... one of the resources i put together is Neurolex.org (building a standard ontology)
... i also built the 'whole brain catalog', connects to neurolex
... also built the 'multi-brain connectome browser', based on SPARQL queries of NIF and Neurolex.
... i plan to finalize my work over the next year.
... i am increasingly interested in the BioRDF world. Was excited about the LODD paper for triplification challenge.
... thought about possibilities for re-using that content for the Wiki

Jeff: My name if Jeff Grethe, I am one of the co-PIs of NIF
... previously, i have been involved with BIRN, worked on bringing MRI data on the web
... i have worked with Maryanne on a user-centered portal, but also on the back-end. we began publishing RDF with Neurolex, we also support tools for publication in RDF format from other data sources.

daniel: My name is Daniel Rubin, i am from Stanford University, I am no longer affiliated with NCBO, now working in Radiology group at stanford (data integration in the context of imaging. "image-phenotype correlation).
... i am still involved in NCBO, synergy with imaging projects of maryanne martone
... why i joined this call: i am interested in linked data for imaging.
... image-phenotype data connected to disease information on the web.
... scott and eric have visited stanford, we have been working on setting up SPARQL endpoints.

kei: i want to give a bit of context. part of the agenda is to have jeff and stephen give a description of the new NIF sparql endpoints.
... this is related to our broader query federation use-case.
... more recently we also looked at a more specific use case, microarray data.
... we have looked at some examples of microarray results in the area of neurological diseases.
... from gene expression data we could also link to other kinds of data, including imaging.
... let us start with the description of NIF endpoints.

jeff: we can divide it into two types of content: the entities in NIF, and the properties that are entered by the community.

<slars0n> NIF SPARQL Endpoint: https://confluence.crbs.ucsd.edu/display/NIF/Sparql+endpoint

jeff: this is available in several ways. first, a SPARQL endpoint.
... second, extracted data from literature, and making it query-able. this data will also be available through the SPARQL endpoint.
... this contet will be available in September.
... kei: for the microarray use-case, we have looked at some examples, such as Alzheimer's disease. Information about different types of neurons, brain regions etc. would be very helpful for annotation.

kei: you also mentioned the literature aspect. one of the challenges we encountered was extracting gene lists from papers.

stephen: to get a sense of the basic structure of what we are doing here: we are going through a loop between an OWL file which contains the NIF content, and a Semantic MediaWiki, which has every entity in that ontology renderes as a page.
... an ontology engineer can track the changes in the wiki and updates the OWL ontology.

<slars0n> http://neurolex.org/wiki/Main_Page

stephen: as the OWL file changes, the engineers will update the wiki.
... neurolex is easily accessible through the web browser. l
... our goal with the ontology was to be very comprehensive. instead of linking out, we brought everything in.
... now that semantic web is growing, we are evaluating ways of linking out.
... the SPARQL endpoint i sent before contains a lof of OWL statements (restrictions etc.)

<slars0n> http://neurolex.org/wiki/SparqlEndPoint

stephen: the SPARQL endpoint i sent just now comes from the Semantic MediaWiki export.
... this version has less OWL (restrictions etc.) in it.
... the two endpoints are on different servers.
... the ontology endpoint is on a virtuoso server. advantage: can do transitive queries.
... the performance of transitive queries is good.

scott: did you run rules / pre-inferencing?

stephen: the transitive operation does not require rulesets as far as i know, you just add it to the query.
... don't know about internals.

(sorry, scott)

stephen: we used a cloud-based service that lets you do SPARQL
... has well-documented update facilities.
... you can even have a 'history' of updates.

<slars0n> http://n2.talis.com/wiki/Main_Page

stephen: (N2 by Talis)

kei: in HCLS we have two instances of Knowledge Bases: the one at DERI (based on Virtuoso), one at University of Berlin (based on AllegroGraph).
... we have the endpoints, but users still need to know detailed graph structure. it would be helpful to have some high-level metadata that would help users know what information is contained in endpoint, what information can be interrelated between endpoints...
... at the moment we have to develop federated queries at a very low level.

scott: at the moment we have a few, nice, useful SPARQL endpoints, but in the future there could be thousands of enpoints to choose from
... the ultimate form of federation would be asking the question at one place and having it automatically distributed to the right places.
... OWL, SKOS? is it exposed via D2R or SWObjects? Licensing information?
... you also need to know the contents. having very condensed information about what is contained in the named graph.

jeff: we are extracting data from tables, we have a curator working on that.
... e.g., how up- and down-regulation is represented. we use a mixture of automated tools and manual curation.
... the tables usually come from HTML/PDF version of papers. sometimes also from supplemental material.

scott: another aspect (having spoken to chis stoeckert)... if we take this not only to MGED, but also the publishers, and try them to get researchers to submit gene lists, that would solve this problem in the future.

kei: the NIF ontology will also be deposited in NCBO BioPortal
... BioPortal has its own SPARQL endpoint, too
... will there be redundancy? which endpoints / URIs will I use?

jeff: Neurolex is the 'working draft', before it goes through the rigours of ontology engineer.
... NCBO is a community place.

scott: i suppose that some of the data released in september will also contain the data that was annotated

jeff: yes

scott: you could also make that data available from NCBO

<ssahoo2> sorry I have to leave

<ssahoo2> bye

kei: another topic: gene lists. a number of us have been working on how to represent gene lists.
... we could look at Neurolex to see which neuroscience terms we can extract form these endpoints that would be relevant for annotation.
... matthias has also been working with aTags, used NCBO resources.
... we need an iterated process of debugging, based on use-cases
... i will be away, jun will convene some of the calls

stephen: we would be happy to receive feedback, suggestions for links.

scott: one potential use-case would be EHRs, helping clinicians with certain tasks through integrated information.

kei: feel free to e-mail me while i am away throughout the next 5 weeks

- DRAFT -

SV_MEETING_TITLE

14 Jun 2010

Attendees

Contents

Summary of Action Items

Scribe.perl diagnostic output