HCLSIG BioRDF Subgroup/Meetings/2006-03-13 Conference Call

Conference Details

Date of Call: Monday March 13, 2006
Time of Call: 11:00am Eastern Time/16:00 UTC
Dial-In #: +1.617.761.6200 (Cambridge, MA)
Participant Access Code: 246733 ("BIORDF")
Duration: ~1 hour
Convener: Susie Stephens
Scribe: Roger Cutler

Draft Agenda

Discuss use case focus
Review progress on task templates and tasks

Meeting Minutes

Attendees: Karen Skinner, Susie Stephens, Don Doherty, John Barkley, Kei Cheung, Alan Ruttenberg, Scott Marshall, Joanne Luciano, Roger Cutler

Susie: John Wilbanks was going to give update on neuroscience, but unable to attend this week. Discussion of neuroscience resources is still important. Perhaps should invite several neuroscience experts and have a discussion. Shoot for next week. Ideally Tim Clark, John Wilbanks, June Kinoshita and others would participate.

Karen: Suggests Dan Gardner.

Susie: Eric Miller has a strong preference for scientists to join WG rather than having separate advisory board.

Tasks Progress:

Alan: Will start with a task based on reagents, etc. Already has code that scrapes an antigen Web site that displays information from a number of vendors. The Alzheimer site has information about antibodies, and a forum where people ask questions about antibodies. These questions could form a basis for queries. He'll write up the details of the task by next week.

Alan: Talked to Davide about whether he could apply company software to NLP cases – e.g. unstructured description of patient in brain atlas. Davide is nibbling at the hook. Will arrange demo. Use tool to get XML out of text parse, then transform XML to RDF.

Karen: Will talk to Alan offline about antibodies.

Kei: Has started editing the description of the task on the Wiki. The general objective is to convert relational data in Oracle Database, to RDF data in the Oracle Database. The goal is to better understand how to map the relational structure to the RDF structure. Planning to use D2RQ for the conversion. Most neuroscience databases are in Oracle.

Susie: Stanford University has built a pathways database in the Oracle RDF Data Model that includes KEGG, Ecocyc and Reactome. The database is available on the Web, and can be queried using Oracle's RDF_MATCH, or canned queries. Stanford used Jena to parse data from RDF/XML to NTriple, so that it could be batch loaded into Oracle. Susie agreed to contact Stanford to see if they could participate in a future call, so that we could learn more.

Alan: BIOPAX is in OWL. Stanford would have started with data that had been exported to BIOPAX from its native format. Each data source has a custom piece of code specific to the underlying database.

Susie: Trying to provide a hosted instance of Oracle for people to play. Is currently exploring whether it'd be possible for the Swiss Institute of Bioinformatics to host such a system.

Scott: No bites on Huntington’s Disease.

Susie: Huntinton's Disease is an attractive area to focus on. But appears that people don’t seem to have a real strong preference for what disease to focus on. The disease that wins will probably be the one where some scientists agree to participate.

Don: What would we want from a neuroscience expert? Data?

Susie: Scientists would help direct us towards the best data sets to enable answering scientifically valid questions. Formulate questions. Demonstrate value.

Don: I am a neuroscientist. Small group putting out data on Web and human brain. Vast group, however, no good way to publish data to Web. Biggest challenge is that for the most part, data is currently not available on the Web.

Karen: It’s really hard to get use cases out of neuroscientists. Showing demos such as BioDASH helps scientists to better understand what can be accomplished using the Semantic Web.

Susie: Other examples include the work that I did with Joanne and Siderean to integrate many bioinformatics data sets; and the high level use case I did with Cerebra for drug safety determination.

Don: Scientists have a real aversion to letting data get out. Institutions like NIH share data, but individual researchers less willing. Don’t have requirement to publish data in neuroscience. There are some cultural problems to overcome.

Karen: NIH resource/data sharing policies are helping. But don’t have good repositories.

Susie: Will try to get some neuroscientists on the call next week. Will start with John Wilbanks, and then Nancy Wexsler.