HCLSIG BioRDF Subgroup/Tasks/Ruby On Rails and ActiveRDF
Task
The task is to produce a Ruby On Rails and Active RDF implementation of the BioDash eclipse (Java) based gui so that we can show that the functionality shown in BioDash can be implemented using web based technologies. We will work with members of the working group to produce a number of common data formats to produce RDF/Owl documents through this web interface. More importantly we will show how we produced these adapters and provide a "cookbook" document that shows how we produced the software so that others can reproduce and use artifacts from our efforts.
Task Objectives
- Research Ruby on Rails/ActiveRDF usage
- Produce a skeleton project using the scripts in ROR
- Produce Ruby code that parses and transforms formats -> RDF
- Start to produce GUI tools that allow people to use tranformers
- Produce Ruby code that takes RDF and shows connections between disparate data
- Produce pretty GUI showing data types based on ontologies chosen/produced by work group
From My E-mail
A: The BioRDF group is tasked with producing a set of documents that show how to produce RDF from common data formats. However, the community needs to know why RDF is needed above and beyond traditional RDBM technology. I propose a web based tool, that shows this utility by taking common biological data formats (excel, biopax, mage-ml etc.), transform them into RDF and allow for query and storage using an intuitive user interface.
Q: What impact will this project have in terms of customer awareness/ community awareness around the problem/issue you are solving?
A: While I don't have any quantitative data to support my claim, an informal survey of top/mid level engineers and managers in the Bioinformatics domain shows a low adoption rate of this technology. Blocking factors for these individuals are: 1) lack of experience with the technology, 2) Unsure how/why this supersedes/compliments current RDBM technology, 3) No public resource showing how to implement real systems using current technology, and 4) Don't understand how adoption impacts development in terms of implementation, maintenance, security, and complexity (what is the impact to a project execution timeline and cost?)
Q: How do you propose to implement this project?
A: I would like to work with the BioRDF group to produce a Ruby On Rails based application that mimics the BioDash thick client first. I'd then like to work with people to take data sets that they have already available in RDF and link them into the web dashboard. From there (or in parallel) we can take other datasets that have been transformed into RDF and show how data just "snaps in" (Eric Miller likes to call this "recombinant data").
Q: What is the duration of the project (show metrics if you have any)?
A: This project should take less than 3 months of time - based on metrics stated above
Q: How many people are required to meet your timeline (show metrics/ data if any)?
A: I believe 3 people would be ideal for this project, 2 developers and 1 scientific lead to show the utility of the data sets that have been integrated - have no data to support this claim.
Q: Are there any hurdles/risks/blocking factors you can identify upfront?
A: Yes, activerdf is still under development although I've produced a proof of concept application to aid in mitigating this risk. Ideally we'd use Oracle's RDF storage engine for this project. Unfortunately, activerdf does not support Oracle at the moment. (wonder if Susie Stevens could help me here?). Not sure how many people (other than myself) are aware of Ruby On Rails or have experience implementing applications using this technology. This may extend timeline suggested above.
Link For the Uninitiated http://www.glue.umd.edu/~billtj/ruby.html
Code