Version 1.0 of Bio2RDF and Chembl webapps released

Hi all,

The 1.0.1 version of the Bio2RDF server software has been released on
Sourceforge. The software is designed to be a Linked Data interface to
a range of RDF datasources, with the current examples being Bio2RDF
and Chembl. (There was a small bug fix needed to enable endpoint
round-robin between 1.0.0 and 1.0.1.)

The Bio2RDF version now contains 1655 namespaces, 111 query types, 455
data providers and 197 normalisation rules. In addition to the Bio2RDF
server interface [1], I made up a version suitable for browsing
through resources located in Egon Willighagen's Chembl RDF dataset [2]
from http://rdf.farmbio.uu.se/chembl/sparql . I picked this dataset as
it did not seem to have a linked data interface to go with its URIs.

If anyone has any other RDF or SPARQL datasets that they want to have
a linked data interface then feel free to either mention them and/or
experiment with the software yourself based on the Chembl sources (all
of the Bio2RDF and Chembl configuration files can be found in the
sourceforge Git repository at [3]). The Bio2RDF configuration files
may be easier to find examples in, but they are quite large, so you
may want to start by looking at how Chembl was implemented using the
software. The software is designed to merge data from different
locations, so if you want to do a mashup with the data you can make up
queries/providers/normalisation rules to do it and publish them as a
server configuration using the configuration interface in the server
(/admin/configuration/n3 , For example:
http://config.bio2rdf.org/admin/configuration/n3 ) so that others can
extend/integrate/re-use them easily in their own instances of the
software.

The HTML interface for the software is configurable using Apache
Velocity templates, so don't feel that you have to stick to the
current design if you want to reuse the software.

This version allows you to optionally perform 303 redirections along
with the in-built HTTP content negotiation using the Accept header.
This allows access to the results of queries in the following RDF
formats:

* RDF/XML - using /rdfxml/ or Accept:application/rdf+xml
** For example: http://bio2rdf.org/rdfxml/geneid:12334
* N3 - using /n3/ or Accept:text/rdf+n3
** For example: http://bio2rdf.org/n3/geneid:12334
* HTML and RDFa - using /page/ or Accept:text/html
** For example: http://bio2rdf.org/page/geneid:12334 (RDFa embedded in
the HTML representation)
* NTriples - /ntriples/ or Accept:text/plain
** For example: E.g. http://bio2rdf.org/ntriples/geneid:12334 (this
and the following two formats will not work until we actually do the
upgrade on bio2rdf.org in the next few days.)
* NQuads  - /nquads/ or Accept:text/x-nquads
** E.g. http://bio2rdf.org/nquads/geneid:12334 (The graph names match
the provider URI that the data came from. I.e., it is not possible to
see the provenance for each statement in a results document. The
downside to this is that it can sometimes cause duplicate statements
to appear if they are located in different graphs, even in the HTML
representation.)
* RDF/Json - /json/ or Accept:application/json
** E.g. http://bio2rdf.org/json/geneid:12334

If you want to make changes to any of the datasources, you can
normalise results (and denormalise queries) using rules. This version
allows you to use XSLT and SPARQL in addition to Regular Expressions
that were already supported. For example this now means that if there
are XML datasources, you can convert them to RDF using the
normalisation rules in the server. The software now uses Sesame 2.4.0
which contains support for SPARQL Query 1.1 so any SPARQL rules that
are applied to intermediate results can be transformed using the new
functions and language features. In particular, SPARQL 1.1 allows you
to create new URIs inside queries, so if you know that there are
literals in an rdf document that you could create a URI out of, you
can do it using a SPARQL normalisation rule for that datasource.

In addition, if anyone wants to suggest a way to do any other rule
based normalisation I would be happy to extend the software to support
it. I have always had RIF rules in the back of my head, but have not
experimented with them yet. If anyone uses RIF rules in their work, it
would be great to get some example code to guide a future extension to
this software.

If you use the server software and/or Bio2RDF datasets it would be
great if you could cite us using the publications on the wiki at [4]

Thanks,

Peter

[1] http://sourceforge.net/projects/bio2rdf/files/bio2rdf-server/bio2rdf-1.0.1/
[2] http://sourceforge.net/projects/bio2rdf/files/chembl-server/chembl-webapp-1.0.1/
[3] http://bio2rdf.git.sourceforge.net/git/gitweb.cgi?p=bio2rdf/bio2rdf;a=tree;f=bio2rdf-webapp/src/main/resources/config;hb=HEAD
[4] https://sourceforge.net/apps/mediawiki/bio2rdf/index.php?title=Publications

Received on Thursday, 30 June 2011 06:58:59 UTC