This page describes the joint effort between BioRDF and LODD task forces of HCLSIG for connecting the knowledge about alternative medicines and western drugs, to facilitate patients searching for alternative medicines and biomedical researchers with drug discovery research.

The goal of this exercise is to:

test and evaluate a novel approach of creating links between RDF datasets in a large scale
demonstrate how Linked Data can be used to connect TCM and western medicine to explore how to intersect the two types of medicine. We demonstrate:
- how we can inform patients about possible side-effects of an herb by discovering the side-effect information reported in clinical trials of western drugs with shared ingredients of this medicine
- how we can inform researchers about possible targets of an alternative medicine by discovering the possible targets of western drugs with shared ingredients of this medicine
- how we can verify genes reported by TCM researchers as being associated with alternative medicines used for the Alzheimer's Disease are indeed AD genes using studies about these genes in the context of western drugs

Data Source

RDF-TCM: see http://code.google.com/p/junsbriefcase/wiki/RDFTCMData
Drugbank: http://www4.wiwiss.fu-berlin.de/drugbank/
SIDER: http://www4.wiwiss.fu-berlin.de/sider/
Dailymed: http://www4.wiwiss.fu-berlin.de/dailymed/
Diseasome: http://www4.wiwiss.fu-berlin.de/diseasome/
LinkedCT: http://www.linkedct.org/
aTags: http://hcls.deri.org/atag/data/tcm_atags.html

These source datasets are transformed into RDF format by the following two approaches:

For DrugBank, DailyMed, Diseasome, SIDER, STITCH, the source datasets as in tab-delimited or XML files are imported into a relational database, and then a D2R server is set up over each relational database.
Customized Python scripts are created to transform tab-delimited data TCMGeneDIT. The scripts can be found at: http://code.google.com/p/junsbriefcase/source/browse/#svn/trunk/biordf2009_query_federation_case/tcm-data

Interlinking of datasets

We used two approaches to create the interlinking between datasets in a large scale:

Silk: http://www4.wiwiss.fu-berlin.de/bizer/silk/
Customized scripts to create the interlinking between RDF-TCM and Entrez gene hosted at http://hcls.deri.org/sparql:
- Firstly, search for mapping Entrez genes from SPARQL endpoint [1] using exact gene name mapping as filters
- Then manually correct many to one gene mappings using Entrez and TCM database web pages

The figure below shows the data sets that have been published and their interlinking pathes so far.

Representation of Interlinks

For the set of links created for any two datasets: we define them as a voiD:LinkSet and an oddlinker:linkage_run
For each link: we represent it as an oddlinker:interlink in order to provide additional metadata about this link, such as which data items are being linked, how much confidence we have for this interlinking.

The application

The applications to support the motivate use cases are currently deployed at http://www.open-biomed.org.uk/admed/admedapps/searchInfoAboutTCM/.

Future directions

incorporate additional data sources, e.g., herbal and/or TCM related sources as well as genomic/clinical/drug data sources
Explore multi-lingual interlinking
Develop new use cases and user-facing applications
Automatic notification on interlink updates between data

HCLSIG/AlternativeMedicineUseCase

Data Source

Interlinking of datasets

Representation of Interlinks

The application

Future directions

See Also