https://www.w3.org/wiki/api.php?action=feedcontributions&feedformat=atom&user=RboyceW3C Wiki - User contributions [en]2024-03-19T11:43:50ZUser contributionsMediaWiki 1.41.0https://www.w3.org/wiki/index.php?title=HCLSIG/DDI&diff=97383HCLSIG/DDI2016-02-01T16:23:27Z<p>Rboyce: Created page with "One of the goals of the task force is to develop a minimal information model for drug interaction evidence and knowledge as part of an HIT standard like HL7. The task force is..."</p>
<hr />
<div>One of the goals of the task force is to develop a minimal information model for drug interaction evidence and knowledge as part of an HIT standard like HL7. The task force is volunteer-based, and formed within the Health Care and Life Sciences Interest Group that operates publicly through the World Wide Web Consortium (W3C).<br />
<br />
Documents, meeting information, and announcements for this task force are managed at this Google Site page: https://sites.google.com/site/ddikrandir/home/ddi_info_model_taskforce</div>Rboycehttps://www.w3.org/wiki/index.php?title=HCLSIG/Pharmacogenomics&diff=68262HCLSIG/Pharmacogenomics2013-09-10T15:58:11Z<p>Rboyce: /* Until March 1, 2014 */</p>
<hr />
<div>=Clinical Pharmacogenomics=<br />
<br />
Understanding how the genetic makeup of individual patients influences the response to pharmaceuticals is essential to the realization of more effective, personalized pharmacotherapy. However, the integration of drug, genotype and phenotype knowledge in medical information systems and its subsequent use in clinical decision making remains a key challenge. This task force is dedicated to developing a biomedical informatics infrastructure that leverages Semantic Web technologies to capture pharmacogenomic findings in such a way that they can be used to inform medical practitioners regarding approved drugs with pharmacogenomic labels. We suggest that such computational technologies are an essential part of personalized medicine, providing key information for translational medicine and clinical care.<br />
<br />
[[File:Patients are prescribed the same thing but are different.jpg|600px]]<br />
<br />
This Task Force is led by [[MichelDumontier|Michel Dumontier]] and [[MatthiasSamwald|Matthias Samwald]].<br />
<br />
== Teleconference ==<br />
This task force meets on the 1st and 3rd Wednesday at 10:15am EST. Be an active participant and join our next call:<br />
* [[/Meetings/2013-06-05_Conference_Call|Next Meeting]]<br />
* [http://www.w3.org/wiki/Category:HCLS_PGX Past Meetings]<br />
<br />
== Goals ==<br />
* Capture use cases for pharmacogenomic clinical research and medicine<br />
* Formalize the representation of pharmacogenomics-related information using Semantic Web technologies and ontologies<br />
* Demonstrate the integration of (genomic) patient data with biomedical resources (SNPs, drugs, genes, trials, treatments, adverse events)<br />
* Demonstrate a working interface to explore and query pharmacogenomic knowledge<br />
* Demonstrate how such representations can be used for clinical decision support<br />
<br />
== Participants ==<br />
* [[MichelDumontier|Michel Dumontier (co-chair)]]<br />
* [[MatthiasSamwald|Matthias Samwald (co-chair)]]<br />
* Eric Prud'hommeaux (W3C tech)<br />
* Robert Freimuth<br />
* Simon Lin<br />
* Richard D. Boyce<br />
* Bob Powers<br />
* Joanne S. Luciano<br />
* M. Scott Marshall<br />
* Elgar Pichler<br />
* Adrien Coulet<br />
* Ratnesh Sahay<br />
<br />
== Schedule ==<br />
<br />
=== Until September 1, 2013 ===<br />
* Matthias Samwald<br />
** Paper and presentation at OWL Reasoner workshop - ''DONE''<br />
** Have post-doc in Vienna working on the project full-time - ''DONE''<br />
** Finalize re-implementation of decision support service (behind http://safety-code.org/) in TrOWL. Need to establish collaboration with Jeff Pan and his group. - ''DONE''<br />
** Submit paper about Genomic CDS to the journal Bioinformatics - ''ALMOST DONE (submission expected mid-September).''<br />
** Tidy up / reorganize the W3C wiki - ''DONE''<br />
** Tidy up / reorganize http://www.genomic-cds.org/ - ''OVERDUE''<br />
* Richard Boyce<br />
** Semantically annotated product label information clinical pharmacogenomics statements published as valid Open Data Annotation and with best practice provenance (including IAA) ''DONE'' (NOTE:for a sub-set of biomarkers/drugs http://www.youtube.com/watch?v=Te546vOiruo, continuing the project)<br />
* Michel Dumontier<br />
** Describe work done this past term with curating class-level drug-drug interactions<br />
*** [https://docs.google.com/file/d/0B1-qT2rHHTkFX2VMRlJfNldHeG8/edit?usp=sharing poster] of curated interactions from drugbank; Honors Thesis Student Holly Surins<br />
** Curate and validate annotations for drug effects, drug indications and drug-drug indications on drug product labels (with Rich)<br />
*** installing DOMEO - in progress<br />
** Identify and develop priority datasets to be included in Bio2RDF that would benefit this task force <br />
*** rxnorm<br />
*** stitch<br />
* Bob Freimuth<br />
** Review Cerner's [http://www.cerner.com/about_cerner/clinical_bioinformatics_ontology/ CBO] (strengths/gaps for CDS) - ''ONGOING (had a look at it together with Matthias during Medinfo, Matthias to write down some impressions - a shared Google doc should be started)''<br />
** Discuss modeling approaches for genomic CDS (what's the trigger? integration with/maintenance of knowledge base; review existing implementations (TPP); feedback to CPIC?)<br />
** Discuss generalization of med safety code approach to NGS of PGx genes - ''DONE''<br />
** Apply generalized MSC to a real data set (demonstration)<br />
** Consider opportunities to partner with eMERGE and/or PGRN (user base for generalized MSC) - ''ONGOING''<br />
* Simon Lin<br />
** Shape a scientific manuscript out of the use-case document draft at docs.google.com/document/d/12leHdI-GT2dzRgvVIx13wsXAaCLj6eNGvQ-L2B749f4/edit<br />
<br />
=== Until December 1, 2013 ===<br />
<br />
* Matthias Samwald<br />
** Finalize overhaul and update of Genomic CDS ontology - ''ONGOING''<br />
** Need to establish more official collaboration with PharmGKB (Michel moving to Stanford might facilitate that further)<br />
** More official collaboration with the group at Mayo Clinic (Bob Freimuth?)<br />
** Finalize grant application for additional medical postdoc position -- for someone working on the biomedical details / curation / working and evaluating with doctors - ''DONE''<br />
** Enable loading VCF files into Genomic CDS - based decision support service. -- ''ONGOING''<br />
* Richard Boyce<br />
** Qualitative and possibly quantitative (e.g., task-oriented usability comparison) data on the value of integrating the semantic annotations into a prototype clinical pharmacogenomic information system. ''ONGOING'' <br />
** Acquisition of pilot funding (~25K) for development ''ONGOING''<br />
** a conference paper describing the above research activities ''IN PROGRESS''<br />
* Michel Dumontier<br />
** Be settled at Stanford University<br />
** Link drug-drug interactions with pharmacogenomic interactions<br />
** Identify and develop priority datasets to be included in Bio2RDF that would benefit this task force<br />
** Build relationships with associated partners on RDF data publication, use and analysis<br />
* Bob Freimuth<br />
** Apply ontological genomic CDS approach to key CPIC guidelines (demonstration, evaluation)<br />
** Demonstrate gene-gene-drug CDS, ensure approach will scale<br />
<br />
=== Until March 1, 2014 ===<br />
<br />
* Matthias Samwald<br />
** Genomic CDS ontology covers everything it needs to cover, connection to Rich's annotation work is continuously kept up-to-date<br />
** Begin partnering with external organizations (clinics, doctors, pharmacies, pharmaceutical companies) to start evaluating and dissemminating the ontology and decision support solutions (as pure research project only, since it is still undecided how best to get the software certified as a medical device). Marshfield Clinic might be a good partnering organisation for pilot studies?<br />
* Richard Boyce<br />
** Demonstration integration of triggers and recommendations present in the Genomic CDS ontology within the UPMC system using Quest lab results and focused meds (e.g., warfarin, clopidogrel, and some psych drugs). Further qualitative inquiry. - ''IN PROGRESS''<br />
** Submission of a journal article describing the results from all above research activities. - ''IN PROGRESS''<br />
** Submission of a journal paper presenting an analysis of pharmgx statements in product including the range of content, frequency of updates, contrasts with other sources, and recommendations for clinicians, drug information compendia, and the FDA - ''IN PROGRESS''<br />
* Michel Dumontier<br />
** Identify putative animal models for pharmacogenomic outcomes<br />
* Bob Freimuth<br />
** Consider application of gene-drug class and gene-drug-drug rules (leverages drug-drug interaction data)<br />
<br />
==Deliverables==<br />
===Resources===<br />
* [http://www.genomic-cds.org/ Genomic CDS ontology project]<br />
* [http://safety-code.org/ Medicine Safety Code project]<br />
* [[/Use Cases|Use Cases]]<br />
* [http://goo.gl/afRh2 Conceptual Model from Michel]<br />
* [http://goo.gl/IUGns Formalization ideas from Michel]<br />
* [[/Data Sources|Data Sources]]<br />
* [[/Queries|Competency Questions and Queries]]<br />
* Decision Support Rules<br />
<br />
===Selected publications and presentations===<br />
* M Samwald. „Semantically Enabling Genetic Medicine to Facilitate Patients and Guidelines Matching and Enhanced Clinical Decision Support“ Proceedings of the Conference on Semantics in Healthcare and Life Sciences 2013 (CSHALS 2013), February 28, 2013, Cambridge/Boston, Massachusetts, USA - [http://de.slideshare.net/matthiassamwald/samwald-cshals2013-20741065 Slides]<br />
* M Samwald, KP Adlassnig. „Pharmacogenomics in the pocket of every patient? A prototype based on Quick Response (QR) codes“ J Am Med Inform Assoc, Published Online First: 23 Jan 2013, http://dx.doi.org/10.1136/amiajnl-2012-001275 - [http://samwald.info/res/Pharmacogenomics%20in%20the%20pocket%20of%20every%20patient%20-%20A%20prototype%20based%20on%20Quick%20Response%20%28QR%29%20codes%20-%20PREPRINT.pdf Link to openly accessible preprint version]<br />
* An Ontology-based Formalism, Knowledge Base and Reasoning System for Clinical Genetics. Samwald, M., Freimuth, R., Powers, R., Luciano, J., Prud’hommeaux, E., Boyce, R., Marshall, M., Dumontier, M. Poster presentation at the 2013 AMIA Summit on Translational Bioinformatics. San Francisco, March, 2013.<br />
* Toward semantic modeling of pharmacogenomic knowledge for clinical and translational decision support. PE Boyce, RD., Freimuth, RR., Romagnoli, KM., Pummer. The 2013 AMIA Summit on Translational Bioinformatics, San Francisco, CA. PubMed Central- Pending. ([http://www.slideshare.net/boycer/pharmgx-annotationamiatbi2013 Summary on SlideShare])<br />
* M Samwald, A Coulet, I Huerga, RL Powers, JS Luciano, RR Freimuth, F Whipple, E Pichler, E Prud'hommeaux, M Dumontier, MS Marshall. Semantically enabling pharmacogenomic data for the realization of personalized medicine. Pharmacogenomics. 2012 Jan;13(2):201-12. [http://www.ncbi.nlm.nih.gov/pubmed/22256869 PubMed]<br />
* An informatics infrastructure for translating pharmacogenomic knowledge into clinical practice. Matthias Samwald, Adrien Coulet, Robert R. Freimuth, Iker Huerga, Joanne S. Luciano, Elgar Pichler, Robert L. Powers, Eric Prud’hommeaux, Frederick Whipple, M. Scott Marshall, Michel Dumontier. AMIA 2012.<br />
* M Samwald, H Stenzhorn, M Dumontier, MS Marshall, J Luciano, KP Adlassnig. Towards an interoperable information infrastructure providing decision support for genomic medicine. Stud Health Technol Inform. 2011;169:165-9. [http://www.ncbi.nlm.nih.gov/pubmed/21893735 PubMed]<br />
<br />
==Related==<br />
* [[HCLSIG/TranslationalMedicine/pharmacogenomics|pharmacogenomics resources]]<br />
* [http://informatics.mayo.edu/phont PHONT@Mayo]<br />
* [http://pgrn.org/display/pgrnwebsite/PGRN+Home PGRN - Pharmacogenomics Research Network]</div>Rboycehttps://www.w3.org/wiki/index.php?title=HCLSIG/Pharmacogenomics&diff=68261HCLSIG/Pharmacogenomics2013-09-10T15:57:56Z<p>Rboyce: /* Until September 1, 2013 */</p>
<hr />
<div>=Clinical Pharmacogenomics=<br />
<br />
Understanding how the genetic makeup of individual patients influences the response to pharmaceuticals is essential to the realization of more effective, personalized pharmacotherapy. However, the integration of drug, genotype and phenotype knowledge in medical information systems and its subsequent use in clinical decision making remains a key challenge. This task force is dedicated to developing a biomedical informatics infrastructure that leverages Semantic Web technologies to capture pharmacogenomic findings in such a way that they can be used to inform medical practitioners regarding approved drugs with pharmacogenomic labels. We suggest that such computational technologies are an essential part of personalized medicine, providing key information for translational medicine and clinical care.<br />
<br />
[[File:Patients are prescribed the same thing but are different.jpg|600px]]<br />
<br />
This Task Force is led by [[MichelDumontier|Michel Dumontier]] and [[MatthiasSamwald|Matthias Samwald]].<br />
<br />
== Teleconference ==<br />
This task force meets on the 1st and 3rd Wednesday at 10:15am EST. Be an active participant and join our next call:<br />
* [[/Meetings/2013-06-05_Conference_Call|Next Meeting]]<br />
* [http://www.w3.org/wiki/Category:HCLS_PGX Past Meetings]<br />
<br />
== Goals ==<br />
* Capture use cases for pharmacogenomic clinical research and medicine<br />
* Formalize the representation of pharmacogenomics-related information using Semantic Web technologies and ontologies<br />
* Demonstrate the integration of (genomic) patient data with biomedical resources (SNPs, drugs, genes, trials, treatments, adverse events)<br />
* Demonstrate a working interface to explore and query pharmacogenomic knowledge<br />
* Demonstrate how such representations can be used for clinical decision support<br />
<br />
== Participants ==<br />
* [[MichelDumontier|Michel Dumontier (co-chair)]]<br />
* [[MatthiasSamwald|Matthias Samwald (co-chair)]]<br />
* Eric Prud'hommeaux (W3C tech)<br />
* Robert Freimuth<br />
* Simon Lin<br />
* Richard D. Boyce<br />
* Bob Powers<br />
* Joanne S. Luciano<br />
* M. Scott Marshall<br />
* Elgar Pichler<br />
* Adrien Coulet<br />
* Ratnesh Sahay<br />
<br />
== Schedule ==<br />
<br />
=== Until September 1, 2013 ===<br />
* Matthias Samwald<br />
** Paper and presentation at OWL Reasoner workshop - ''DONE''<br />
** Have post-doc in Vienna working on the project full-time - ''DONE''<br />
** Finalize re-implementation of decision support service (behind http://safety-code.org/) in TrOWL. Need to establish collaboration with Jeff Pan and his group. - ''DONE''<br />
** Submit paper about Genomic CDS to the journal Bioinformatics - ''ALMOST DONE (submission expected mid-September).''<br />
** Tidy up / reorganize the W3C wiki - ''DONE''<br />
** Tidy up / reorganize http://www.genomic-cds.org/ - ''OVERDUE''<br />
* Richard Boyce<br />
** Semantically annotated product label information clinical pharmacogenomics statements published as valid Open Data Annotation and with best practice provenance (including IAA) ''DONE'' (NOTE:for a sub-set of biomarkers/drugs http://www.youtube.com/watch?v=Te546vOiruo, continuing the project)<br />
* Michel Dumontier<br />
** Describe work done this past term with curating class-level drug-drug interactions<br />
*** [https://docs.google.com/file/d/0B1-qT2rHHTkFX2VMRlJfNldHeG8/edit?usp=sharing poster] of curated interactions from drugbank; Honors Thesis Student Holly Surins<br />
** Curate and validate annotations for drug effects, drug indications and drug-drug indications on drug product labels (with Rich)<br />
*** installing DOMEO - in progress<br />
** Identify and develop priority datasets to be included in Bio2RDF that would benefit this task force <br />
*** rxnorm<br />
*** stitch<br />
* Bob Freimuth<br />
** Review Cerner's [http://www.cerner.com/about_cerner/clinical_bioinformatics_ontology/ CBO] (strengths/gaps for CDS) - ''ONGOING (had a look at it together with Matthias during Medinfo, Matthias to write down some impressions - a shared Google doc should be started)''<br />
** Discuss modeling approaches for genomic CDS (what's the trigger? integration with/maintenance of knowledge base; review existing implementations (TPP); feedback to CPIC?)<br />
** Discuss generalization of med safety code approach to NGS of PGx genes - ''DONE''<br />
** Apply generalized MSC to a real data set (demonstration)<br />
** Consider opportunities to partner with eMERGE and/or PGRN (user base for generalized MSC) - ''ONGOING''<br />
* Simon Lin<br />
** Shape a scientific manuscript out of the use-case document draft at docs.google.com/document/d/12leHdI-GT2dzRgvVIx13wsXAaCLj6eNGvQ-L2B749f4/edit<br />
<br />
=== Until December 1, 2013 ===<br />
<br />
* Matthias Samwald<br />
** Finalize overhaul and update of Genomic CDS ontology - ''ONGOING''<br />
** Need to establish more official collaboration with PharmGKB (Michel moving to Stanford might facilitate that further)<br />
** More official collaboration with the group at Mayo Clinic (Bob Freimuth?)<br />
** Finalize grant application for additional medical postdoc position -- for someone working on the biomedical details / curation / working and evaluating with doctors - ''DONE''<br />
** Enable loading VCF files into Genomic CDS - based decision support service. -- ''ONGOING''<br />
* Richard Boyce<br />
** Qualitative and possibly quantitative (e.g., task-oriented usability comparison) data on the value of integrating the semantic annotations into a prototype clinical pharmacogenomic information system. ''ONGOING'' <br />
** Acquisition of pilot funding (~25K) for development ''ONGOING''<br />
** a conference paper describing the above research activities ''IN PROGRESS''<br />
* Michel Dumontier<br />
** Be settled at Stanford University<br />
** Link drug-drug interactions with pharmacogenomic interactions<br />
** Identify and develop priority datasets to be included in Bio2RDF that would benefit this task force<br />
** Build relationships with associated partners on RDF data publication, use and analysis<br />
* Bob Freimuth<br />
** Apply ontological genomic CDS approach to key CPIC guidelines (demonstration, evaluation)<br />
** Demonstrate gene-gene-drug CDS, ensure approach will scale<br />
<br />
=== Until March 1, 2014 ===<br />
<br />
* Matthias Samwald<br />
** Genomic CDS ontology covers everything it needs to cover, connection to Rich's annotation work is continuously kept up-to-date<br />
** Begin partnering with external organizations (clinics, doctors, pharmacies, pharmaceutical companies) to start evaluating and dissemminating the ontology and decision support solutions (as pure research project only, since it is still undecided how best to get the software certified as a medical device). Marshfield Clinic might be a good partnering organisation for pilot studies?<br />
* Richard Boyce<br />
** Demonstration integration of triggers and recommendations present in the Genomic CDS ontology within the UPMC system using Quest lab results and focused meds (e.g., warfarin, clopidogrel, and some psych drugs). Further qualitative inquiry. - ''IN PROGRESS''<br />
** Submission of a journal article describing the results from all above research activities. - ''IN PROGRESS''<br />
** Submission of a journal paper presenting an analysis of pharmgx statements in product including the range of content, frequency of updates, contrasts with other sources, and recommendations for clinicians, drug information compendia, and the FDA - ''IN PROGRESS''<br />
* Michel Dumontier<br />
** Identify putative animal models for pharmacogenomic outcomes<br />
* Bob Freimuth<br />
** Consider application of gene-drug class and gene-drug-drug rules (leverages drug-drug interaction data)<br />
<br />
References: <br />
<br />
[1] atomoxetine, atorvastatin, boceprevir, capecitabine, carbamazepine, carvedilol, celecoxib, cisplatin, citalopram, clobazam, clomipramine, clopidogrel, clozapine, codeine, dapsone, desipramine, dexlansoprazole, diazepam, doxepin, esomeprazole, fluorouracil, fluoxetine, flurbiprofen, fluvoxamine, iloperidone, imipramine, irinotecan, ivacaftor, mercaptopurine, metoprolol, nefazodone, nortriptyline, omeprazole, pantoprazole, paroxetine, peginterferon, alfa-2b-il28b, perphenazine, phenytoin, pimozide, pravastatin, propafenone, propranolol, protriptyline, quinidine, rabeprazole, rasburicase, rifampin, risperidone, telaprevir, terbinafine, tetrabenazine, thioguanine, thioridazine, ticagrelor, tramadol, venlafaxine, and voriconazole<br />
<br />
==Deliverables==<br />
===Resources===<br />
* [http://www.genomic-cds.org/ Genomic CDS ontology project]<br />
* [http://safety-code.org/ Medicine Safety Code project]<br />
* [[/Use Cases|Use Cases]]<br />
* [http://goo.gl/afRh2 Conceptual Model from Michel]<br />
* [http://goo.gl/IUGns Formalization ideas from Michel]<br />
* [[/Data Sources|Data Sources]]<br />
* [[/Queries|Competency Questions and Queries]]<br />
* Decision Support Rules<br />
<br />
===Selected publications and presentations===<br />
* M Samwald. „Semantically Enabling Genetic Medicine to Facilitate Patients and Guidelines Matching and Enhanced Clinical Decision Support“ Proceedings of the Conference on Semantics in Healthcare and Life Sciences 2013 (CSHALS 2013), February 28, 2013, Cambridge/Boston, Massachusetts, USA - [http://de.slideshare.net/matthiassamwald/samwald-cshals2013-20741065 Slides]<br />
* M Samwald, KP Adlassnig. „Pharmacogenomics in the pocket of every patient? A prototype based on Quick Response (QR) codes“ J Am Med Inform Assoc, Published Online First: 23 Jan 2013, http://dx.doi.org/10.1136/amiajnl-2012-001275 - [http://samwald.info/res/Pharmacogenomics%20in%20the%20pocket%20of%20every%20patient%20-%20A%20prototype%20based%20on%20Quick%20Response%20%28QR%29%20codes%20-%20PREPRINT.pdf Link to openly accessible preprint version]<br />
* An Ontology-based Formalism, Knowledge Base and Reasoning System for Clinical Genetics. Samwald, M., Freimuth, R., Powers, R., Luciano, J., Prud’hommeaux, E., Boyce, R., Marshall, M., Dumontier, M. Poster presentation at the 2013 AMIA Summit on Translational Bioinformatics. San Francisco, March, 2013.<br />
* Toward semantic modeling of pharmacogenomic knowledge for clinical and translational decision support. PE Boyce, RD., Freimuth, RR., Romagnoli, KM., Pummer. The 2013 AMIA Summit on Translational Bioinformatics, San Francisco, CA. PubMed Central- Pending. ([http://www.slideshare.net/boycer/pharmgx-annotationamiatbi2013 Summary on SlideShare])<br />
* M Samwald, A Coulet, I Huerga, RL Powers, JS Luciano, RR Freimuth, F Whipple, E Pichler, E Prud'hommeaux, M Dumontier, MS Marshall. Semantically enabling pharmacogenomic data for the realization of personalized medicine. Pharmacogenomics. 2012 Jan;13(2):201-12. [http://www.ncbi.nlm.nih.gov/pubmed/22256869 PubMed]<br />
* An informatics infrastructure for translating pharmacogenomic knowledge into clinical practice. Matthias Samwald, Adrien Coulet, Robert R. Freimuth, Iker Huerga, Joanne S. Luciano, Elgar Pichler, Robert L. Powers, Eric Prud’hommeaux, Frederick Whipple, M. Scott Marshall, Michel Dumontier. AMIA 2012.<br />
* M Samwald, H Stenzhorn, M Dumontier, MS Marshall, J Luciano, KP Adlassnig. Towards an interoperable information infrastructure providing decision support for genomic medicine. Stud Health Technol Inform. 2011;169:165-9. [http://www.ncbi.nlm.nih.gov/pubmed/21893735 PubMed]<br />
<br />
==Related==<br />
* [[HCLSIG/TranslationalMedicine/pharmacogenomics|pharmacogenomics resources]]<br />
* [http://informatics.mayo.edu/phont PHONT@Mayo]<br />
* [http://pgrn.org/display/pgrnwebsite/PGRN+Home PGRN - Pharmacogenomics Research Network]</div>Rboycehttps://www.w3.org/wiki/index.php?title=HCLSIG/Pharmacogenomics&diff=68260HCLSIG/Pharmacogenomics2013-09-10T15:55:26Z<p>Rboyce: /* Until September 1, 2013 */</p>
<hr />
<div>=Clinical Pharmacogenomics=<br />
<br />
Understanding how the genetic makeup of individual patients influences the response to pharmaceuticals is essential to the realization of more effective, personalized pharmacotherapy. However, the integration of drug, genotype and phenotype knowledge in medical information systems and its subsequent use in clinical decision making remains a key challenge. This task force is dedicated to developing a biomedical informatics infrastructure that leverages Semantic Web technologies to capture pharmacogenomic findings in such a way that they can be used to inform medical practitioners regarding approved drugs with pharmacogenomic labels. We suggest that such computational technologies are an essential part of personalized medicine, providing key information for translational medicine and clinical care.<br />
<br />
[[File:Patients are prescribed the same thing but are different.jpg|600px]]<br />
<br />
This Task Force is led by [[MichelDumontier|Michel Dumontier]] and [[MatthiasSamwald|Matthias Samwald]].<br />
<br />
== Teleconference ==<br />
This task force meets on the 1st and 3rd Wednesday at 10:15am EST. Be an active participant and join our next call:<br />
* [[/Meetings/2013-06-05_Conference_Call|Next Meeting]]<br />
* [http://www.w3.org/wiki/Category:HCLS_PGX Past Meetings]<br />
<br />
== Goals ==<br />
* Capture use cases for pharmacogenomic clinical research and medicine<br />
* Formalize the representation of pharmacogenomics-related information using Semantic Web technologies and ontologies<br />
* Demonstrate the integration of (genomic) patient data with biomedical resources (SNPs, drugs, genes, trials, treatments, adverse events)<br />
* Demonstrate a working interface to explore and query pharmacogenomic knowledge<br />
* Demonstrate how such representations can be used for clinical decision support<br />
<br />
== Participants ==<br />
* [[MichelDumontier|Michel Dumontier (co-chair)]]<br />
* [[MatthiasSamwald|Matthias Samwald (co-chair)]]<br />
* Eric Prud'hommeaux (W3C tech)<br />
* Robert Freimuth<br />
* Simon Lin<br />
* Richard D. Boyce<br />
* Bob Powers<br />
* Joanne S. Luciano<br />
* M. Scott Marshall<br />
* Elgar Pichler<br />
* Adrien Coulet<br />
* Ratnesh Sahay<br />
<br />
== Schedule ==<br />
<br />
=== Until September 1, 2013 ===<br />
* Matthias Samwald<br />
** Paper and presentation at OWL Reasoner workshop - ''DONE''<br />
** Have post-doc in Vienna working on the project full-time - ''DONE''<br />
** Finalize re-implementation of decision support service (behind http://safety-code.org/) in TrOWL. Need to establish collaboration with Jeff Pan and his group. - ''DONE''<br />
** Submit paper about Genomic CDS to the journal Bioinformatics - ''ALMOST DONE (submission expected mid-September).''<br />
** Tidy up / reorganize the W3C wiki - ''DONE''<br />
** Tidy up / reorganize http://www.genomic-cds.org/ - ''OVERDUE''<br />
* Richard Boyce<br />
** Semantically annotated product label information for [1] published as valid Open Data Annotation and with best practice provenance (including IAA) ''DONE'' (NOTE:for a sub-set of biomarkers/drugs http://www.youtube.com/watch?v=Te546vOiruo, continuing the project)<br />
* Michel Dumontier<br />
** Describe work done this past term with curating class-level drug-drug interactions<br />
*** [https://docs.google.com/file/d/0B1-qT2rHHTkFX2VMRlJfNldHeG8/edit?usp=sharing poster] of curated interactions from drugbank; Honors Thesis Student Holly Surins<br />
** Curate and validate annotations for drug effects, drug indications and drug-drug indications on drug product labels (with Rich)<br />
*** installing DOMEO - in progress<br />
** Identify and develop priority datasets to be included in Bio2RDF that would benefit this task force <br />
*** rxnorm<br />
*** stitch<br />
* Bob Freimuth<br />
** Review Cerner's [http://www.cerner.com/about_cerner/clinical_bioinformatics_ontology/ CBO] (strengths/gaps for CDS) - ''ONGOING (had a look at it together with Matthias during Medinfo, Matthias to write down some impressions - a shared Google doc should be started)''<br />
** Discuss modeling approaches for genomic CDS (what's the trigger? integration with/maintenance of knowledge base; review existing implementations (TPP); feedback to CPIC?)<br />
** Discuss generalization of med safety code approach to NGS of PGx genes - ''DONE''<br />
** Apply generalized MSC to a real data set (demonstration)<br />
** Consider opportunities to partner with eMERGE and/or PGRN (user base for generalized MSC) - ''ONGOING''<br />
* Simon Lin<br />
** Shape a scientific manuscript out of the use-case document draft at docs.google.com/document/d/12leHdI-GT2dzRgvVIx13wsXAaCLj6eNGvQ-L2B749f4/edit<br />
<br />
=== Until December 1, 2013 ===<br />
<br />
* Matthias Samwald<br />
** Finalize overhaul and update of Genomic CDS ontology - ''ONGOING''<br />
** Need to establish more official collaboration with PharmGKB (Michel moving to Stanford might facilitate that further)<br />
** More official collaboration with the group at Mayo Clinic (Bob Freimuth?)<br />
** Finalize grant application for additional medical postdoc position -- for someone working on the biomedical details / curation / working and evaluating with doctors - ''DONE''<br />
** Enable loading VCF files into Genomic CDS - based decision support service. -- ''ONGOING''<br />
* Richard Boyce<br />
** Qualitative and possibly quantitative (e.g., task-oriented usability comparison) data on the value of integrating the semantic annotations into a prototype clinical pharmacogenomic information system. ''ONGOING'' <br />
** Acquisition of pilot funding (~25K) for development ''ONGOING''<br />
** a conference paper describing the above research activities ''IN PROGRESS''<br />
* Michel Dumontier<br />
** Be settled at Stanford University<br />
** Link drug-drug interactions with pharmacogenomic interactions<br />
** Identify and develop priority datasets to be included in Bio2RDF that would benefit this task force<br />
** Build relationships with associated partners on RDF data publication, use and analysis<br />
* Bob Freimuth<br />
** Apply ontological genomic CDS approach to key CPIC guidelines (demonstration, evaluation)<br />
** Demonstrate gene-gene-drug CDS, ensure approach will scale<br />
<br />
=== Until March 1, 2014 ===<br />
<br />
* Matthias Samwald<br />
** Genomic CDS ontology covers everything it needs to cover, connection to Rich's annotation work is continuously kept up-to-date<br />
** Begin partnering with external organizations (clinics, doctors, pharmacies, pharmaceutical companies) to start evaluating and dissemminating the ontology and decision support solutions (as pure research project only, since it is still undecided how best to get the software certified as a medical device). Marshfield Clinic might be a good partnering organisation for pilot studies?<br />
* Richard Boyce<br />
** Demonstration integration of triggers and recommendations present in the Genomic CDS ontology within the UPMC system using Quest lab results and focused meds (e.g., warfarin, clopidogrel, and some psych drugs). Further qualitative inquiry. - ''IN PROGRESS''<br />
** Submission of a journal article describing the results from all above research activities. - ''IN PROGRESS''<br />
** Submission of a journal paper presenting an analysis of pharmgx statements in product including the range of content, frequency of updates, contrasts with other sources, and recommendations for clinicians, drug information compendia, and the FDA - ''IN PROGRESS''<br />
* Michel Dumontier<br />
** Identify putative animal models for pharmacogenomic outcomes<br />
* Bob Freimuth<br />
** Consider application of gene-drug class and gene-drug-drug rules (leverages drug-drug interaction data)<br />
<br />
References: <br />
<br />
[1] atomoxetine, atorvastatin, boceprevir, capecitabine, carbamazepine, carvedilol, celecoxib, cisplatin, citalopram, clobazam, clomipramine, clopidogrel, clozapine, codeine, dapsone, desipramine, dexlansoprazole, diazepam, doxepin, esomeprazole, fluorouracil, fluoxetine, flurbiprofen, fluvoxamine, iloperidone, imipramine, irinotecan, ivacaftor, mercaptopurine, metoprolol, nefazodone, nortriptyline, omeprazole, pantoprazole, paroxetine, peginterferon, alfa-2b-il28b, perphenazine, phenytoin, pimozide, pravastatin, propafenone, propranolol, protriptyline, quinidine, rabeprazole, rasburicase, rifampin, risperidone, telaprevir, terbinafine, tetrabenazine, thioguanine, thioridazine, ticagrelor, tramadol, venlafaxine, and voriconazole<br />
<br />
==Deliverables==<br />
===Resources===<br />
* [http://www.genomic-cds.org/ Genomic CDS ontology project]<br />
* [http://safety-code.org/ Medicine Safety Code project]<br />
* [[/Use Cases|Use Cases]]<br />
* [http://goo.gl/afRh2 Conceptual Model from Michel]<br />
* [http://goo.gl/IUGns Formalization ideas from Michel]<br />
* [[/Data Sources|Data Sources]]<br />
* [[/Queries|Competency Questions and Queries]]<br />
* Decision Support Rules<br />
<br />
===Selected publications and presentations===<br />
* M Samwald. „Semantically Enabling Genetic Medicine to Facilitate Patients and Guidelines Matching and Enhanced Clinical Decision Support“ Proceedings of the Conference on Semantics in Healthcare and Life Sciences 2013 (CSHALS 2013), February 28, 2013, Cambridge/Boston, Massachusetts, USA - [http://de.slideshare.net/matthiassamwald/samwald-cshals2013-20741065 Slides]<br />
* M Samwald, KP Adlassnig. „Pharmacogenomics in the pocket of every patient? A prototype based on Quick Response (QR) codes“ J Am Med Inform Assoc, Published Online First: 23 Jan 2013, http://dx.doi.org/10.1136/amiajnl-2012-001275 - [http://samwald.info/res/Pharmacogenomics%20in%20the%20pocket%20of%20every%20patient%20-%20A%20prototype%20based%20on%20Quick%20Response%20%28QR%29%20codes%20-%20PREPRINT.pdf Link to openly accessible preprint version]<br />
* An Ontology-based Formalism, Knowledge Base and Reasoning System for Clinical Genetics. Samwald, M., Freimuth, R., Powers, R., Luciano, J., Prud’hommeaux, E., Boyce, R., Marshall, M., Dumontier, M. Poster presentation at the 2013 AMIA Summit on Translational Bioinformatics. San Francisco, March, 2013.<br />
* Toward semantic modeling of pharmacogenomic knowledge for clinical and translational decision support. PE Boyce, RD., Freimuth, RR., Romagnoli, KM., Pummer. The 2013 AMIA Summit on Translational Bioinformatics, San Francisco, CA. PubMed Central- Pending. ([http://www.slideshare.net/boycer/pharmgx-annotationamiatbi2013 Summary on SlideShare])<br />
* M Samwald, A Coulet, I Huerga, RL Powers, JS Luciano, RR Freimuth, F Whipple, E Pichler, E Prud'hommeaux, M Dumontier, MS Marshall. Semantically enabling pharmacogenomic data for the realization of personalized medicine. Pharmacogenomics. 2012 Jan;13(2):201-12. [http://www.ncbi.nlm.nih.gov/pubmed/22256869 PubMed]<br />
* An informatics infrastructure for translating pharmacogenomic knowledge into clinical practice. Matthias Samwald, Adrien Coulet, Robert R. Freimuth, Iker Huerga, Joanne S. Luciano, Elgar Pichler, Robert L. Powers, Eric Prud’hommeaux, Frederick Whipple, M. Scott Marshall, Michel Dumontier. AMIA 2012.<br />
* M Samwald, H Stenzhorn, M Dumontier, MS Marshall, J Luciano, KP Adlassnig. Towards an interoperable information infrastructure providing decision support for genomic medicine. Stud Health Technol Inform. 2011;169:165-9. [http://www.ncbi.nlm.nih.gov/pubmed/21893735 PubMed]<br />
<br />
==Related==<br />
* [[HCLSIG/TranslationalMedicine/pharmacogenomics|pharmacogenomics resources]]<br />
* [http://informatics.mayo.edu/phont PHONT@Mayo]<br />
* [http://pgrn.org/display/pgrnwebsite/PGRN+Home PGRN - Pharmacogenomics Research Network]</div>Rboycehttps://www.w3.org/wiki/index.php?title=HCLSIG/Pharmacogenomics&diff=68259HCLSIG/Pharmacogenomics2013-09-10T15:53:48Z<p>Rboyce: /* Until March 1, 2014 */</p>
<hr />
<div>=Clinical Pharmacogenomics=<br />
<br />
Understanding how the genetic makeup of individual patients influences the response to pharmaceuticals is essential to the realization of more effective, personalized pharmacotherapy. However, the integration of drug, genotype and phenotype knowledge in medical information systems and its subsequent use in clinical decision making remains a key challenge. This task force is dedicated to developing a biomedical informatics infrastructure that leverages Semantic Web technologies to capture pharmacogenomic findings in such a way that they can be used to inform medical practitioners regarding approved drugs with pharmacogenomic labels. We suggest that such computational technologies are an essential part of personalized medicine, providing key information for translational medicine and clinical care.<br />
<br />
[[File:Patients are prescribed the same thing but are different.jpg|600px]]<br />
<br />
This Task Force is led by [[MichelDumontier|Michel Dumontier]] and [[MatthiasSamwald|Matthias Samwald]].<br />
<br />
== Teleconference ==<br />
This task force meets on the 1st and 3rd Wednesday at 10:15am EST. Be an active participant and join our next call:<br />
* [[/Meetings/2013-06-05_Conference_Call|Next Meeting]]<br />
* [http://www.w3.org/wiki/Category:HCLS_PGX Past Meetings]<br />
<br />
== Goals ==<br />
* Capture use cases for pharmacogenomic clinical research and medicine<br />
* Formalize the representation of pharmacogenomics-related information using Semantic Web technologies and ontologies<br />
* Demonstrate the integration of (genomic) patient data with biomedical resources (SNPs, drugs, genes, trials, treatments, adverse events)<br />
* Demonstrate a working interface to explore and query pharmacogenomic knowledge<br />
* Demonstrate how such representations can be used for clinical decision support<br />
<br />
== Participants ==<br />
* [[MichelDumontier|Michel Dumontier (co-chair)]]<br />
* [[MatthiasSamwald|Matthias Samwald (co-chair)]]<br />
* Eric Prud'hommeaux (W3C tech)<br />
* Robert Freimuth<br />
* Simon Lin<br />
* Richard D. Boyce<br />
* Bob Powers<br />
* Joanne S. Luciano<br />
* M. Scott Marshall<br />
* Elgar Pichler<br />
* Adrien Coulet<br />
* Ratnesh Sahay<br />
<br />
== Schedule ==<br />
<br />
=== Until September 1, 2013 ===<br />
* Matthias Samwald<br />
** Paper and presentation at OWL Reasoner workshop - ''DONE''<br />
** Have post-doc in Vienna working on the project full-time - ''DONE''<br />
** Finalize re-implementation of decision support service (behind http://safety-code.org/) in TrOWL. Need to establish collaboration with Jeff Pan and his group. - ''DONE''<br />
** Submit paper about Genomic CDS to the journal Bioinformatics - ''ALMOST DONE (submission expected mid-September).''<br />
** Tidy up / reorganize the W3C wiki - ''DONE''<br />
** Tidy up / reorganize http://www.genomic-cds.org/ - ''OVERDUE''<br />
* Richard Boyce<br />
** Semantically annotated product label information for [1] published as valid Open Data Annotation and with best practice provenance (including IAA) ''DOME'' (for a sub-set http://www.youtube.com/watch?v=Te546vOiruo, continuing the project)<br />
* Michel Dumontier<br />
** Describe work done this past term with curating class-level drug-drug interactions<br />
*** [https://docs.google.com/file/d/0B1-qT2rHHTkFX2VMRlJfNldHeG8/edit?usp=sharing poster] of curated interactions from drugbank; Honors Thesis Student Holly Surins<br />
** Curate and validate annotations for drug effects, drug indications and drug-drug indications on drug product labels (with Rich)<br />
*** installing DOMEO - in progress<br />
** Identify and develop priority datasets to be included in Bio2RDF that would benefit this task force <br />
*** rxnorm<br />
*** stitch<br />
* Bob Freimuth<br />
** Review Cerner's [http://www.cerner.com/about_cerner/clinical_bioinformatics_ontology/ CBO] (strengths/gaps for CDS) - ''ONGOING (had a look at it together with Matthias during Medinfo, Matthias to write down some impressions - a shared Google doc should be started)''<br />
** Discuss modeling approaches for genomic CDS (what's the trigger? integration with/maintenance of knowledge base; review existing implementations (TPP); feedback to CPIC?)<br />
** Discuss generalization of med safety code approach to NGS of PGx genes - ''DONE''<br />
** Apply generalized MSC to a real data set (demonstration)<br />
** Consider opportunities to partner with eMERGE and/or PGRN (user base for generalized MSC) - ''ONGOING''<br />
* Simon Lin<br />
** Shape a scientific manuscript out of the use-case document draft at docs.google.com/document/d/12leHdI-GT2dzRgvVIx13wsXAaCLj6eNGvQ-L2B749f4/edit<br />
<br />
=== Until December 1, 2013 ===<br />
<br />
* Matthias Samwald<br />
** Finalize overhaul and update of Genomic CDS ontology - ''ONGOING''<br />
** Need to establish more official collaboration with PharmGKB (Michel moving to Stanford might facilitate that further)<br />
** More official collaboration with the group at Mayo Clinic (Bob Freimuth?)<br />
** Finalize grant application for additional medical postdoc position -- for someone working on the biomedical details / curation / working and evaluating with doctors - ''DONE''<br />
** Enable loading VCF files into Genomic CDS - based decision support service. -- ''ONGOING''<br />
* Richard Boyce<br />
** Qualitative and possibly quantitative (e.g., task-oriented usability comparison) data on the value of integrating the semantic annotations into a prototype clinical pharmacogenomic information system. ''ONGOING'' <br />
** Acquisition of pilot funding (~25K) for development ''ONGOING''<br />
** a conference paper describing the above research activities ''IN PROGRESS''<br />
* Michel Dumontier<br />
** Be settled at Stanford University<br />
** Link drug-drug interactions with pharmacogenomic interactions<br />
** Identify and develop priority datasets to be included in Bio2RDF that would benefit this task force<br />
** Build relationships with associated partners on RDF data publication, use and analysis<br />
* Bob Freimuth<br />
** Apply ontological genomic CDS approach to key CPIC guidelines (demonstration, evaluation)<br />
** Demonstrate gene-gene-drug CDS, ensure approach will scale<br />
<br />
=== Until March 1, 2014 ===<br />
<br />
* Matthias Samwald<br />
** Genomic CDS ontology covers everything it needs to cover, connection to Rich's annotation work is continuously kept up-to-date<br />
** Begin partnering with external organizations (clinics, doctors, pharmacies, pharmaceutical companies) to start evaluating and dissemminating the ontology and decision support solutions (as pure research project only, since it is still undecided how best to get the software certified as a medical device). Marshfield Clinic might be a good partnering organisation for pilot studies?<br />
* Richard Boyce<br />
** Demonstration integration of triggers and recommendations present in the Genomic CDS ontology within the UPMC system using Quest lab results and focused meds (e.g., warfarin, clopidogrel, and some psych drugs). Further qualitative inquiry. - ''IN PROGRESS''<br />
** Submission of a journal article describing the results from all above research activities. - ''IN PROGRESS''<br />
** Submission of a journal paper presenting an analysis of pharmgx statements in product including the range of content, frequency of updates, contrasts with other sources, and recommendations for clinicians, drug information compendia, and the FDA - ''IN PROGRESS''<br />
* Michel Dumontier<br />
** Identify putative animal models for pharmacogenomic outcomes<br />
* Bob Freimuth<br />
** Consider application of gene-drug class and gene-drug-drug rules (leverages drug-drug interaction data)<br />
<br />
References: <br />
<br />
[1] atomoxetine, atorvastatin, boceprevir, capecitabine, carbamazepine, carvedilol, celecoxib, cisplatin, citalopram, clobazam, clomipramine, clopidogrel, clozapine, codeine, dapsone, desipramine, dexlansoprazole, diazepam, doxepin, esomeprazole, fluorouracil, fluoxetine, flurbiprofen, fluvoxamine, iloperidone, imipramine, irinotecan, ivacaftor, mercaptopurine, metoprolol, nefazodone, nortriptyline, omeprazole, pantoprazole, paroxetine, peginterferon, alfa-2b-il28b, perphenazine, phenytoin, pimozide, pravastatin, propafenone, propranolol, protriptyline, quinidine, rabeprazole, rasburicase, rifampin, risperidone, telaprevir, terbinafine, tetrabenazine, thioguanine, thioridazine, ticagrelor, tramadol, venlafaxine, and voriconazole<br />
<br />
==Deliverables==<br />
===Resources===<br />
* [http://www.genomic-cds.org/ Genomic CDS ontology project]<br />
* [http://safety-code.org/ Medicine Safety Code project]<br />
* [[/Use Cases|Use Cases]]<br />
* [http://goo.gl/afRh2 Conceptual Model from Michel]<br />
* [http://goo.gl/IUGns Formalization ideas from Michel]<br />
* [[/Data Sources|Data Sources]]<br />
* [[/Queries|Competency Questions and Queries]]<br />
* Decision Support Rules<br />
<br />
===Selected publications and presentations===<br />
* M Samwald. „Semantically Enabling Genetic Medicine to Facilitate Patients and Guidelines Matching and Enhanced Clinical Decision Support“ Proceedings of the Conference on Semantics in Healthcare and Life Sciences 2013 (CSHALS 2013), February 28, 2013, Cambridge/Boston, Massachusetts, USA - [http://de.slideshare.net/matthiassamwald/samwald-cshals2013-20741065 Slides]<br />
* M Samwald, KP Adlassnig. „Pharmacogenomics in the pocket of every patient? A prototype based on Quick Response (QR) codes“ J Am Med Inform Assoc, Published Online First: 23 Jan 2013, http://dx.doi.org/10.1136/amiajnl-2012-001275 - [http://samwald.info/res/Pharmacogenomics%20in%20the%20pocket%20of%20every%20patient%20-%20A%20prototype%20based%20on%20Quick%20Response%20%28QR%29%20codes%20-%20PREPRINT.pdf Link to openly accessible preprint version]<br />
* An Ontology-based Formalism, Knowledge Base and Reasoning System for Clinical Genetics. Samwald, M., Freimuth, R., Powers, R., Luciano, J., Prud’hommeaux, E., Boyce, R., Marshall, M., Dumontier, M. Poster presentation at the 2013 AMIA Summit on Translational Bioinformatics. San Francisco, March, 2013.<br />
* Toward semantic modeling of pharmacogenomic knowledge for clinical and translational decision support. PE Boyce, RD., Freimuth, RR., Romagnoli, KM., Pummer. The 2013 AMIA Summit on Translational Bioinformatics, San Francisco, CA. PubMed Central- Pending. ([http://www.slideshare.net/boycer/pharmgx-annotationamiatbi2013 Summary on SlideShare])<br />
* M Samwald, A Coulet, I Huerga, RL Powers, JS Luciano, RR Freimuth, F Whipple, E Pichler, E Prud'hommeaux, M Dumontier, MS Marshall. Semantically enabling pharmacogenomic data for the realization of personalized medicine. Pharmacogenomics. 2012 Jan;13(2):201-12. [http://www.ncbi.nlm.nih.gov/pubmed/22256869 PubMed]<br />
* An informatics infrastructure for translating pharmacogenomic knowledge into clinical practice. Matthias Samwald, Adrien Coulet, Robert R. Freimuth, Iker Huerga, Joanne S. Luciano, Elgar Pichler, Robert L. Powers, Eric Prud’hommeaux, Frederick Whipple, M. Scott Marshall, Michel Dumontier. AMIA 2012.<br />
* M Samwald, H Stenzhorn, M Dumontier, MS Marshall, J Luciano, KP Adlassnig. Towards an interoperable information infrastructure providing decision support for genomic medicine. Stud Health Technol Inform. 2011;169:165-9. [http://www.ncbi.nlm.nih.gov/pubmed/21893735 PubMed]<br />
<br />
==Related==<br />
* [[HCLSIG/TranslationalMedicine/pharmacogenomics|pharmacogenomics resources]]<br />
* [http://informatics.mayo.edu/phont PHONT@Mayo]<br />
* [http://pgrn.org/display/pgrnwebsite/PGRN+Home PGRN - Pharmacogenomics Research Network]</div>Rboycehttps://www.w3.org/wiki/index.php?title=HCLSIG/Pharmacogenomics&diff=68258HCLSIG/Pharmacogenomics2013-09-10T15:53:37Z<p>Rboyce: /* Until March 1, 2014 */</p>
<hr />
<div>=Clinical Pharmacogenomics=<br />
<br />
Understanding how the genetic makeup of individual patients influences the response to pharmaceuticals is essential to the realization of more effective, personalized pharmacotherapy. However, the integration of drug, genotype and phenotype knowledge in medical information systems and its subsequent use in clinical decision making remains a key challenge. This task force is dedicated to developing a biomedical informatics infrastructure that leverages Semantic Web technologies to capture pharmacogenomic findings in such a way that they can be used to inform medical practitioners regarding approved drugs with pharmacogenomic labels. We suggest that such computational technologies are an essential part of personalized medicine, providing key information for translational medicine and clinical care.<br />
<br />
[[File:Patients are prescribed the same thing but are different.jpg|600px]]<br />
<br />
This Task Force is led by [[MichelDumontier|Michel Dumontier]] and [[MatthiasSamwald|Matthias Samwald]].<br />
<br />
== Teleconference ==<br />
This task force meets on the 1st and 3rd Wednesday at 10:15am EST. Be an active participant and join our next call:<br />
* [[/Meetings/2013-06-05_Conference_Call|Next Meeting]]<br />
* [http://www.w3.org/wiki/Category:HCLS_PGX Past Meetings]<br />
<br />
== Goals ==<br />
* Capture use cases for pharmacogenomic clinical research and medicine<br />
* Formalize the representation of pharmacogenomics-related information using Semantic Web technologies and ontologies<br />
* Demonstrate the integration of (genomic) patient data with biomedical resources (SNPs, drugs, genes, trials, treatments, adverse events)<br />
* Demonstrate a working interface to explore and query pharmacogenomic knowledge<br />
* Demonstrate how such representations can be used for clinical decision support<br />
<br />
== Participants ==<br />
* [[MichelDumontier|Michel Dumontier (co-chair)]]<br />
* [[MatthiasSamwald|Matthias Samwald (co-chair)]]<br />
* Eric Prud'hommeaux (W3C tech)<br />
* Robert Freimuth<br />
* Simon Lin<br />
* Richard D. Boyce<br />
* Bob Powers<br />
* Joanne S. Luciano<br />
* M. Scott Marshall<br />
* Elgar Pichler<br />
* Adrien Coulet<br />
* Ratnesh Sahay<br />
<br />
== Schedule ==<br />
<br />
=== Until September 1, 2013 ===<br />
* Matthias Samwald<br />
** Paper and presentation at OWL Reasoner workshop - ''DONE''<br />
** Have post-doc in Vienna working on the project full-time - ''DONE''<br />
** Finalize re-implementation of decision support service (behind http://safety-code.org/) in TrOWL. Need to establish collaboration with Jeff Pan and his group. - ''DONE''<br />
** Submit paper about Genomic CDS to the journal Bioinformatics - ''ALMOST DONE (submission expected mid-September).''<br />
** Tidy up / reorganize the W3C wiki - ''DONE''<br />
** Tidy up / reorganize http://www.genomic-cds.org/ - ''OVERDUE''<br />
* Richard Boyce<br />
** Semantically annotated product label information for [1] published as valid Open Data Annotation and with best practice provenance (including IAA) ''DOME'' (for a sub-set http://www.youtube.com/watch?v=Te546vOiruo, continuing the project)<br />
* Michel Dumontier<br />
** Describe work done this past term with curating class-level drug-drug interactions<br />
*** [https://docs.google.com/file/d/0B1-qT2rHHTkFX2VMRlJfNldHeG8/edit?usp=sharing poster] of curated interactions from drugbank; Honors Thesis Student Holly Surins<br />
** Curate and validate annotations for drug effects, drug indications and drug-drug indications on drug product labels (with Rich)<br />
*** installing DOMEO - in progress<br />
** Identify and develop priority datasets to be included in Bio2RDF that would benefit this task force <br />
*** rxnorm<br />
*** stitch<br />
* Bob Freimuth<br />
** Review Cerner's [http://www.cerner.com/about_cerner/clinical_bioinformatics_ontology/ CBO] (strengths/gaps for CDS) - ''ONGOING (had a look at it together with Matthias during Medinfo, Matthias to write down some impressions - a shared Google doc should be started)''<br />
** Discuss modeling approaches for genomic CDS (what's the trigger? integration with/maintenance of knowledge base; review existing implementations (TPP); feedback to CPIC?)<br />
** Discuss generalization of med safety code approach to NGS of PGx genes - ''DONE''<br />
** Apply generalized MSC to a real data set (demonstration)<br />
** Consider opportunities to partner with eMERGE and/or PGRN (user base for generalized MSC) - ''ONGOING''<br />
* Simon Lin<br />
** Shape a scientific manuscript out of the use-case document draft at docs.google.com/document/d/12leHdI-GT2dzRgvVIx13wsXAaCLj6eNGvQ-L2B749f4/edit<br />
<br />
=== Until December 1, 2013 ===<br />
<br />
* Matthias Samwald<br />
** Finalize overhaul and update of Genomic CDS ontology - ''ONGOING''<br />
** Need to establish more official collaboration with PharmGKB (Michel moving to Stanford might facilitate that further)<br />
** More official collaboration with the group at Mayo Clinic (Bob Freimuth?)<br />
** Finalize grant application for additional medical postdoc position -- for someone working on the biomedical details / curation / working and evaluating with doctors - ''DONE''<br />
** Enable loading VCF files into Genomic CDS - based decision support service. -- ''ONGOING''<br />
* Richard Boyce<br />
** Qualitative and possibly quantitative (e.g., task-oriented usability comparison) data on the value of integrating the semantic annotations into a prototype clinical pharmacogenomic information system. ''ONGOING'' <br />
** Acquisition of pilot funding (~25K) for development ''ONGOING''<br />
** a conference paper describing the above research activities ''IN PROGRESS''<br />
* Michel Dumontier<br />
** Be settled at Stanford University<br />
** Link drug-drug interactions with pharmacogenomic interactions<br />
** Identify and develop priority datasets to be included in Bio2RDF that would benefit this task force<br />
** Build relationships with associated partners on RDF data publication, use and analysis<br />
* Bob Freimuth<br />
** Apply ontological genomic CDS approach to key CPIC guidelines (demonstration, evaluation)<br />
** Demonstrate gene-gene-drug CDS, ensure approach will scale<br />
<br />
=== Until March 1, 2014 ===<br />
<br />
* Matthias Samwald<br />
** Genomic CDS ontology covers everything it needs to cover, connection to Rich's annotation work is continuously kept up-to-date<br />
** Begin partnering with external organizations (clinics, doctors, pharmacies, pharmaceutical companies) to start evaluating and dissemminating the ontology and decision support solutions (as pure research project only, since it is still undecided how best to get the software certified as a medical device). Marshfield Clinic might be a good partnering organisation for pilot studies?<br />
* Richard Boyce<br />
** Demonstration integration of triggers and recommendations present in the Genomic CDS ontology within the UPMC system using Quest lab results and focused meds (e.g., warfarin, clopidogrel, and some psych drugs). Further qualitative inquiry. - ''IN PROGRESS 9/10/13''<br />
** Submission of a journal article describing the results from all above research activities. - ''IN PROGRESS''<br />
** Submission of a journal paper presenting an analysis of pharmgx statements in product including the range of content, frequency of updates, contrasts with other sources, and recommendations for clinicians, drug information compendia, and the FDA - ''IN PROGRESS''<br />
* Michel Dumontier<br />
** Identify putative animal models for pharmacogenomic outcomes<br />
* Bob Freimuth<br />
** Consider application of gene-drug class and gene-drug-drug rules (leverages drug-drug interaction data)<br />
<br />
References: <br />
<br />
[1] atomoxetine, atorvastatin, boceprevir, capecitabine, carbamazepine, carvedilol, celecoxib, cisplatin, citalopram, clobazam, clomipramine, clopidogrel, clozapine, codeine, dapsone, desipramine, dexlansoprazole, diazepam, doxepin, esomeprazole, fluorouracil, fluoxetine, flurbiprofen, fluvoxamine, iloperidone, imipramine, irinotecan, ivacaftor, mercaptopurine, metoprolol, nefazodone, nortriptyline, omeprazole, pantoprazole, paroxetine, peginterferon, alfa-2b-il28b, perphenazine, phenytoin, pimozide, pravastatin, propafenone, propranolol, protriptyline, quinidine, rabeprazole, rasburicase, rifampin, risperidone, telaprevir, terbinafine, tetrabenazine, thioguanine, thioridazine, ticagrelor, tramadol, venlafaxine, and voriconazole<br />
<br />
==Deliverables==<br />
===Resources===<br />
* [http://www.genomic-cds.org/ Genomic CDS ontology project]<br />
* [http://safety-code.org/ Medicine Safety Code project]<br />
* [[/Use Cases|Use Cases]]<br />
* [http://goo.gl/afRh2 Conceptual Model from Michel]<br />
* [http://goo.gl/IUGns Formalization ideas from Michel]<br />
* [[/Data Sources|Data Sources]]<br />
* [[/Queries|Competency Questions and Queries]]<br />
* Decision Support Rules<br />
<br />
===Selected publications and presentations===<br />
* M Samwald. „Semantically Enabling Genetic Medicine to Facilitate Patients and Guidelines Matching and Enhanced Clinical Decision Support“ Proceedings of the Conference on Semantics in Healthcare and Life Sciences 2013 (CSHALS 2013), February 28, 2013, Cambridge/Boston, Massachusetts, USA - [http://de.slideshare.net/matthiassamwald/samwald-cshals2013-20741065 Slides]<br />
* M Samwald, KP Adlassnig. „Pharmacogenomics in the pocket of every patient? A prototype based on Quick Response (QR) codes“ J Am Med Inform Assoc, Published Online First: 23 Jan 2013, http://dx.doi.org/10.1136/amiajnl-2012-001275 - [http://samwald.info/res/Pharmacogenomics%20in%20the%20pocket%20of%20every%20patient%20-%20A%20prototype%20based%20on%20Quick%20Response%20%28QR%29%20codes%20-%20PREPRINT.pdf Link to openly accessible preprint version]<br />
* An Ontology-based Formalism, Knowledge Base and Reasoning System for Clinical Genetics. Samwald, M., Freimuth, R., Powers, R., Luciano, J., Prud’hommeaux, E., Boyce, R., Marshall, M., Dumontier, M. Poster presentation at the 2013 AMIA Summit on Translational Bioinformatics. San Francisco, March, 2013.<br />
* Toward semantic modeling of pharmacogenomic knowledge for clinical and translational decision support. PE Boyce, RD., Freimuth, RR., Romagnoli, KM., Pummer. The 2013 AMIA Summit on Translational Bioinformatics, San Francisco, CA. PubMed Central- Pending. ([http://www.slideshare.net/boycer/pharmgx-annotationamiatbi2013 Summary on SlideShare])<br />
* M Samwald, A Coulet, I Huerga, RL Powers, JS Luciano, RR Freimuth, F Whipple, E Pichler, E Prud'hommeaux, M Dumontier, MS Marshall. Semantically enabling pharmacogenomic data for the realization of personalized medicine. Pharmacogenomics. 2012 Jan;13(2):201-12. [http://www.ncbi.nlm.nih.gov/pubmed/22256869 PubMed]<br />
* An informatics infrastructure for translating pharmacogenomic knowledge into clinical practice. Matthias Samwald, Adrien Coulet, Robert R. Freimuth, Iker Huerga, Joanne S. Luciano, Elgar Pichler, Robert L. Powers, Eric Prud’hommeaux, Frederick Whipple, M. Scott Marshall, Michel Dumontier. AMIA 2012.<br />
* M Samwald, H Stenzhorn, M Dumontier, MS Marshall, J Luciano, KP Adlassnig. Towards an interoperable information infrastructure providing decision support for genomic medicine. Stud Health Technol Inform. 2011;169:165-9. [http://www.ncbi.nlm.nih.gov/pubmed/21893735 PubMed]<br />
<br />
==Related==<br />
* [[HCLSIG/TranslationalMedicine/pharmacogenomics|pharmacogenomics resources]]<br />
* [http://informatics.mayo.edu/phont PHONT@Mayo]<br />
* [http://pgrn.org/display/pgrnwebsite/PGRN+Home PGRN - Pharmacogenomics Research Network]</div>Rboycehttps://www.w3.org/wiki/index.php?title=HCLSIG/Pharmacogenomics&diff=68257HCLSIG/Pharmacogenomics2013-09-10T15:44:03Z<p>Rboyce: /* Until December 1, 2013 */</p>
<hr />
<div>=Clinical Pharmacogenomics=<br />
<br />
Understanding how the genetic makeup of individual patients influences the response to pharmaceuticals is essential to the realization of more effective, personalized pharmacotherapy. However, the integration of drug, genotype and phenotype knowledge in medical information systems and its subsequent use in clinical decision making remains a key challenge. This task force is dedicated to developing a biomedical informatics infrastructure that leverages Semantic Web technologies to capture pharmacogenomic findings in such a way that they can be used to inform medical practitioners regarding approved drugs with pharmacogenomic labels. We suggest that such computational technologies are an essential part of personalized medicine, providing key information for translational medicine and clinical care.<br />
<br />
[[File:Patients are prescribed the same thing but are different.jpg|600px]]<br />
<br />
This Task Force is led by [[MichelDumontier|Michel Dumontier]] and [[MatthiasSamwald|Matthias Samwald]].<br />
<br />
== Teleconference ==<br />
This task force meets on the 1st and 3rd Wednesday at 10:15am EST. Be an active participant and join our next call:<br />
* [[/Meetings/2013-06-05_Conference_Call|Next Meeting]]<br />
* [http://www.w3.org/wiki/Category:HCLS_PGX Past Meetings]<br />
<br />
== Goals ==<br />
* Capture use cases for pharmacogenomic clinical research and medicine<br />
* Formalize the representation of pharmacogenomics-related information using Semantic Web technologies and ontologies<br />
* Demonstrate the integration of (genomic) patient data with biomedical resources (SNPs, drugs, genes, trials, treatments, adverse events)<br />
* Demonstrate a working interface to explore and query pharmacogenomic knowledge<br />
* Demonstrate how such representations can be used for clinical decision support<br />
<br />
== Participants ==<br />
* [[MichelDumontier|Michel Dumontier (co-chair)]]<br />
* [[MatthiasSamwald|Matthias Samwald (co-chair)]]<br />
* Eric Prud'hommeaux (W3C tech)<br />
* Robert Freimuth<br />
* Simon Lin<br />
* Richard D. Boyce<br />
* Bob Powers<br />
* Joanne S. Luciano<br />
* M. Scott Marshall<br />
* Elgar Pichler<br />
* Adrien Coulet<br />
* Ratnesh Sahay<br />
<br />
== Schedule ==<br />
<br />
=== Until September 1, 2013 ===<br />
* Matthias Samwald<br />
** Paper and presentation at OWL Reasoner workshop - ''DONE''<br />
** Have post-doc in Vienna working on the project full-time - ''DONE''<br />
** Finalize re-implementation of decision support service (behind http://safety-code.org/) in TrOWL. Need to establish collaboration with Jeff Pan and his group. - ''DONE''<br />
** Submit paper about Genomic CDS to the journal Bioinformatics - ''ALMOST DONE (submission expected mid-September).''<br />
** Tidy up / reorganize the W3C wiki - ''DONE''<br />
** Tidy up / reorganize http://www.genomic-cds.org/ - ''OVERDUE''<br />
* Richard Boyce<br />
** Semantically annotated product label information for [1] published as valid Open Data Annotation and with best practice provenance (including IAA) ''DOME'' (for a sub-set http://www.youtube.com/watch?v=Te546vOiruo, continuing the project)<br />
* Michel Dumontier<br />
** Describe work done this past term with curating class-level drug-drug interactions<br />
*** [https://docs.google.com/file/d/0B1-qT2rHHTkFX2VMRlJfNldHeG8/edit?usp=sharing poster] of curated interactions from drugbank; Honors Thesis Student Holly Surins<br />
** Curate and validate annotations for drug effects, drug indications and drug-drug indications on drug product labels (with Rich)<br />
*** installing DOMEO - in progress<br />
** Identify and develop priority datasets to be included in Bio2RDF that would benefit this task force <br />
*** rxnorm<br />
*** stitch<br />
* Bob Freimuth<br />
** Review Cerner's [http://www.cerner.com/about_cerner/clinical_bioinformatics_ontology/ CBO] (strengths/gaps for CDS) - ''ONGOING (had a look at it together with Matthias during Medinfo, Matthias to write down some impressions - a shared Google doc should be started)''<br />
** Discuss modeling approaches for genomic CDS (what's the trigger? integration with/maintenance of knowledge base; review existing implementations (TPP); feedback to CPIC?)<br />
** Discuss generalization of med safety code approach to NGS of PGx genes - ''DONE''<br />
** Apply generalized MSC to a real data set (demonstration)<br />
** Consider opportunities to partner with eMERGE and/or PGRN (user base for generalized MSC) - ''ONGOING''<br />
* Simon Lin<br />
** Shape a scientific manuscript out of the use-case document draft at docs.google.com/document/d/12leHdI-GT2dzRgvVIx13wsXAaCLj6eNGvQ-L2B749f4/edit<br />
<br />
=== Until December 1, 2013 ===<br />
<br />
* Matthias Samwald<br />
** Finalize overhaul and update of Genomic CDS ontology - ''ONGOING''<br />
** Need to establish more official collaboration with PharmGKB (Michel moving to Stanford might facilitate that further)<br />
** More official collaboration with the group at Mayo Clinic (Bob Freimuth?)<br />
** Finalize grant application for additional medical postdoc position -- for someone working on the biomedical details / curation / working and evaluating with doctors - ''DONE''<br />
** Enable loading VCF files into Genomic CDS - based decision support service. -- ''ONGOING''<br />
* Richard Boyce<br />
** Qualitative and possibly quantitative (e.g., task-oriented usability comparison) data on the value of integrating the semantic annotations into a prototype clinical pharmacogenomic information system. ''ONGOING'' <br />
** Acquisition of pilot funding (~25K) for development ''ONGOING''<br />
** a conference paper describing the above research activities ''IN PROGRESS''<br />
* Michel Dumontier<br />
** Be settled at Stanford University<br />
** Link drug-drug interactions with pharmacogenomic interactions<br />
** Identify and develop priority datasets to be included in Bio2RDF that would benefit this task force<br />
** Build relationships with associated partners on RDF data publication, use and analysis<br />
* Bob Freimuth<br />
** Apply ontological genomic CDS approach to key CPIC guidelines (demonstration, evaluation)<br />
** Demonstrate gene-gene-drug CDS, ensure approach will scale<br />
<br />
=== Until March 1, 2014 ===<br />
<br />
* Matthias Samwald<br />
** Genomic CDS ontology covers everything it needs to cover, connection to Rich's annotation work is continuously kept up-to-date<br />
** Begin partnering with external organizations (clinics, doctors, pharmacies, pharmaceutical companies) to start evaluating and dissemminating the ontology and decision support solutions (as pure research project only, since it is still undecided how best to get the software certified as a medical device). Marshfield Clinic might be a good partnering organisation for pilot studies?<br />
* Richard Boyce<br />
** Demonstration integration of triggers and recommendations present in the Genomic CDS ontology within the UPMC system using Quest lab results and focused meds (e.g., warfarin, clopidogrel, and some psych drugs). Further qualitative inquiry. - ''IN PROGRESS 9/10/13''<br />
** Submission of a journal article describing the results from all above research activities. - ''IN PROGRESS 9/10/13''<br />
** Submission of a journal paper presenting an analysis of pharmgx statements in product including the range of content, frequency of updates, contrasts with other sources, and recommendations for clinicians, drug information compendia, and the FDA - ''IN PROGRESS 9/10/13''<br />
* Michel Dumontier<br />
** Identify putative animal models for pharmacogenomic outcomes<br />
* Bob Freimuth<br />
** Consider application of gene-drug class and gene-drug-drug rules (leverages drug-drug interaction data)<br />
<br />
References: <br />
<br />
[1] atomoxetine, atorvastatin, boceprevir, capecitabine, carbamazepine, carvedilol, celecoxib, cisplatin, citalopram, clobazam, clomipramine, clopidogrel, clozapine, codeine, dapsone, desipramine, dexlansoprazole, diazepam, doxepin, esomeprazole, fluorouracil, fluoxetine, flurbiprofen, fluvoxamine, iloperidone, imipramine, irinotecan, ivacaftor, mercaptopurine, metoprolol, nefazodone, nortriptyline, omeprazole, pantoprazole, paroxetine, peginterferon, alfa-2b-il28b, perphenazine, phenytoin, pimozide, pravastatin, propafenone, propranolol, protriptyline, quinidine, rabeprazole, rasburicase, rifampin, risperidone, telaprevir, terbinafine, tetrabenazine, thioguanine, thioridazine, ticagrelor, tramadol, venlafaxine, and voriconazole<br />
<br />
==Deliverables==<br />
===Resources===<br />
* [http://www.genomic-cds.org/ Genomic CDS ontology project]<br />
* [http://safety-code.org/ Medicine Safety Code project]<br />
* [[/Use Cases|Use Cases]]<br />
* [http://goo.gl/afRh2 Conceptual Model from Michel]<br />
* [http://goo.gl/IUGns Formalization ideas from Michel]<br />
* [[/Data Sources|Data Sources]]<br />
* [[/Queries|Competency Questions and Queries]]<br />
* Decision Support Rules<br />
<br />
===Selected publications and presentations===<br />
* M Samwald. „Semantically Enabling Genetic Medicine to Facilitate Patients and Guidelines Matching and Enhanced Clinical Decision Support“ Proceedings of the Conference on Semantics in Healthcare and Life Sciences 2013 (CSHALS 2013), February 28, 2013, Cambridge/Boston, Massachusetts, USA - [http://de.slideshare.net/matthiassamwald/samwald-cshals2013-20741065 Slides]<br />
* M Samwald, KP Adlassnig. „Pharmacogenomics in the pocket of every patient? A prototype based on Quick Response (QR) codes“ J Am Med Inform Assoc, Published Online First: 23 Jan 2013, http://dx.doi.org/10.1136/amiajnl-2012-001275 - [http://samwald.info/res/Pharmacogenomics%20in%20the%20pocket%20of%20every%20patient%20-%20A%20prototype%20based%20on%20Quick%20Response%20%28QR%29%20codes%20-%20PREPRINT.pdf Link to openly accessible preprint version]<br />
* An Ontology-based Formalism, Knowledge Base and Reasoning System for Clinical Genetics. Samwald, M., Freimuth, R., Powers, R., Luciano, J., Prud’hommeaux, E., Boyce, R., Marshall, M., Dumontier, M. Poster presentation at the 2013 AMIA Summit on Translational Bioinformatics. San Francisco, March, 2013.<br />
* Toward semantic modeling of pharmacogenomic knowledge for clinical and translational decision support. PE Boyce, RD., Freimuth, RR., Romagnoli, KM., Pummer. The 2013 AMIA Summit on Translational Bioinformatics, San Francisco, CA. PubMed Central- Pending. ([http://www.slideshare.net/boycer/pharmgx-annotationamiatbi2013 Summary on SlideShare])<br />
* M Samwald, A Coulet, I Huerga, RL Powers, JS Luciano, RR Freimuth, F Whipple, E Pichler, E Prud'hommeaux, M Dumontier, MS Marshall. Semantically enabling pharmacogenomic data for the realization of personalized medicine. Pharmacogenomics. 2012 Jan;13(2):201-12. [http://www.ncbi.nlm.nih.gov/pubmed/22256869 PubMed]<br />
* An informatics infrastructure for translating pharmacogenomic knowledge into clinical practice. Matthias Samwald, Adrien Coulet, Robert R. Freimuth, Iker Huerga, Joanne S. Luciano, Elgar Pichler, Robert L. Powers, Eric Prud’hommeaux, Frederick Whipple, M. Scott Marshall, Michel Dumontier. AMIA 2012.<br />
* M Samwald, H Stenzhorn, M Dumontier, MS Marshall, J Luciano, KP Adlassnig. Towards an interoperable information infrastructure providing decision support for genomic medicine. Stud Health Technol Inform. 2011;169:165-9. [http://www.ncbi.nlm.nih.gov/pubmed/21893735 PubMed]<br />
<br />
==Related==<br />
* [[HCLSIG/TranslationalMedicine/pharmacogenomics|pharmacogenomics resources]]<br />
* [http://informatics.mayo.edu/phont PHONT@Mayo]<br />
* [http://pgrn.org/display/pgrnwebsite/PGRN+Home PGRN - Pharmacogenomics Research Network]</div>Rboycehttps://www.w3.org/wiki/index.php?title=HCLSIG/Pharmacogenomics&diff=68256HCLSIG/Pharmacogenomics2013-09-10T15:43:13Z<p>Rboyce: /* Until September 1, 2013 */</p>
<hr />
<div>=Clinical Pharmacogenomics=<br />
<br />
Understanding how the genetic makeup of individual patients influences the response to pharmaceuticals is essential to the realization of more effective, personalized pharmacotherapy. However, the integration of drug, genotype and phenotype knowledge in medical information systems and its subsequent use in clinical decision making remains a key challenge. This task force is dedicated to developing a biomedical informatics infrastructure that leverages Semantic Web technologies to capture pharmacogenomic findings in such a way that they can be used to inform medical practitioners regarding approved drugs with pharmacogenomic labels. We suggest that such computational technologies are an essential part of personalized medicine, providing key information for translational medicine and clinical care.<br />
<br />
[[File:Patients are prescribed the same thing but are different.jpg|600px]]<br />
<br />
This Task Force is led by [[MichelDumontier|Michel Dumontier]] and [[MatthiasSamwald|Matthias Samwald]].<br />
<br />
== Teleconference ==<br />
This task force meets on the 1st and 3rd Wednesday at 10:15am EST. Be an active participant and join our next call:<br />
* [[/Meetings/2013-06-05_Conference_Call|Next Meeting]]<br />
* [http://www.w3.org/wiki/Category:HCLS_PGX Past Meetings]<br />
<br />
== Goals ==<br />
* Capture use cases for pharmacogenomic clinical research and medicine<br />
* Formalize the representation of pharmacogenomics-related information using Semantic Web technologies and ontologies<br />
* Demonstrate the integration of (genomic) patient data with biomedical resources (SNPs, drugs, genes, trials, treatments, adverse events)<br />
* Demonstrate a working interface to explore and query pharmacogenomic knowledge<br />
* Demonstrate how such representations can be used for clinical decision support<br />
<br />
== Participants ==<br />
* [[MichelDumontier|Michel Dumontier (co-chair)]]<br />
* [[MatthiasSamwald|Matthias Samwald (co-chair)]]<br />
* Eric Prud'hommeaux (W3C tech)<br />
* Robert Freimuth<br />
* Simon Lin<br />
* Richard D. Boyce<br />
* Bob Powers<br />
* Joanne S. Luciano<br />
* M. Scott Marshall<br />
* Elgar Pichler<br />
* Adrien Coulet<br />
* Ratnesh Sahay<br />
<br />
== Schedule ==<br />
<br />
=== Until September 1, 2013 ===<br />
* Matthias Samwald<br />
** Paper and presentation at OWL Reasoner workshop - ''DONE''<br />
** Have post-doc in Vienna working on the project full-time - ''DONE''<br />
** Finalize re-implementation of decision support service (behind http://safety-code.org/) in TrOWL. Need to establish collaboration with Jeff Pan and his group. - ''DONE''<br />
** Submit paper about Genomic CDS to the journal Bioinformatics - ''ALMOST DONE (submission expected mid-September).''<br />
** Tidy up / reorganize the W3C wiki - ''DONE''<br />
** Tidy up / reorganize http://www.genomic-cds.org/ - ''OVERDUE''<br />
* Richard Boyce<br />
** Semantically annotated product label information for [1] published as valid Open Data Annotation and with best practice provenance (including IAA) ''DOME'' (for a sub-set http://www.youtube.com/watch?v=Te546vOiruo, continuing the project)<br />
* Michel Dumontier<br />
** Describe work done this past term with curating class-level drug-drug interactions<br />
*** [https://docs.google.com/file/d/0B1-qT2rHHTkFX2VMRlJfNldHeG8/edit?usp=sharing poster] of curated interactions from drugbank; Honors Thesis Student Holly Surins<br />
** Curate and validate annotations for drug effects, drug indications and drug-drug indications on drug product labels (with Rich)<br />
*** installing DOMEO - in progress<br />
** Identify and develop priority datasets to be included in Bio2RDF that would benefit this task force <br />
*** rxnorm<br />
*** stitch<br />
* Bob Freimuth<br />
** Review Cerner's [http://www.cerner.com/about_cerner/clinical_bioinformatics_ontology/ CBO] (strengths/gaps for CDS) - ''ONGOING (had a look at it together with Matthias during Medinfo, Matthias to write down some impressions - a shared Google doc should be started)''<br />
** Discuss modeling approaches for genomic CDS (what's the trigger? integration with/maintenance of knowledge base; review existing implementations (TPP); feedback to CPIC?)<br />
** Discuss generalization of med safety code approach to NGS of PGx genes - ''DONE''<br />
** Apply generalized MSC to a real data set (demonstration)<br />
** Consider opportunities to partner with eMERGE and/or PGRN (user base for generalized MSC) - ''ONGOING''<br />
* Simon Lin<br />
** Shape a scientific manuscript out of the use-case document draft at docs.google.com/document/d/12leHdI-GT2dzRgvVIx13wsXAaCLj6eNGvQ-L2B749f4/edit<br />
<br />
=== Until December 1, 2013 ===<br />
<br />
* Matthias Samwald<br />
** Finalize overhaul and update of Genomic CDS ontology - ''ONGOING''<br />
** Need to establish more official collaboration with PharmGKB (Michel moving to Stanford might facilitate that further)<br />
** More official collaboration with the group at Mayo Clinic (Bob Freimuth?)<br />
** Finalize grant application for additional medical postdoc position -- for someone working on the biomedical details / curation / working and evaluating with doctors - ''DONE''<br />
** Enable loading VCF files into Genomic CDS - based decision support service. -- ''ONGOING''<br />
* Richard Boyce<br />
** Qualitative and possibly quantitative (e.g., task-oriented usability comparison) data on the value of integrating the semantic annotations into a prototype clinical pharmacogenomic information system. <br />
** Acquisition of pilot funding (~25K) for development <br />
** a conference paper describing the above research activities<br />
* Michel Dumontier<br />
** Be settled at Stanford University<br />
** Link drug-drug interactions with pharmacogenomic interactions<br />
** Identify and develop priority datasets to be included in Bio2RDF that would benefit this task force<br />
** Build relationships with associated partners on RDF data publication, use and analysis<br />
* Bob Freimuth<br />
** Apply ontological genomic CDS approach to key CPIC guidelines (demonstration, evaluation)<br />
** Demonstrate gene-gene-drug CDS, ensure approach will scale<br />
<br />
=== Until March 1, 2014 ===<br />
<br />
* Matthias Samwald<br />
** Genomic CDS ontology covers everything it needs to cover, connection to Rich's annotation work is continuously kept up-to-date<br />
** Begin partnering with external organizations (clinics, doctors, pharmacies, pharmaceutical companies) to start evaluating and dissemminating the ontology and decision support solutions (as pure research project only, since it is still undecided how best to get the software certified as a medical device). Marshfield Clinic might be a good partnering organisation for pilot studies?<br />
* Richard Boyce<br />
** Demonstration integration of triggers and recommendations present in the Genomic CDS ontology within the UPMC system using Quest lab results and focused meds (e.g., warfarin, clopidogrel, and some psych drugs). Further qualitative inquiry. - ''IN PROGRESS 9/10/13''<br />
** Submission of a journal article describing the results from all above research activities. - ''IN PROGRESS 9/10/13''<br />
** Submission of a journal paper presenting an analysis of pharmgx statements in product including the range of content, frequency of updates, contrasts with other sources, and recommendations for clinicians, drug information compendia, and the FDA - ''IN PROGRESS 9/10/13''<br />
* Michel Dumontier<br />
** Identify putative animal models for pharmacogenomic outcomes<br />
* Bob Freimuth<br />
** Consider application of gene-drug class and gene-drug-drug rules (leverages drug-drug interaction data)<br />
<br />
References: <br />
<br />
[1] atomoxetine, atorvastatin, boceprevir, capecitabine, carbamazepine, carvedilol, celecoxib, cisplatin, citalopram, clobazam, clomipramine, clopidogrel, clozapine, codeine, dapsone, desipramine, dexlansoprazole, diazepam, doxepin, esomeprazole, fluorouracil, fluoxetine, flurbiprofen, fluvoxamine, iloperidone, imipramine, irinotecan, ivacaftor, mercaptopurine, metoprolol, nefazodone, nortriptyline, omeprazole, pantoprazole, paroxetine, peginterferon, alfa-2b-il28b, perphenazine, phenytoin, pimozide, pravastatin, propafenone, propranolol, protriptyline, quinidine, rabeprazole, rasburicase, rifampin, risperidone, telaprevir, terbinafine, tetrabenazine, thioguanine, thioridazine, ticagrelor, tramadol, venlafaxine, and voriconazole<br />
<br />
==Deliverables==<br />
===Resources===<br />
* [http://www.genomic-cds.org/ Genomic CDS ontology project]<br />
* [http://safety-code.org/ Medicine Safety Code project]<br />
* [[/Use Cases|Use Cases]]<br />
* [http://goo.gl/afRh2 Conceptual Model from Michel]<br />
* [http://goo.gl/IUGns Formalization ideas from Michel]<br />
* [[/Data Sources|Data Sources]]<br />
* [[/Queries|Competency Questions and Queries]]<br />
* Decision Support Rules<br />
<br />
===Selected publications and presentations===<br />
* M Samwald. „Semantically Enabling Genetic Medicine to Facilitate Patients and Guidelines Matching and Enhanced Clinical Decision Support“ Proceedings of the Conference on Semantics in Healthcare and Life Sciences 2013 (CSHALS 2013), February 28, 2013, Cambridge/Boston, Massachusetts, USA - [http://de.slideshare.net/matthiassamwald/samwald-cshals2013-20741065 Slides]<br />
* M Samwald, KP Adlassnig. „Pharmacogenomics in the pocket of every patient? A prototype based on Quick Response (QR) codes“ J Am Med Inform Assoc, Published Online First: 23 Jan 2013, http://dx.doi.org/10.1136/amiajnl-2012-001275 - [http://samwald.info/res/Pharmacogenomics%20in%20the%20pocket%20of%20every%20patient%20-%20A%20prototype%20based%20on%20Quick%20Response%20%28QR%29%20codes%20-%20PREPRINT.pdf Link to openly accessible preprint version]<br />
* An Ontology-based Formalism, Knowledge Base and Reasoning System for Clinical Genetics. Samwald, M., Freimuth, R., Powers, R., Luciano, J., Prud’hommeaux, E., Boyce, R., Marshall, M., Dumontier, M. Poster presentation at the 2013 AMIA Summit on Translational Bioinformatics. San Francisco, March, 2013.<br />
* Toward semantic modeling of pharmacogenomic knowledge for clinical and translational decision support. PE Boyce, RD., Freimuth, RR., Romagnoli, KM., Pummer. The 2013 AMIA Summit on Translational Bioinformatics, San Francisco, CA. PubMed Central- Pending. ([http://www.slideshare.net/boycer/pharmgx-annotationamiatbi2013 Summary on SlideShare])<br />
* M Samwald, A Coulet, I Huerga, RL Powers, JS Luciano, RR Freimuth, F Whipple, E Pichler, E Prud'hommeaux, M Dumontier, MS Marshall. Semantically enabling pharmacogenomic data for the realization of personalized medicine. Pharmacogenomics. 2012 Jan;13(2):201-12. [http://www.ncbi.nlm.nih.gov/pubmed/22256869 PubMed]<br />
* An informatics infrastructure for translating pharmacogenomic knowledge into clinical practice. Matthias Samwald, Adrien Coulet, Robert R. Freimuth, Iker Huerga, Joanne S. Luciano, Elgar Pichler, Robert L. Powers, Eric Prud’hommeaux, Frederick Whipple, M. Scott Marshall, Michel Dumontier. AMIA 2012.<br />
* M Samwald, H Stenzhorn, M Dumontier, MS Marshall, J Luciano, KP Adlassnig. Towards an interoperable information infrastructure providing decision support for genomic medicine. Stud Health Technol Inform. 2011;169:165-9. [http://www.ncbi.nlm.nih.gov/pubmed/21893735 PubMed]<br />
<br />
==Related==<br />
* [[HCLSIG/TranslationalMedicine/pharmacogenomics|pharmacogenomics resources]]<br />
* [http://informatics.mayo.edu/phont PHONT@Mayo]<br />
* [http://pgrn.org/display/pgrnwebsite/PGRN+Home PGRN - Pharmacogenomics Research Network]</div>Rboycehttps://www.w3.org/wiki/index.php?title=HCLSIG/Pharmacogenomics&diff=68249HCLSIG/Pharmacogenomics2013-09-10T12:54:47Z<p>Rboyce: /* Until March 1, 2014 */</p>
<hr />
<div>=Clinical Pharmacogenomics=<br />
<br />
Understanding how the genetic makeup of individual patients influences the response to pharmaceuticals is essential to the realization of more effective, personalized pharmacotherapy. However, the integration of drug, genotype and phenotype knowledge in medical information systems and its subsequent use in clinical decision making remains a key challenge. This task force is dedicated to developing a biomedical informatics infrastructure that leverages Semantic Web technologies to capture pharmacogenomic findings in such a way that they can be used to inform medical practitioners regarding approved drugs with pharmacogenomic labels. We suggest that such computational technologies are an essential part of personalized medicine, providing key information for translational medicine and clinical care.<br />
<br />
[[File:Patients are prescribed the same thing but are different.jpg|600px]]<br />
<br />
This Task Force is led by [[MichelDumontier|Michel Dumontier]] and [[MatthiasSamwald|Matthias Samwald]].<br />
<br />
== Teleconference ==<br />
This task force meets on the 1st and 3rd Wednesday at 10:15am EST. Be an active participant and join our next call:<br />
* [[/Meetings/2013-06-05_Conference_Call|Next Meeting]]<br />
* [http://www.w3.org/wiki/Category:HCLS_PGX Past Meetings]<br />
<br />
== Goals ==<br />
* Capture use cases for pharmacogenomic clinical research and medicine<br />
* Formalize the representation of pharmacogenomics-related information using Semantic Web technologies and ontologies<br />
* Demonstrate the integration of (genomic) patient data with biomedical resources (SNPs, drugs, genes, trials, treatments, adverse events)<br />
* Demonstrate a working interface to explore and query pharmacogenomic knowledge<br />
* Demonstrate how such representations can be used for clinical decision support<br />
<br />
== Participants ==<br />
* [[MichelDumontier|Michel Dumontier (co-chair)]]<br />
* [[MatthiasSamwald|Matthias Samwald (co-chair)]]<br />
* Eric Prud'hommeaux (W3C tech)<br />
* Robert Freimuth<br />
* Simon Lin<br />
* Richard D. Boyce<br />
* Bob Powers<br />
* Joanne S. Luciano<br />
* M. Scott Marshall<br />
* Elgar Pichler<br />
* Adrien Coulet<br />
* Ratnesh Sahay<br />
<br />
== Schedule ==<br />
<br />
=== Until September 1, 2013 ===<br />
* Matthias Samwald<br />
** Paper and presentation at OWL Reasoner workshop - ''DONE''<br />
** Have post-doc in Vienna working on the project full-time - ''DONE''<br />
** Finalize re-implementation of decision support service (behind http://safety-code.org/) in TrOWL. Need to establish collaboration with Jeff Pan and his group. - ''DONE''<br />
** Submit paper about Genomic CDS to the journal Bioinformatics - ''ALMOST DONE (submission expected mid-September).''<br />
** Tidy up / reorganize the W3C wiki - ''DONE''<br />
** Tidy up / reorganize http://www.genomic-cds.org/ - ''OVERDUE''<br />
* Richard Boyce<br />
** Semantically annotated product label information for [1] published as valid Open Data Annotation and with best practice provenance (including IAA).<br />
* Michel Dumontier<br />
** Describe work done this past term with curating class-level drug-drug interactions<br />
*** [https://docs.google.com/file/d/0B1-qT2rHHTkFX2VMRlJfNldHeG8/edit?usp=sharing poster] of curated interactions from drugbank; Honors Thesis Student Holly Surins<br />
** Curate and validate annotations for drug effects, drug indications and drug-drug indications on drug product labels (with Rich)<br />
*** installing DOMEO - in progress<br />
** Identify and develop priority datasets to be included in Bio2RDF that would benefit this task force <br />
*** rxnorm<br />
*** stitch<br />
* Bob Freimuth<br />
** Review Cerner's [http://www.cerner.com/about_cerner/clinical_bioinformatics_ontology/ CBO] (strengths/gaps for CDS) - ''ONGOING (had a look at it together with Matthias during Medinfo, Matthias to write down some impressions - a shared Google doc should be started)''<br />
** Discuss modeling approaches for genomic CDS (what's the trigger? integration with/maintenance of knowledge base; review existing implementations (TPP); feedback to CPIC?)<br />
** Discuss generalization of med safety code approach to NGS of PGx genes - ''DONE''<br />
** Apply generalized MSC to a real data set (demonstration)<br />
** Consider opportunities to partner with eMERGE and/or PGRN (user base for generalized MSC) - ''ONGOING''<br />
* Simon Lin<br />
** Shape a scientific manuscript out of the use-case document draft at docs.google.com/document/d/12leHdI-GT2dzRgvVIx13wsXAaCLj6eNGvQ-L2B749f4/edit<br />
<br />
=== Until December 1, 2013 ===<br />
<br />
* Matthias Samwald<br />
** Finalize overhaul and update of Genomic CDS ontology - ''ONGOING''<br />
** Need to establish more official collaboration with PharmGKB (Michel moving to Stanford might facilitate that further)<br />
** More official collaboration with the group at Mayo Clinic (Bob Freimuth?)<br />
** Finalize grant application for additional medical postdoc position -- for someone working on the biomedical details / curation / working and evaluating with doctors - ''DONE''<br />
* Richard Boyce<br />
** Qualitative and possibly quantitative (e.g., task-oriented usability comparison) data on the value of integrating the semantic annotations into a prototype clinical pharmacogenomic information system. <br />
** Acquisition of pilot funding (~25K) for development <br />
** a conference paper describing the above research activities<br />
* Michel Dumontier<br />
** Be settled at Stanford University<br />
** Link drug-drug interactions with pharmacogenomic interactions<br />
** Identify and develop priority datasets to be included in Bio2RDF that would benefit this task force<br />
** Build relationships with associated partners on RDF data publication, use and analysis<br />
* Bob Freimuth<br />
** Apply ontological genomic CDS approach to key CPIC guidelines (demonstration, evaluation)<br />
** Demonstrate gene-gene-drug CDS, ensure approach will scale<br />
<br />
=== Until March 1, 2014 ===<br />
<br />
* Matthias Samwald<br />
** Genomic CDS ontology covers everything it needs to cover, connection to Rich's annotation work is continuously kept up-to-date<br />
** Begin partnering with external organizations (clinics, doctors, pharmacies, pharmaceutical companies) to start evaluating and dissemminating the ontology and decision support solutions (as pure research project only, since it is still undecided how best to get the software certified as a medical device). Marshfield Clinic might be a good partnering organisation for pilot studies?<br />
* Richard Boyce<br />
** (IN PROGRESS 9/10/13) Demonstration integration of triggers and recommendations present in the Genomic CDS ontology within the UPMC system using Quest lab results and focused meds (e.g., warfarin, clopidogrel, and some psych drugs). Further qualitative inquiry. <br />
** (IN PROGRESS 9/10/13) Submission of a journal article describing the results from all above research activities. <br />
** (IN PROGRESS 9/10/13) Submission of a journal paper presenting an analysis of pharmgx statements in product including the range of content, frequency of updates, contrasts with other sources, and recommendations for clinicians, drug information compendia, and the FDA<br />
* Michel Dumontier<br />
** Identify putative animal models for pharmacogenomic outcomes<br />
* Bob Freimuth<br />
** Consider application of gene-drug class and gene-drug-drug rules (leverages drug-drug interaction data)<br />
<br />
References: <br />
<br />
[1] atomoxetine, atorvastatin, boceprevir, capecitabine, carbamazepine, carvedilol, celecoxib, cisplatin, citalopram, clobazam, clomipramine, clopidogrel, clozapine, codeine, dapsone, desipramine, dexlansoprazole, diazepam, doxepin, esomeprazole, fluorouracil, fluoxetine, flurbiprofen, fluvoxamine, iloperidone, imipramine, irinotecan, ivacaftor, mercaptopurine, metoprolol, nefazodone, nortriptyline, omeprazole, pantoprazole, paroxetine, peginterferon, alfa-2b-il28b, perphenazine, phenytoin, pimozide, pravastatin, propafenone, propranolol, protriptyline, quinidine, rabeprazole, rasburicase, rifampin, risperidone, telaprevir, terbinafine, tetrabenazine, thioguanine, thioridazine, ticagrelor, tramadol, venlafaxine, and voriconazole<br />
<br />
==Deliverables==<br />
===Resources===<br />
* [http://www.genomic-cds.org/ Genomic CDS ontology project]<br />
* [http://safety-code.org/ Medicine Safety Code project]<br />
* [[/Use Cases|Use Cases]]<br />
* [http://goo.gl/afRh2 Conceptual Model from Michel]<br />
* [http://goo.gl/IUGns Formalization ideas from Michel]<br />
* [[/Data Sources|Data Sources]]<br />
* [[/Queries|Competency Questions and Queries]]<br />
* Decision Support Rules<br />
<br />
===Selected publications and presentations===<br />
* M Samwald. „Semantically Enabling Genetic Medicine to Facilitate Patients and Guidelines Matching and Enhanced Clinical Decision Support“ Proceedings of the Conference on Semantics in Healthcare and Life Sciences 2013 (CSHALS 2013), February 28, 2013, Cambridge/Boston, Massachusetts, USA - [http://de.slideshare.net/matthiassamwald/samwald-cshals2013-20741065 Slides]<br />
* M Samwald, KP Adlassnig. „Pharmacogenomics in the pocket of every patient? A prototype based on Quick Response (QR) codes“ J Am Med Inform Assoc, Published Online First: 23 Jan 2013, http://dx.doi.org/10.1136/amiajnl-2012-001275 - [http://samwald.info/res/Pharmacogenomics%20in%20the%20pocket%20of%20every%20patient%20-%20A%20prototype%20based%20on%20Quick%20Response%20%28QR%29%20codes%20-%20PREPRINT.pdf Link to openly accessible preprint version]<br />
* An Ontology-based Formalism, Knowledge Base and Reasoning System for Clinical Genetics. Samwald, M., Freimuth, R., Powers, R., Luciano, J., Prud’hommeaux, E., Boyce, R., Marshall, M., Dumontier, M. Poster presentation at the 2013 AMIA Summit on Translational Bioinformatics. San Francisco, March, 2013.<br />
* Toward semantic modeling of pharmacogenomic knowledge for clinical and translational decision support. PE Boyce, RD., Freimuth, RR., Romagnoli, KM., Pummer. The 2013 AMIA Summit on Translational Bioinformatics, San Francisco, CA. PubMed Central- Pending. ([http://www.slideshare.net/boycer/pharmgx-annotationamiatbi2013 Summary on SlideShare])<br />
* M Samwald, A Coulet, I Huerga, RL Powers, JS Luciano, RR Freimuth, F Whipple, E Pichler, E Prud'hommeaux, M Dumontier, MS Marshall. Semantically enabling pharmacogenomic data for the realization of personalized medicine. Pharmacogenomics. 2012 Jan;13(2):201-12. [http://www.ncbi.nlm.nih.gov/pubmed/22256869 PubMed]<br />
* An informatics infrastructure for translating pharmacogenomic knowledge into clinical practice. Matthias Samwald, Adrien Coulet, Robert R. Freimuth, Iker Huerga, Joanne S. Luciano, Elgar Pichler, Robert L. Powers, Eric Prud’hommeaux, Frederick Whipple, M. Scott Marshall, Michel Dumontier. AMIA 2012.<br />
* M Samwald, H Stenzhorn, M Dumontier, MS Marshall, J Luciano, KP Adlassnig. Towards an interoperable information infrastructure providing decision support for genomic medicine. Stud Health Technol Inform. 2011;169:165-9. [http://www.ncbi.nlm.nih.gov/pubmed/21893735 PubMed]<br />
<br />
==Related==<br />
* [[HCLSIG/TranslationalMedicine/pharmacogenomics|pharmacogenomics resources]]<br />
* [http://informatics.mayo.edu/phont PHONT@Mayo]<br />
* [http://pgrn.org/display/pgrnwebsite/PGRN+Home PGRN - Pharmacogenomics Research Network]</div>Rboycehttps://www.w3.org/wiki/index.php?title=HCLSIG/Tools&diff=67741HCLSIG/Tools2013-08-08T16:23:05Z<p>Rboyce: /* Tools that support converting non-relational data to RDF */</p>
<hr />
<div>Know of a tool? Tweet it using #hclstool ! We will add the tool when we periodically aggregate tweets with that hashtag.<br />
<br />
= General support for Semantic Web applications =<br />
<br />
== [http://lod2.eu/WikiArticle/TechnologyStack.html LOD2 Stack] ==<br />
LOD2 stack is a collection of tools contributed by [http://lod2.eu/WikiArticle/Project.html LOD2] members.<br />
<br />
== [http://clarkparsia.com/pellet/icv Pellet Integrity Constraint Validator] ==<br />
* Pellet ICV is a modified version of Pellet that works with the Closed World Assumption. It can be used to define an ontology that will work as an schema and use Pellet to validate RDF data. An example using SKOS: http://weblog.clarkparsia.com/2010/04/14/pellet-icv-04-release-using-owl-integrity-constraints-to-validate-skos <br />
<br />
== [http://jrdf.sourceforge.net/ JRDF - An RDF Library in Java] ==<br />
* Note from author: "From May the 8th 2011, I've set the status of the project to inactive. Mainly due to lack of interest and contribution - no further development is taking place." <br />
* JRDF is an attempt to create a standard set of APIs and base implementations to RDF (Resource Description Framework) using the latest version of the Java language. <br />
<br />
= Tools to make relational data accessible via SPARQL =<br />
<br />
== [http://www.capsenta.com/ Ultrawrap] ==<br />
<br />
* Ultrawrap makes legacy relational databases upward compatible to the Semantic Web<br />
* Allows to execute SPARQL directly on the relational database and ETL to RDF<br />
* Ultrawrap executes SPARQL as fast as SQL because it delegates all optimizations to the underlying RDBMS<br />
* Full support for W3C's R2RML and Direct Mapping<br />
* Augmented Direct Mapping which maps the SQL Schema to OWL<br />
<br />
== [http://sourceforge.net/apps/mediawiki/swobjects/index.php?title=Main_Page SWObjects] ([[/SWObjects|usage]]) ==<br />
* SWObjects uses a query rewriting approach to make SQL data accessible via a SPARQL endpoint.<br />
* SWObjects creates maps from SPARQL Construct statements that act as translation rules from SPARQL to SQL, as well as SPARQL to SPARQL.<br />
* Federation support: Maps will automatically dispatch queries to the appropriate graph/endpoint in federated applications.<br />
<br />
== [http://www4.wiwiss.fu-berlin.de/bizer/d2r-server/ D2R server] ==<br />
<br />
D2R Server is a tool for publishing relational databases on the Semantic Web.<br />
It enables RDF and HTML browsers to navigate the content of the database,<br />
and allows applications to query the database using the SPARQL query language<br />
<br />
== [http://mayor2.dia.fi.upm.es/oeg-upm/index.php/en/downloads/9-r2o-odemapster ODEMapster] ==<br />
<br />
* R2O & ODEMapster is an integrated framework for the formal specification, evaluation, verification and exploitation of the semantic mappings between ontologies and relational databases. <br />
* ODEMapster is a NeOn plugin that offers a GUI for building mappings between a RDBMS and an Ontology. It also offers the possibility of excecuting such mappings and populating the ontology to create a Linked Data KB. <br />
<br />
= Tools that support converting non-relational data to RDF =<br />
<br />
== [http://www.io-informatics.com/products/index.html Sentient Knowledge Explorer] ==<br />
* interactive graphics for selection of desired output for automated SPARQL query builder<br />
* automatic import wizards for mapping from common formats to RDF<br />
<br />
== [http://topquadrant.com/products/TB_Composer.html TopBraid Composer] ==<br />
* has import with automated mappings to RDF from XML with provenance in the Maestro Edition<br />
* is a versatile tool with many features for building and inspecting RDF and OWL, as well as publishing SPARQL access to data<br />
<br />
== [https://github.com/srdc/ontmalizer Ontmalizer] ==<br />
* Performs comprehensive transformations of XML Schemas (XSD) and XML data to RDF/OWL automatically. Through this tool, it is possible to create RDF/OWL representation of XML Schemas, and XML instances that comply with such XML Schemas.<br />
* Tested on HL7 Clinical Document Architecture (CDA) R2.<br />
<br />
== [http://lab.linkeddata.deri.ie/2010/grefine-rdf-extension/ RDF plugin] to [http://code.google.com/p/google-refine/ Google Refine] ==<br />
* Good for getting a sense of what’s in the data (or a sample of the data where scale is large).<br />
* Enables reconciliation of data with freebase/sindice/other sparql endpoints such as NCBO<br />
* Enables description of data in terms of predicates retrieved from prefix.cc<br />
* Possibility to specify which ontologies should be used to describe the data<br />
<br />
== [https://github.com/timrdf/csv2rdf4lod-automation/wiki/ CSV2RDF4LOD] ==<br />
* https://github.com/timrdf/csv2rdf4lod-automation/wiki/Examples <br />
* It is implemented to handle arbitrary rows counts and was found to work with data that has 3,949,400 rows<br />
<br />
== [http://www.sysmo-db.org/rightfield Rightfield] == <br />
* from the [http://genetics-ecology.univie.ac.at/sysmo.html EU SysMO project]<br />
* Create spreadsheet templates for input using ontologies as controlled vocabularies. Spreadsheet entries then contain unambiguous identifiers and are easier to convert to RDF.<br />
<br />
== [http://dblab.cs.toronto.edu/project/xcurator/ xCurator ] ==<br />
<br />
* The xCurator project offers an end-to-end framework to transform a semi-structured (XML) source into high-quality Linked Data.<br />
* Used by the new LinkedCT http://linkedct.org/ Thanks to xCurator, the data is now over 25 million triples (previously only 7 million triples), has much higher quality and is up-to-date at all times<br />
* Paper describing the framework and initial results: Linking Semistructured Data on the Web (WebDB2011 at SIGMOD)<br />
* A little demo available online, but the code is still under development and not released yet http://dblab.cs.toronto.edu/project/xcurator/<br />
<br />
= Management/Visualization =<br />
=== [http://distilbio.com/ DistilBio] from Metaome ===<br />
=== [http://triplemap.com/ TripleMap] from Entagen ===<br />
=== [http://www.io-informatics.com/products/sentient-KE.html Sentient Knowledge Explorer] from IO Informatics ===<br />
=== [http://www.fluidops.com/information-workbench/ Information Workbench] from Fluid Operations ===<br />
=== [http://www.revelytix.com/content/spyder Spyder] from Revelytix ===<br />
=== [http://bnowack.de/work#3 Paggr Prospect] from Benjamin Nowak ===<br />
=== [http://code.google.com/p/callimachus/ Callimachus] from 3 round stones ===</div>Rboycehttps://www.w3.org/wiki/index.php?title=HCLSIG/Pharmacogenomics/Meetings/2013-05-01_Conference_Call&diff=65831HCLSIG/Pharmacogenomics/Meetings/2013-05-01 Conference Call2013-05-01T14:28:49Z<p>Rboyce: /* Richard Boyce */</p>
<hr />
<div>[[category:HCLS_PGX]]<br />
[[category:HCLS_meeting]]<br />
Meeting: Clinical Pharmacogenomics<br />
Date: May 01, 2013<br />
Time: 10:15 Eastern Time (16:15 Central European Time)<br />
Frequency: 1st and 3rd Wednesday of the month<br />
Conveners: Michel Dumontier, Matthias Samwald<br />
Dial-In #: +1.617.761.6200 (Cambridge, MA)<br />
VoIP address: sip:zakim@voip.w3.org<br />
Participant Access Code: 4257 ("HCLS")<br />
IRC Channel: irc.w3.org port 6665 channel #HCLS<br />
<br />
'''Suggested Agenda'''<br />
<br />
* Planning and priorities meeting. Prepare for this meeting by jotting down specific outcomes you want to see in the next 3, 6, 12 months. We will discuss these and determine what specifically needs to be done to achieve them. Then we will prioritize tasks and identify who wants to work on what.<br />
* Submission about Genomic CDS (and other ontologies you have in the field of pharmacogenomics?) to the OWL Reaser Evaluation Workshop 2013 http://ore2013.cs.manchester.ac.uk/ (Deadline 3rd of May!!)<br />
<br />
'''Meeting Minutes'''<br />
* TBD<br />
<br />
<br />
== Notepad for documenting 3-6-12 month plans ==<br />
<br />
=== Matthias Samwald ===<br />
<br />
3 months: <br />
* Paper and presentation at OWL Reasoner workshop<br />
* Finalize re-implementation of decision support service (behind safety-code.org) in TrOWL. Need to establish collaboration with Jeff Pan and his group!!<br />
* Submit paper about Genomic CDS to the journal Bioinformatics<br />
* Tidy up / reorganize the W3C wiki and http://www.genomic-cds.org/<br />
<br />
6 months: <br />
* Have post-doc in Vienna working on this full-time! <br />
* Finalize overhaul and update of Genomic CDS ontology<br />
* Need to establish more official collaboration with PharmGKB (Michel moving to Stanford might facilitate that further)<br />
* More official collaboration with the group at Mayo Clinic (Bob Freimuth?)<br />
* Finalize grant application for additional 3-year medical PhD position -- for someone working on the biomedical details / curation / working and evaluating with doctors<br />
<br />
12 months: <br />
* Genomic CDS ontology covers everything it needs to cover, connection to Rich's annotation work is continuously kept up-to-date<br />
* Begin partnering with external organizations (clinics, doctors, pharmacies, pharmaceutical companies) to start evaluating and dissemminating the ontology and decision support solutions (as pure research project only, since it is still undecided how best to get the software certified as a medical device). Marshfield Clinic might be a good partnering organisation for pilot studies?<br />
<br />
=== Richard Boyce ===<br />
<br />
3 months:<br />
* semantically annotated product label information for [1] published as valid Open Data Annotation and with best practice provenance (including IAA). <br />
<br />
6 months:<br />
* Qualitative and possibly quantitative (e.g., task-oriented usability comparison) data on the value of integrating the semantic annotations into a prototype clinical pharmacogenomic information system. <br />
* Acquisition of pilot funding (~25K) for development <br />
* a conference paper describing the above research activities<br />
<br />
9 months:<br />
* Demonstration integration of triggers and recommendations present in the Genomic CDS ontology within the UPMC system using Quest lab results and focused meds (e.g., warfarin, clopidogrel, and some psych drugs). Further qualitative inquiry. <br />
* Submission of a journal article describing the results from all above research activities. <br />
* Submission of a journal paper presenting an analysis of pharmgx statements in product including the range of content, frequency of updates, contrasts with other sources, and recommendations for clinicians, drug information compendia, and the FDA<br />
<br />
References: <br />
<br />
1. atomoxetine, atorvastatin, boceprevir, capecitabine, carbamazepine, carvedilol, celecoxib, cisplatin, citalopram, clobazam, clomipramine, clopidogrel, clozapine, codeine, dapsone, desipramine, dexlansoprazole, diazepam, doxepin, esomeprazole, fluorouracil, fluoxetine, flurbiprofen, fluvoxamine, iloperidone, imipramine, irinotecan, ivacaftor, mercaptopurine, metoprolol, nefazodone, nortriptyline, omeprazole, pantoprazole, paroxetine, peginterferon, alfa-2b-il28b, perphenazine, phenytoin, pimozide, pravastatin, propafenone, propranolol, protriptyline, quinidine, rabeprazole, rasburicase, rifampin, risperidone, telaprevir, terbinafine, tetrabenazine, thioguanine, thioridazine, ticagrelor, tramadol, venlafaxine, and voriconazole</div>Rboycehttps://www.w3.org/wiki/index.php?title=HCLSIG/Pharmacogenomics&diff=64979HCLSIG/Pharmacogenomics2013-04-03T13:50:16Z<p>Rboyce: /* Members */</p>
<hr />
<div>==Clinical Pharmacogenomics==<br />
<br />
===Overview===<br />
Understanding how the genetic makeup of individual patients influences the response to pharmaceuticals is essential to the realization of more effective, personalized pharmacotherapy. However, the integration of drug, genotype and phenotype knowledge in medical information systems and its subsequent use in clinical decision making remains a key challenge. This task force is dedicated to developing a biomedical informatics infrastructure that leverages Semantic Web technologies to capture pharmacogenomic findings in such a way that they can be used to inform medical practitioners regarding approved drugs with pharmacogenomic labels. We suggest that such computational technologies are an essential part of personalized medicine, providing key information for translational medicine and clinical care.<br />
<br />
This Task Force is lead by [[MichelDumontier|Michel Dumontier]] and [[MatthiasSamwald|Matthias Samwald]].<br />
<br />
=== Teleconference ===<br />
This task force meets on the 1st and 3rd Wednesday at 10:15am EST. Be an active participant and join our next call:<br />
* [[/Meetings/2013-04-03_Conference_Call|Next Meeting]]<br />
* [http://www.w3.org/wiki/Category:HCLS_PGX Past Meetings]<br />
<br />
===Tasks===<br />
* Capture use cases for pharmacogenomic clinical research and medicine<br />
* Formalize the representation of pharmacogenomics-related information using Semantic Web technologies and ontologies<br />
* Demonstrate the integration of (genomic) patient data with biomedical resources (SNPs, drugs, genes, trials, treatments, adverse events)<br />
* Demonstrate a working interface to explore and query pharmacogenomic knowledge<br />
* Demonstrate how such representations can be used for clinical decision support<br />
<br />
===Deliverables===<br />
'''Resources'''<br />
* [[/Use Cases|Use Cases]]<br />
* [http://goo.gl/afRh2 Conceptual Model]<br />
* [http://goo.gl/IUGns Formalization]<br />
* [http://www.w3.org/2001/sw/hcls/ns/transmed/tmo Translational Medicine Ontology]. View with Protege 4, OwlSight, TopBraid Composer<br />
* [[/Data Sources|Data Sources]]<br />
* [[/Queries|Competency Questions and Queries]]<br />
* Decision Support Rules<br />
'''Tools'''<br />
* Knowledge Base<br />
* Decision Support<br />
'''Selected Publications'''<br />
* M Samwald, KP Adlassnig. „Pharmacogenomics in the pocket of every patient? A prototype based on Quick Response (QR) codes“ JAMIA (to appear)<br />
* M Samwald, A Coulet, I Huerga, RL Powers, JS Luciano, RR Freimuth, F Whipple, E Pichler, E Prud'hommeaux, M Dumontier, MS Marshall. Semantically enabling pharmacogenomic data for the realization of personalized medicine. Pharmacogenomics. 2012 Jan;13(2):201-12. [http://www.ncbi.nlm.nih.gov/pubmed/22256869 pubmed]<br />
* M Samwald, H Stenzhorn, M Dumontier, MS Marshall, J Luciano, KP Adlassnig. Towards an interoperable information infrastructure providing decision support for genomic medicine. Stud Health Technol Inform. 2011;169:165-9. [http://www.ncbi.nlm.nih.gov/pubmed/21893735 pubmed]<br />
* Toward semantic modeling of pharmacogenomic knowledge for clinical and translational decision support. PE Boyce, RD., Freimuth, RR., Romagnoli, KM., Pummer. The 2013 AMIA Summit on Translational Bioinformatics, San Francisco, CA. PubMed Central- Pending. ([http://www.slideshare.net/boycer/pharmgx-annotationamiatbi2013 Summary on SlideShare])<br />
'''Presentations'''<br />
* An informatics infrastructure for translating pharmacogenomic knowledge into clinical practice. Matthias Samwald, Adrien Coulet, Robert R. Freimuth, Iker Huerga, Joanne S. Luciano, Elgar Pichler, Robert L. Powers, Eric Prud’hommeaux, Frederick Whipple, M. Scott Marshall, Michel Dumontier. AMIA 2012.<br />
* An Ontology-based Formalism, Knowledge Base and Reasoning System for Clinical Genetics. Samwald, M., Freimuth, R., Powers, R., Luciano, J., Prud’hommeaux, E., Boyce, R., Marshall, M., Dumontier, M. Poster presentation at the 2013 AMIA Summit on Translational Bioinformatics. San Francisco, March, 2013.<br />
<br />
===Members===<br />
* [[MichelDumontier|Michel Dumontier (co-chair)]]<br />
* [[MatthiasSamwald|Matthias Samwald (co-chair)]]<br />
* Eric Prud'hommeaux (W3C tech)<br />
<br />
* Adrien Coulet<br />
* Robert Freimuth<br />
* Iker Huerga<br />
* Joanne S. Luciano<br />
* M. Scott Marshall<br />
* Elgar Pichler<br />
* Bob Powers<br />
* Frederick Whipple<br />
* Simon Lin<br />
* Richard D. Boyce<br />
<br />
===Related===<br />
* [[HCLSIG/TranslationalMedicine/pharmacogenomics|pharmacogenomics resources]]<br />
* [[HCLSIG/PharmaOntology|Translational Medicine Task Force]]<br />
* [http://informatics.mayo.edu/phont PHONT@Mayo]<br />
* [http://pgrn.org/display/pgrnwebsite/PGRN+Home PGRN - Pharmacogenomics Research Network]<br />
* [http://www.genomic-cds.org/ Genomic CDS ontology project]<br />
* [http://safety-code.org/ Medicine Safety Code project]</div>Rboycehttps://www.w3.org/wiki/index.php?title=HCLSIG/Pharmacogenomics&diff=64978HCLSIG/Pharmacogenomics2013-04-03T13:49:44Z<p>Rboyce: /* Deliverables */</p>
<hr />
<div>==Clinical Pharmacogenomics==<br />
<br />
===Overview===<br />
Understanding how the genetic makeup of individual patients influences the response to pharmaceuticals is essential to the realization of more effective, personalized pharmacotherapy. However, the integration of drug, genotype and phenotype knowledge in medical information systems and its subsequent use in clinical decision making remains a key challenge. This task force is dedicated to developing a biomedical informatics infrastructure that leverages Semantic Web technologies to capture pharmacogenomic findings in such a way that they can be used to inform medical practitioners regarding approved drugs with pharmacogenomic labels. We suggest that such computational technologies are an essential part of personalized medicine, providing key information for translational medicine and clinical care.<br />
<br />
This Task Force is lead by [[MichelDumontier|Michel Dumontier]] and [[MatthiasSamwald|Matthias Samwald]].<br />
<br />
=== Teleconference ===<br />
This task force meets on the 1st and 3rd Wednesday at 10:15am EST. Be an active participant and join our next call:<br />
* [[/Meetings/2013-04-03_Conference_Call|Next Meeting]]<br />
* [http://www.w3.org/wiki/Category:HCLS_PGX Past Meetings]<br />
<br />
===Tasks===<br />
* Capture use cases for pharmacogenomic clinical research and medicine<br />
* Formalize the representation of pharmacogenomics-related information using Semantic Web technologies and ontologies<br />
* Demonstrate the integration of (genomic) patient data with biomedical resources (SNPs, drugs, genes, trials, treatments, adverse events)<br />
* Demonstrate a working interface to explore and query pharmacogenomic knowledge<br />
* Demonstrate how such representations can be used for clinical decision support<br />
<br />
===Deliverables===<br />
'''Resources'''<br />
* [[/Use Cases|Use Cases]]<br />
* [http://goo.gl/afRh2 Conceptual Model]<br />
* [http://goo.gl/IUGns Formalization]<br />
* [http://www.w3.org/2001/sw/hcls/ns/transmed/tmo Translational Medicine Ontology]. View with Protege 4, OwlSight, TopBraid Composer<br />
* [[/Data Sources|Data Sources]]<br />
* [[/Queries|Competency Questions and Queries]]<br />
* Decision Support Rules<br />
'''Tools'''<br />
* Knowledge Base<br />
* Decision Support<br />
'''Selected Publications'''<br />
* M Samwald, KP Adlassnig. „Pharmacogenomics in the pocket of every patient? A prototype based on Quick Response (QR) codes“ JAMIA (to appear)<br />
* M Samwald, A Coulet, I Huerga, RL Powers, JS Luciano, RR Freimuth, F Whipple, E Pichler, E Prud'hommeaux, M Dumontier, MS Marshall. Semantically enabling pharmacogenomic data for the realization of personalized medicine. Pharmacogenomics. 2012 Jan;13(2):201-12. [http://www.ncbi.nlm.nih.gov/pubmed/22256869 pubmed]<br />
* M Samwald, H Stenzhorn, M Dumontier, MS Marshall, J Luciano, KP Adlassnig. Towards an interoperable information infrastructure providing decision support for genomic medicine. Stud Health Technol Inform. 2011;169:165-9. [http://www.ncbi.nlm.nih.gov/pubmed/21893735 pubmed]<br />
* Toward semantic modeling of pharmacogenomic knowledge for clinical and translational decision support. PE Boyce, RD., Freimuth, RR., Romagnoli, KM., Pummer. The 2013 AMIA Summit on Translational Bioinformatics, San Francisco, CA. PubMed Central- Pending. ([http://www.slideshare.net/boycer/pharmgx-annotationamiatbi2013 Summary on SlideShare])<br />
'''Presentations'''<br />
* An informatics infrastructure for translating pharmacogenomic knowledge into clinical practice. Matthias Samwald, Adrien Coulet, Robert R. Freimuth, Iker Huerga, Joanne S. Luciano, Elgar Pichler, Robert L. Powers, Eric Prud’hommeaux, Frederick Whipple, M. Scott Marshall, Michel Dumontier. AMIA 2012.<br />
* An Ontology-based Formalism, Knowledge Base and Reasoning System for Clinical Genetics. Samwald, M., Freimuth, R., Powers, R., Luciano, J., Prud’hommeaux, E., Boyce, R., Marshall, M., Dumontier, M. Poster presentation at the 2013 AMIA Summit on Translational Bioinformatics. San Francisco, March, 2013.<br />
<br />
===Members===<br />
* [[MichelDumontier|Michel Dumontier (co-chair)]]<br />
* [[MatthiasSamwald|Matthias Samwald (co-chair)]]<br />
* Eric Prud'hommeaux (W3C tech)<br />
<br />
* Adrien Coulet<br />
* Robert Freimuth<br />
* Iker Huerga<br />
* Joanne S. Luciano<br />
* M. Scott Marshall<br />
* Elgar Pichler<br />
* Bob Powers<br />
* Frederick Whipple<br />
* Simon Lin<br />
<br />
===Related===<br />
* [[HCLSIG/TranslationalMedicine/pharmacogenomics|pharmacogenomics resources]]<br />
* [[HCLSIG/PharmaOntology|Translational Medicine Task Force]]<br />
* [http://informatics.mayo.edu/phont PHONT@Mayo]<br />
* [http://pgrn.org/display/pgrnwebsite/PGRN+Home PGRN - Pharmacogenomics Research Network]<br />
* [http://www.genomic-cds.org/ Genomic CDS ontology project]<br />
* [http://safety-code.org/ Medicine Safety Code project]</div>Rboycehttps://www.w3.org/wiki/index.php?title=HCLSIG/Pharmacogenomics/Meetings/2013-03-06_Conference_Call&diff=64268HCLSIG/Pharmacogenomics/Meetings/2013-03-06 Conference Call2013-02-20T23:05:55Z<p>Rboyce: </p>
<hr />
<div>[[category:HCLS_PGX]]<br />
[[category:HCLS_meeting]]<br />
Meeting: Clinical Pharmacogenomics<br />
Date: March 06, 2013<br />
Time: 10:15 Eastern Time (16:15 Middle European Time)<br />
Frequency: 1st and 3rd Wednesday of the month<br />
Conveners: Michel Dumontier, Matthias Samwald<br />
Dial-In #: +1.617.761.6200 (Cambridge, MA)<br />
VoIP address: sip:zakim@voip.w3.org<br />
Participant Access Code: 4257 ("HCLS")<br />
IRC Channel: irc.w3.org port 6665 channel #HCLS<br />
<br />
'''Suggested Agenda'''<br />
<br />
* Report about CSHALS2013<br />
* Update to RxNorm RDF<br />
* Progress on research to report for Fall AMIA Symposium (Bob and Rich)<br />
* Add suggested agenda items here</div>Rboycehttps://www.w3.org/wiki/index.php?title=HCLSIG/Pharmacogenomics/Meetings/2013-02-06_Conference_Call&diff=63862HCLSIG/Pharmacogenomics/Meetings/2013-02-06 Conference Call2013-02-05T20:22:01Z<p>Rboyce: </p>
<hr />
<div>[[category:HCLS_PGX]]<br />
[[category:HCLS_meeting]]<br />
Meeting: HCLS Pharmacogenomics<br />
Date: February 06, 2013<br />
Time: 10:15 Eastern Time (16:15 Middle European Time)<br />
Frequency: 1st and 3rd Wednesday of the month<br />
Convener: Michel Dumontier, Matthias Samwald<br />
Dial-In #: +1.617.761.6200 (Cambridge, MA)<br />
VoIP address: sip:zakim@voip.w3.org<br />
Participant Access Code: 4257 ("HCLS")<br />
IRC Channel: irc.w3.org port 6665 channel #HCLS<br />
<br />
'''Suggested Agenda'''<br />
* AMIA Poster<br />
* PK Ontology<br />
* Discussing a potential HCLS IG / task force meetup around CSHALS 2013 in Boston this year<br />
* (add other agenda items here)</div>Rboycehttps://www.w3.org/wiki/index.php?title=HCLSIG/LLD&diff=63497HCLSIG/LLD2013-01-18T19:12:38Z<p>Rboyce: /* Participants */</p>
<hr />
<div>__NOTOC__<br />
== Linked Life Data ==<br />
''Best practices in creating, publishing, linking, querying and visualizing linked life data''<br />
<br />
With the advent of high-throughput experimentation, there has been an explosion of biomedical data on the Internet. While most of the data is available in Web accessible formats (e.g. HTML), many biomedical researchers rely on the use of Web browsers (e.g., Firefox and Internet Explorer) and search engines like Google to browse and search data on the Internet. Such manual browsing and keyword searching approaches are inadequate for large-scale integration of data on the Web. The Semantic Web transforms the Web into a global database or knowledge base that can be accessed by computer programs/agents through a standard data/ontology format. The Resource Description Framework (RDF) and the Web Ontology Language (OWL) are the W3C standards for encoding data/knowledge. This task group explores how to use RDF and OWL (as well as their enabling technologies) to better enable the computer to represent, identify, publish, query and integrate data/knowledge in the health care and life science domain.<br />
<br />
=== Coordinator ===<br />
The task coordinator is [[MScottMarshall]] (mscottmarshall@gmail.com)<br />
<br />
=== Objectives ===<br />
<br />
* Create a network of linked life data. <br />
* Develop methods to keep resources up to date.<br />
* Develop human-friendly user interfaces<br />
* Produce how-to W3C notes<br />
<br />
<br />
===Products===<br />
Currently there are two HCLS KBs: [http://esw.w3.org/topic/HCLSIG_BioRDF_Subgroup/DERI_HCLS_KB HCLS KB hosted by DERI] and [http://www.corporate-semantic-web.de/hcls.html HCLS KB hosted by Free University Berlin].<br />
<br />
<br />
===Publications===<br />
<br />
Helena Deus, Jun Zhao, Satya Sahoo, Matthias Samwald, Eric Prud'hommeaux, Michael Miller, M.Scott Marshall and Kei-Hoi Cheung. Provenance of Microarray Experiments for a Better Understanding of Experiment Results ([http://wiki.knoesis.org/index.php/SWPM-2010 SWPM2010 Workshop wiki page], [http://people.csail.mit.edu/pcm/tempISWC/workshops/SWPM2010/InvitedPaper_6.pdf pdf])<br />
<br />
Samwald, M.; Jentzsch, A.; Bouton, C.; Kallesoe, C.; Willighagen, E.; Hajagos, J.; Marshall, M.; Prud'hommeaux, E.; Hassanzadeh, O.; Pichler, E.; Stephens, S. Journal of Cheminformatics 2011, 3, 19 ([http://www.jcheminf.com/content/3/1/19 HTML]).<br />
<br />
Cheung KH, Frost HR, Marshall MS, Prud'hommeaux E, Samwald M, Zhao J, Paschke A. (2009). A Journey to Semantic Web Query Federation in Life Sciences. BMC Bioinformatics, 10(Suppl 10):S10 ([http://www.biomedcentral.com/1471-2105/10/S10/S10 HTML]).<br />
<br />
Zhao J, Jentzsch A, Samwald M, Cheung KH. Linked Data for Connnecting Traditional Chinese Medicine and Western Medicine. Data Integration in the Life Sciences Workshops ([http://www.cs.manchester.ac.uk/DILS09/ DILS 09]), University of Manchester, UK ([http://www.cs.manchester.ac.uk/DILS09/PosterDemoProceedings.pdf poster presentation]) -- this work is in collaboration with the LoDD task force.<br />
<br />
=== Meetings ===<br />
* [[/Meetings/2013-01-14_Conference_Call|Next Meeting Jan 14, 2013]]<br />
* [http://goo.gl/ekhTH Past Meetings]<br />
<br />
=== Current Tasks ===<br />
* Query Federation<br />
** [http://esw.w3.org/topic/HCLSIG_BioRDF_Subgroup/QueryFederation Use case 1 -- receptors] <br />
** [http://esw.w3.org/topic/HCLSIG_BioRDF_Subgroup/QueryFederation2 Use case 2 -- microarrays]<br />
** [http://esw.w3.org/topic/HCLSIG_BioRDF_Subgroup/MinimalInformationAboutAGraph Minimal Information About a Graph (MIAG)]<br />
* [http://esw.w3.org/topic/HCLSIG_BioRDF_Subgroup/aTags BioSIOC / aTags]<br />
* [http://esw.w3.org/topic/Data/TCMGeneDIT Linking TCM with LoDD Datasets]<br />
* [http://esw.w3.org/topic/HCLSIG_BioRDF_Subgroup/MicroarrayExperimentContext Provenance]<br />
* [[/DeepCapture|Deep Knowledge Representation Challenge]]<br />
* [[/DatasetDescription|Dataset Description]]<br />
<br />
=== Participants ===<br />
* Kei Cheung (Yale University)<br />
* Matthias Samwald (Medical University of Vienna)<br />
* Rob Frost (Vector C)<br />
* Adrian Paschke (Freie Universitat Berlin)<br />
* Eric Prud'hommeaux (W3C)<br />
* Don Doherty (Brainstage)<br />
* Susie Stephens (Johnson & Johnson Pharmaceutical Research & Development)<br />
* Scott Marshall (University of Amsterdam)<br />
* TN Bhat (NIST)<br />
* Huajun Chen (Zhejiang University)<br />
* Jun Zhao (Oxford University)<br />
* [[KingsleyIdehen|Kingsley Idehen]] ([[OpenLinkSoftware|OpenLink Software]])<br />
* Helena Deus (University of Texas)<br />
* Satya Sahoo (Wright State University)<br />
* Egon Willighagen (Maastricht University)<br />
* Richard D Boyce (University of Pittsburgh)<br />
<br />
We welcome new participants.<br />
<br />
=== Resources/References ===<br />
* [http://www.eswc2008.org/final-pdfs-for-web-site/qpII-1.pdf SemWIQ]<br />
* [http://www.eswc2008.org/final-pdfs-for-web-site/qpII-2.pdf Querying Distributed RDF Data Sources with SPARQL]<br />
* [http://agraph.franz.com/support/documentation/current/agraph-introduction.html#intro-federation AllegroGraph documentation]<br />
* [http://darq.sourceforge.net/ DARQ]<br />
* [http://www.faviki.com/ Faviki]<br />
* [http://www.corporate-semantic-web.de/tl_files/pub/RuleResponder_HCLS_eScience.pdf SPARQL Service Bus middleware]<br />
* [http://users.ox.ac.uk/~zool0770/presentations/HCLS-BioRDF-Feb-09.pdf Vocabulary of Interlinked Dataset]<br />
* [[Media:HCLSIG_BioRDF_Subgroup$TCMGeneDIT_RDF_Dataset_r1.zip|The RDF dump of the TCMGeneDIT Database]]<br />
* [[Media:HCLSIG_BioRDF_Subgroup$TCMGeneDIT_RDF_Dataset_r2.tar.gz|The new release of RDF dump of the TCMGeneDIT Database]]<br />
* [[VirtuosoUniversalServer|OpenLink Virtuoso]] - Quad Store and Linked Data Deployment<br />
* [http://lod.openlinksw.com/fct/facet.vsp Live Virtuoso Instance] - [[NeuroCommons]], [[Bio2Rdf]], Uniprot, Yago, and other data sets from the Linked Data Cloud (note: [http://lod.openlinksw.com/sparql Live SPARQL endpoint])<br />
* [http://www.iscb.org/cms_addon/conferences/cshals2009/ C-SHALS 2009]<br />
* [http://www.cs.manchester.ac.uk/DILS09/ DILS 2009]<br />
* [[Media:HCLSIG_BioRDF_Subgroup$biordf_f2f_2009.pdf|BioRDF update F2F 2009]]<br />
* [[Media:HCLSIG_BioRDF_Subgroup$biordf_f2f_breakout_2009.pdf|BioRDF breakout session F2F 2009]]<br />
=== Old Archives ===<br />
* [[HCLSIG/LODD]]<br />
* [[HCLSIG BioRDF Subgroup]]<br />
<br />
Categories: [[Category:Hclsig]]</div>Rboycehttps://www.w3.org/wiki/index.php?title=HCLSIG/LODD/Data&diff=63345HCLSIG/LODD/Data2012-12-28T12:46:22Z<p>Rboyce: /* NOTE: WORK-IN-PROGRESS fu-berlin datasets are being hosted by Bio2RDF. Several are already there. Updates to this page and CKAN Datahub are pending.. */</p>
<hr />
<div>__NOTOC__<br />
=== LODD-related datasets that the LODD group already made available as Linked Data ===<br />
<br />
== NOTE: WORK-IN-PROGRESS fu-berlin datasets are being hosted by [http://bio2rdf.org/ Bio2RDF]. Several are already there. Updates to this page and CKAN Datahub are pending..==<br />
<br />
{| border="1" cellpadding="2" cellspacing="0"<br />
| '''Name'''<br />
| '''Topic'''<br />
| '''Short Description'''<br />
| '''Size and coverage'''<br />
| '''Status / Activity'''<br />
| '''Example Instances'''<br />
| '''SPARQL Endpoint'''<br />
|-<br />
| [http://www4.wiwiss.fu-berlin.de/drugbank/ DrugBank]<br />
| Drugs<br />
| [http://www.drugbank.ca/ Drugbank.ca] provides drug (i.e., chemical, pharmacological and pharmaceutical) data with comprehensive drug target (i.e., sequence, structure, and pathway) information ({{doi|10.1093/nar/gkj067}})<br />
| 766,920 triples; 4,800 drugs, 2,500 protein sequences<br />
| updated regularly<br />
| Varenicline [http://beckr.org/marbles?uri=http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugs/DB01273 via Marbles], [http://demo.openlinksw.com/ode/?uri=http%3A%2F%2Fwww4.wiwiss.fu-berlin.de%2Fdrugbank%2Fresource%2Fdrugs%2FDB01273 via OpenLink Data Explorer] <br />
| http://www4.wiwiss.fu-berlin.de/drugbank/sparql <br />
|-<br />
| [http://linkedct.org/ LinkedCT]<br />
| Clinical Trials<br />
| Linked data source of trials from [http://clinicaltrials.gov ClinicalTrials.gov]<br />
| ~25 million triples, 106,000 trials (as of April 2011)<br />
| Updated automatically at all times, refer to [http://linkedct.org/faq/ FAQ] for more details.<br />
| [http://data.linkedct.org/resource/condition/breast-cancer/ Breast Cancer] (Condition), a [http://data.linkedct.org/resource/trial/nct00999557/ NCT00999557] (Trial), [http://data.linkedct.org/resource/city/toronto/ Toronto] (City). <br />
| http://data.linkedct.org/sparql<br />
|-<br />
| [http://purl.org/net/nlprepository/linkedSPLs DailyMed] <br />
| All FDA-approved Structured Product Labels (SPLs) for currently marketed drugs enhanced with indexing to pharmacogenomics information and NDF-RT drug class assignments<br />
| Data available via a D2R server (sample data), as an RDF dumpt (full data, ntriples), or from Virtuoso RDF Store (contact maintainer)<br />
| 1,604,893 triples, 36,000+ product labels<br />
| Updated every Thursday using information from the DailyMed RSS feed<br />
| [http://thedatahub.org/dataset/linked-structured-product-labels/resource/918071ae-f570-4728-98b7-5b447ab42ab8 SPL for Venlafaxine Hydrochloride (American Health Packaging)]<br />
| http://purl.org/net/nlprepository/linkedSPLs<br />
|-<br />
| [http://dbpedia.org/About DBpedia]<br />
| Drugs/ Diseases/ Proteins<br />
| RDF data about 2.49 million things that has been extracted from Wikipedia<br />
| 218 million RDF triples; 2,300 drugs, 2,200 proteins<br />
| updated every 3 months <br />
| [http://dbpedia.org/resource/Aspirin Aspirin], [http://dbpedia.org/resource/HIV HIV]<br />
| http://dbpedia.org/sparql<br />
|-<br />
| [http://www4.wiwiss.fu-berlin.de/diseasome/ Diseasome]<br />
| Diseases / Genes<br />
| [http://www.nd.edu/~alb/Publication06/145-HumanDisease_PNAS-14My07-Proc/Suppl/ Diseasome] describes characteristics of disorders and disease genes linked by known disorder–gene associations<br />
| 91,182 triples; 2,600 genes<br />
| updated 2006<br />
| Alzheimer's [http://beckr.org/marbles?uri=http://www4.wiwiss.fu-berlin.de/diseasome/resource/diseases/74 via Marbles], [http://demo.openlinksw.com/ode/?uri=http%3A%2F%2Fwww4.wiwiss.fu-berlin.de%2Fdiseasome%2Fresource%2Fdiseases%2F74 via OpenLink Data Explorer]<br />
| http://www4.wiwiss.fu-berlin.de/diseasome/sparql<br />
|-<br />
| [http://dbmi-icode-01.dbmi.pitt.edu/dikb-evidence/front-page.html The Drug Interaction Knowledge Base] <br />
| Drugs / Metabolic Inhibition Drug-drug Interactions (DDIs) / Claims and Evidence for drug mechanisms and DDIs<br />
| A D2R server of more than 60 drugs currently in the DIKB<br />
| >41K<br />
| Updated 12/21/2012<br />
| [http://dbmi-icode-01.dbmi.pitt.edu:2020/page/Drugs/paroxetine paroxetine], [http://dbmi-icode-01.dbmi.pitt.edu:2020/page/Drugs/atorvastatin atorvastatin]<br />
| http://dbmi-icode-01.dbmi.pitt.edu:2020/<br />
|-<br />
| [http://code.google.com/p/junsbriefcase/wiki/TGDdataset RDF-TCM]<br />
| Genes / Diseases / Medicine / Ingredients<br />
| Traditional Chinese medicine, gene and disease association dataset and a linkset mapping TCM gene symbols to Extrez Gene IDs created by Neurocommons <br />
| 117,643 <br />
| updated August 2009 (stable)<br />
| [http://purl.org/net/tcm/tcm.lifescience.ntu.edu.tw/id/medicine/Ginkgo_biloba Ginkgo biloba] <br />
| http://www.open-biomed.org.uk/sparql/endpoint/tcm<br />
|-<br />
| [http://www.nlm.nih.gov/research/umls/rxnorm/ RxNorm]<br />
| Drugs<br />
| A linked version of the NLM's RxNorm database that connects prescription drugs, ingredients, and NDC through RXCUI a concept unique identifier. RxNorm is a product developed by NIH’s National Library of Medicine. It currently interlinks 12 different drug vocabularies around a unique concept identifier. Due to licensing only six of the drug vocabularies are made available as part of the LODD cloud. This includes: Medical Subject Headings,, Metathesaurus FDA National Drug Code Directory, Metathesaurus FDA Structured Product Labels, National Drug File, RxNorm Vocabulary, Veterans Health Administration National Drug File<br />
Links are provided connecting RxNorm to drug bank and to the UMLS.<br />
|over 7.7 million triples; 165,806 RXCUI (Concept Unique Identifiers) Unique drugs and ingredients; 332,754 RXAUI (Atomic Unique Identifiers) sourced terms<br />
| Based on 3/2010 Rxnorm Release; Last updated 5/2010<br />
| [http://link.informatics.stonybrook.edu/rxnorm/RXAUI/2994963 Singulair from the Metathesaurus FDA Structured Product Labels]<br />
| http://link.informatics.stonybrook.edu/sparql/<br />
|-<br />
| [http://www4.wiwiss.fu-berlin.de/sider/ SIDER]<br />
| Diseases / Side Effects<br />
| [http://sideeffects.embl.de/ SIDER] contains information on marketed drugs and their adverse effects ({{doi|10.1038/msb.2009.98}})<br />
| 192,515 triples; 63,000 adverse effect reports, 1,737 genes<br />
| updated 2009<br />
| Confusion [http://beckr.org/marbles?uri=http://www4.wiwiss.fu-berlin.de/sider/resource/side_effects/C0009676 via Marbles]<br />
| http://www4.wiwiss.fu-berlin.de/sider/sparql<br />
|-<br />
| [http://www4.wiwiss.fu-berlin.de/stitch/ STITCH]<br />
| Chemicals / Proteins<br />
| [http://stitch.embl.de/ STITCH] contains information on chemicals, proteins, and their interactions ({{doi|10.1093/nar/gkm795}})<br />
| 7,500,000 chemicals; 500,000 proteins; 370 organisms <br />
| updated July 2009<br />
| Lactose [http://beckr.org/marbles?uri=http://www4.wiwiss.fu-berlin.de/stitch/resource/chemicals/CID000000294 via Marbles]<br />
| http://www4.wiwiss.fu-berlin.de/stitch/sparql<br />
|-<br />
| [http://www4.wiwiss.fu-berlin.de/medicare/ Medicare]<br />
| Medicare Formulary<br />
| xxx<br />
| xxx<br />
| xxx<br />
| xxx<br />
| http://www4.wiwiss.fu-berlin.de/medicare/sparql<br />
|-<br />
| [http://www.ebi.ac.uk/chembl/ ChEMBL]<br />
| Chemical / Assays (Proteins, Organisms) / Papers<br />
| [http://www.ebi.ac.uk/chembl/ ChEMBL] contains information on trial drugs with information about activity against targets like but not limited to proteins. All is backed up by and linked to literature. Includes links to Bio2RDF for ChEBI and Uniprot. License: CC-BY-SA.<br />
| ~130M triples<br />
| Updated 2010-01<br />
| A [http://rdf.farmbio.uu.se/chembl/snorql/?describe=http://linkedchemistry.info/chembl/activity/a2642163 IC50 activity].<br />
| http://rdf.farmbio.uu.se/chembl/sparql<br />
|-<br />
| [http://apps.who.int/ghodata/ WHO's Global Health Observatory (GHO)]<br />
| Infectious Diseases /Demography / Socioeconomic Conditions / Environmental Factors<br />
| Data and statistics for infectious diseases at country, regional, and global levels<br />
| ~3M triples<br />
| Updated 2012-05<br />
| xxx<br />
| http://gho.aksw.org<br />
|-<br />
| [http://www.dbmi.pitt.edu/nlpfront University of Pittsburgh NLP Repository] <br />
| Drugs / Procedures / Diagnoses<br />
| A semantic index of concepts present in 800 full-text clinical notes from the University of Pittsburgh NLP Repository<br />
| 38.664<br />
| Proof of concept -- Updated 02/25/2011<br />
| [http://dbmi-icode-01.dbmi.pitt.edu:8080/sparql?default-graph-uri=&query=SELECT+DISTINCT+%3Fproperty+%3FhasValue+%3FisValueOf%0D%0AWHERE+{%0D%0A++{+%3Chttp%3A%2F%2Fpurl.org%2Fnet%2Fnlprepository%2Ftest%23report_1460%3E+%3Fproperty+%3FhasValue+}%0D%0A++UNION%0D%0A++{+%3FisValueOf+%3Fproperty+%3Chttp%3A%2F%2Fpurl.org%2Fnet%2Fnlprepository%2Ftest%23report_1460%3E+}%0D%0A}%0D%0AORDER+BY+%28!BOUND%28%3FhasValue%29%29+%3Fproperty+%3FhasValue+%3FisValueOf%0D%0A&format=text%2Fhtml&debug=on&timeout= Concepts from a sample radiology report]<br />
| [http://dbmi-icode-01.dbmi.pitt.edu:8080/sparql http://dbmi-icode-01.dbmi.pitt.edu:8080/sparql]<br />
|-<br />
|}<br />
<br />
[[File:2010-12-04_lodd_cloud.png|600px]]<br />
<br />
A graph of some of the LODD datasets (dark grey), related biomedical datasets (light grey), related general-purpose datasets (white) and their interconnections. Line weights correspond to the number of links. The direction of an arrow indicates the dataset that contains the links, e.g., an arrow from A to B means that dataset A contains RDF triples that use identifiers from B. Bidirectional arrows usually indicate that the links are mirrored in both datasets.<br />
More on the interlinking methodology and statistics can be found on the [[../Interlinking|Interlinking]] page.<br />
<br />
The LODD datasets have been crawled by the SWSE Semantic Web search engine and can be accessed via a faceted browsing interface at [http://visinav.deri.org/hcls/] ([http://visinav.deri.org/hcls/list?keyword=varenicline Example query: Varenicline]).<br />
<br />
Most of the LODD datasets have also been integrated into the SPARQL endpoint of the HCLS Knowledge Base, see [http://esw.w3.org/topic/HCLSIG_BioRDF_Subgroup/DERI_HCLS_KB the wiki page of the HCLS KB] for further information.<br />
<br />
=== Bio2RDF Data Sets ===<br />
<br />
The [http://bio2rdf.org/ Bio2RDF project] has published 40 biology-, gene- and medical-related datasets (altogether 2.3 billion triples). <br />
The datasets are available via SPARQL endpoints and as Linked Data. It is recommended that you use the [http://sourceforge.net/project/platformdownload.php?group_id=142631 Bio2RDF Java Servlet], and optionally [http://quebec.bio2rdf.org/download/virtuoso/indexed/ download the databases] for efficient personal use. Running your own instance of the [http://virtuoso.openlinksw.com/wiki/main/Main/VirtEC2AMIBio2rdfInstall OpenLink Virtuoso AMI for EC2] is also an option (and for basic URI resolution doesn't require the Java Servlet, although if you want advanced queries you should still download it and configure it to query your EC2 sparql endpoint).<br />
<br />
* [http://www.freebase.com/view/user/bio2rdf/public/sparql Bio2RDF sparql endpoint list] [http://rdf.freebase.com/rdf/user/bio2rdf/public/sparql Sparql endpoint list in RDF]<br />
* [http://linkeddata.openlinksw.com:8891/pubmed:10500064 Identification of an autoimmune enteropathy-related 75-kilodalton antigen], via an [[OpenLinkSoftware|OpenLink]] hosted edition of [[Bio2Rdf]] <br />
* [http://linkeddata.openlinksw.com:8891/pubmed:9636670 Structure of the gene encoding the human cyclin-dependent kinase inhibitor p18 and mutational analysis in breast cancer], via an [[OpenLinkSoftware|OpenLink]] hosted edition of [[Bio2Rdf]] <br />
* [http://beckr.org/marbles?uri=http%3A%2F%2Fbio2rdf.org%2Fpubmed%3A9626117 PubMed article] viewed using the Marbles Linked Data browser.<br />
* [http://beckr.org/marbles?lang=en&uri=http%3A%2F%2Fbio2rdf.org%2Ffoaf%3AClemens%2C_T_L PubMed author] viewed using the Marbles Linked Data browser.<br />
* [http://beckr.org/marbles?uri=http%3A%2F%2Fbio2rdf.org%2Fomim%3A161555 OMIM Killer Cell Lectin-Like Receptor] viewed using the Marbles Linked Data browser.<br />
* [http://iws.seu.edu.cn/services/falcons/objectsearch/queryresult.jsp?query=%22KILLER+CELL%22 Falcons Search for KILLER CELL]. The Bio2RDF data has been crawled by the Falcons Semantic Web Search engine. This is an example on how the data is accessed by humans using the search engine. Falcons also offers an API that can by used by applications to access the data.<br />
<br />
=== Chem2bio2RDF ===<br />
<br />
* Information about the [http://chem2bio2rdf.org/ chem2bio2rdf] data sets<br />
<br />
=== Data Sets for the LODD Task ===<br />
<br />
To complement the drug-related Web of Data build by the LODD effort, the following data sets could/should also be published as Linked Data.<br />
<br />
The LODD effort is currently gathering more information about relevant datasets. See also [[/DataSetEvaluation|Evaluation of LODD Data Sets]] for current evaluation results.<br />
<br />
* [http://library.dialog.com/bluesheets/html/bl0107.html Adis R&D Insight]<br />
* [http://www.ebi.ac.uk/chebi/ chEBI]<br />
* [http://xpdb.nist.gov/pdb/chemblast.html ChemBlast]<br />
* [http://www.chemspider.com/ ChemSpider]<br />
* [http://ClinicalTrials.gov ClinicalTrials.gov]<br />
* [http://www.citeline.com/trialtrove.html Citeline TrialTrove]<br />
* [http://dailymed.nlm.nih.gov/dailymed/about.cfm DailyMed]<br />
* [http://dbpedia.org/About DBpedia]<br />
* [http://www.nd.edu/~alb/Publication06/145-HumanDisease_PNAS-14My07-Proc/Suppl/ Diseasome]<br />
* [http://www.drugbank.ca/ Drug Bank]<br />
* [http://www.virtualref.com/abs/72.htm DrugDB]<br />
* [http://www.ncbi.nlm.nih.gov/pubmed/17921997 Drugome]<br />
* [http://lsdis.cs.uga.edu/projects/asdoc/ Drug Ontology]<br />
* [http://scientific.thomsonreuters.com/products/iddb/ Investigational Drug Database] - Proprietary<br />
* [http://www.ovid.com/site/catalog/DataBase/1244.jsp?top=2&mid=3&bottom=7&subsection=10 IMS]<br />
* [http://www.genome.jp/kegg/drug/ KEGG Drug]<br />
* [http://Lillytrials.com LillyTrials]<br />
* [http://www.fda.gov/medwatch/ MedWatch]<br />
* [http://www.fda.gov/cder/ndc/ National Drug Code]<br />
* [http://www.ncbi.nlm.nih.gov/omim/ OMIM]<br />
* [http://www.fda.gov/cder/ob/ Orange Book]<br />
* [http://www.pharmaprojects.com/ Pharmaprojects] - Proprietary<br />
* [http://pubchem.ncbi.nlm.nih.gov/ PubChem]<br />
* [http://www.nlm.nih.gov/research/umls/rxnorm/ RxNorm]<br />
* [http://www.ncbi.nlm.nih.gov/sites/entrez?Db=pubmed&Cmd=ShowDetailView&TermToSearch=15360858 VA NDF-RT]<br />
* Other data sources could include blogs, discussion boards, wikis, etc.<br />
* and.... <br />
** [http://www.who.int/globalatlas/ World Health Organization's Global Health Atlas]<br />
** [http://www.epispider.org/ EpiSPIDER]<br />
** [http://www.accessdata.fda.gov/Scripts/cder/DrugsatFDA/ Drugs@FDA - FDA Approved Drug Products]<br />
** [http://www.drugdigest.org/wps/portal/ddigest DrugDigest]<br />
** [http://humancyc.org/ HumanCyc: Encyclopedia of ''Homo sapiens'' Genes and Metabolism]<br />
** [http://www.alzforum.org/ Alzheimer Research Forum]<br />
** [http://wwwcf.nlm.nih.gov/umlslicense/rxtermApp/rxTerm.cfm RxTerms]<br />
** [http://hudine.neu.edu/ HuDiNe]<br />
** [http://wiki.medpedia.com/Cymbalta Medpedia]<br />
** [http://tcm.lifescience.ntu.edu.tw/ TCMGeneDIT] and [[Media:HCLSIG$$LODD$$Data$TCMGeneDIT_RDF_Dataset_r1.zip|RDF dump]]<br />
** [http://www.tuftsctsi.org/~/media/Files/CTSI/Library%20Files/FCC%20for%20CER%20Rpt%20to%20Pres%20and%20Congress_063009.ashx List of other possible data sources from page 66 onwards]<br />
<br />
=== Alternative Herbal Medicine use case ===<br />
* [[Data/TCMGeneDIT|TCMGeneDIT dataset]]<br />
<br />
=== Identified Based Linkage Points ===<br />
<br />
* INCHIs<br />
* [[PubChem]] Compound ID (CID)<br />
* [[PubChem]] NSC<br />
* Chemical Abstract ID (CAS)<br />
* New Drug Application (NDA)<br />
<br />
=== Data Set Attributes ===<br />
<br />
* Licensing<br />
* Data Format<br />
* Identifiers</div>Rboycehttps://www.w3.org/wiki/index.php?title=HCLSIG/LODD/Data&diff=63130HCLSIG/LODD/Data2012-12-10T16:24:46Z<p>Rboyce: /* LODD-related datasets that the LODD group already made available as Linked Data */</p>
<hr />
<div>__NOTOC__<br />
=== LODD-related datasets that the LODD group already made available as Linked Data ===<br />
<br />
{| border="1" cellpadding="2" cellspacing="0"<br />
| '''Name'''<br />
| '''Topic'''<br />
| '''Short Description'''<br />
| '''Size and coverage'''<br />
| '''Status / Activity'''<br />
| '''Example Instances'''<br />
| '''SPARQL Endpoint'''<br />
|-<br />
| [http://www4.wiwiss.fu-berlin.de/drugbank/ DrugBank]<br />
| Drugs<br />
| [http://www.drugbank.ca/ Drugbank.ca] provides drug (i.e., chemical, pharmacological and pharmaceutical) data with comprehensive drug target (i.e., sequence, structure, and pathway) information ({{doi|10.1093/nar/gkj067}})<br />
| 766,920 triples; 4,800 drugs, 2,500 protein sequences<br />
| updated regularly<br />
| Varenicline [http://beckr.org/marbles?uri=http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugs/DB01273 via Marbles], [http://demo.openlinksw.com/ode/?uri=http%3A%2F%2Fwww4.wiwiss.fu-berlin.de%2Fdrugbank%2Fresource%2Fdrugs%2FDB01273 via OpenLink Data Explorer] <br />
| http://www4.wiwiss.fu-berlin.de/drugbank/sparql <br />
|-<br />
| [http://linkedct.org/ LinkedCT]<br />
| Clinical Trials<br />
| Linked data source of trials from [http://clinicaltrials.gov ClinicalTrials.gov]<br />
| ~25 million triples, 106,000 trials (as of April 2011)<br />
| Updated automatically at all times, refer to [http://linkedct.org/faq/ FAQ] for more details.<br />
| [http://data.linkedct.org/resource/condition/breast-cancer/ Breast Cancer] (Condition), a [http://data.linkedct.org/resource/trial/nct00999557/ NCT00999557] (Trial), [http://data.linkedct.org/resource/city/toronto/ Toronto] (City). <br />
| http://data.linkedct.org/sparql<br />
|-<br />
| [http://purl.org/net/nlprepository/linkedSPLs DailyMed] <br />
| All FDA-approved Structured Product Labels (SPLs) for currently marketed drugs enhanced with indexing to pharmacogenomics information and NDF-RT drug class assignments<br />
| Data available via a D2R server (sample data), as an RDF dumpt (full data, ntriples), or from Virtuoso RDF Store (contact maintainer)<br />
| 1,604,893 triples, 36,000+ product labels<br />
| Updated every Thursday using information from the DailyMed RSS feed<br />
| [http://thedatahub.org/dataset/linked-structured-product-labels/resource/918071ae-f570-4728-98b7-5b447ab42ab8 SPL for Venlafaxine Hydrochloride (American Health Packaging)]<br />
| http://purl.org/net/nlprepository/linkedSPLs<br />
|-<br />
| [http://dbpedia.org/About DBpedia]<br />
| Drugs/ Diseases/ Proteins<br />
| RDF data about 2.49 million things that has been extracted from Wikipedia<br />
| 218 million RDF triples; 2,300 drugs, 2,200 proteins<br />
| updated every 3 months <br />
| [http://dbpedia.org/resource/Aspirin Aspirin], [http://dbpedia.org/resource/HIV HIV]<br />
| http://dbpedia.org/sparql<br />
|-<br />
| [http://www4.wiwiss.fu-berlin.de/diseasome/ Diseasome]<br />
| Diseases / Genes<br />
| [http://www.nd.edu/~alb/Publication06/145-HumanDisease_PNAS-14My07-Proc/Suppl/ Diseasome] describes characteristics of disorders and disease genes linked by known disorder–gene associations<br />
| 91,182 triples; 2,600 genes<br />
| updated 2006<br />
| Alzheimer's [http://beckr.org/marbles?uri=http://www4.wiwiss.fu-berlin.de/diseasome/resource/diseases/74 via Marbles], [http://demo.openlinksw.com/ode/?uri=http%3A%2F%2Fwww4.wiwiss.fu-berlin.de%2Fdiseasome%2Fresource%2Fdiseases%2F74 via OpenLink Data Explorer]<br />
| http://www4.wiwiss.fu-berlin.de/diseasome/sparql<br />
|-<br />
| [http://dbmi-icode-01.dbmi.pitt.edu/dikb-evidence/front-page.html The Drug Interaction Knowledge Base] <br />
| Drugs / Metabolic Inhibition Drug-drug Interactions (DDIs) / Claims and Evidence for drug mechanisms and DDIs<br />
| A D2R server of more than 60 drugs currently in the DIKB<br />
| >41K<br />
| Updated 10/01/2012<br />
| [http://dbmi-icode-01.dbmi.pitt.edu:2020/page/Drugs/paroxetine paroxetine], [http://dbmi-icode-01.dbmi.pitt.edu:2020/page/Drugs/atorvastatin atorvastatin]<br />
| http://dbmi-icode-01.dbmi.pitt.edu:2020/<br />
|-<br />
| [http://code.google.com/p/junsbriefcase/wiki/TGDdataset RDF-TCM]<br />
| Genes / Diseases / Medicine / Ingredients<br />
| Traditional Chinese medicine, gene and disease association dataset and a linkset mapping TCM gene symbols to Extrez Gene IDs created by Neurocommons <br />
| 117,643 <br />
| updated August 2009 (stable)<br />
| [http://purl.org/net/tcm/tcm.lifescience.ntu.edu.tw/id/medicine/Ginkgo_biloba Ginkgo biloba] <br />
| http://hcls.deri.org/sparql; graph name: http://hcls.deri.org/resource/graph/tcm<br />
|-<br />
| [http://www.nlm.nih.gov/research/umls/rxnorm/ RxNorm]<br />
| Drugs<br />
| A linked version of the NLM's RxNorm database that connects prescription drugs, ingredients, and NDC through RXCUI a concept unique identifier. RxNorm is a product developed by NIH’s National Library of Medicine. It currently interlinks 12 different drug vocabularies around a unique concept identifier. Due to licensing only six of the drug vocabularies are made available as part of the LODD cloud. This includes: Medical Subject Headings,, Metathesaurus FDA National Drug Code Directory, Metathesaurus FDA Structured Product Labels, National Drug File, RxNorm Vocabulary, Veterans Health Administration National Drug File<br />
Links are provided connecting RxNorm to drug bank and to the UMLS.<br />
|over 7.7 million triples; 165,806 RXCUI (Concept Unique Identifiers) Unique drugs and ingredients; 332,754 RXAUI (Atomic Unique Identifiers) sourced terms<br />
| Based on 3/2010 Rxnorm Release; Last updated 5/2010<br />
| [http://link.informatics.stonybrook.edu/rxnorm/RXAUI/2994963 Singulair from the Metathesaurus FDA Structured Product Labels]<br />
| http://link.informatics.stonybrook.edu/sparql/<br />
|-<br />
| [http://www4.wiwiss.fu-berlin.de/sider/ SIDER]<br />
| Diseases / Side Effects<br />
| [http://sideeffects.embl.de/ SIDER] contains information on marketed drugs and their adverse effects ({{doi|10.1038/msb.2009.98}})<br />
| 192,515 triples; 63,000 adverse effect reports, 1,737 genes<br />
| updated 2009<br />
| Confusion [http://beckr.org/marbles?uri=http://www4.wiwiss.fu-berlin.de/sider/resource/side_effects/C0009676 via Marbles]<br />
| http://www4.wiwiss.fu-berlin.de/sider/sparql<br />
|-<br />
| [http://www4.wiwiss.fu-berlin.de/stitch/ STITCH]<br />
| Chemicals / Proteins<br />
| [http://stitch.embl.de/ STITCH] contains information on chemicals, proteins, and their interactions ({{doi|10.1093/nar/gkm795}})<br />
| 7,500,000 chemicals; 500,000 proteins; 370 organisms <br />
| updated July 2009<br />
| Lactose [http://beckr.org/marbles?uri=http://www4.wiwiss.fu-berlin.de/stitch/resource/chemicals/CID000000294 via Marbles]<br />
| http://www4.wiwiss.fu-berlin.de/stitch/sparql<br />
|-<br />
| [http://www4.wiwiss.fu-berlin.de/medicare/ Medicare]<br />
| Medicare Formulary<br />
| xxx<br />
| xxx<br />
| xxx<br />
| xxx<br />
| http://www4.wiwiss.fu-berlin.de/medicare/sparql<br />
|-<br />
| [http://www.ebi.ac.uk/chembl/ ChEMBL]<br />
| Chemical / Assays (Proteins, Organisms) / Papers<br />
| [http://www.ebi.ac.uk/chembl/ ChEMBL] contains information on trial drugs with information about activity against targets like but not limited to proteins. All is backed up by and linked to literature. Includes links to Bio2RDF for ChEBI and Uniprot. License: CC-BY-SA.<br />
| ~130M triples<br />
| Updated 2010-01<br />
| A [http://rdf.farmbio.uu.se/chembl/snorql/?describe=http://linkedchemistry.info/chembl/activity/a2642163 IC50 activity].<br />
| http://rdf.farmbio.uu.se/chembl/sparql<br />
|-<br />
| [http://apps.who.int/ghodata/ WHO's Global Health Observatory (GHO)]<br />
| Infectious Diseases /Demography / Socioeconomic Conditions / Environmental Factors<br />
| Data and statistics for infectious diseases at country, regional, and global levels<br />
| ~3M triples<br />
| Updated 2012-05<br />
| xxx<br />
| http://gho.aksw.org<br />
|-<br />
| [http://www.dbmi.pitt.edu/nlpfront University of Pittsburgh NLP Repository] <br />
| Drugs / Procedures / Diagnoses<br />
| A semantic index of concepts present in 800 full-text clinical notes from the University of Pittsburgh NLP Repository<br />
| 38.664<br />
| Proof of concept -- Updated 02/25/2011<br />
| [http://dbmi-icode-01.dbmi.pitt.edu:8080/sparql?default-graph-uri=&query=SELECT+DISTINCT+%3Fproperty+%3FhasValue+%3FisValueOf%0D%0AWHERE+{%0D%0A++{+%3Chttp%3A%2F%2Fpurl.org%2Fnet%2Fnlprepository%2Ftest%23report_1460%3E+%3Fproperty+%3FhasValue+}%0D%0A++UNION%0D%0A++{+%3FisValueOf+%3Fproperty+%3Chttp%3A%2F%2Fpurl.org%2Fnet%2Fnlprepository%2Ftest%23report_1460%3E+}%0D%0A}%0D%0AORDER+BY+%28!BOUND%28%3FhasValue%29%29+%3Fproperty+%3FhasValue+%3FisValueOf%0D%0A&format=text%2Fhtml&debug=on&timeout= Concepts from a sample radiology report]<br />
| [http://dbmi-icode-01.dbmi.pitt.edu:8080/sparql http://dbmi-icode-01.dbmi.pitt.edu:8080/sparql]<br />
|-<br />
|}<br />
<br />
[[File:2010-12-04_lodd_cloud.png|600px]]<br />
<br />
A graph of some of the LODD datasets (dark grey), related biomedical datasets (light grey), related general-purpose datasets (white) and their interconnections. Line weights correspond to the number of links. The direction of an arrow indicates the dataset that contains the links, e.g., an arrow from A to B means that dataset A contains RDF triples that use identifiers from B. Bidirectional arrows usually indicate that the links are mirrored in both datasets.<br />
More on the interlinking methodology and statistics can be found on the [[../Interlinking|Interlinking]] page.<br />
<br />
The LODD datasets have been crawled by the SWSE Semantic Web search engine and can be accessed via a faceted browsing interface at [http://visinav.deri.org/hcls/] ([http://visinav.deri.org/hcls/list?keyword=varenicline Example query: Varenicline]).<br />
<br />
Most of the LODD datasets have also been integrated into the SPARQL endpoint of the HCLS Knowledge Base, see [http://esw.w3.org/topic/HCLSIG_BioRDF_Subgroup/DERI_HCLS_KB the wiki page of the HCLS KB] for further information.<br />
<br />
=== Bio2RDF Data Sets ===<br />
<br />
The [http://bio2rdf.org/ Bio2RDF project] has published 40 biology-, gene- and medical-related datasets (altogether 2.3 billion triples). <br />
The datasets are available via SPARQL endpoints and as Linked Data. It is recommended that you use the [http://sourceforge.net/project/platformdownload.php?group_id=142631 Bio2RDF Java Servlet], and optionally [http://quebec.bio2rdf.org/download/virtuoso/indexed/ download the databases] for efficient personal use. Running your own instance of the [http://virtuoso.openlinksw.com/wiki/main/Main/VirtEC2AMIBio2rdfInstall OpenLink Virtuoso AMI for EC2] is also an option (and for basic URI resolution doesn't require the Java Servlet, although if you want advanced queries you should still download it and configure it to query your EC2 sparql endpoint).<br />
<br />
* [http://www.freebase.com/view/user/bio2rdf/public/sparql Bio2RDF sparql endpoint list] [http://rdf.freebase.com/rdf/user/bio2rdf/public/sparql Sparql endpoint list in RDF]<br />
* [http://linkeddata.openlinksw.com:8891/pubmed:10500064 Identification of an autoimmune enteropathy-related 75-kilodalton antigen], via an [[OpenLinkSoftware|OpenLink]] hosted edition of [[Bio2Rdf]] <br />
* [http://linkeddata.openlinksw.com:8891/pubmed:9636670 Structure of the gene encoding the human cyclin-dependent kinase inhibitor p18 and mutational analysis in breast cancer], via an [[OpenLinkSoftware|OpenLink]] hosted edition of [[Bio2Rdf]] <br />
* [http://beckr.org/marbles?uri=http%3A%2F%2Fbio2rdf.org%2Fpubmed%3A9626117 PubMed article] viewed using the Marbles Linked Data browser.<br />
* [http://beckr.org/marbles?lang=en&uri=http%3A%2F%2Fbio2rdf.org%2Ffoaf%3AClemens%2C_T_L PubMed author] viewed using the Marbles Linked Data browser.<br />
* [http://beckr.org/marbles?uri=http%3A%2F%2Fbio2rdf.org%2Fomim%3A161555 OMIM Killer Cell Lectin-Like Receptor] viewed using the Marbles Linked Data browser.<br />
* [http://iws.seu.edu.cn/services/falcons/objectsearch/queryresult.jsp?query=%22KILLER+CELL%22 Falcons Search for KILLER CELL]. The Bio2RDF data has been crawled by the Falcons Semantic Web Search engine. This is an example on how the data is accessed by humans using the search engine. Falcons also offers an API that can by used by applications to access the data.<br />
<br />
=== Chem2bio2RDF ===<br />
<br />
* Information about the [http://chem2bio2rdf.org/ chem2bio2rdf] data sets<br />
<br />
=== Data Sets for the LODD Task ===<br />
<br />
To complement the drug-related Web of Data build by the LODD effort, the following data sets could/should also be published as Linked Data.<br />
<br />
The LODD effort is currently gathering more information about relevant datasets. See also [[/DataSetEvaluation|Evaluation of LODD Data Sets]] for current evaluation results.<br />
<br />
* [http://library.dialog.com/bluesheets/html/bl0107.html Adis R&D Insight]<br />
* [http://www.ebi.ac.uk/chebi/ chEBI]<br />
* [http://xpdb.nist.gov/pdb/chemblast.html ChemBlast]<br />
* [http://www.chemspider.com/ ChemSpider]<br />
* [http://ClinicalTrials.gov ClinicalTrials.gov]<br />
* [http://www.citeline.com/trialtrove.html Citeline TrialTrove]<br />
* [http://dailymed.nlm.nih.gov/dailymed/about.cfm DailyMed]<br />
* [http://dbpedia.org/About DBpedia]<br />
* [http://www.nd.edu/~alb/Publication06/145-HumanDisease_PNAS-14My07-Proc/Suppl/ Diseasome]<br />
* [http://www.drugbank.ca/ Drug Bank]<br />
* [http://www.virtualref.com/abs/72.htm DrugDB]<br />
* [http://www.ncbi.nlm.nih.gov/pubmed/17921997 Drugome]<br />
* [http://lsdis.cs.uga.edu/projects/asdoc/ Drug Ontology]<br />
* [http://scientific.thomsonreuters.com/products/iddb/ Investigational Drug Database] - Proprietary<br />
* [http://www.ovid.com/site/catalog/DataBase/1244.jsp?top=2&mid=3&bottom=7&subsection=10 IMS]<br />
* [http://www.genome.jp/kegg/drug/ KEGG Drug]<br />
* [http://Lillytrials.com LillyTrials]<br />
* [http://www.fda.gov/medwatch/ MedWatch]<br />
* [http://www.fda.gov/cder/ndc/ National Drug Code]<br />
* [http://www.ncbi.nlm.nih.gov/omim/ OMIM]<br />
* [http://www.fda.gov/cder/ob/ Orange Book]<br />
* [http://www.pharmaprojects.com/ Pharmaprojects] - Proprietary<br />
* [http://pubchem.ncbi.nlm.nih.gov/ PubChem]<br />
* [http://www.nlm.nih.gov/research/umls/rxnorm/ RxNorm]<br />
* [http://www.ncbi.nlm.nih.gov/sites/entrez?Db=pubmed&Cmd=ShowDetailView&TermToSearch=15360858 VA NDF-RT]<br />
* Other data sources could include blogs, discussion boards, wikis, etc.<br />
* and.... <br />
** [http://www.who.int/globalatlas/ World Health Organization's Global Health Atlas]<br />
** [http://www.epispider.org/ EpiSPIDER]<br />
** [http://www.accessdata.fda.gov/Scripts/cder/DrugsatFDA/ Drugs@FDA - FDA Approved Drug Products]<br />
** [http://www.drugdigest.org/wps/portal/ddigest DrugDigest]<br />
** [http://humancyc.org/ HumanCyc: Encyclopedia of ''Homo sapiens'' Genes and Metabolism]<br />
** [http://www.alzforum.org/ Alzheimer Research Forum]<br />
** [http://wwwcf.nlm.nih.gov/umlslicense/rxtermApp/rxTerm.cfm RxTerms]<br />
** [http://hudine.neu.edu/ HuDiNe]<br />
** [http://wiki.medpedia.com/Cymbalta Medpedia]<br />
** [http://tcm.lifescience.ntu.edu.tw/ TCMGeneDIT] and [[Media:HCLSIG$$LODD$$Data$TCMGeneDIT_RDF_Dataset_r1.zip|RDF dump]]<br />
** [http://www.tuftsctsi.org/~/media/Files/CTSI/Library%20Files/FCC%20for%20CER%20Rpt%20to%20Pres%20and%20Congress_063009.ashx List of other possible data sources from page 66 onwards]<br />
<br />
=== Alternative Herbal Medicine use case ===<br />
* [[Data/TCMGeneDIT|TCMGeneDIT dataset]]<br />
<br />
=== Identified Based Linkage Points ===<br />
<br />
* INCHIs<br />
* [[PubChem]] Compound ID (CID)<br />
* [[PubChem]] NSC<br />
* Chemical Abstract ID (CAS)<br />
* New Drug Application (NDA)<br />
<br />
=== Data Set Attributes ===<br />
<br />
* Licensing<br />
* Data Format<br />
* Identifiers</div>Rboycehttps://www.w3.org/wiki/index.php?title=HCLSIG/LODD/Data&diff=62437HCLSIG/LODD/Data2012-11-06T20:53:51Z<p>Rboyce: /* LODD-related datasets that the LODD group already made available as Linked Data */</p>
<hr />
<div>__NOTOC__<br />
=== LODD-related datasets that the LODD group already made available as Linked Data ===<br />
<br />
{| border="1" cellpadding="2" cellspacing="0"<br />
| '''Name'''<br />
| '''Topic'''<br />
| '''Short Description'''<br />
| '''Size and coverage'''<br />
| '''Status / Activity'''<br />
| '''Example Instances'''<br />
| '''SPARQL Endpoint'''<br />
|-<br />
| [http://www4.wiwiss.fu-berlin.de/drugbank/ DrugBank]<br />
| Drugs<br />
| [http://www.drugbank.ca/ Drugbank.ca] provides drug (i.e., chemical, pharmacological and pharmaceutical) data with comprehensive drug target (i.e., sequence, structure, and pathway) information ({{doi|10.1093/nar/gkj067}})<br />
| 766,920 triples; 4,800 drugs, 2,500 protein sequences<br />
| updated regularly<br />
| Varenicline [http://beckr.org/marbles?uri=http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugs/DB01273 via Marbles], [http://demo.openlinksw.com/ode/?uri=http%3A%2F%2Fwww4.wiwiss.fu-berlin.de%2Fdrugbank%2Fresource%2Fdrugs%2FDB01273 via OpenLink Data Explorer] <br />
| http://www4.wiwiss.fu-berlin.de/drugbank/sparql <br />
|-<br />
| [http://linkedct.org/ LinkedCT]<br />
| Clinical Trials<br />
| Linked data source of trials from [http://clinicaltrials.gov ClinicalTrials.gov]<br />
| ~25 million triples, 106,000 trials (as of April 2011)<br />
| Updated automatically at all times, refer to [http://linkedct.org/faq/ FAQ] for more details.<br />
| [http://data.linkedct.org/resource/condition/breast-cancer/ Breast Cancer] (Condition), a [http://data.linkedct.org/resource/trial/nct00999557/ NCT00999557] (Trial), [http://data.linkedct.org/resource/city/toronto/ Toronto] (City). <br />
| http://data.linkedct.org/sparql<br />
|-<br />
| [http://www4.wiwiss.fu-berlin.de/dailymed/ DailyMed]<br />
| Drugs<br />
| [http://dailymed.nlm.nih.gov/dailymed/about.cfm dailymed.nlm.nih.gov] provides information about approved prescription drugs, includes FDA approved labels (package inserts)<br />
| 164,276 triples; 4,039 drugs<br />
| possibly subsumed by LinkedSPLs (see below)<br />
| "Sterile Water (Irrigant)" [http://beckr.org/marbles?uri=http://www4.wiwiss.fu-berlin.de/dailymed/resource/drugs/492 via Marbles], [http://demo.openlinksw.com/ode/?uri=http%3A%2F%2Fwww4.wiwiss.fu-berlin.de%2Fdailymed%2Fresource%2Fdrugs%2F492 via OpenLink Data Explorer]<br />
| http://www4.wiwiss.fu-berlin.de/dailymed/sparql<br />
|-<br />
| [http://dbpedia.org/About DBpedia]<br />
| Drugs/ Diseases/ Proteins<br />
| RDF data about 2.49 million things that has been extracted from Wikipedia<br />
| 218 million RDF triples; 2,300 drugs, 2,200 proteins<br />
| updated every 3 months <br />
| [http://dbpedia.org/resource/Aspirin Aspirin], [http://dbpedia.org/resource/HIV HIV]<br />
| http://dbpedia.org/sparql<br />
|-<br />
| [http://www4.wiwiss.fu-berlin.de/diseasome/ Diseasome]<br />
| Diseases / Genes<br />
| [http://www.nd.edu/~alb/Publication06/145-HumanDisease_PNAS-14My07-Proc/Suppl/ Diseasome] describes characteristics of disorders and disease genes linked by known disorder–gene associations<br />
| 91,182 triples; 2,600 genes<br />
| updated 2006<br />
| Alzheimer's [http://beckr.org/marbles?uri=http://www4.wiwiss.fu-berlin.de/diseasome/resource/diseases/74 via Marbles], [http://demo.openlinksw.com/ode/?uri=http%3A%2F%2Fwww4.wiwiss.fu-berlin.de%2Fdiseasome%2Fresource%2Fdiseases%2F74 via OpenLink Data Explorer]<br />
| http://www4.wiwiss.fu-berlin.de/diseasome/sparql<br />
|-<br />
| [http://code.google.com/p/junsbriefcase/wiki/TGDdataset RDF-TCM]<br />
| Genes / Diseases / Medicine / Ingredients<br />
| Traditional Chinese medicine, gene and disease association dataset and a linkset mapping TCM gene symbols to Extrez Gene IDs created by Neurocommons <br />
| 117,643 <br />
| updated August 2009 (stable)<br />
| [http://purl.org/net/tcm/tcm.lifescience.ntu.edu.tw/id/medicine/Ginkgo_biloba Ginkgo biloba] <br />
| http://hcls.deri.org/sparql; graph name: http://hcls.deri.org/resource/graph/tcm<br />
|-<br />
| [http://www.nlm.nih.gov/research/umls/rxnorm/ RxNorm]<br />
| Drugs<br />
| A linked version of the NLM's RxNorm database that connects prescription drugs, ingredients, and NDC through RXCUI a concept unique identifier. RxNorm is a product developed by NIH’s National Library of Medicine. It currently interlinks 12 different drug vocabularies around a unique concept identifier. Due to licensing only six of the drug vocabularies are made available as part of the LODD cloud. This includes: Medical Subject Headings,, Metathesaurus FDA National Drug Code Directory, Metathesaurus FDA Structured Product Labels, National Drug File, RxNorm Vocabulary, Veterans Health Administration National Drug File<br />
Links are provided connecting RxNorm to drug bank and to the UMLS.<br />
|over 7.7 million triples; 165,806 RXCUI (Concept Unique Identifiers) Unique drugs and ingredients; 332,754 RXAUI (Atomic Unique Identifiers) sourced terms<br />
| Based on 3/2010 Rxnorm Release; Last updated 5/2010<br />
| [http://link.informatics.stonybrook.edu/rxnorm/RXAUI/2994963 Singulair from the Metathesaurus FDA Structured Product Labels]<br />
| http://link.informatics.stonybrook.edu/sparql/<br />
|-<br />
| [http://www4.wiwiss.fu-berlin.de/sider/ SIDER]<br />
| Diseases / Side Effects<br />
| [http://sideeffects.embl.de/ SIDER] contains information on marketed drugs and their adverse effects ({{doi|10.1038/msb.2009.98}})<br />
| 192,515 triples; 63,000 adverse effect reports, 1,737 genes<br />
| updated 2009<br />
| Confusion [http://beckr.org/marbles?uri=http://www4.wiwiss.fu-berlin.de/sider/resource/side_effects/C0009676 via Marbles]<br />
| http://www4.wiwiss.fu-berlin.de/sider/sparql<br />
|-<br />
| [http://www4.wiwiss.fu-berlin.de/stitch/ STITCH]<br />
| Chemicals / Proteins<br />
| [http://stitch.embl.de/ STITCH] contains information on chemicals, proteins, and their interactions ({{doi|10.1093/nar/gkm795}})<br />
| 7,500,000 chemicals; 500,000 proteins; 370 organisms <br />
| updated July 2009<br />
| Lactose [http://beckr.org/marbles?uri=http://www4.wiwiss.fu-berlin.de/stitch/resource/chemicals/CID000000294 via Marbles]<br />
| http://www4.wiwiss.fu-berlin.de/stitch/sparql<br />
|-<br />
| [http://www4.wiwiss.fu-berlin.de/medicare/ Medicare]<br />
| Medicare Formulary<br />
| xxx<br />
| xxx<br />
| xxx<br />
| xxx<br />
| http://www4.wiwiss.fu-berlin.de/medicare/sparql<br />
|-<br />
| [http://www.ebi.ac.uk/chembl/ ChEMBL]<br />
| Chemical / Assays (Proteins, Organisms) / Papers<br />
| [http://www.ebi.ac.uk/chembl/ ChEMBL]] contains information on trial drugs with information about activity against targets like but not limited to proteins. All is backed up by and linked to literature. Includes links to Bio2RDF for ChEBI and Uniprot. License: CC-BY-SA.<br />
| ~24M triples<br />
| Updated 2010-01<br />
| A [http://rdf.farmbio.uu.se/chembl/snorql/?describe=http://rdf.farmbio.uu.se/chembl/activity/a2642163 IC50 activity].<br />
| http://rdf.farmbio.uu.se/chembl/sparql<br />
|-<br />
| [http://apps.who.int/ghodata/ WHO's Global Health Observatory (GHO)]<br />
| Infectious Diseases /Demography / Socioeconomic Conditions / Environmental Factors<br />
| Data and statistics for infectious diseases at country, regional, and global levels<br />
| ~3M triples<br />
| Updated 2012-05<br />
| xxx<br />
| http://gho.aksw.org<br />
|-<br />
| [http://www.dbmi.pitt.edu/nlpfront University of Pittsburgh NLP Repository] <br />
| Drugs / Procedures / Diagnoses<br />
| A semantic index of concepts present in 800 full-text clinical notes from the University of Pittsburgh NLP Repository<br />
| 38.664<br />
| Proof of concept -- Updated 02/25/2011<br />
| [http://dbmi-icode-01.dbmi.pitt.edu:8080/sparql?default-graph-uri=&query=SELECT+DISTINCT+%3Fproperty+%3FhasValue+%3FisValueOf%0D%0AWHERE+{%0D%0A++{+%3Chttp%3A%2F%2Fpurl.org%2Fnet%2Fnlprepository%2Ftest%23report_1460%3E+%3Fproperty+%3FhasValue+}%0D%0A++UNION%0D%0A++{+%3FisValueOf+%3Fproperty+%3Chttp%3A%2F%2Fpurl.org%2Fnet%2Fnlprepository%2Ftest%23report_1460%3E+}%0D%0A}%0D%0AORDER+BY+%28!BOUND%28%3FhasValue%29%29+%3Fproperty+%3FhasValue+%3FisValueOf%0D%0A&format=text%2Fhtml&debug=on&timeout= Concepts from a sample radiology report]<br />
| [http://dbmi-icode-01.dbmi.pitt.edu:8080/sparql http://dbmi-icode-01.dbmi.pitt.edu:8080/sparql]<br />
|-<br />
| [http://dbmi-icode-01.dbmi.pitt.edu/dikb-evidence/front-page.html The Drug Interaction Knowledge Base] <br />
| Drugs / Metabolic Inhibition Drug-drug Interactions (DDIs) / Claims and Evidence for drug mechanisms and DDIs<br />
| A D2R server of more than 60 drugs currently in the DIKB<br />
| >41K<br />
| Updated 10/01/2012<br />
| [http://dbmi-icode-01.dbmi.pitt.edu:2020/page/Drugs/paroxetine paroxetine], [http://dbmi-icode-01.dbmi.pitt.edu:2020/page/Drugs/atorvastatin atorvastatin]<br />
| http://dbmi-icode-01.dbmi.pitt.edu:2020/<br />
|-<br />
| [http://purl.org/net/nlprepository/linkedSPLs Linked Structured Product Labels (LinkedSPLs)] <br />
| All FDA-approved Structured Product Labels (SPLs) for currently marketed drugs enhanced with indexing to pharmacogenomics information and NDF-RT drug class assignments<br />
| Data available via a D2R server (sample data), as an RDF dumpt (full data, ntriples), or from Virtuoso RDF Store (contact maintainer)<br />
| 1,604,893 triples, 36,000+ product labels<br />
| Updated every Thursday using information from the DailyMed RSS feed<br />
| [http://thedatahub.org/dataset/linked-structured-product-labels/resource/918071ae-f570-4728-98b7-5b447ab42ab8 SPL for Venlafaxine Hydrochloride (American Health Packaging)]<br />
| http://purl.org/net/nlprepository/linkedSPLs<br />
|}<br />
<br />
[[File:2010-12-04_lodd_cloud.png|600px]]<br />
<br />
A graph of some of the LODD datasets (dark grey), related biomedical datasets (light grey), related general-purpose datasets (white) and their interconnections. Line weights correspond to the number of links. The direction of an arrow indicates the dataset that contains the links, e.g., an arrow from A to B means that dataset A contains RDF triples that use identifiers from B. Bidirectional arrows usually indicate that the links are mirrored in both datasets.<br />
More on the interlinking methodology and statistics can be found on the [[../Interlinking|Interlinking]] page.<br />
<br />
The LODD datasets have been crawled by the SWSE Semantic Web search engine and can be accessed via a faceted browsing interface at [http://visinav.deri.org/hcls/] ([http://visinav.deri.org/hcls/list?keyword=varenicline Example query: Varenicline]).<br />
<br />
Most of the LODD datasets have also been integrated into the SPARQL endpoint of the HCLS Knowledge Base, see [http://esw.w3.org/topic/HCLSIG_BioRDF_Subgroup/DERI_HCLS_KB the wiki page of the HCLS KB] for further information.<br />
<br />
=== Bio2RDF Data Sets ===<br />
<br />
The [http://bio2rdf.org/ Bio2RDF project] has published 40 biology-, gene- and medical-related datasets (altogether 2.3 billion triples). <br />
The datasets are available via SPARQL endpoints and as Linked Data. It is recommended that you use the [http://sourceforge.net/project/platformdownload.php?group_id=142631 Bio2RDF Java Servlet], and optionally [http://quebec.bio2rdf.org/download/virtuoso/indexed/ download the databases] for efficient personal use. Running your own instance of the [http://virtuoso.openlinksw.com/wiki/main/Main/VirtEC2AMIBio2rdfInstall OpenLink Virtuoso AMI for EC2] is also an option (and for basic URI resolution doesn't require the Java Servlet, although if you want advanced queries you should still download it and configure it to query your EC2 sparql endpoint).<br />
<br />
* [http://www.freebase.com/view/user/bio2rdf/public/sparql Bio2RDF sparql endpoint list] [http://rdf.freebase.com/rdf/user/bio2rdf/public/sparql Sparql endpoint list in RDF]<br />
* [http://linkeddata.openlinksw.com:8891/pubmed:10500064 Identification of an autoimmune enteropathy-related 75-kilodalton antigen], via an [[OpenLinkSoftware|OpenLink]] hosted edition of [[Bio2Rdf]] <br />
* [http://linkeddata.openlinksw.com:8891/pubmed:9636670 Structure of the gene encoding the human cyclin-dependent kinase inhibitor p18 and mutational analysis in breast cancer], via an [[OpenLinkSoftware|OpenLink]] hosted edition of [[Bio2Rdf]] <br />
* [http://beckr.org/marbles?uri=http%3A%2F%2Fbio2rdf.org%2Fpubmed%3A9626117 PubMed article] viewed using the Marbles Linked Data browser.<br />
* [http://beckr.org/marbles?lang=en&uri=http%3A%2F%2Fbio2rdf.org%2Ffoaf%3AClemens%2C_T_L PubMed author] viewed using the Marbles Linked Data browser.<br />
* [http://beckr.org/marbles?uri=http%3A%2F%2Fbio2rdf.org%2Fomim%3A161555 OMIM Killer Cell Lectin-Like Receptor] viewed using the Marbles Linked Data browser.<br />
* [http://iws.seu.edu.cn/services/falcons/objectsearch/queryresult.jsp?query=%22KILLER+CELL%22 Falcons Search for KILLER CELL]. The Bio2RDF data has been crawled by the Falcons Semantic Web Search engine. This is an example on how the data is accessed by humans using the search engine. Falcons also offers an API that can by used by applications to access the data.<br />
<br />
=== Chem2bio2RDF ===<br />
<br />
* Information about the [http://chem2bio2rdf.org/ chem2bio2rdf] data sets<br />
<br />
=== Data Sets for the LODD Task ===<br />
<br />
To complement the drug-related Web of Data build by the LODD effort, the following data sets could/should also be published as Linked Data.<br />
<br />
The LODD effort is currently gathering more information about relevant datasets. See also [[/DataSetEvaluation|Evaluation of LODD Data Sets]] for current evaluation results.<br />
<br />
* [http://library.dialog.com/bluesheets/html/bl0107.html Adis R&D Insight]<br />
* [http://www.ebi.ac.uk/chebi/ chEBI]<br />
* [http://xpdb.nist.gov/pdb/chemblast.html ChemBlast]<br />
* [http://www.chemspider.com/ ChemSpider]<br />
* [http://ClinicalTrials.gov ClinicalTrials.gov]<br />
* [http://www.citeline.com/trialtrove.html Citeline TrialTrove]<br />
* [http://dailymed.nlm.nih.gov/dailymed/about.cfm DailyMed]<br />
* [http://dbpedia.org/About DBpedia]<br />
* [http://www.nd.edu/~alb/Publication06/145-HumanDisease_PNAS-14My07-Proc/Suppl/ Diseasome]<br />
* [http://www.drugbank.ca/ Drug Bank]<br />
* [http://www.virtualref.com/abs/72.htm DrugDB]<br />
* [http://www.ncbi.nlm.nih.gov/pubmed/17921997 Drugome]<br />
* [http://lsdis.cs.uga.edu/projects/asdoc/ Drug Ontology]<br />
* [http://scientific.thomsonreuters.com/products/iddb/ Investigational Drug Database] - Proprietary<br />
* [http://www.ovid.com/site/catalog/DataBase/1244.jsp?top=2&mid=3&bottom=7&subsection=10 IMS]<br />
* [http://www.genome.jp/kegg/drug/ KEGG Drug]<br />
* [http://Lillytrials.com LillyTrials]<br />
* [http://www.fda.gov/medwatch/ MedWatch]<br />
* [http://www.fda.gov/cder/ndc/ National Drug Code]<br />
* [http://www.ncbi.nlm.nih.gov/omim/ OMIM]<br />
* [http://www.fda.gov/cder/ob/ Orange Book]<br />
* [http://www.pharmaprojects.com/ Pharmaprojects] - Proprietary<br />
* [http://pubchem.ncbi.nlm.nih.gov/ PubChem]<br />
* [http://www.nlm.nih.gov/research/umls/rxnorm/ RxNorm]<br />
* [http://www.ncbi.nlm.nih.gov/sites/entrez?Db=pubmed&Cmd=ShowDetailView&TermToSearch=15360858 VA NDF-RT]<br />
* Other data sources could include blogs, discussion boards, wikis, etc.<br />
* and.... <br />
** [http://www.who.int/globalatlas/ World Health Organization's Global Health Atlas]<br />
** [http://www.epispider.org/ EpiSPIDER]<br />
** [http://www.accessdata.fda.gov/Scripts/cder/DrugsatFDA/ Drugs@FDA - FDA Approved Drug Products]<br />
** [http://www.drugdigest.org/wps/portal/ddigest DrugDigest]<br />
** [http://humancyc.org/ HumanCyc: Encyclopedia of ''Homo sapiens'' Genes and Metabolism]<br />
** [http://www.alzforum.org/ Alzheimer Research Forum]<br />
** [http://wwwcf.nlm.nih.gov/umlslicense/rxtermApp/rxTerm.cfm RxTerms]<br />
** [http://hudine.neu.edu/ HuDiNe]<br />
** [http://wiki.medpedia.com/Cymbalta Medpedia]<br />
** [http://tcm.lifescience.ntu.edu.tw/ TCMGeneDIT] and [[Media:HCLSIG$$LODD$$Data$TCMGeneDIT_RDF_Dataset_r1.zip|RDF dump]]<br />
** [http://www.tuftsctsi.org/~/media/Files/CTSI/Library%20Files/FCC%20for%20CER%20Rpt%20to%20Pres%20and%20Congress_063009.ashx List of other possible data sources from page 66 onwards]<br />
<br />
=== Alternative Herbal Medicine use case ===<br />
* [[Data/TCMGeneDIT|TCMGeneDIT dataset]]<br />
<br />
=== Identified Based Linkage Points ===<br />
<br />
* INCHIs<br />
* [[PubChem]] Compound ID (CID)<br />
* [[PubChem]] NSC<br />
* Chemical Abstract ID (CAS)<br />
* New Drug Application (NDA)<br />
<br />
=== Data Set Attributes ===<br />
<br />
* Licensing<br />
* Data Format<br />
* Identifiers</div>Rboycehttps://www.w3.org/wiki/index.php?title=HCLSIG/LODD/Data&diff=60034HCLSIG/LODD/Data2012-07-20T11:14:36Z<p>Rboyce: /* LODD-related datasets that the LODD group already made available as Linked Data */</p>
<hr />
<div>__NOTOC__<br />
=== LODD-related datasets that the LODD group already made available as Linked Data ===<br />
<br />
{| border="1" cellpadding="2" cellspacing="0"<br />
| '''Name'''<br />
| '''Topic'''<br />
| '''Short Description'''<br />
| '''Size and coverage'''<br />
| '''Status / Activity'''<br />
| '''Example Instances'''<br />
| '''SPARQL Endpoint'''<br />
|-<br />
| [http://www4.wiwiss.fu-berlin.de/drugbank/ DrugBank]<br />
| Drugs<br />
| [http://www.drugbank.ca/ Drugbank.ca] provides drug (i.e., chemical, pharmacological and pharmaceutical) data with comprehensive drug target (i.e., sequence, structure, and pathway) information ({{doi|10.1093/nar/gkj067}})<br />
| 766,920 triples; 4,800 drugs, 2,500 protein sequences<br />
| updated regularly<br />
| Varenicline [http://beckr.org/marbles?uri=http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugs/DB01273 via Marbles], [http://demo.openlinksw.com/ode/?uri=http%3A%2F%2Fwww4.wiwiss.fu-berlin.de%2Fdrugbank%2Fresource%2Fdrugs%2FDB01273 via OpenLink Data Explorer] <br />
| http://www4.wiwiss.fu-berlin.de/drugbank/sparql <br />
|-<br />
| [http://linkedct.org/ LinkedCT]<br />
| Clinical Trials<br />
| Linked data source of trials from [http://clinicaltrials.gov ClinicalTrials.gov]<br />
| ~25 million triples, 106,000 trials (as of April 2011)<br />
| Updated automatically at all times, refer to [http://linkedct.org/faq/ FAQ] for more details.<br />
| [http://data.linkedct.org/resource/condition/breast-cancer/ Breast Cancer] (Condition), a [http://data.linkedct.org/resource/trial/nct00999557/ NCT00999557] (Trial), [http://data.linkedct.org/resource/city/toronto/ Toronto] (City). <br />
| http://data.linkedct.org/sparql<br />
|-<br />
| [http://www4.wiwiss.fu-berlin.de/dailymed/ DailyMed]<br />
| Drugs<br />
| [http://dailymed.nlm.nih.gov/dailymed/about.cfm dailymed.nlm.nih.gov] provides information about approved prescription drugs, includes FDA approved labels (package inserts)<br />
| 164,276 triples; 4,039 drugs<br />
| updated regularly<br />
| "Sterile Water (Irrigant)" [http://beckr.org/marbles?uri=http://www4.wiwiss.fu-berlin.de/dailymed/resource/drugs/492 via Marbles], [http://demo.openlinksw.com/ode/?uri=http%3A%2F%2Fwww4.wiwiss.fu-berlin.de%2Fdailymed%2Fresource%2Fdrugs%2F492 via OpenLink Data Explorer]<br />
| http://www4.wiwiss.fu-berlin.de/dailymed/sparql<br />
|-<br />
| [http://dbpedia.org/About DBpedia]<br />
| Drugs/ Diseases/ Proteins<br />
| RDF data about 2.49 million things that has been extracted from Wikipedia<br />
| 218 million RDF triples; 2,300 drugs, 2,200 proteins<br />
| updated every 3 months <br />
| [http://dbpedia.org/resource/Aspirin Aspirin], [http://dbpedia.org/resource/HIV HIV]<br />
| http://dbpedia.org/sparql<br />
|-<br />
| [http://www4.wiwiss.fu-berlin.de/diseasome/ Diseasome]<br />
| Diseases / Genes<br />
| [http://www.nd.edu/~alb/Publication06/145-HumanDisease_PNAS-14My07-Proc/Suppl/ Diseasome] describes characteristics of disorders and disease genes linked by known disorder–gene associations<br />
| 91,182 triples; 2,600 genes<br />
| updated 2006<br />
| Alzheimer's [http://beckr.org/marbles?uri=http://www4.wiwiss.fu-berlin.de/diseasome/resource/diseases/74 via Marbles], [http://demo.openlinksw.com/ode/?uri=http%3A%2F%2Fwww4.wiwiss.fu-berlin.de%2Fdiseasome%2Fresource%2Fdiseases%2F74 via OpenLink Data Explorer]<br />
| http://www4.wiwiss.fu-berlin.de/diseasome/sparql<br />
|-<br />
| [http://code.google.com/p/junsbriefcase/wiki/TGDdataset RDF-TCM]<br />
| Genes / Diseases / Medicine / Ingredients<br />
| Traditional Chinese medicine, gene and disease association dataset and a linkset mapping TCM gene symbols to Extrez Gene IDs created by Neurocommons <br />
| 117,643 <br />
| updated August 2009 (stable)<br />
| [http://purl.org/net/tcm/tcm.lifescience.ntu.edu.tw/id/medicine/Ginkgo_biloba Ginkgo biloba] <br />
| http://hcls.deri.org/sparql; graph name: http://hcls.deri.org/resource/graph/tcm<br />
|-<br />
| [http://www.nlm.nih.gov/research/umls/rxnorm/ RxNorm]<br />
| Drugs<br />
| A linked version of the NLM's RxNorm database that connects prescription drugs, ingredients, and NDC through RXCUI a concept unique identifier. RxNorm is a product developed by NIH’s National Library of Medicine. It currently interlinks 12 different drug vocabularies around a unique concept identifier. Due to licensing only six of the drug vocabularies are made available as part of the LODD cloud. This includes: Medical Subject Headings,, Metathesaurus FDA National Drug Code Directory, Metathesaurus FDA Structured Product Labels, National Drug File, RxNorm Vocabulary, Veterans Health Administration National Drug File<br />
Links are provided connecting RxNorm to drug bank and to the UMLS.<br />
|over 7.7 million triples; 165,806 RXCUI (Concept Unique Identifiers) Unique drugs and ingredients; 332,754 RXAUI (Atomic Unique Identifiers) sourced terms<br />
| Based on 3/2010 Rxnorm Release; Last updated 5/2010<br />
| [http://link.informatics.stonybrook.edu/rxnorm/RXAUI/2994963 Singulair from the Metathesaurus FDA Structured Product Labels]<br />
| http://link.informatics.stonybrook.edu/sparql/<br />
|-<br />
| [http://www4.wiwiss.fu-berlin.de/sider/ SIDER]<br />
| Diseases / Side Effects<br />
| [http://sideeffects.embl.de/ SIDER] contains information on marketed drugs and their adverse effects ({{doi|10.1038/msb.2009.98}})<br />
| 192,515 triples; 63,000 adverse effect reports, 1,737 genes<br />
| updated 2009<br />
| Confusion [http://beckr.org/marbles?uri=http://www4.wiwiss.fu-berlin.de/sider/resource/side_effects/C0009676 via Marbles]<br />
| http://www4.wiwiss.fu-berlin.de/sider/sparql<br />
|-<br />
| [http://www4.wiwiss.fu-berlin.de/stitch/ STITCH]<br />
| Chemicals / Proteins<br />
| [http://stitch.embl.de/ STITCH] contains information on chemicals, proteins, and their interactions ({{doi|10.1093/nar/gkm795}})<br />
| 7,500,000 chemicals; 500,000 proteins; 370 organisms <br />
| updated July 2009<br />
| Lactose [http://beckr.org/marbles?uri=http://www4.wiwiss.fu-berlin.de/stitch/resource/chemicals/CID000000294 via Marbles]<br />
| http://www4.wiwiss.fu-berlin.de/stitch/sparql<br />
|-<br />
| [http://www4.wiwiss.fu-berlin.de/medicare/ Medicare]<br />
| Medicare Formulary<br />
| xxx<br />
| xxx<br />
| xxx<br />
| xxx<br />
| http://www4.wiwiss.fu-berlin.de/medicare/sparql<br />
|-<br />
| [http://www.ebi.ac.uk/chembl/ ChEMBL]<br />
| Chemical / Assays (Proteins, Organisms) / Papers<br />
| [http://www.ebi.ac.uk/chembl/ ChEMBL]] contains information on trial drugs with information about activity against targets like but not limited to proteins. All is backed up by and linked to literature. Includes links to Bio2RDF for ChEBI and Uniprot. License: CC-BY-SA.<br />
| ~24M triples<br />
| Updated 2010-01<br />
| A [http://rdf.farmbio.uu.se/chembl/snorql/?describe=http://rdf.farmbio.uu.se/chembl/activity/a2642163 IC50 activity].<br />
| http://rdf.farmbio.uu.se/chembl/sparql<br />
|-<br />
| [http://apps.who.int/ghodata/ WHO's Global Health Observatory (GHO)]<br />
| Infectious Diseases /Demography / Socioeconomic Conditions / Environmental Factors<br />
| Data and statistics for infectious diseases at country, regional, and global levels<br />
| ~3M triples<br />
| Updated 2012-05<br />
| xxx<br />
| http://gho.aksw.org<br />
|-<br />
| [http://nlp.dbmi.pitt.edu/nlprepository.html University of Pittsburgh NLP Repository] <br />
| Drugs / Procedures / Diagnoses<br />
| A semantic index of concepts present in 800 full-text clinical notes from the University of Pittsburgh NLP Repository<br />
| 38.664<br />
| Proof of concept -- Updated 02/25/2011<br />
| [http://dbmi-icode-01.dbmi.pitt.edu:8080/sparql?default-graph-uri=&query=SELECT+DISTINCT+%3Fproperty+%3FhasValue+%3FisValueOf%0D%0AWHERE+{%0D%0A++{+%3Chttp%3A%2F%2Fpurl.org%2Fnet%2Fnlprepository%2Ftest%23report_1460%3E+%3Fproperty+%3FhasValue+}%0D%0A++UNION%0D%0A++{+%3FisValueOf+%3Fproperty+%3Chttp%3A%2F%2Fpurl.org%2Fnet%2Fnlprepository%2Ftest%23report_1460%3E+}%0D%0A}%0D%0AORDER+BY+%28!BOUND%28%3FhasValue%29%29+%3Fproperty+%3FhasValue+%3FisValueOf%0D%0A&format=text%2Fhtml&debug=on&timeout= Concepts from a sample radiology report]<br />
| [http://dbmi-icode-01.dbmi.pitt.edu:8080/sparql http://dbmi-icode-01.dbmi.pitt.edu:8080/sparql]<br />
|-<br />
| [http://dbmi-icode-01.dbmi.pitt.edu/dikb-evidence/front-page.html The Drug Interaction Knowledge Base] <br />
| Drugs / Metabolic Inhibition Drug-drug Interactions (DDIs) / Claims and Evidence for drug mechanisms and DDIs<br />
| A D2R server of more than 60 drugs currently in the DIKB<br />
| 41,480<br />
| Updated 03/01/2012<br />
| [http://dbmi-icode-01.dbmi.pitt.edu:2020/page/Drugs/paroxetine paroxetine], [http://dbmi-icode-01.dbmi.pitt.edu:2020/page/Drugs/atorvastatin atorvastatin]<br />
| http://dbmi-icode-01.dbmi.pitt.edu:2020/<br />
|-<br />
| [http://purl.org/net/nlprepository/linkedSPLs Linked Structured Product Labels] <br />
| Selected sections from the FDA-approved Structured Product Labels (SPLs) for currently marketed drugs<br />
| A D2R server rendering properly encoded unstructured text from the SPLs<br />
| 470,000 triples, 4000+ drugs, 17,000+ product labels<br />
| Updated every Thursday using information from the DailyMed RSS feed<br />
| [http://thedatahub.org/dataset/linked-structured-product-labels/resource/918071ae-f570-4728-98b7-5b447ab42ab8 SPL for Venlafaxine Hydrochloride (American Health Packaging)]<br />
| http://purl.org/net/nlprepository/linkedSPLs<br />
|}<br />
<br />
[[File:2010-12-04_lodd_cloud.png|600px]]<br />
<br />
A graph of some of the LODD datasets (dark grey), related biomedical datasets (light grey), related general-purpose datasets (white) and their interconnections. Line weights correspond to the number of links. The direction of an arrow indicates the dataset that contains the links, e.g., an arrow from A to B means that dataset A contains RDF triples that use identifiers from B. Bidirectional arrows usually indicate that the links are mirrored in both datasets.<br />
More on the interlinking methodology and statistics can be found on the [[../Interlinking|Interlinking]] page.<br />
<br />
The LODD datasets have been crawled by the SWSE Semantic Web search engine and can be accessed via a faceted browsing interface at [http://visinav.deri.org/hcls/] ([http://visinav.deri.org/hcls/list?keyword=varenicline Example query: Varenicline]).<br />
<br />
Most of the LODD datasets have also been integrated into the SPARQL endpoint of the HCLS Knowledge Base, see [http://esw.w3.org/topic/HCLSIG_BioRDF_Subgroup/DERI_HCLS_KB the wiki page of the HCLS KB] for further information.<br />
<br />
=== Bio2RDF Data Sets ===<br />
<br />
The [http://bio2rdf.org/ Bio2RDF project] has published 40 biology-, gene- and medical-related datasets (altogether 2.3 billion triples). <br />
The datasets are available via SPARQL endpoints and as Linked Data. It is recommended that you use the [http://sourceforge.net/project/platformdownload.php?group_id=142631 Bio2RDF Java Servlet], and optionally [http://quebec.bio2rdf.org/download/virtuoso/indexed/ download the databases] for efficient personal use. Running your own instance of the [http://virtuoso.openlinksw.com/wiki/main/Main/VirtEC2AMIBio2rdfInstall OpenLink Virtuoso AMI for EC2] is also an option (and for basic URI resolution doesn't require the Java Servlet, although if you want advanced queries you should still download it and configure it to query your EC2 sparql endpoint).<br />
<br />
* [http://www.freebase.com/view/user/bio2rdf/public/sparql Bio2RDF sparql endpoint list] [http://rdf.freebase.com/rdf/user/bio2rdf/public/sparql Sparql endpoint list in RDF]<br />
* [http://linkeddata.openlinksw.com:8891/pubmed:10500064 Identification of an autoimmune enteropathy-related 75-kilodalton antigen], via an [[OpenLinkSoftware|OpenLink]] hosted edition of [[Bio2Rdf]] <br />
* [http://linkeddata.openlinksw.com:8891/pubmed:9636670 Structure of the gene encoding the human cyclin-dependent kinase inhibitor p18 and mutational analysis in breast cancer], via an [[OpenLinkSoftware|OpenLink]] hosted edition of [[Bio2Rdf]] <br />
* [http://beckr.org/marbles?uri=http%3A%2F%2Fbio2rdf.org%2Fpubmed%3A9626117 PubMed article] viewed using the Marbles Linked Data browser.<br />
* [http://beckr.org/marbles?lang=en&uri=http%3A%2F%2Fbio2rdf.org%2Ffoaf%3AClemens%2C_T_L PubMed author] viewed using the Marbles Linked Data browser.<br />
* [http://beckr.org/marbles?uri=http%3A%2F%2Fbio2rdf.org%2Fomim%3A161555 OMIM Killer Cell Lectin-Like Receptor] viewed using the Marbles Linked Data browser.<br />
* [http://iws.seu.edu.cn/services/falcons/objectsearch/queryresult.jsp?query=%22KILLER+CELL%22 Falcons Search for KILLER CELL]. The Bio2RDF data has been crawled by the Falcons Semantic Web Search engine. This is an example on how the data is accessed by humans using the search engine. Falcons also offers an API that can by used by applications to access the data.<br />
<br />
=== Chem2bio2RDF ===<br />
<br />
* Information about the [http://chem2bio2rdf.org/ chem2bio2rdf] data sets<br />
<br />
=== Data Sets for the LODD Task ===<br />
<br />
To complement the drug-related Web of Data build by the LODD effort, the following data sets could/should also be published as Linked Data.<br />
<br />
The LODD effort is currently gathering more information about relevant datasets. See also [[/DataSetEvaluation|Evaluation of LODD Data Sets]] for current evaluation results.<br />
<br />
* [http://library.dialog.com/bluesheets/html/bl0107.html Adis R&D Insight]<br />
* [http://www.ebi.ac.uk/chebi/ chEBI]<br />
* [http://xpdb.nist.gov/pdb/chemblast.html ChemBlast]<br />
* [http://www.chemspider.com/ ChemSpider]<br />
* [http://ClinicalTrials.gov ClinicalTrials.gov]<br />
* [http://www.citeline.com/trialtrove.html Citeline TrialTrove]<br />
* [http://dailymed.nlm.nih.gov/dailymed/about.cfm DailyMed]<br />
* [http://dbpedia.org/About DBpedia]<br />
* [http://www.nd.edu/~alb/Publication06/145-HumanDisease_PNAS-14My07-Proc/Suppl/ Diseasome]<br />
* [http://www.drugbank.ca/ Drug Bank]<br />
* [http://www.virtualref.com/abs/72.htm DrugDB]<br />
* [http://www.ncbi.nlm.nih.gov/pubmed/17921997 Drugome]<br />
* [http://lsdis.cs.uga.edu/projects/asdoc/ Drug Ontology]<br />
* [http://scientific.thomsonreuters.com/products/iddb/ Investigational Drug Database] - Proprietary<br />
* [http://www.ovid.com/site/catalog/DataBase/1244.jsp?top=2&mid=3&bottom=7&subsection=10 IMS]<br />
* [http://www.genome.jp/kegg/drug/ KEGG Drug]<br />
* [http://Lillytrials.com LillyTrials]<br />
* [http://www.fda.gov/medwatch/ MedWatch]<br />
* [http://www.fda.gov/cder/ndc/ National Drug Code]<br />
* [http://www.ncbi.nlm.nih.gov/omim/ OMIM]<br />
* [http://www.fda.gov/cder/ob/ Orange Book]<br />
* [http://www.pharmaprojects.com/ Pharmaprojects] - Proprietary<br />
* [http://pubchem.ncbi.nlm.nih.gov/ PubChem]<br />
* [http://www.nlm.nih.gov/research/umls/rxnorm/ RxNorm]<br />
* [http://www.ncbi.nlm.nih.gov/sites/entrez?Db=pubmed&Cmd=ShowDetailView&TermToSearch=15360858 VA NDF-RT]<br />
* Other data sources could include blogs, discussion boards, wikis, etc.<br />
* and.... <br />
** [http://www.who.int/globalatlas/ World Health Organization's Global Health Atlas]<br />
** [http://www.epispider.org/ EpiSPIDER]<br />
** [http://www.accessdata.fda.gov/Scripts/cder/DrugsatFDA/ Drugs@FDA - FDA Approved Drug Products]<br />
** [http://www.drugdigest.org/wps/portal/ddigest DrugDigest]<br />
** [http://humancyc.org/ HumanCyc: Encyclopedia of ''Homo sapiens'' Genes and Metabolism]<br />
** [http://www.alzforum.org/ Alzheimer Research Forum]<br />
** [http://wwwcf.nlm.nih.gov/umlslicense/rxtermApp/rxTerm.cfm RxTerms]<br />
** [http://hudine.neu.edu/ HuDiNe]<br />
** [http://wiki.medpedia.com/Cymbalta Medpedia]<br />
** [http://tcm.lifescience.ntu.edu.tw/ TCMGeneDIT] and [[Media:HCLSIG$$LODD$$Data$TCMGeneDIT_RDF_Dataset_r1.zip|RDF dump]]<br />
** [http://www.tuftsctsi.org/~/media/Files/CTSI/Library%20Files/FCC%20for%20CER%20Rpt%20to%20Pres%20and%20Congress_063009.ashx List of other possible data sources from page 66 onwards]<br />
<br />
=== Alternative Herbal Medicine use case ===<br />
* [[Data/TCMGeneDIT|TCMGeneDIT dataset]]<br />
<br />
=== Identified Based Linkage Points ===<br />
<br />
* INCHIs<br />
* [[PubChem]] Compound ID (CID)<br />
* [[PubChem]] NSC<br />
* Chemical Abstract ID (CAS)<br />
* New Drug Application (NDA)<br />
<br />
=== Data Set Attributes ===<br />
<br />
* Licensing<br />
* Data Format<br />
* Identifiers</div>Rboycehttps://www.w3.org/wiki/index.php?title=HCLSIG/Tools&diff=59246HCLSIG/Tools2012-05-30T13:11:24Z<p>Rboyce: /* CSV2RDF4LOD */</p>
<hr />
<div>Know of a tool? Tweet it using #hclstool ! We will add the tool when we periodically aggregate tweets with that hashtag.<br />
<br />
= General support for Semantic Web applications =<br />
<br />
== [http://lod2.eu/WikiArticle/TechnologyStack.html LOD2 Stack] ==<br />
LOD2 stack is a collection of tools contributed by [http://lod2.eu/WikiArticle/Project.html LOD2] members.<br />
<br />
== [http://code.google.com/p/callimachus/ Callimachus] ==<br />
Callimachus is a framework for data-driven applications based on [http://www.w3.org/standards/semanticweb/data Linked Data].<br />
<br />
== [http://clarkparsia.com/pellet/icv Pellet Integrity Constraint Validator] ==<br />
* Pellet ICV is a modified version of Pellet that works with the Closed World Assumption. It can be used to define an ontology that will work as an schema and use Pellet to validate RDF data. An example using SKOS: http://weblog.clarkparsia.com/2010/04/14/pellet-icv-04-release-using-owl-integrity-constraints-to-validate-skos <br />
<br />
== [http://jrdf.sourceforge.net/ JRDF - An RDF Library in Java] ==<br />
* Note from author: "From May the 8th 2011, I've set the status of the project to inactive. Mainly due to lack of interest and contribution - no further development is taking place." <br />
* JRDF is an attempt to create a standard set of APIs and base implementations to RDF (Resource Description Framework) using the latest version of the Java language. <br />
<br />
= Tools to make relational data accessible via SPARQL =<br />
<br />
== [http://sourceforge.net/apps/mediawiki/swobjects/index.php?title=Main_Page SWObjects] ([[/SWObjects|usage]]) ==<br />
* SWObjects uses a query rewriting approach to make SQL data accessible via a SPARQL endpoint.<br />
* SWObjects creates maps from SPARQL Construct statements that act as translation rules from SPARQL to SQL, as well as SPARQL to SPARQL.<br />
* Federation support: Maps will automatically dispatch queries to the appropriate graph/endpoint in federated applications.<br />
<br />
== [http://www4.wiwiss.fu-berlin.de/bizer/d2r-server/ D2R server] ==<br />
<br />
D2R Server is a tool for publishing relational databases on the Semantic Web.<br />
It enables RDF and HTML browsers to navigate the content of the database,<br />
and allows applications to query the database using the SPARQL query language<br />
<br />
== [http://mayor2.dia.fi.upm.es/oeg-upm/index.php/en/downloads/9-r2o-odemapster ODEMapster] ==<br />
<br />
* R2O & ODEMapster is an integrated framework for the formal specification, evaluation, verification and exploitation of the semantic mappings between ontologies and relational databases. <br />
* ODEMapster is a NeOn plugin that offers a GUI for building mappings between a RDBMS and an Ontology. It also offers the possibility of excecuting such mappings and populating the ontology to create a Linked Data KB. <br />
<br />
= Tools that support converting non-relational data to RDF =<br />
<br />
== [http://www.io-informatics.com/products/index.html Sentient Knowledge Explorer] ==<br />
* interactive graphics for selection of desired output for automated SPARQL query builder<br />
* automatic import wizards for mapping from common formats to RDF<br />
<br />
== [http://topquadrant.com/products/TB_Composer.html TopBraid Composer] ==<br />
* has import with automated mappings to RDF from XML with provenance in the Maestro Edition<br />
* is a versatile tool with many features for building and inspecting RDF and OWL, as well as publishing SPARQL access to data<br />
<br />
== [http://lab.linkeddata.deri.ie/2010/grefine-rdf-extension/ RDF plugin] to [http://code.google.com/p/google-refine/ Google Refine] ==<br />
* Good for getting a sense of what’s in the data (or a sample of the data where scale is large).<br />
* Enables reconciliation of data with freebase/sindice/other sparql endpoints such as NCBO<br />
* Enables description of data in terms of predicates retrieved from prefix.cc<br />
* Possibility to specify which ontologies should be used to describe the data<br />
<br />
== [https://github.com/timrdf/csv2rdf4lod-automation/wiki/ CSV2RDF4LOD] ==<br />
* https://github.com/timrdf/csv2rdf4lod-automation/wiki/Examples <br />
* It is implemented to handle arbitrary rows counts and was found to work with data that has 3,949,400 rows<br />
<br />
== [http://www.sysmo-db.org/rightfield Rightfield] == <br />
* from the [http://genetics-ecology.univie.ac.at/sysmo.html EU SysMO project]<br />
* Create spreadsheet templates for input using ontologies as controlled vocabularies. Spreadsheet entries then contain unambiguous identifiers and are easier to convert to RDF.<br />
<br />
== [http://dblab.cs.toronto.edu/project/xcurator/ xCurator ] ==<br />
<br />
* The xCurator project offers an end-to-end framework to transform a semi-structured (XML) source into high-quality Linked Data.<br />
* Used by the new LinkedCT http://linkedct.org/ Thanks to xCurator, the data is now over 25 million triples (previously only 7 million triples), has much higher quality and is up-to-date at all times<br />
* Paper describing the framework and initial results: Linking Semistructured Data on the Web (WebDB2011 at SIGMOD)<br />
* A little demo available online, but the code is still under development and not released yet http://dblab.cs.toronto.edu/project/xcurator/</div>Rboycehttps://www.w3.org/wiki/index.php?title=HCLSIG/LODD/Data&diff=59215HCLSIG/LODD/Data2012-05-26T12:55:29Z<p>Rboyce: /* LODD-related datasets that the LODD group already made available as Linked Data */</p>
<hr />
<div>__NOTOC__<br />
=== LODD-related datasets that the LODD group already made available as Linked Data ===<br />
<br />
{| border="1" cellpadding="2" cellspacing="0"<br />
| '''Name'''<br />
| '''Topic'''<br />
| '''Short Description'''<br />
| '''Size and coverage'''<br />
| '''Status / Activity'''<br />
| '''Example Instances'''<br />
| '''SPARQL Endpoint'''<br />
|-<br />
| [http://www4.wiwiss.fu-berlin.de/drugbank/ DrugBank]<br />
| Drugs<br />
| [http://www.drugbank.ca/ Drugbank.ca] provides drug (i.e., chemical, pharmacological and pharmaceutical) data with comprehensive drug target (i.e., sequence, structure, and pathway) information ({{doi|10.1093/nar/gkj067}})<br />
| 766,920 triples; 4,800 drugs, 2,500 protein sequences<br />
| updated regularly<br />
| Varenicline [http://beckr.org/marbles?uri=http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugs/DB01273 via Marbles], [http://demo.openlinksw.com/ode/?uri=http%3A%2F%2Fwww4.wiwiss.fu-berlin.de%2Fdrugbank%2Fresource%2Fdrugs%2FDB01273 via OpenLink Data Explorer] <br />
| http://www4.wiwiss.fu-berlin.de/drugbank/sparql <br />
|-<br />
| [http://linkedct.org/ LinkedCT]<br />
| Clinical Trials<br />
| Linked data source of trials from [http://clinicaltrials.gov ClinicalTrials.gov]<br />
| ~25 million triples, 106,000 trials (as of April 2011)<br />
| Updated automatically at all times, refer to [http://linkedct.org/faq/ FAQ] for more details.<br />
| [http://data.linkedct.org/resource/condition/breast-cancer/ Breast Cancer] (Condition), a [http://data.linkedct.org/resource/trial/nct00999557/ NCT00999557] (Trial), [http://data.linkedct.org/resource/city/toronto/ Toronto] (City). <br />
| http://data.linkedct.org/sparql<br />
|-<br />
| [http://www4.wiwiss.fu-berlin.de/dailymed/ DailyMed]<br />
| Drugs<br />
| [http://dailymed.nlm.nih.gov/dailymed/about.cfm dailymed.nlm.nih.gov] provides information about approved prescription drugs, includes FDA approved labels (package inserts)<br />
| 164,276 triples; 4,039 drugs<br />
| updated regularly<br />
| "Sterile Water (Irrigant)" [http://beckr.org/marbles?uri=http://www4.wiwiss.fu-berlin.de/dailymed/resource/drugs/492 via Marbles], [http://demo.openlinksw.com/ode/?uri=http%3A%2F%2Fwww4.wiwiss.fu-berlin.de%2Fdailymed%2Fresource%2Fdrugs%2F492 via OpenLink Data Explorer]<br />
| http://www4.wiwiss.fu-berlin.de/dailymed/sparql<br />
|-<br />
| [http://dbpedia.org/About DBpedia]<br />
| Drugs/ Diseases/ Proteins<br />
| RDF data about 2.49 million things that has been extracted from Wikipedia<br />
| 218 million RDF triples; 2,300 drugs, 2,200 proteins<br />
| updated every 3 months <br />
| [http://dbpedia.org/resource/Aspirin Aspirin], [http://dbpedia.org/resource/HIV HIV]<br />
| http://dbpedia.org/sparql<br />
|-<br />
| [http://www4.wiwiss.fu-berlin.de/diseasome/ Diseasome]<br />
| Diseases / Genes<br />
| [http://www.nd.edu/~alb/Publication06/145-HumanDisease_PNAS-14My07-Proc/Suppl/ Diseasome] describes characteristics of disorders and disease genes linked by known disorder–gene associations<br />
| 91,182 triples; 2,600 genes<br />
| updated 2006<br />
| Alzheimer's [http://beckr.org/marbles?uri=http://www4.wiwiss.fu-berlin.de/diseasome/resource/diseases/74 via Marbles], [http://demo.openlinksw.com/ode/?uri=http%3A%2F%2Fwww4.wiwiss.fu-berlin.de%2Fdiseasome%2Fresource%2Fdiseases%2F74 via OpenLink Data Explorer]<br />
| http://www4.wiwiss.fu-berlin.de/diseasome/sparql<br />
|-<br />
| [http://code.google.com/p/junsbriefcase/wiki/TGDdataset RDF-TCM]<br />
| Genes / Diseases / Medicine / Ingredients<br />
| Traditional Chinese medicine, gene and disease association dataset and a linkset mapping TCM gene symbols to Extrez Gene IDs created by Neurocommons <br />
| 117,643 <br />
| updated August 2009 (stable)<br />
| [http://purl.org/net/tcm/tcm.lifescience.ntu.edu.tw/id/medicine/Ginkgo_biloba Ginkgo biloba] <br />
| http://hcls.deri.org/sparql; graph name: http://hcls.deri.org/resource/graph/tcm<br />
|-<br />
| [http://www.nlm.nih.gov/research/umls/rxnorm/ RxNorm]<br />
| Drugs<br />
| A linked version of the NLM's RxNorm database that connects prescription drugs, ingredients, and NDC through RXCUI a concept unique identifier. RxNorm is a product developed by NIH’s National Library of Medicine. It currently interlinks 12 different drug vocabularies around a unique concept identifier. Due to licensing only six of the drug vocabularies are made available as part of the LODD cloud. This includes: Medical Subject Headings,, Metathesaurus FDA National Drug Code Directory, Metathesaurus FDA Structured Product Labels, National Drug File, RxNorm Vocabulary, Veterans Health Administration National Drug File<br />
Links are provided connecting RxNorm to drug bank and to the UMLS.<br />
|over 7.7 million triples; 165,806 RXCUI (Concept Unique Identifiers) Unique drugs and ingredients; 332,754 RXAUI (Atomic Unique Identifiers) sourced terms<br />
| Based on 3/2010 Rxnorm Release; Last updated 5/2010<br />
| [http://link.informatics.stonybrook.edu/rxnorm/RXAUI/2994963 Singulair from the Metathesaurus FDA Structured Product Labels]<br />
| http://link.informatics.stonybrook.edu/sparql/<br />
|-<br />
| [http://www4.wiwiss.fu-berlin.de/sider/ SIDER]<br />
| Diseases / Side Effects<br />
| [http://sideeffects.embl.de/ SIDER] contains information on marketed drugs and their adverse effects ({{doi|10.1038/msb.2009.98}})<br />
| 192,515 triples; 63,000 adverse effect reports, 1,737 genes<br />
| updated 2009<br />
| Confusion [http://beckr.org/marbles?uri=http://www4.wiwiss.fu-berlin.de/sider/resource/side_effects/C0009676 via Marbles]<br />
| http://www4.wiwiss.fu-berlin.de/sider/sparql<br />
|-<br />
| [http://www4.wiwiss.fu-berlin.de/stitch/ STITCH]<br />
| Chemicals / Proteins<br />
| [http://stitch.embl.de/ STITCH] contains information on chemicals, proteins, and their interactions ({{doi|10.1093/nar/gkm795}})<br />
| 7,500,000 chemicals; 500,000 proteins; 370 organisms <br />
| updated July 2009<br />
| Lactose [http://beckr.org/marbles?uri=http://www4.wiwiss.fu-berlin.de/stitch/resource/chemicals/CID000000294 via Marbles]<br />
| http://www4.wiwiss.fu-berlin.de/stitch/sparql<br />
|-<br />
| [http://www4.wiwiss.fu-berlin.de/medicare/ Medicare]<br />
| Medicare Formulary<br />
| xxx<br />
| xxx<br />
| xxx<br />
| xxx<br />
| http://www4.wiwiss.fu-berlin.de/medicare/sparql<br />
|-<br />
| [http://www.ebi.ac.uk/chembl/ ChEMBL]<br />
| Chemical / Assays (Proteins, Organisms) / Papers<br />
| [http://www.ebi.ac.uk/chembl/ ChEMBL]] contains information on trial drugs with information about activity against targets like but not limited to proteins. All is backed up by and linked to literature. Includes links to Bio2RDF for ChEBI and Uniprot. License: CC-BY-SA.<br />
| ~24M triples<br />
| Updated 2010-01<br />
| A [http://rdf.farmbio.uu.se/chembl/snorql/?describe=http://rdf.farmbio.uu.se/chembl/activity/a2642163 IC50 activity].<br />
| http://rdf.farmbio.uu.se/chembl/sparql<br />
|-<br />
| [http://www.who.int/gho/en/index.html WHO Global Health Observatory]<br />
| Infectious Diseases /Demography / Socioeconomic Conditions / Environmental Factors<br />
| Data and statistics for infectious diseases at country, regional, and global levels<br />
| 354300<br />
| Updated 2010-09<br />
| xxx<br />
| http://aksw.org/Projects/Stats2RDF<br />
|-<br />
| [http://nlp.dbmi.pitt.edu/nlprepository.html University of Pittsburgh NLP Repository] <br />
| Drugs / Procedures / Diagnoses<br />
| A semantic index of concepts present in 800 full-text clinical notes from the University of Pittsburgh NLP Repository<br />
| 38.664<br />
| Proof of concept -- Updated 02/25/2011<br />
| [http://tarski.duhs.org:8080/openrdf-workbench/repositories/u-pitt-blulab/explore?resource=bl%3Areport_404 Concepts from a sample radiology report]<br />
| [http://tarski.duhs.org:8080/openrdf-workbench/repositories/u-pitt-blulab/query http://tarski.duhs.org:8080/openrdf-workbench/repositories/u-pitt-blulab/query]<br />
|-<br />
| [http://dbmi-icode-01.dbmi.pitt.edu/dikb-evidence/front-page.html The Drug Interaction Knowledge Base] <br />
| Drugs / Metabolic Inhibition Drug-drug Interactions (DDIs) / Claims and Evidence for drug mechanisms and DDIs<br />
| A D2R server of more than 60 drugs currently in the DIKB<br />
| 41,480<br />
| Updated 03/01/2012<br />
| [http://dbmi-icode-01.dbmi.pitt.edu:2020/page/Drugs/paroxetine paroxetine], [http://dbmi-icode-01.dbmi.pitt.edu:2020/page/Drugs/atorvastatin atorvastatin]<br />
| http://dbmi-icode-01.dbmi.pitt.edu:2020/<br />
|-<br />
| [http://purl.org/net/nlprepository/linkedSPLs Linked Structured Product Labels] <br />
| Selected sections from the FDA-approved Structured Product Labels (SPLs) for currently marketed drugs<br />
| A D2R server rendering properly encoded unstructured text from the SPLs<br />
| 470,000 triples, 4000+ drugs, 17,000+ product labels<br />
| Updated 05/26/2012<br />
| [http://thedatahub.org/dataset/linked-structured-product-labels/resource/918071ae-f570-4728-98b7-5b447ab42ab8 SPL for Venlafaxine Hydrochloride (American Health Packaging)]<br />
| http://purl.org/net/nlprepository/linkedSPLs<br />
|}<br />
<br />
[[File:2010-12-04_lodd_cloud.png|600px]]<br />
<br />
A graph of some of the LODD datasets (dark grey), related biomedical datasets (light grey), related general-purpose datasets (white) and their interconnections. Line weights correspond to the number of links. The direction of an arrow indicates the dataset that contains the links, e.g., an arrow from A to B means that dataset A contains RDF triples that use identifiers from B. Bidirectional arrows usually indicate that the links are mirrored in both datasets.<br />
More on the interlinking methodology and statistics can be found on the [[../Interlinking|Interlinking]] page.<br />
<br />
The LODD datasets have been crawled by the SWSE Semantic Web search engine and can be accessed via a faceted browsing interface at [http://visinav.deri.org/hcls/] ([http://visinav.deri.org/hcls/list?keyword=varenicline Example query: Varenicline]).<br />
<br />
Most of the LODD datasets have also been integrated into the SPARQL endpoint of the HCLS Knowledge Base, see [http://esw.w3.org/topic/HCLSIG_BioRDF_Subgroup/DERI_HCLS_KB the wiki page of the HCLS KB] for further information.<br />
<br />
=== Bio2RDF Data Sets ===<br />
<br />
The [http://bio2rdf.org/ Bio2RDF project] has published 40 biology-, gene- and medical-related datasets (altogether 2.3 billion triples). <br />
The datasets are available via SPARQL endpoints and as Linked Data. It is recommended that you use the [http://sourceforge.net/project/platformdownload.php?group_id=142631 Bio2RDF Java Servlet], and optionally [http://quebec.bio2rdf.org/download/virtuoso/indexed/ download the databases] for efficient personal use. Running your own instance of the [http://virtuoso.openlinksw.com/wiki/main/Main/VirtEC2AMIBio2rdfInstall OpenLink Virtuoso AMI for EC2] is also an option (and for basic URI resolution doesn't require the Java Servlet, although if you want advanced queries you should still download it and configure it to query your EC2 sparql endpoint).<br />
<br />
* [http://www.freebase.com/view/user/bio2rdf/public/sparql Bio2RDF sparql endpoint list] [http://rdf.freebase.com/rdf/user/bio2rdf/public/sparql Sparql endpoint list in RDF]<br />
* [http://linkeddata.openlinksw.com:8891/pubmed:10500064 Identification of an autoimmune enteropathy-related 75-kilodalton antigen], via an [[OpenLinkSoftware|OpenLink]] hosted edition of [[Bio2Rdf]] <br />
* [http://linkeddata.openlinksw.com:8891/pubmed:9636670 Structure of the gene encoding the human cyclin-dependent kinase inhibitor p18 and mutational analysis in breast cancer], via an [[OpenLinkSoftware|OpenLink]] hosted edition of [[Bio2Rdf]] <br />
* [http://beckr.org/marbles?uri=http%3A%2F%2Fbio2rdf.org%2Fpubmed%3A9626117 PubMed article] viewed using the Marbles Linked Data browser.<br />
* [http://beckr.org/marbles?lang=en&uri=http%3A%2F%2Fbio2rdf.org%2Ffoaf%3AClemens%2C_T_L PubMed author] viewed using the Marbles Linked Data browser.<br />
* [http://beckr.org/marbles?uri=http%3A%2F%2Fbio2rdf.org%2Fomim%3A161555 OMIM Killer Cell Lectin-Like Receptor] viewed using the Marbles Linked Data browser.<br />
* [http://iws.seu.edu.cn/services/falcons/objectsearch/queryresult.jsp?query=%22KILLER+CELL%22 Falcons Search for KILLER CELL]. The Bio2RDF data has been crawled by the Falcons Semantic Web Search engine. This is an example on how the data is accessed by humans using the search engine. Falcons also offers an API that can by used by applications to access the data.<br />
<br />
=== Chem2bio2RDF ===<br />
<br />
* Information about the [http://chem2bio2rdf.org/ chem2bio2rdf] data sets<br />
<br />
=== Data Sets for the LODD Task ===<br />
<br />
To complement the drug-related Web of Data build by the LODD effort, the following data sets could/should also be published as Linked Data.<br />
<br />
The LODD effort is currently gathering more information about relevant datasets. See also [[/DataSetEvaluation|Evaluation of LODD Data Sets]] for current evaluation results.<br />
<br />
* [http://library.dialog.com/bluesheets/html/bl0107.html Adis R&D Insight]<br />
* [http://www.ebi.ac.uk/chebi/ chEBI]<br />
* [http://xpdb.nist.gov/pdb/chemblast.html ChemBlast]<br />
* [http://www.chemspider.com/ ChemSpider]<br />
* [http://ClinicalTrials.gov ClinicalTrials.gov]<br />
* [http://www.citeline.com/trialtrove.html Citeline TrialTrove]<br />
* [http://dailymed.nlm.nih.gov/dailymed/about.cfm DailyMed]<br />
* [http://dbpedia.org/About DBpedia]<br />
* [http://www.nd.edu/~alb/Publication06/145-HumanDisease_PNAS-14My07-Proc/Suppl/ Diseasome]<br />
* [http://www.drugbank.ca/ Drug Bank]<br />
* [http://www.virtualref.com/abs/72.htm DrugDB]<br />
* [http://www.ncbi.nlm.nih.gov/pubmed/17921997 Drugome]<br />
* [http://lsdis.cs.uga.edu/projects/asdoc/ Drug Ontology]<br />
* [http://scientific.thomsonreuters.com/products/iddb/ Investigational Drug Database] - Proprietary<br />
* [http://www.ovid.com/site/catalog/DataBase/1244.jsp?top=2&mid=3&bottom=7&subsection=10 IMS]<br />
* [http://www.genome.jp/kegg/drug/ KEGG Drug]<br />
* [http://Lillytrials.com LillyTrials]<br />
* [http://www.fda.gov/medwatch/ MedWatch]<br />
* [http://www.fda.gov/cder/ndc/ National Drug Code]<br />
* [http://www.ncbi.nlm.nih.gov/omim/ OMIM]<br />
* [http://www.fda.gov/cder/ob/ Orange Book]<br />
* [http://www.pharmaprojects.com/ Pharmaprojects] - Proprietary<br />
* [http://pubchem.ncbi.nlm.nih.gov/ PubChem]<br />
* [http://www.nlm.nih.gov/research/umls/rxnorm/ RxNorm]<br />
* [http://www.ncbi.nlm.nih.gov/sites/entrez?Db=pubmed&Cmd=ShowDetailView&TermToSearch=15360858 VA NDF-RT]<br />
* Other data sources could include blogs, discussion boards, wikis, etc.<br />
* and.... <br />
** [http://www.who.int/globalatlas/ World Health Organization's Global Health Atlas]<br />
** [http://www.epispider.org/ EpiSPIDER]<br />
** [http://www.accessdata.fda.gov/Scripts/cder/DrugsatFDA/ Drugs@FDA - FDA Approved Drug Products]<br />
** [http://www.drugdigest.org/wps/portal/ddigest DrugDigest]<br />
** [http://humancyc.org/ HumanCyc: Encyclopedia of ''Homo sapiens'' Genes and Metabolism]<br />
** [http://www.alzforum.org/ Alzheimer Research Forum]<br />
** [http://wwwcf.nlm.nih.gov/umlslicense/rxtermApp/rxTerm.cfm RxTerms]<br />
** [http://hudine.neu.edu/ HuDiNe]<br />
** [http://wiki.medpedia.com/Cymbalta Medpedia]<br />
** [http://tcm.lifescience.ntu.edu.tw/ TCMGeneDIT] and [[Media:HCLSIG$$LODD$$Data$TCMGeneDIT_RDF_Dataset_r1.zip|RDF dump]]<br />
** [http://www.tuftsctsi.org/~/media/Files/CTSI/Library%20Files/FCC%20for%20CER%20Rpt%20to%20Pres%20and%20Congress_063009.ashx List of other possible data sources from page 66 onwards]<br />
<br />
=== Alternative Herbal Medicine use case ===<br />
* [[Data/TCMGeneDIT|TCMGeneDIT dataset]]<br />
<br />
=== Identified Based Linkage Points ===<br />
<br />
* INCHIs<br />
* [[PubChem]] Compound ID (CID)<br />
* [[PubChem]] NSC<br />
* Chemical Abstract ID (CAS)<br />
* New Drug Application (NDA)<br />
<br />
=== Data Set Attributes ===<br />
<br />
* Licensing<br />
* Data Format<br />
* Identifiers</div>Rboycehttps://www.w3.org/wiki/index.php?title=HCLSIG/LODD/Data&diff=59214HCLSIG/LODD/Data2012-05-26T12:54:07Z<p>Rboyce: /* LODD-related datasets that the LODD group already made available as Linked Data */</p>
<hr />
<div>__NOTOC__<br />
=== LODD-related datasets that the LODD group already made available as Linked Data ===<br />
<br />
{| border="1" cellpadding="2" cellspacing="0"<br />
| '''Name'''<br />
| '''Topic'''<br />
| '''Short Description'''<br />
| '''Size and coverage'''<br />
| '''Status / Activity'''<br />
| '''Example Instances'''<br />
| '''SPARQL Endpoint'''<br />
|-<br />
| [http://www4.wiwiss.fu-berlin.de/drugbank/ DrugBank]<br />
| Drugs<br />
| [http://www.drugbank.ca/ Drugbank.ca] provides drug (i.e., chemical, pharmacological and pharmaceutical) data with comprehensive drug target (i.e., sequence, structure, and pathway) information ({{doi|10.1093/nar/gkj067}})<br />
| 766,920 triples; 4,800 drugs, 2,500 protein sequences<br />
| updated regularly<br />
| Varenicline [http://beckr.org/marbles?uri=http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugs/DB01273 via Marbles], [http://demo.openlinksw.com/ode/?uri=http%3A%2F%2Fwww4.wiwiss.fu-berlin.de%2Fdrugbank%2Fresource%2Fdrugs%2FDB01273 via OpenLink Data Explorer] <br />
| http://www4.wiwiss.fu-berlin.de/drugbank/sparql <br />
|-<br />
| [http://linkedct.org/ LinkedCT]<br />
| Clinical Trials<br />
| Linked data source of trials from [http://clinicaltrials.gov ClinicalTrials.gov]<br />
| ~25 million triples, 106,000 trials (as of April 2011)<br />
| Updated automatically at all times, refer to [http://linkedct.org/faq/ FAQ] for more details.<br />
| [http://data.linkedct.org/resource/condition/breast-cancer/ Breast Cancer] (Condition), a [http://data.linkedct.org/resource/trial/nct00999557/ NCT00999557] (Trial), [http://data.linkedct.org/resource/city/toronto/ Toronto] (City). <br />
| http://data.linkedct.org/sparql<br />
|-<br />
| [http://www4.wiwiss.fu-berlin.de/dailymed/ DailyMed]<br />
| Drugs<br />
| [http://dailymed.nlm.nih.gov/dailymed/about.cfm dailymed.nlm.nih.gov] provides information about approved prescription drugs, includes FDA approved labels (package inserts)<br />
| 164,276 triples; 4,039 drugs<br />
| updated regularly<br />
| "Sterile Water (Irrigant)" [http://beckr.org/marbles?uri=http://www4.wiwiss.fu-berlin.de/dailymed/resource/drugs/492 via Marbles], [http://demo.openlinksw.com/ode/?uri=http%3A%2F%2Fwww4.wiwiss.fu-berlin.de%2Fdailymed%2Fresource%2Fdrugs%2F492 via OpenLink Data Explorer]<br />
| http://www4.wiwiss.fu-berlin.de/dailymed/sparql<br />
|-<br />
| [http://dbpedia.org/About DBpedia]<br />
| Drugs/ Diseases/ Proteins<br />
| RDF data about 2.49 million things that has been extracted from Wikipedia<br />
| 218 million RDF triples; 2,300 drugs, 2,200 proteins<br />
| updated every 3 months <br />
| [http://dbpedia.org/resource/Aspirin Aspirin], [http://dbpedia.org/resource/HIV HIV]<br />
| http://dbpedia.org/sparql<br />
|-<br />
| [http://www4.wiwiss.fu-berlin.de/diseasome/ Diseasome]<br />
| Diseases / Genes<br />
| [http://www.nd.edu/~alb/Publication06/145-HumanDisease_PNAS-14My07-Proc/Suppl/ Diseasome] describes characteristics of disorders and disease genes linked by known disorder–gene associations<br />
| 91,182 triples; 2,600 genes<br />
| updated 2006<br />
| Alzheimer's [http://beckr.org/marbles?uri=http://www4.wiwiss.fu-berlin.de/diseasome/resource/diseases/74 via Marbles], [http://demo.openlinksw.com/ode/?uri=http%3A%2F%2Fwww4.wiwiss.fu-berlin.de%2Fdiseasome%2Fresource%2Fdiseases%2F74 via OpenLink Data Explorer]<br />
| http://www4.wiwiss.fu-berlin.de/diseasome/sparql<br />
|-<br />
| [http://code.google.com/p/junsbriefcase/wiki/TGDdataset RDF-TCM]<br />
| Genes / Diseases / Medicine / Ingredients<br />
| Traditional Chinese medicine, gene and disease association dataset and a linkset mapping TCM gene symbols to Extrez Gene IDs created by Neurocommons <br />
| 117,643 <br />
| updated August 2009 (stable)<br />
| [http://purl.org/net/tcm/tcm.lifescience.ntu.edu.tw/id/medicine/Ginkgo_biloba Ginkgo biloba] <br />
| http://hcls.deri.org/sparql; graph name: http://hcls.deri.org/resource/graph/tcm<br />
|-<br />
| [http://www.nlm.nih.gov/research/umls/rxnorm/ RxNorm]<br />
| Drugs<br />
| A linked version of the NLM's RxNorm database that connects prescription drugs, ingredients, and NDC through RXCUI a concept unique identifier. RxNorm is a product developed by NIH’s National Library of Medicine. It currently interlinks 12 different drug vocabularies around a unique concept identifier. Due to licensing only six of the drug vocabularies are made available as part of the LODD cloud. This includes: Medical Subject Headings,, Metathesaurus FDA National Drug Code Directory, Metathesaurus FDA Structured Product Labels, National Drug File, RxNorm Vocabulary, Veterans Health Administration National Drug File<br />
Links are provided connecting RxNorm to drug bank and to the UMLS.<br />
|over 7.7 million triples; 165,806 RXCUI (Concept Unique Identifiers) Unique drugs and ingredients; 332,754 RXAUI (Atomic Unique Identifiers) sourced terms<br />
| Based on 3/2010 Rxnorm Release; Last updated 5/2010<br />
| [http://link.informatics.stonybrook.edu/rxnorm/RXAUI/2994963 Singulair from the Metathesaurus FDA Structured Product Labels]<br />
| http://link.informatics.stonybrook.edu/sparql/<br />
|-<br />
| [http://www4.wiwiss.fu-berlin.de/sider/ SIDER]<br />
| Diseases / Side Effects<br />
| [http://sideeffects.embl.de/ SIDER] contains information on marketed drugs and their adverse effects ({{doi|10.1038/msb.2009.98}})<br />
| 192,515 triples; 63,000 adverse effect reports, 1,737 genes<br />
| updated 2009<br />
| Confusion [http://beckr.org/marbles?uri=http://www4.wiwiss.fu-berlin.de/sider/resource/side_effects/C0009676 via Marbles]<br />
| http://www4.wiwiss.fu-berlin.de/sider/sparql<br />
|-<br />
| [http://www4.wiwiss.fu-berlin.de/stitch/ STITCH]<br />
| Chemicals / Proteins<br />
| [http://stitch.embl.de/ STITCH] contains information on chemicals, proteins, and their interactions ({{doi|10.1093/nar/gkm795}})<br />
| 7,500,000 chemicals; 500,000 proteins; 370 organisms <br />
| updated July 2009<br />
| Lactose [http://beckr.org/marbles?uri=http://www4.wiwiss.fu-berlin.de/stitch/resource/chemicals/CID000000294 via Marbles]<br />
| http://www4.wiwiss.fu-berlin.de/stitch/sparql<br />
|-<br />
| [http://www4.wiwiss.fu-berlin.de/medicare/ Medicare]<br />
| Medicare Formulary<br />
| xxx<br />
| xxx<br />
| xxx<br />
| xxx<br />
| http://www4.wiwiss.fu-berlin.de/medicare/sparql<br />
|-<br />
| [http://www.ebi.ac.uk/chembl/ ChEMBL]<br />
| Chemical / Assays (Proteins, Organisms) / Papers<br />
| [http://www.ebi.ac.uk/chembl/ ChEMBL]] contains information on trial drugs with information about activity against targets like but not limited to proteins. All is backed up by and linked to literature. Includes links to Bio2RDF for ChEBI and Uniprot. License: CC-BY-SA.<br />
| ~24M triples<br />
| Updated 2010-01<br />
| A [http://rdf.farmbio.uu.se/chembl/snorql/?describe=http://rdf.farmbio.uu.se/chembl/activity/a2642163 IC50 activity].<br />
| http://rdf.farmbio.uu.se/chembl/sparql<br />
|-<br />
| [http://www.who.int/gho/en/index.html WHO Global Health Observatory]<br />
| Infectious Diseases /Demography / Socioeconomic Conditions / Environmental Factors<br />
| Data and statistics for infectious diseases at country, regional, and global levels<br />
| 354300<br />
| Updated 2010-09<br />
| xxx<br />
| http://aksw.org/Projects/Stats2RDF<br />
|-<br />
| [http://nlp.dbmi.pitt.edu/nlprepository.html University of Pittsburgh NLP Repository] <br />
| Drugs / Procedures / Diagnoses<br />
| A semantic index of concepts present in 800 full-text clinical notes from the University of Pittsburgh NLP Repository<br />
| 38.664<br />
| Proof of concept -- Updated 02/25/2011<br />
| [http://tarski.duhs.org:8080/openrdf-workbench/repositories/u-pitt-blulab/explore?resource=bl%3Areport_404 Concepts from a sample radiology report]<br />
| [http://tarski.duhs.org:8080/openrdf-workbench/repositories/u-pitt-blulab/query http://tarski.duhs.org:8080/openrdf-workbench/repositories/u-pitt-blulab/query]<br />
|-<br />
| [http://dbmi-icode-01.dbmi.pitt.edu/dikb-evidence/front-page.html The Drug Interaction Knowledge Base] <br />
| Drugs / Metabolic Inhibition Drug-drug Interactions (DDIs) / Claims and Evidence for drug mechanisms and DDIs<br />
| A D2R server of more than 60 drugs currently in the DIKB<br />
| 41,480<br />
| Updated 03/01/2012<br />
| [http://dbmi-icode-01.dbmi.pitt.edu:2020/page/Drugs/paroxetine paroxetine], [http://dbmi-icode-01.dbmi.pitt.edu:2020/page/Drugs/atorvastatin atorvastatin]<br />
| http://dbmi-icode-01.dbmi.pitt.edu:2020/<br />
|-<br />
| [http://purl.org/net/nlprepository/linkedSPLs Linked Structured Product Labels] <br />
| Selected sections from the FDA-approved Structured Product Labels (SPLs) for currently marketed drugs<br />
| A D2R server rendering properly encoded unstructured text from the SPLs<br />
| 470,000<br />
| Updated 05/26/2012<br />
| [http://thedatahub.org/dataset/linked-structured-product-labels/resource/918071ae-f570-4728-98b7-5b447ab42ab8 SPL for Venlafaxine Hydrochloride (American Health Packaging)]<br />
| http://purl.org/net/nlprepository/linkedSPLs<br />
|}<br />
<br />
[[File:2010-12-04_lodd_cloud.png|600px]]<br />
<br />
A graph of some of the LODD datasets (dark grey), related biomedical datasets (light grey), related general-purpose datasets (white) and their interconnections. Line weights correspond to the number of links. The direction of an arrow indicates the dataset that contains the links, e.g., an arrow from A to B means that dataset A contains RDF triples that use identifiers from B. Bidirectional arrows usually indicate that the links are mirrored in both datasets.<br />
More on the interlinking methodology and statistics can be found on the [[../Interlinking|Interlinking]] page.<br />
<br />
The LODD datasets have been crawled by the SWSE Semantic Web search engine and can be accessed via a faceted browsing interface at [http://visinav.deri.org/hcls/] ([http://visinav.deri.org/hcls/list?keyword=varenicline Example query: Varenicline]).<br />
<br />
Most of the LODD datasets have also been integrated into the SPARQL endpoint of the HCLS Knowledge Base, see [http://esw.w3.org/topic/HCLSIG_BioRDF_Subgroup/DERI_HCLS_KB the wiki page of the HCLS KB] for further information.<br />
<br />
=== Bio2RDF Data Sets ===<br />
<br />
The [http://bio2rdf.org/ Bio2RDF project] has published 40 biology-, gene- and medical-related datasets (altogether 2.3 billion triples). <br />
The datasets are available via SPARQL endpoints and as Linked Data. It is recommended that you use the [http://sourceforge.net/project/platformdownload.php?group_id=142631 Bio2RDF Java Servlet], and optionally [http://quebec.bio2rdf.org/download/virtuoso/indexed/ download the databases] for efficient personal use. Running your own instance of the [http://virtuoso.openlinksw.com/wiki/main/Main/VirtEC2AMIBio2rdfInstall OpenLink Virtuoso AMI for EC2] is also an option (and for basic URI resolution doesn't require the Java Servlet, although if you want advanced queries you should still download it and configure it to query your EC2 sparql endpoint).<br />
<br />
* [http://www.freebase.com/view/user/bio2rdf/public/sparql Bio2RDF sparql endpoint list] [http://rdf.freebase.com/rdf/user/bio2rdf/public/sparql Sparql endpoint list in RDF]<br />
* [http://linkeddata.openlinksw.com:8891/pubmed:10500064 Identification of an autoimmune enteropathy-related 75-kilodalton antigen], via an [[OpenLinkSoftware|OpenLink]] hosted edition of [[Bio2Rdf]] <br />
* [http://linkeddata.openlinksw.com:8891/pubmed:9636670 Structure of the gene encoding the human cyclin-dependent kinase inhibitor p18 and mutational analysis in breast cancer], via an [[OpenLinkSoftware|OpenLink]] hosted edition of [[Bio2Rdf]] <br />
* [http://beckr.org/marbles?uri=http%3A%2F%2Fbio2rdf.org%2Fpubmed%3A9626117 PubMed article] viewed using the Marbles Linked Data browser.<br />
* [http://beckr.org/marbles?lang=en&uri=http%3A%2F%2Fbio2rdf.org%2Ffoaf%3AClemens%2C_T_L PubMed author] viewed using the Marbles Linked Data browser.<br />
* [http://beckr.org/marbles?uri=http%3A%2F%2Fbio2rdf.org%2Fomim%3A161555 OMIM Killer Cell Lectin-Like Receptor] viewed using the Marbles Linked Data browser.<br />
* [http://iws.seu.edu.cn/services/falcons/objectsearch/queryresult.jsp?query=%22KILLER+CELL%22 Falcons Search for KILLER CELL]. The Bio2RDF data has been crawled by the Falcons Semantic Web Search engine. This is an example on how the data is accessed by humans using the search engine. Falcons also offers an API that can by used by applications to access the data.<br />
<br />
=== Chem2bio2RDF ===<br />
<br />
* Information about the [http://chem2bio2rdf.org/ chem2bio2rdf] data sets<br />
<br />
=== Data Sets for the LODD Task ===<br />
<br />
To complement the drug-related Web of Data build by the LODD effort, the following data sets could/should also be published as Linked Data.<br />
<br />
The LODD effort is currently gathering more information about relevant datasets. See also [[/DataSetEvaluation|Evaluation of LODD Data Sets]] for current evaluation results.<br />
<br />
* [http://library.dialog.com/bluesheets/html/bl0107.html Adis R&D Insight]<br />
* [http://www.ebi.ac.uk/chebi/ chEBI]<br />
* [http://xpdb.nist.gov/pdb/chemblast.html ChemBlast]<br />
* [http://www.chemspider.com/ ChemSpider]<br />
* [http://ClinicalTrials.gov ClinicalTrials.gov]<br />
* [http://www.citeline.com/trialtrove.html Citeline TrialTrove]<br />
* [http://dailymed.nlm.nih.gov/dailymed/about.cfm DailyMed]<br />
* [http://dbpedia.org/About DBpedia]<br />
* [http://www.nd.edu/~alb/Publication06/145-HumanDisease_PNAS-14My07-Proc/Suppl/ Diseasome]<br />
* [http://www.drugbank.ca/ Drug Bank]<br />
* [http://www.virtualref.com/abs/72.htm DrugDB]<br />
* [http://www.ncbi.nlm.nih.gov/pubmed/17921997 Drugome]<br />
* [http://lsdis.cs.uga.edu/projects/asdoc/ Drug Ontology]<br />
* [http://scientific.thomsonreuters.com/products/iddb/ Investigational Drug Database] - Proprietary<br />
* [http://www.ovid.com/site/catalog/DataBase/1244.jsp?top=2&mid=3&bottom=7&subsection=10 IMS]<br />
* [http://www.genome.jp/kegg/drug/ KEGG Drug]<br />
* [http://Lillytrials.com LillyTrials]<br />
* [http://www.fda.gov/medwatch/ MedWatch]<br />
* [http://www.fda.gov/cder/ndc/ National Drug Code]<br />
* [http://www.ncbi.nlm.nih.gov/omim/ OMIM]<br />
* [http://www.fda.gov/cder/ob/ Orange Book]<br />
* [http://www.pharmaprojects.com/ Pharmaprojects] - Proprietary<br />
* [http://pubchem.ncbi.nlm.nih.gov/ PubChem]<br />
* [http://www.nlm.nih.gov/research/umls/rxnorm/ RxNorm]<br />
* [http://www.ncbi.nlm.nih.gov/sites/entrez?Db=pubmed&Cmd=ShowDetailView&TermToSearch=15360858 VA NDF-RT]<br />
* Other data sources could include blogs, discussion boards, wikis, etc.<br />
* and.... <br />
** [http://www.who.int/globalatlas/ World Health Organization's Global Health Atlas]<br />
** [http://www.epispider.org/ EpiSPIDER]<br />
** [http://www.accessdata.fda.gov/Scripts/cder/DrugsatFDA/ Drugs@FDA - FDA Approved Drug Products]<br />
** [http://www.drugdigest.org/wps/portal/ddigest DrugDigest]<br />
** [http://humancyc.org/ HumanCyc: Encyclopedia of ''Homo sapiens'' Genes and Metabolism]<br />
** [http://www.alzforum.org/ Alzheimer Research Forum]<br />
** [http://wwwcf.nlm.nih.gov/umlslicense/rxtermApp/rxTerm.cfm RxTerms]<br />
** [http://hudine.neu.edu/ HuDiNe]<br />
** [http://wiki.medpedia.com/Cymbalta Medpedia]<br />
** [http://tcm.lifescience.ntu.edu.tw/ TCMGeneDIT] and [[Media:HCLSIG$$LODD$$Data$TCMGeneDIT_RDF_Dataset_r1.zip|RDF dump]]<br />
** [http://www.tuftsctsi.org/~/media/Files/CTSI/Library%20Files/FCC%20for%20CER%20Rpt%20to%20Pres%20and%20Congress_063009.ashx List of other possible data sources from page 66 onwards]<br />
<br />
=== Alternative Herbal Medicine use case ===<br />
* [[Data/TCMGeneDIT|TCMGeneDIT dataset]]<br />
<br />
=== Identified Based Linkage Points ===<br />
<br />
* INCHIs<br />
* [[PubChem]] Compound ID (CID)<br />
* [[PubChem]] NSC<br />
* Chemical Abstract ID (CAS)<br />
* New Drug Application (NDA)<br />
<br />
=== Data Set Attributes ===<br />
<br />
* Licensing<br />
* Data Format<br />
* Identifiers</div>Rboycehttps://www.w3.org/wiki/index.php?title=HCLSIG/LODD/Data&diff=59213HCLSIG/LODD/Data2012-05-26T12:51:34Z<p>Rboyce: /* LODD-related datasets that the LODD group already made available as Linked Data */</p>
<hr />
<div>__NOTOC__<br />
=== LODD-related datasets that the LODD group already made available as Linked Data ===<br />
<br />
{| border="1" cellpadding="2" cellspacing="0"<br />
| '''Name'''<br />
| '''Topic'''<br />
| '''Short Description'''<br />
| '''Size and coverage'''<br />
| '''Status / Activity'''<br />
| '''Example Instances'''<br />
| '''SPARQL Endpoint'''<br />
|-<br />
| [http://www4.wiwiss.fu-berlin.de/drugbank/ DrugBank]<br />
| Drugs<br />
| [http://www.drugbank.ca/ Drugbank.ca] provides drug (i.e., chemical, pharmacological and pharmaceutical) data with comprehensive drug target (i.e., sequence, structure, and pathway) information ({{doi|10.1093/nar/gkj067}})<br />
| 766,920 triples; 4,800 drugs, 2,500 protein sequences<br />
| updated regularly<br />
| Varenicline [http://beckr.org/marbles?uri=http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugs/DB01273 via Marbles], [http://demo.openlinksw.com/ode/?uri=http%3A%2F%2Fwww4.wiwiss.fu-berlin.de%2Fdrugbank%2Fresource%2Fdrugs%2FDB01273 via OpenLink Data Explorer] <br />
| http://www4.wiwiss.fu-berlin.de/drugbank/sparql <br />
|-<br />
| [http://linkedct.org/ LinkedCT]<br />
| Clinical Trials<br />
| Linked data source of trials from [http://clinicaltrials.gov ClinicalTrials.gov]<br />
| ~25 million triples, 106,000 trials (as of April 2011)<br />
| Updated automatically at all times, refer to [http://linkedct.org/faq/ FAQ] for more details.<br />
| [http://data.linkedct.org/resource/condition/breast-cancer/ Breast Cancer] (Condition), a [http://data.linkedct.org/resource/trial/nct00999557/ NCT00999557] (Trial), [http://data.linkedct.org/resource/city/toronto/ Toronto] (City). <br />
| http://data.linkedct.org/sparql<br />
|-<br />
| [http://www4.wiwiss.fu-berlin.de/dailymed/ DailyMed]<br />
| Drugs<br />
| [http://dailymed.nlm.nih.gov/dailymed/about.cfm dailymed.nlm.nih.gov] provides information about approved prescription drugs, includes FDA approved labels (package inserts)<br />
| 164,276 triples; 4,039 drugs<br />
| updated regularly<br />
| "Sterile Water (Irrigant)" [http://beckr.org/marbles?uri=http://www4.wiwiss.fu-berlin.de/dailymed/resource/drugs/492 via Marbles], [http://demo.openlinksw.com/ode/?uri=http%3A%2F%2Fwww4.wiwiss.fu-berlin.de%2Fdailymed%2Fresource%2Fdrugs%2F492 via OpenLink Data Explorer]<br />
| http://www4.wiwiss.fu-berlin.de/dailymed/sparql<br />
|-<br />
| [http://dbpedia.org/About DBpedia]<br />
| Drugs/ Diseases/ Proteins<br />
| RDF data about 2.49 million things that has been extracted from Wikipedia<br />
| 218 million RDF triples; 2,300 drugs, 2,200 proteins<br />
| updated every 3 months <br />
| [http://dbpedia.org/resource/Aspirin Aspirin], [http://dbpedia.org/resource/HIV HIV]<br />
| http://dbpedia.org/sparql<br />
|-<br />
| [http://www4.wiwiss.fu-berlin.de/diseasome/ Diseasome]<br />
| Diseases / Genes<br />
| [http://www.nd.edu/~alb/Publication06/145-HumanDisease_PNAS-14My07-Proc/Suppl/ Diseasome] describes characteristics of disorders and disease genes linked by known disorder–gene associations<br />
| 91,182 triples; 2,600 genes<br />
| updated 2006<br />
| Alzheimer's [http://beckr.org/marbles?uri=http://www4.wiwiss.fu-berlin.de/diseasome/resource/diseases/74 via Marbles], [http://demo.openlinksw.com/ode/?uri=http%3A%2F%2Fwww4.wiwiss.fu-berlin.de%2Fdiseasome%2Fresource%2Fdiseases%2F74 via OpenLink Data Explorer]<br />
| http://www4.wiwiss.fu-berlin.de/diseasome/sparql<br />
|-<br />
| [http://code.google.com/p/junsbriefcase/wiki/TGDdataset RDF-TCM]<br />
| Genes / Diseases / Medicine / Ingredients<br />
| Traditional Chinese medicine, gene and disease association dataset and a linkset mapping TCM gene symbols to Extrez Gene IDs created by Neurocommons <br />
| 117,643 <br />
| updated August 2009 (stable)<br />
| [http://purl.org/net/tcm/tcm.lifescience.ntu.edu.tw/id/medicine/Ginkgo_biloba Ginkgo biloba] <br />
| http://hcls.deri.org/sparql; graph name: http://hcls.deri.org/resource/graph/tcm<br />
|-<br />
| [http://www.nlm.nih.gov/research/umls/rxnorm/ RxNorm]<br />
| Drugs<br />
| A linked version of the NLM's RxNorm database that connects prescription drugs, ingredients, and NDC through RXCUI a concept unique identifier. RxNorm is a product developed by NIH’s National Library of Medicine. It currently interlinks 12 different drug vocabularies around a unique concept identifier. Due to licensing only six of the drug vocabularies are made available as part of the LODD cloud. This includes: Medical Subject Headings,, Metathesaurus FDA National Drug Code Directory, Metathesaurus FDA Structured Product Labels, National Drug File, RxNorm Vocabulary, Veterans Health Administration National Drug File<br />
Links are provided connecting RxNorm to drug bank and to the UMLS.<br />
|over 7.7 million triples; 165,806 RXCUI (Concept Unique Identifiers) Unique drugs and ingredients; 332,754 RXAUI (Atomic Unique Identifiers) sourced terms<br />
| Based on 3/2010 Rxnorm Release; Last updated 5/2010<br />
| [http://link.informatics.stonybrook.edu/rxnorm/RXAUI/2994963 Singulair from the Metathesaurus FDA Structured Product Labels]<br />
| http://link.informatics.stonybrook.edu/sparql/<br />
|-<br />
| [http://www4.wiwiss.fu-berlin.de/sider/ SIDER]<br />
| Diseases / Side Effects<br />
| [http://sideeffects.embl.de/ SIDER] contains information on marketed drugs and their adverse effects ({{doi|10.1038/msb.2009.98}})<br />
| 192,515 triples; 63,000 adverse effect reports, 1,737 genes<br />
| updated 2009<br />
| Confusion [http://beckr.org/marbles?uri=http://www4.wiwiss.fu-berlin.de/sider/resource/side_effects/C0009676 via Marbles]<br />
| http://www4.wiwiss.fu-berlin.de/sider/sparql<br />
|-<br />
| [http://www4.wiwiss.fu-berlin.de/stitch/ STITCH]<br />
| Chemicals / Proteins<br />
| [http://stitch.embl.de/ STITCH] contains information on chemicals, proteins, and their interactions ({{doi|10.1093/nar/gkm795}})<br />
| 7,500,000 chemicals; 500,000 proteins; 370 organisms <br />
| updated July 2009<br />
| Lactose [http://beckr.org/marbles?uri=http://www4.wiwiss.fu-berlin.de/stitch/resource/chemicals/CID000000294 via Marbles]<br />
| http://www4.wiwiss.fu-berlin.de/stitch/sparql<br />
|-<br />
| [http://www4.wiwiss.fu-berlin.de/medicare/ Medicare]<br />
| Medicare Formulary<br />
| xxx<br />
| xxx<br />
| xxx<br />
| xxx<br />
| http://www4.wiwiss.fu-berlin.de/medicare/sparql<br />
|-<br />
| [http://www.ebi.ac.uk/chembl/ ChEMBL]<br />
| Chemical / Assays (Proteins, Organisms) / Papers<br />
| [http://www.ebi.ac.uk/chembl/ ChEMBL]] contains information on trial drugs with information about activity against targets like but not limited to proteins. All is backed up by and linked to literature. Includes links to Bio2RDF for ChEBI and Uniprot. License: CC-BY-SA.<br />
| ~24M triples<br />
| Updated 2010-01<br />
| A [http://rdf.farmbio.uu.se/chembl/snorql/?describe=http://rdf.farmbio.uu.se/chembl/activity/a2642163 IC50 activity].<br />
| http://rdf.farmbio.uu.se/chembl/sparql<br />
|-<br />
| [http://www.who.int/gho/en/index.html WHO Global Health Observatory]<br />
| Infectious Diseases /Demography / Socioeconomic Conditions / Environmental Factors<br />
| Data and statistics for infectious diseases at country, regional, and global levels<br />
| 354300<br />
| Updated 2010-09<br />
| xxx<br />
| http://aksw.org/Projects/Stats2RDF<br />
|-<br />
| [http://nlp.dbmi.pitt.edu/nlprepository.html University of Pittsburgh NLP Repository] <br />
| Drugs / Procedures / Diagnoses<br />
| A semantic index of concepts present in 800 full-text clinical notes from the University of Pittsburgh NLP Repository<br />
| 47,000<br />
| Proof of concept -- Updated 02/25/2011<br />
| [http://tarski.duhs.org:8080/openrdf-workbench/repositories/u-pitt-blulab/explore?resource=bl%3Areport_404 Concepts from a sample radiology report]<br />
| [http://tarski.duhs.org:8080/openrdf-workbench/repositories/u-pitt-blulab/query http://tarski.duhs.org:8080/openrdf-workbench/repositories/u-pitt-blulab/query]<br />
|-<br />
| [http://dbmi-icode-01.dbmi.pitt.edu/dikb-evidence/front-page.html The Drug Interaction Knowledge Base] <br />
| Drugs / Metabolic Inhibition Drug-drug Interactions (DDIs) / Claims and Evidence for drug mechanisms and DDIs<br />
| A D2R server of more than 60 drugs currently in the DIKB<br />
| 41,480<br />
| Updated 03/01/2012<br />
| [http://dbmi-icode-01.dbmi.pitt.edu:2020/page/Drugs/paroxetine paroxetine], [http://dbmi-icode-01.dbmi.pitt.edu:2020/page/Drugs/atorvastatin atorvastatin]<br />
| http://dbmi-icode-01.dbmi.pitt.edu:2020/<br />
|-<br />
| [http://purl.org/net/nlprepository/linkedSPLs Linked Structured Product Labels] <br />
| Selected sections from the FDA-approved Structured Product Labels (SPLs) for currently marketed drugs<br />
| A D2R server rendering properly encoded unstructured text from the SPLs<br />
| 470,000<br />
| Updated 05/26/2012<br />
| [http://thedatahub.org/dataset/linked-structured-product-labels/resource/918071ae-f570-4728-98b7-5b447ab42ab8 SPL for Venlafaxine Hydrochloride (American Health Packaging)]<br />
| http://purl.org/net/nlprepository/linkedSPLs<br />
|}<br />
<br />
[[File:2010-12-04_lodd_cloud.png|600px]]<br />
<br />
A graph of some of the LODD datasets (dark grey), related biomedical datasets (light grey), related general-purpose datasets (white) and their interconnections. Line weights correspond to the number of links. The direction of an arrow indicates the dataset that contains the links, e.g., an arrow from A to B means that dataset A contains RDF triples that use identifiers from B. Bidirectional arrows usually indicate that the links are mirrored in both datasets.<br />
More on the interlinking methodology and statistics can be found on the [[../Interlinking|Interlinking]] page.<br />
<br />
The LODD datasets have been crawled by the SWSE Semantic Web search engine and can be accessed via a faceted browsing interface at [http://visinav.deri.org/hcls/] ([http://visinav.deri.org/hcls/list?keyword=varenicline Example query: Varenicline]).<br />
<br />
Most of the LODD datasets have also been integrated into the SPARQL endpoint of the HCLS Knowledge Base, see [http://esw.w3.org/topic/HCLSIG_BioRDF_Subgroup/DERI_HCLS_KB the wiki page of the HCLS KB] for further information.<br />
<br />
=== Bio2RDF Data Sets ===<br />
<br />
The [http://bio2rdf.org/ Bio2RDF project] has published 40 biology-, gene- and medical-related datasets (altogether 2.3 billion triples). <br />
The datasets are available via SPARQL endpoints and as Linked Data. It is recommended that you use the [http://sourceforge.net/project/platformdownload.php?group_id=142631 Bio2RDF Java Servlet], and optionally [http://quebec.bio2rdf.org/download/virtuoso/indexed/ download the databases] for efficient personal use. Running your own instance of the [http://virtuoso.openlinksw.com/wiki/main/Main/VirtEC2AMIBio2rdfInstall OpenLink Virtuoso AMI for EC2] is also an option (and for basic URI resolution doesn't require the Java Servlet, although if you want advanced queries you should still download it and configure it to query your EC2 sparql endpoint).<br />
<br />
* [http://www.freebase.com/view/user/bio2rdf/public/sparql Bio2RDF sparql endpoint list] [http://rdf.freebase.com/rdf/user/bio2rdf/public/sparql Sparql endpoint list in RDF]<br />
* [http://linkeddata.openlinksw.com:8891/pubmed:10500064 Identification of an autoimmune enteropathy-related 75-kilodalton antigen], via an [[OpenLinkSoftware|OpenLink]] hosted edition of [[Bio2Rdf]] <br />
* [http://linkeddata.openlinksw.com:8891/pubmed:9636670 Structure of the gene encoding the human cyclin-dependent kinase inhibitor p18 and mutational analysis in breast cancer], via an [[OpenLinkSoftware|OpenLink]] hosted edition of [[Bio2Rdf]] <br />
* [http://beckr.org/marbles?uri=http%3A%2F%2Fbio2rdf.org%2Fpubmed%3A9626117 PubMed article] viewed using the Marbles Linked Data browser.<br />
* [http://beckr.org/marbles?lang=en&uri=http%3A%2F%2Fbio2rdf.org%2Ffoaf%3AClemens%2C_T_L PubMed author] viewed using the Marbles Linked Data browser.<br />
* [http://beckr.org/marbles?uri=http%3A%2F%2Fbio2rdf.org%2Fomim%3A161555 OMIM Killer Cell Lectin-Like Receptor] viewed using the Marbles Linked Data browser.<br />
* [http://iws.seu.edu.cn/services/falcons/objectsearch/queryresult.jsp?query=%22KILLER+CELL%22 Falcons Search for KILLER CELL]. The Bio2RDF data has been crawled by the Falcons Semantic Web Search engine. This is an example on how the data is accessed by humans using the search engine. Falcons also offers an API that can by used by applications to access the data.<br />
<br />
=== Chem2bio2RDF ===<br />
<br />
* Information about the [http://chem2bio2rdf.org/ chem2bio2rdf] data sets<br />
<br />
=== Data Sets for the LODD Task ===<br />
<br />
To complement the drug-related Web of Data build by the LODD effort, the following data sets could/should also be published as Linked Data.<br />
<br />
The LODD effort is currently gathering more information about relevant datasets. See also [[/DataSetEvaluation|Evaluation of LODD Data Sets]] for current evaluation results.<br />
<br />
* [http://library.dialog.com/bluesheets/html/bl0107.html Adis R&D Insight]<br />
* [http://www.ebi.ac.uk/chebi/ chEBI]<br />
* [http://xpdb.nist.gov/pdb/chemblast.html ChemBlast]<br />
* [http://www.chemspider.com/ ChemSpider]<br />
* [http://ClinicalTrials.gov ClinicalTrials.gov]<br />
* [http://www.citeline.com/trialtrove.html Citeline TrialTrove]<br />
* [http://dailymed.nlm.nih.gov/dailymed/about.cfm DailyMed]<br />
* [http://dbpedia.org/About DBpedia]<br />
* [http://www.nd.edu/~alb/Publication06/145-HumanDisease_PNAS-14My07-Proc/Suppl/ Diseasome]<br />
* [http://www.drugbank.ca/ Drug Bank]<br />
* [http://www.virtualref.com/abs/72.htm DrugDB]<br />
* [http://www.ncbi.nlm.nih.gov/pubmed/17921997 Drugome]<br />
* [http://lsdis.cs.uga.edu/projects/asdoc/ Drug Ontology]<br />
* [http://scientific.thomsonreuters.com/products/iddb/ Investigational Drug Database] - Proprietary<br />
* [http://www.ovid.com/site/catalog/DataBase/1244.jsp?top=2&mid=3&bottom=7&subsection=10 IMS]<br />
* [http://www.genome.jp/kegg/drug/ KEGG Drug]<br />
* [http://Lillytrials.com LillyTrials]<br />
* [http://www.fda.gov/medwatch/ MedWatch]<br />
* [http://www.fda.gov/cder/ndc/ National Drug Code]<br />
* [http://www.ncbi.nlm.nih.gov/omim/ OMIM]<br />
* [http://www.fda.gov/cder/ob/ Orange Book]<br />
* [http://www.pharmaprojects.com/ Pharmaprojects] - Proprietary<br />
* [http://pubchem.ncbi.nlm.nih.gov/ PubChem]<br />
* [http://www.nlm.nih.gov/research/umls/rxnorm/ RxNorm]<br />
* [http://www.ncbi.nlm.nih.gov/sites/entrez?Db=pubmed&Cmd=ShowDetailView&TermToSearch=15360858 VA NDF-RT]<br />
* Other data sources could include blogs, discussion boards, wikis, etc.<br />
* and.... <br />
** [http://www.who.int/globalatlas/ World Health Organization's Global Health Atlas]<br />
** [http://www.epispider.org/ EpiSPIDER]<br />
** [http://www.accessdata.fda.gov/Scripts/cder/DrugsatFDA/ Drugs@FDA - FDA Approved Drug Products]<br />
** [http://www.drugdigest.org/wps/portal/ddigest DrugDigest]<br />
** [http://humancyc.org/ HumanCyc: Encyclopedia of ''Homo sapiens'' Genes and Metabolism]<br />
** [http://www.alzforum.org/ Alzheimer Research Forum]<br />
** [http://wwwcf.nlm.nih.gov/umlslicense/rxtermApp/rxTerm.cfm RxTerms]<br />
** [http://hudine.neu.edu/ HuDiNe]<br />
** [http://wiki.medpedia.com/Cymbalta Medpedia]<br />
** [http://tcm.lifescience.ntu.edu.tw/ TCMGeneDIT] and [[Media:HCLSIG$$LODD$$Data$TCMGeneDIT_RDF_Dataset_r1.zip|RDF dump]]<br />
** [http://www.tuftsctsi.org/~/media/Files/CTSI/Library%20Files/FCC%20for%20CER%20Rpt%20to%20Pres%20and%20Congress_063009.ashx List of other possible data sources from page 66 onwards]<br />
<br />
=== Alternative Herbal Medicine use case ===<br />
* [[Data/TCMGeneDIT|TCMGeneDIT dataset]]<br />
<br />
=== Identified Based Linkage Points ===<br />
<br />
* INCHIs<br />
* [[PubChem]] Compound ID (CID)<br />
* [[PubChem]] NSC<br />
* Chemical Abstract ID (CAS)<br />
* New Drug Application (NDA)<br />
<br />
=== Data Set Attributes ===<br />
<br />
* Licensing<br />
* Data Format<br />
* Identifiers</div>Rboycehttps://www.w3.org/wiki/index.php?title=HCLSIG/LODD/Data&diff=59212HCLSIG/LODD/Data2012-05-26T12:48:54Z<p>Rboyce: /* LODD-related datasets that the LODD group already made available as Linked Data */</p>
<hr />
<div>__NOTOC__<br />
=== LODD-related datasets that the LODD group already made available as Linked Data ===<br />
<br />
{| border="1" cellpadding="2" cellspacing="0"<br />
| '''Name'''<br />
| '''Topic'''<br />
| '''Short Description'''<br />
| '''Size and coverage'''<br />
| '''Status / Activity'''<br />
| '''Example Instances'''<br />
| '''SPARQL Endpoint'''<br />
|-<br />
| [http://www4.wiwiss.fu-berlin.de/drugbank/ DrugBank]<br />
| Drugs<br />
| [http://www.drugbank.ca/ Drugbank.ca] provides drug (i.e., chemical, pharmacological and pharmaceutical) data with comprehensive drug target (i.e., sequence, structure, and pathway) information ({{doi|10.1093/nar/gkj067}})<br />
| 766,920 triples; 4,800 drugs, 2,500 protein sequences<br />
| updated regularly<br />
| Varenicline [http://beckr.org/marbles?uri=http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugs/DB01273 via Marbles], [http://demo.openlinksw.com/ode/?uri=http%3A%2F%2Fwww4.wiwiss.fu-berlin.de%2Fdrugbank%2Fresource%2Fdrugs%2FDB01273 via OpenLink Data Explorer] <br />
| http://www4.wiwiss.fu-berlin.de/drugbank/sparql <br />
|-<br />
| [http://linkedct.org/ LinkedCT]<br />
| Clinical Trials<br />
| Linked data source of trials from [http://clinicaltrials.gov ClinicalTrials.gov]<br />
| ~25 million triples, 106,000 trials (as of April 2011)<br />
| Updated automatically at all times, refer to [http://linkedct.org/faq/ FAQ] for more details.<br />
| [http://data.linkedct.org/resource/condition/breast-cancer/ Breast Cancer] (Condition), a [http://data.linkedct.org/resource/trial/nct00999557/ NCT00999557] (Trial), [http://data.linkedct.org/resource/city/toronto/ Toronto] (City). <br />
| http://data.linkedct.org/sparql<br />
|-<br />
| [http://www4.wiwiss.fu-berlin.de/dailymed/ DailyMed]<br />
| Drugs<br />
| [http://dailymed.nlm.nih.gov/dailymed/about.cfm dailymed.nlm.nih.gov] provides information about approved prescription drugs, includes FDA approved labels (package inserts)<br />
| 164,276 triples; 4,039 drugs<br />
| updated regularly<br />
| "Sterile Water (Irrigant)" [http://beckr.org/marbles?uri=http://www4.wiwiss.fu-berlin.de/dailymed/resource/drugs/492 via Marbles], [http://demo.openlinksw.com/ode/?uri=http%3A%2F%2Fwww4.wiwiss.fu-berlin.de%2Fdailymed%2Fresource%2Fdrugs%2F492 via OpenLink Data Explorer]<br />
| http://www4.wiwiss.fu-berlin.de/dailymed/sparql<br />
|-<br />
| [http://dbpedia.org/About DBpedia]<br />
| Drugs/ Diseases/ Proteins<br />
| RDF data about 2.49 million things that has been extracted from Wikipedia<br />
| 218 million RDF triples; 2,300 drugs, 2,200 proteins<br />
| updated every 3 months <br />
| [http://dbpedia.org/resource/Aspirin Aspirin], [http://dbpedia.org/resource/HIV HIV]<br />
| http://dbpedia.org/sparql<br />
|-<br />
| [http://www4.wiwiss.fu-berlin.de/diseasome/ Diseasome]<br />
| Diseases / Genes<br />
| [http://www.nd.edu/~alb/Publication06/145-HumanDisease_PNAS-14My07-Proc/Suppl/ Diseasome] describes characteristics of disorders and disease genes linked by known disorder–gene associations<br />
| 91,182 triples; 2,600 genes<br />
| updated 2006<br />
| Alzheimer's [http://beckr.org/marbles?uri=http://www4.wiwiss.fu-berlin.de/diseasome/resource/diseases/74 via Marbles], [http://demo.openlinksw.com/ode/?uri=http%3A%2F%2Fwww4.wiwiss.fu-berlin.de%2Fdiseasome%2Fresource%2Fdiseases%2F74 via OpenLink Data Explorer]<br />
| http://www4.wiwiss.fu-berlin.de/diseasome/sparql<br />
|-<br />
| [http://code.google.com/p/junsbriefcase/wiki/TGDdataset RDF-TCM]<br />
| Genes / Diseases / Medicine / Ingredients<br />
| Traditional Chinese medicine, gene and disease association dataset and a linkset mapping TCM gene symbols to Extrez Gene IDs created by Neurocommons <br />
| 117,643 <br />
| updated August 2009 (stable)<br />
| [http://purl.org/net/tcm/tcm.lifescience.ntu.edu.tw/id/medicine/Ginkgo_biloba Ginkgo biloba] <br />
| http://hcls.deri.org/sparql; graph name: http://hcls.deri.org/resource/graph/tcm<br />
|-<br />
| [http://www.nlm.nih.gov/research/umls/rxnorm/ RxNorm]<br />
| Drugs<br />
| A linked version of the NLM's RxNorm database that connects prescription drugs, ingredients, and NDC through RXCUI a concept unique identifier. RxNorm is a product developed by NIH’s National Library of Medicine. It currently interlinks 12 different drug vocabularies around a unique concept identifier. Due to licensing only six of the drug vocabularies are made available as part of the LODD cloud. This includes: Medical Subject Headings,, Metathesaurus FDA National Drug Code Directory, Metathesaurus FDA Structured Product Labels, National Drug File, RxNorm Vocabulary, Veterans Health Administration National Drug File<br />
Links are provided connecting RxNorm to drug bank and to the UMLS.<br />
|over 7.7 million triples; 165,806 RXCUI (Concept Unique Identifiers) Unique drugs and ingredients; 332,754 RXAUI (Atomic Unique Identifiers) sourced terms<br />
| Based on 3/2010 Rxnorm Release; Last updated 5/2010<br />
| [http://link.informatics.stonybrook.edu/rxnorm/RXAUI/2994963 Singulair from the Metathesaurus FDA Structured Product Labels]<br />
| http://link.informatics.stonybrook.edu/sparql/<br />
|-<br />
| [http://www4.wiwiss.fu-berlin.de/sider/ SIDER]<br />
| Diseases / Side Effects<br />
| [http://sideeffects.embl.de/ SIDER] contains information on marketed drugs and their adverse effects ({{doi|10.1038/msb.2009.98}})<br />
| 192,515 triples; 63,000 adverse effect reports, 1,737 genes<br />
| updated 2009<br />
| Confusion [http://beckr.org/marbles?uri=http://www4.wiwiss.fu-berlin.de/sider/resource/side_effects/C0009676 via Marbles]<br />
| http://www4.wiwiss.fu-berlin.de/sider/sparql<br />
|-<br />
| [http://www4.wiwiss.fu-berlin.de/stitch/ STITCH]<br />
| Chemicals / Proteins<br />
| [http://stitch.embl.de/ STITCH] contains information on chemicals, proteins, and their interactions ({{doi|10.1093/nar/gkm795}})<br />
| 7,500,000 chemicals; 500,000 proteins; 370 organisms <br />
| updated July 2009<br />
| Lactose [http://beckr.org/marbles?uri=http://www4.wiwiss.fu-berlin.de/stitch/resource/chemicals/CID000000294 via Marbles]<br />
| http://www4.wiwiss.fu-berlin.de/stitch/sparql<br />
|-<br />
| [http://www4.wiwiss.fu-berlin.de/medicare/ Medicare]<br />
| Medicare Formulary<br />
| xxx<br />
| xxx<br />
| xxx<br />
| xxx<br />
| http://www4.wiwiss.fu-berlin.de/medicare/sparql<br />
|-<br />
| [http://www.ebi.ac.uk/chembl/ ChEMBL]<br />
| Chemical / Assays (Proteins, Organisms) / Papers<br />
| [http://www.ebi.ac.uk/chembl/ ChEMBL]] contains information on trial drugs with information about activity against targets like but not limited to proteins. All is backed up by and linked to literature. Includes links to Bio2RDF for ChEBI and Uniprot. License: CC-BY-SA.<br />
| ~24M triples<br />
| Updated 2010-01<br />
| A [http://rdf.farmbio.uu.se/chembl/snorql/?describe=http://rdf.farmbio.uu.se/chembl/activity/a2642163 IC50 activity].<br />
| http://rdf.farmbio.uu.se/chembl/sparql<br />
|-<br />
| [http://www.who.int/gho/en/index.html WHO Global Health Observatory]<br />
| Infectious Diseases /Demography / Socioeconomic Conditions / Environmental Factors<br />
| Data and statistics for infectious diseases at country, regional, and global levels<br />
| 354300<br />
| Updated 2010-09<br />
| xxx<br />
| http://aksw.org/Projects/Stats2RDF<br />
|-<br />
| [http://nlp.dbmi.pitt.edu/nlprepository.html University of Pittsburgh NLP Repository] <br />
| Drugs / Procedures / Diagnoses<br />
| A semantic index of concepts present in 800 full-text clinical notes from the University of Pittsburgh NLP Repository<br />
| 470,000<br />
| Proof of concept -- Updated 02/25/2011<br />
| [http://tarski.duhs.org:8080/openrdf-workbench/repositories/u-pitt-blulab/explore?resource=bl%3Areport_404 Concepts from a sample radiology report]<br />
| [http://tarski.duhs.org:8080/openrdf-workbench/repositories/u-pitt-blulab/query http://tarski.duhs.org:8080/openrdf-workbench/repositories/u-pitt-blulab/query]<br />
|-<br />
| [http://dbmi-icode-01.dbmi.pitt.edu/dikb-evidence/front-page.html The Drug Interaction Knowledge Base] <br />
| Drugs / Metabolic Inhibition Drug-drug Interactions (DDIs) / Claims and Evidence for drug mechanisms and DDIs<br />
| A D2R server of more than 60 drugs currently in the DIKB<br />
| 41,480<br />
| Updated 03/01/2012<br />
| [http://dbmi-icode-01.dbmi.pitt.edu:2020/page/Drugs/paroxetine paroxetine], [http://dbmi-icode-01.dbmi.pitt.edu:2020/page/Drugs/atorvastatin atorvastatin]<br />
| http://dbmi-icode-01.dbmi.pitt.edu:2020/<br />
|-<br />
| [http://purl.org/net/nlprepository/linkedSPLs Linked Structured Product Labels] <br />
| Selected sections from the FDA-approved Structured Product Labels (SPLs) for currently marketed drugs<br />
| A D2R server rendering properly encoded unstructured text from the SPLs<br />
| 47,925<br />
| Updated 03/01/2012<br />
| [http://thedatahub.org/dataset/linked-structured-product-labels/resource/918071ae-f570-4728-98b7-5b447ab42ab8 SPL for Venlafaxine Hydrochloride (American Health Packaging)]<br />
| http://purl.org/net/nlprepository/linkedSPLs<br />
|}<br />
<br />
[[File:2010-12-04_lodd_cloud.png|600px]]<br />
<br />
A graph of some of the LODD datasets (dark grey), related biomedical datasets (light grey), related general-purpose datasets (white) and their interconnections. Line weights correspond to the number of links. The direction of an arrow indicates the dataset that contains the links, e.g., an arrow from A to B means that dataset A contains RDF triples that use identifiers from B. Bidirectional arrows usually indicate that the links are mirrored in both datasets.<br />
More on the interlinking methodology and statistics can be found on the [[../Interlinking|Interlinking]] page.<br />
<br />
The LODD datasets have been crawled by the SWSE Semantic Web search engine and can be accessed via a faceted browsing interface at [http://visinav.deri.org/hcls/] ([http://visinav.deri.org/hcls/list?keyword=varenicline Example query: Varenicline]).<br />
<br />
Most of the LODD datasets have also been integrated into the SPARQL endpoint of the HCLS Knowledge Base, see [http://esw.w3.org/topic/HCLSIG_BioRDF_Subgroup/DERI_HCLS_KB the wiki page of the HCLS KB] for further information.<br />
<br />
=== Bio2RDF Data Sets ===<br />
<br />
The [http://bio2rdf.org/ Bio2RDF project] has published 40 biology-, gene- and medical-related datasets (altogether 2.3 billion triples). <br />
The datasets are available via SPARQL endpoints and as Linked Data. It is recommended that you use the [http://sourceforge.net/project/platformdownload.php?group_id=142631 Bio2RDF Java Servlet], and optionally [http://quebec.bio2rdf.org/download/virtuoso/indexed/ download the databases] for efficient personal use. Running your own instance of the [http://virtuoso.openlinksw.com/wiki/main/Main/VirtEC2AMIBio2rdfInstall OpenLink Virtuoso AMI for EC2] is also an option (and for basic URI resolution doesn't require the Java Servlet, although if you want advanced queries you should still download it and configure it to query your EC2 sparql endpoint).<br />
<br />
* [http://www.freebase.com/view/user/bio2rdf/public/sparql Bio2RDF sparql endpoint list] [http://rdf.freebase.com/rdf/user/bio2rdf/public/sparql Sparql endpoint list in RDF]<br />
* [http://linkeddata.openlinksw.com:8891/pubmed:10500064 Identification of an autoimmune enteropathy-related 75-kilodalton antigen], via an [[OpenLinkSoftware|OpenLink]] hosted edition of [[Bio2Rdf]] <br />
* [http://linkeddata.openlinksw.com:8891/pubmed:9636670 Structure of the gene encoding the human cyclin-dependent kinase inhibitor p18 and mutational analysis in breast cancer], via an [[OpenLinkSoftware|OpenLink]] hosted edition of [[Bio2Rdf]] <br />
* [http://beckr.org/marbles?uri=http%3A%2F%2Fbio2rdf.org%2Fpubmed%3A9626117 PubMed article] viewed using the Marbles Linked Data browser.<br />
* [http://beckr.org/marbles?lang=en&uri=http%3A%2F%2Fbio2rdf.org%2Ffoaf%3AClemens%2C_T_L PubMed author] viewed using the Marbles Linked Data browser.<br />
* [http://beckr.org/marbles?uri=http%3A%2F%2Fbio2rdf.org%2Fomim%3A161555 OMIM Killer Cell Lectin-Like Receptor] viewed using the Marbles Linked Data browser.<br />
* [http://iws.seu.edu.cn/services/falcons/objectsearch/queryresult.jsp?query=%22KILLER+CELL%22 Falcons Search for KILLER CELL]. The Bio2RDF data has been crawled by the Falcons Semantic Web Search engine. This is an example on how the data is accessed by humans using the search engine. Falcons also offers an API that can by used by applications to access the data.<br />
<br />
=== Chem2bio2RDF ===<br />
<br />
* Information about the [http://chem2bio2rdf.org/ chem2bio2rdf] data sets<br />
<br />
=== Data Sets for the LODD Task ===<br />
<br />
To complement the drug-related Web of Data build by the LODD effort, the following data sets could/should also be published as Linked Data.<br />
<br />
The LODD effort is currently gathering more information about relevant datasets. See also [[/DataSetEvaluation|Evaluation of LODD Data Sets]] for current evaluation results.<br />
<br />
* [http://library.dialog.com/bluesheets/html/bl0107.html Adis R&D Insight]<br />
* [http://www.ebi.ac.uk/chebi/ chEBI]<br />
* [http://xpdb.nist.gov/pdb/chemblast.html ChemBlast]<br />
* [http://www.chemspider.com/ ChemSpider]<br />
* [http://ClinicalTrials.gov ClinicalTrials.gov]<br />
* [http://www.citeline.com/trialtrove.html Citeline TrialTrove]<br />
* [http://dailymed.nlm.nih.gov/dailymed/about.cfm DailyMed]<br />
* [http://dbpedia.org/About DBpedia]<br />
* [http://www.nd.edu/~alb/Publication06/145-HumanDisease_PNAS-14My07-Proc/Suppl/ Diseasome]<br />
* [http://www.drugbank.ca/ Drug Bank]<br />
* [http://www.virtualref.com/abs/72.htm DrugDB]<br />
* [http://www.ncbi.nlm.nih.gov/pubmed/17921997 Drugome]<br />
* [http://lsdis.cs.uga.edu/projects/asdoc/ Drug Ontology]<br />
* [http://scientific.thomsonreuters.com/products/iddb/ Investigational Drug Database] - Proprietary<br />
* [http://www.ovid.com/site/catalog/DataBase/1244.jsp?top=2&mid=3&bottom=7&subsection=10 IMS]<br />
* [http://www.genome.jp/kegg/drug/ KEGG Drug]<br />
* [http://Lillytrials.com LillyTrials]<br />
* [http://www.fda.gov/medwatch/ MedWatch]<br />
* [http://www.fda.gov/cder/ndc/ National Drug Code]<br />
* [http://www.ncbi.nlm.nih.gov/omim/ OMIM]<br />
* [http://www.fda.gov/cder/ob/ Orange Book]<br />
* [http://www.pharmaprojects.com/ Pharmaprojects] - Proprietary<br />
* [http://pubchem.ncbi.nlm.nih.gov/ PubChem]<br />
* [http://www.nlm.nih.gov/research/umls/rxnorm/ RxNorm]<br />
* [http://www.ncbi.nlm.nih.gov/sites/entrez?Db=pubmed&Cmd=ShowDetailView&TermToSearch=15360858 VA NDF-RT]<br />
* Other data sources could include blogs, discussion boards, wikis, etc.<br />
* and.... <br />
** [http://www.who.int/globalatlas/ World Health Organization's Global Health Atlas]<br />
** [http://www.epispider.org/ EpiSPIDER]<br />
** [http://www.accessdata.fda.gov/Scripts/cder/DrugsatFDA/ Drugs@FDA - FDA Approved Drug Products]<br />
** [http://www.drugdigest.org/wps/portal/ddigest DrugDigest]<br />
** [http://humancyc.org/ HumanCyc: Encyclopedia of ''Homo sapiens'' Genes and Metabolism]<br />
** [http://www.alzforum.org/ Alzheimer Research Forum]<br />
** [http://wwwcf.nlm.nih.gov/umlslicense/rxtermApp/rxTerm.cfm RxTerms]<br />
** [http://hudine.neu.edu/ HuDiNe]<br />
** [http://wiki.medpedia.com/Cymbalta Medpedia]<br />
** [http://tcm.lifescience.ntu.edu.tw/ TCMGeneDIT] and [[Media:HCLSIG$$LODD$$Data$TCMGeneDIT_RDF_Dataset_r1.zip|RDF dump]]<br />
** [http://www.tuftsctsi.org/~/media/Files/CTSI/Library%20Files/FCC%20for%20CER%20Rpt%20to%20Pres%20and%20Congress_063009.ashx List of other possible data sources from page 66 onwards]<br />
<br />
=== Alternative Herbal Medicine use case ===<br />
* [[Data/TCMGeneDIT|TCMGeneDIT dataset]]<br />
<br />
=== Identified Based Linkage Points ===<br />
<br />
* INCHIs<br />
* [[PubChem]] Compound ID (CID)<br />
* [[PubChem]] NSC<br />
* Chemical Abstract ID (CAS)<br />
* New Drug Application (NDA)<br />
<br />
=== Data Set Attributes ===<br />
<br />
* Licensing<br />
* Data Format<br />
* Identifiers</div>Rboycehttps://www.w3.org/wiki/index.php?title=Talk:HCLSIG/Tools&diff=57924Talk:HCLSIG/Tools2012-04-16T20:53:38Z<p>Rboyce: </p>
<hr />
<div>04/16/2012 (Rich Boyce): Did a search over twitter for #hcsltools and got no results. <br />
<br />
<br />
04/16/2012 (Rich Boyce): Would like to add a "+1" button that targets the URL of each tool (for example, by adding the JS generated here: http://www.google.com/webmasters/+1/button/). This might be possible using MediaWiki, but there does not appear to be an extension that allows <script> tags to be embedded in Wiki pages. There is an extension for "+1" here <http://www.mediawiki.org/wiki/Extension:Google_%2B1> but it uses a fixed URL that would be specific to the W3C server rather than the tool.</div>Rboycehttps://www.w3.org/wiki/index.php?title=Talk:HCLSIG/Tools&diff=57923Talk:HCLSIG/Tools2012-04-16T20:53:21Z<p>Rboyce: Created page with "04/16/2012 (Rich Boyce): Did a search over twitter for #hcsltools and got no results. 04/16/2012 (Rich Boyce): Would like to add a "+1" button that targets the URL of each tool …"</p>
<hr />
<div>04/16/2012 (Rich Boyce): Did a search over twitter for #hcsltools and got no results. <br />
04/16/2012 (Rich Boyce): Would like to add a "+1" button that targets the URL of each tool (for example, by adding the JS generated here: http://www.google.com/webmasters/+1/button/). This might be possible using MediaWiki, but there does not appear to be an extension that allows <script> tags to be embedded in Wiki pages. There is an extension for "+1" here <http://www.mediawiki.org/wiki/Extension:Google_%2B1> but it uses a fixed URL that would be specific to the W3C server rather than the tool.</div>Rboycehttps://www.w3.org/wiki/index.php?title=HCLSIG/LODD/Data&diff=57259HCLSIG/LODD/Data2012-03-16T19:29:05Z<p>Rboyce: </p>
<hr />
<div>__NOTOC__<br />
=== LODD-related datasets that the LODD group already made available as Linked Data ===<br />
<br />
{| border="1" cellpadding="2" cellspacing="0"<br />
| '''Name'''<br />
| '''Topic'''<br />
| '''Short Description'''<br />
| '''Size and coverage'''<br />
| '''Status / Activity'''<br />
| '''Example Instances'''<br />
| '''SPARQL Endpoint'''<br />
|-<br />
| [http://www4.wiwiss.fu-berlin.de/drugbank/ DrugBank]<br />
| Drugs<br />
| [http://www.drugbank.ca/ Drugbank.ca] provides drug (i.e., chemical, pharmacological and pharmaceutical) data with comprehensive drug target (i.e., sequence, structure, and pathway) information ({{doi|10.1093/nar/gkj067}})<br />
| 766,920 triples; 4,800 drugs, 2,500 protein sequences<br />
| updated regularly<br />
| Varenicline [http://beckr.org/marbles?uri=http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugs/DB01273 via Marbles], [http://demo.openlinksw.com/ode/?uri=http%3A%2F%2Fwww4.wiwiss.fu-berlin.de%2Fdrugbank%2Fresource%2Fdrugs%2FDB01273 via OpenLink Data Explorer] <br />
| http://www4.wiwiss.fu-berlin.de/drugbank/sparql <br />
|-<br />
| [http://linkedct.org/ LinkedCT]<br />
| Clinical Trials<br />
| Linked data source of trials from [http://clinicaltrials.gov ClinicalTrials.gov]<br />
| ~25 million triples, 106,000 trials (as of April 2011)<br />
| Updated automatically at all times, refer to [http://linkedct.org/faq/ FAQ] for more details.<br />
| [http://data.linkedct.org/resource/condition/breast-cancer/ Breast Cancer] (Condition), a [http://data.linkedct.org/resource/trial/nct00999557/ NCT00999557] (Trial), [http://data.linkedct.org/resource/city/toronto/ Toronto] (City). <br />
| http://data.linkedct.org/sparql<br />
|-<br />
| [http://www4.wiwiss.fu-berlin.de/dailymed/ DailyMed]<br />
| Drugs<br />
| [http://dailymed.nlm.nih.gov/dailymed/about.cfm dailymed.nlm.nih.gov] provides information about approved prescription drugs, includes FDA approved labels (package inserts)<br />
| 164,276 triples; 4,039 drugs<br />
| updated regularly<br />
| "Sterile Water (Irrigant)" [http://beckr.org/marbles?uri=http://www4.wiwiss.fu-berlin.de/dailymed/resource/drugs/492 via Marbles], [http://demo.openlinksw.com/ode/?uri=http%3A%2F%2Fwww4.wiwiss.fu-berlin.de%2Fdailymed%2Fresource%2Fdrugs%2F492 via OpenLink Data Explorer]<br />
| http://www4.wiwiss.fu-berlin.de/dailymed/sparql<br />
|-<br />
| [http://dbpedia.org/About DBpedia]<br />
| Drugs/ Diseases/ Proteins<br />
| RDF data about 2.49 million things that has been extracted from Wikipedia<br />
| 218 million RDF triples; 2,300 drugs, 2,200 proteins<br />
| updated every 3 months <br />
| [http://dbpedia.org/resource/Aspirin Aspirin], [http://dbpedia.org/resource/HIV HIV]<br />
| http://dbpedia.org/sparql<br />
|-<br />
| [http://www4.wiwiss.fu-berlin.de/diseasome/ Diseasome]<br />
| Diseases / Genes<br />
| [http://www.nd.edu/~alb/Publication06/145-HumanDisease_PNAS-14My07-Proc/Suppl/ Diseasome] describes characteristics of disorders and disease genes linked by known disorder–gene associations<br />
| 91,182 triples; 2,600 genes<br />
| updated 2006<br />
| Alzheimer's [http://beckr.org/marbles?uri=http://www4.wiwiss.fu-berlin.de/diseasome/resource/diseases/74 via Marbles], [http://demo.openlinksw.com/ode/?uri=http%3A%2F%2Fwww4.wiwiss.fu-berlin.de%2Fdiseasome%2Fresource%2Fdiseases%2F74 via OpenLink Data Explorer]<br />
| http://www4.wiwiss.fu-berlin.de/diseasome/sparql<br />
|-<br />
| [http://code.google.com/p/junsbriefcase/wiki/TGDdataset RDF-TCM]<br />
| Genes / Diseases / Medicine / Ingredients<br />
| Traditional Chinese medicine, gene and disease association dataset and a linkset mapping TCM gene symbols to Extrez Gene IDs created by Neurocommons <br />
| 117,643 <br />
| updated August 2009 (stable)<br />
| [http://purl.org/net/tcm/tcm.lifescience.ntu.edu.tw/id/medicine/Ginkgo_biloba Ginkgo biloba] <br />
| http://hcls.deri.org/sparql; graph name: http://hcls.deri.org/resource/graph/tcm<br />
|-<br />
| [http://www.nlm.nih.gov/research/umls/rxnorm/ RxNorm]<br />
| Drugs<br />
| A linked version of the NLM's RxNorm database that connects prescription drugs, ingredients, and NDC through RXCUI a concept unique identifier. RxNorm is a product developed by NIH’s National Library of Medicine. It currently interlinks 12 different drug vocabularies around a unique concept identifier. Due to licensing only six of the drug vocabularies are made available as part of the LODD cloud. This includes: Medical Subject Headings,, Metathesaurus FDA National Drug Code Directory, Metathesaurus FDA Structured Product Labels, National Drug File, RxNorm Vocabulary, Veterans Health Administration National Drug File<br />
Links are provided connecting RxNorm to drug bank and to the UMLS.<br />
|over 7.7 million triples; 165,806 RXCUI (Concept Unique Identifiers) Unique drugs and ingredients; 332,754 RXAUI (Atomic Unique Identifiers) sourced terms<br />
| Based on 3/2010 Rxnorm Release; Last updated 5/2010<br />
| [http://link.informatics.stonybrook.edu/rxnorm/RXAUI/2994963 Singulair from the Metathesaurus FDA Structured Product Labels]<br />
| http://link.informatics.stonybrook.edu/sparql/<br />
|-<br />
| [http://www4.wiwiss.fu-berlin.de/sider/ SIDER]<br />
| Diseases / Side Effects<br />
| [http://sideeffects.embl.de/ SIDER] contains information on marketed drugs and their adverse effects ({{doi|10.1038/msb.2009.98}})<br />
| 192,515 triples; 63,000 adverse effect reports, 1,737 genes<br />
| updated 2009<br />
| Confusion [http://beckr.org/marbles?uri=http://www4.wiwiss.fu-berlin.de/sider/resource/side_effects/C0009676 via Marbles]<br />
| http://www4.wiwiss.fu-berlin.de/sider/sparql<br />
|-<br />
| [http://www4.wiwiss.fu-berlin.de/stitch/ STITCH]<br />
| Chemicals / Proteins<br />
| [http://stitch.embl.de/ STITCH] contains information on chemicals, proteins, and their interactions ({{doi|10.1093/nar/gkm795}})<br />
| 7,500,000 chemicals; 500,000 proteins; 370 organisms <br />
| updated July 2009<br />
| Lactose [http://beckr.org/marbles?uri=http://www4.wiwiss.fu-berlin.de/stitch/resource/chemicals/CID000000294 via Marbles]<br />
| http://www4.wiwiss.fu-berlin.de/stitch/sparql<br />
|-<br />
| [http://www4.wiwiss.fu-berlin.de/medicare/ Medicare]<br />
| Medicare Formulary<br />
| xxx<br />
| xxx<br />
| xxx<br />
| xxx<br />
| http://www4.wiwiss.fu-berlin.de/medicare/sparql<br />
|-<br />
| [http://www.ebi.ac.uk/chembl/ ChEMBL]<br />
| Chemical / Assays (Proteins, Organisms) / Papers<br />
| [http://www.ebi.ac.uk/chembl/ ChEMBL]] contains information on trial drugs with information about activity against targets like but not limited to proteins. All is backed up by and linked to literature. Includes links to Bio2RDF for ChEBI and Uniprot. License: CC-BY-SA.<br />
| ~24M triples<br />
| Updated 2010-01<br />
| A [http://rdf.farmbio.uu.se/chembl/snorql/?describe=http://rdf.farmbio.uu.se/chembl/activity/a2642163 IC50 activity].<br />
| http://rdf.farmbio.uu.se/chembl/sparql<br />
|-<br />
| [http://www.who.int/gho/en/index.html WHO Global Health Observatory]<br />
| Infectious Diseases /Demography / Socioeconomic Conditions / Environmental Factors<br />
| Data and statistics for infectious diseases at country, regional, and global levels<br />
| 354300<br />
| Updated 2010-09<br />
| xxx<br />
| http://aksw.org/Projects/Stats2RDF<br />
|-<br />
| [http://nlp.dbmi.pitt.edu/nlprepository.html University of Pittsburgh NLP Repository] <br />
| Drugs / Procedures / Diagnoses<br />
| A semantic index of concepts present in 800 full-text clinical notes from the University of Pittsburgh NLP Repository<br />
| 116,855<br />
| Proof of concept -- Updated 02/25/2011<br />
| [http://tarski.duhs.org:8080/openrdf-workbench/repositories/u-pitt-blulab/explore?resource=bl%3Areport_404 Concepts from a sample radiology report]<br />
| [http://tarski.duhs.org:8080/openrdf-workbench/repositories/u-pitt-blulab/query http://tarski.duhs.org:8080/openrdf-workbench/repositories/u-pitt-blulab/query]<br />
|-<br />
| [http://dbmi-icode-01.dbmi.pitt.edu/dikb-evidence/front-page.html The Drug Interaction Knowledge Base] <br />
| Drugs / Metabolic Inhibition Drug-drug Interactions (DDIs) / Claims and Evidence for drug mechanisms and DDIs<br />
| A D2R server of more than 60 drugs currently in the DIKB<br />
| 41,480<br />
| Updated 03/01/2012<br />
| [http://dbmi-icode-01.dbmi.pitt.edu:2020/page/Drugs/paroxetine paroxetine], [http://dbmi-icode-01.dbmi.pitt.edu:2020/page/Drugs/atorvastatin atorvastatin]<br />
| http://dbmi-icode-01.dbmi.pitt.edu:2020/<br />
|-<br />
| [http://purl.org/net/nlprepository/linkedSPLs Linked Structured Product Labels] <br />
| Selected sections from the FDA-approved Structured Product Labels (SPLs) for currently marketed drugs<br />
| A D2R server rendering properly encoded unstructured text from the SPLs<br />
| 47,925<br />
| Updated 03/01/2012<br />
| [http://thedatahub.org/dataset/linked-structured-product-labels/resource/918071ae-f570-4728-98b7-5b447ab42ab8 SPL for Venlafaxine Hydrochloride (American Health Packaging)]<br />
| http://purl.org/net/nlprepository/linkedSPLs<br />
|}<br />
<br />
[[File:2010-12-04_lodd_cloud.png|600px]]<br />
<br />
A graph of some of the LODD datasets (dark grey), related biomedical datasets (light grey), related general-purpose datasets (white) and their interconnections. Line weights correspond to the number of links. The direction of an arrow indicates the dataset that contains the links, e.g., an arrow from A to B means that dataset A contains RDF triples that use identifiers from B. Bidirectional arrows usually indicate that the links are mirrored in both datasets.<br />
More on the interlinking methodology and statistics can be found on the [[../Interlinking|Interlinking]] page.<br />
<br />
The LODD datasets have been crawled by the SWSE Semantic Web search engine and can be accessed via a faceted browsing interface at [http://visinav.deri.org/hcls/] ([http://visinav.deri.org/hcls/list?keyword=varenicline Example query: Varenicline]).<br />
<br />
Most of the LODD datasets have also been integrated into the SPARQL endpoint of the HCLS Knowledge Base, see [http://esw.w3.org/topic/HCLSIG_BioRDF_Subgroup/DERI_HCLS_KB the wiki page of the HCLS KB] for further information.<br />
<br />
=== Bio2RDF Data Sets ===<br />
<br />
The [http://bio2rdf.org/ Bio2RDF project] has published 40 biology-, gene- and medical-related datasets (altogether 2.3 billion triples). <br />
The datasets are available via SPARQL endpoints and as Linked Data. It is recommended that you use the [http://sourceforge.net/project/platformdownload.php?group_id=142631 Bio2RDF Java Servlet], and optionally [http://quebec.bio2rdf.org/download/virtuoso/indexed/ download the databases] for efficient personal use. Running your own instance of the [http://virtuoso.openlinksw.com/wiki/main/Main/VirtEC2AMIBio2rdfInstall OpenLink Virtuoso AMI for EC2] is also an option (and for basic URI resolution doesn't require the Java Servlet, although if you want advanced queries you should still download it and configure it to query your EC2 sparql endpoint).<br />
<br />
* [http://www.freebase.com/view/user/bio2rdf/public/sparql Bio2RDF sparql endpoint list] [http://rdf.freebase.com/rdf/user/bio2rdf/public/sparql Sparql endpoint list in RDF]<br />
* [http://linkeddata.openlinksw.com:8891/pubmed:10500064 Identification of an autoimmune enteropathy-related 75-kilodalton antigen], via an [[OpenLinkSoftware|OpenLink]] hosted edition of [[Bio2Rdf]] <br />
* [http://linkeddata.openlinksw.com:8891/pubmed:9636670 Structure of the gene encoding the human cyclin-dependent kinase inhibitor p18 and mutational analysis in breast cancer], via an [[OpenLinkSoftware|OpenLink]] hosted edition of [[Bio2Rdf]] <br />
* [http://beckr.org/marbles?uri=http%3A%2F%2Fbio2rdf.org%2Fpubmed%3A9626117 PubMed article] viewed using the Marbles Linked Data browser.<br />
* [http://beckr.org/marbles?lang=en&uri=http%3A%2F%2Fbio2rdf.org%2Ffoaf%3AClemens%2C_T_L PubMed author] viewed using the Marbles Linked Data browser.<br />
* [http://beckr.org/marbles?uri=http%3A%2F%2Fbio2rdf.org%2Fomim%3A161555 OMIM Killer Cell Lectin-Like Receptor] viewed using the Marbles Linked Data browser.<br />
* [http://iws.seu.edu.cn/services/falcons/objectsearch/queryresult.jsp?query=%22KILLER+CELL%22 Falcons Search for KILLER CELL]. The Bio2RDF data has been crawled by the Falcons Semantic Web Search engine. This is an example on how the data is accessed by humans using the search engine. Falcons also offers an API that can by used by applications to access the data.<br />
<br />
=== Chem2bio2RDF ===<br />
<br />
* Information about the [http://chem2bio2rdf.org/ chem2bio2rdf] data sets<br />
<br />
=== Data Sets for the LODD Task ===<br />
<br />
To complement the drug-related Web of Data build by the LODD effort, the following data sets could/should also be published as Linked Data.<br />
<br />
The LODD effort is currently gathering more information about relevant datasets. See also [[/DataSetEvaluation|Evaluation of LODD Data Sets]] for current evaluation results.<br />
<br />
* [http://library.dialog.com/bluesheets/html/bl0107.html Adis R&D Insight]<br />
* [http://www.ebi.ac.uk/chebi/ chEBI]<br />
* [http://xpdb.nist.gov/pdb/chemblast.html ChemBlast]<br />
* [http://www.chemspider.com/ ChemSpider]<br />
* [http://ClinicalTrials.gov ClinicalTrials.gov]<br />
* [http://www.citeline.com/trialtrove.html Citeline TrialTrove]<br />
* [http://dailymed.nlm.nih.gov/dailymed/about.cfm DailyMed]<br />
* [http://dbpedia.org/About DBpedia]<br />
* [http://www.nd.edu/~alb/Publication06/145-HumanDisease_PNAS-14My07-Proc/Suppl/ Diseasome]<br />
* [http://www.drugbank.ca/ Drug Bank]<br />
* [http://www.virtualref.com/abs/72.htm DrugDB]<br />
* [http://www.ncbi.nlm.nih.gov/pubmed/17921997 Drugome]<br />
* [http://lsdis.cs.uga.edu/projects/asdoc/ Drug Ontology]<br />
* [http://scientific.thomsonreuters.com/products/iddb/ Investigational Drug Database] - Proprietary<br />
* [http://www.ovid.com/site/catalog/DataBase/1244.jsp?top=2&mid=3&bottom=7&subsection=10 IMS]<br />
* [http://www.genome.jp/kegg/drug/ KEGG Drug]<br />
* [http://Lillytrials.com LillyTrials]<br />
* [http://www.fda.gov/medwatch/ MedWatch]<br />
* [http://www.fda.gov/cder/ndc/ National Drug Code]<br />
* [http://www.ncbi.nlm.nih.gov/omim/ OMIM]<br />
* [http://www.fda.gov/cder/ob/ Orange Book]<br />
* [http://www.pharmaprojects.com/ Pharmaprojects] - Proprietary<br />
* [http://pubchem.ncbi.nlm.nih.gov/ PubChem]<br />
* [http://www.nlm.nih.gov/research/umls/rxnorm/ RxNorm]<br />
* [http://www.ncbi.nlm.nih.gov/sites/entrez?Db=pubmed&Cmd=ShowDetailView&TermToSearch=15360858 VA NDF-RT]<br />
* Other data sources could include blogs, discussion boards, wikis, etc.<br />
* and.... <br />
** [http://www.who.int/globalatlas/ World Health Organization's Global Health Atlas]<br />
** [http://www.epispider.org/ EpiSPIDER]<br />
** [http://www.accessdata.fda.gov/Scripts/cder/DrugsatFDA/ Drugs@FDA - FDA Approved Drug Products]<br />
** [http://www.drugdigest.org/wps/portal/ddigest DrugDigest]<br />
** [http://humancyc.org/ HumanCyc: Encyclopedia of ''Homo sapiens'' Genes and Metabolism]<br />
** [http://www.alzforum.org/ Alzheimer Research Forum]<br />
** [http://wwwcf.nlm.nih.gov/umlslicense/rxtermApp/rxTerm.cfm RxTerms]<br />
** [http://hudine.neu.edu/ HuDiNe]<br />
** [http://wiki.medpedia.com/Cymbalta Medpedia]<br />
** [http://tcm.lifescience.ntu.edu.tw/ TCMGeneDIT] and [[Media:HCLSIG$$LODD$$Data$TCMGeneDIT_RDF_Dataset_r1.zip|RDF dump]]<br />
** [http://www.tuftsctsi.org/~/media/Files/CTSI/Library%20Files/FCC%20for%20CER%20Rpt%20to%20Pres%20and%20Congress_063009.ashx List of other possible data sources from page 66 onwards]<br />
<br />
=== Alternative Herbal Medicine use case ===<br />
* [[Data/TCMGeneDIT|TCMGeneDIT dataset]]<br />
<br />
=== Identified Based Linkage Points ===<br />
<br />
* INCHIs<br />
* [[PubChem]] Compound ID (CID)<br />
* [[PubChem]] NSC<br />
* Chemical Abstract ID (CAS)<br />
* New Drug Application (NDA)<br />
<br />
=== Data Set Attributes ===<br />
<br />
* Licensing<br />
* Data Format<br />
* Identifiers</div>Rboycehttps://www.w3.org/wiki/index.php?title=HCLSIG/Tools&diff=56147HCLSIG/Tools2012-01-19T15:57:40Z<p>Rboyce: </p>
<hr />
<div>Know of a tool? Tweet it using #hclstool ! We will add the tool when we periodically aggregate tweets with that hashtag.<br />
<br />
= General support for Semantic Web applications =<br />
<br />
* [http://sourceforge.net/apps/mediawiki/swobjects/index.php?title=Main_Page SWObjects]<br />
<br />
= Tools to convert relational data to RDF =<br />
<br />
== [http://www4.wiwiss.fu-berlin.de/bizer/d2r-server/ D2R server] ==<br />
<br />
D2R Server is a tool for publishing relational databases on the Semantic Web.<br />
It enables RDF and HTML browsers to navigate the content of the database,<br />
and allows applications to query the database using the SPARQL query language<br />
<br />
<br />
= Tools that support converting non-relational data to RDF =<br />
<br />
== [http://dblab.cs.toronto.edu/project/xcurator/ xCurator ] ==<br />
<br />
The xCurator project offers an end-to-end framework to transform a semi-structured (XML) source into high-quality Linked Data</div>Rboycehttps://www.w3.org/wiki/index.php?title=HCLSIG/Tools&diff=56146HCLSIG/Tools2012-01-19T15:54:12Z<p>Rboyce: </p>
<hr />
<div>Know of a tool? Tweet it using #hclstool ! We add the tool when we periodically aggregate tweets with that hashtag.<br />
<br />
= General support for Semantic Web applications =<br />
<br />
* [http://sourceforge.net/apps/mediawiki/swobjects/index.php?title=Main_Page SWObjects]<br />
<br />
= Tools to convert relational data to RDF =<br />
<br />
== [http://www4.wiwiss.fu-berlin.de/bizer/d2r-server/ D2R server] ==<br />
<br />
D2R Server is a tool for publishing relational databases on the Semantic Web.<br />
It enables RDF and HTML browsers to navigate the content of the database,<br />
and allows applications to query the database using the SPARQL query language<br />
<br />
<br />
= Tools that support converting non-relational data to RDF =<br />
<br />
== [http://dblab.cs.toronto.edu/project/xcurator/ xCurator ] ==<br />
<br />
The xCurator project offers an end-to-end framework to transform a semi-structured (XML) source into high-quality Linked Data</div>Rboycehttps://www.w3.org/wiki/index.php?title=HCLSIG/Tools&diff=56145HCLSIG/Tools2012-01-19T15:53:50Z<p>Rboyce: </p>
<hr />
<div>= Know of a tool? Tweet it using #hclstool ! We add the tool when we periodically aggregate tweets with that hashtag. =<br />
<br />
= General support for Semantic Web applications =<br />
<br />
* [http://sourceforge.net/apps/mediawiki/swobjects/index.php?title=Main_Page SWObjects]<br />
<br />
= Tools to convert relational data to RDF =<br />
<br />
== [http://www4.wiwiss.fu-berlin.de/bizer/d2r-server/ D2R server] ==<br />
<br />
D2R Server is a tool for publishing relational databases on the Semantic Web.<br />
It enables RDF and HTML browsers to navigate the content of the database,<br />
and allows applications to query the database using the SPARQL query language<br />
<br />
<br />
= Tools that support converting non-relational data to RDF =<br />
<br />
== [http://dblab.cs.toronto.edu/project/xcurator/ xCurator ] ==<br />
<br />
The xCurator project offers an end-to-end framework to transform a semi-structured (XML) source into high-quality Linked Data</div>Rboycehttps://www.w3.org/wiki/index.php?title=HCLSIG/Tools&diff=56144HCLSIG/Tools2012-01-19T15:50:03Z<p>Rboyce: /* general support for Semantic Web applications */</p>
<hr />
<div>= General support for Semantic Web applications =<br />
<br />
* [http://sourceforge.net/apps/mediawiki/swobjects/index.php?title=Main_Page SWObjects]<br />
<br />
= Tools to convert relational data to RDF =<br />
<br />
== [http://www4.wiwiss.fu-berlin.de/bizer/d2r-server/ D2R server] ==<br />
<br />
D2R Server is a tool for publishing relational databases on the Semantic Web.<br />
It enables RDF and HTML browsers to navigate the content of the database,<br />
and allows applications to query the database using the SPARQL query language<br />
<br />
<br />
= Tools that support converting non-relational data to RDF =<br />
<br />
== [http://dblab.cs.toronto.edu/project/xcurator/ xCurator ] ==<br />
<br />
The xCurator project offers an end-to-end framework to transform a semi-structured (XML) source into high-quality Linked Data</div>Rboycehttps://www.w3.org/wiki/index.php?title=HCLSIG/Tools&diff=56143HCLSIG/Tools2012-01-19T15:49:47Z<p>Rboyce: </p>
<hr />
<div>= general support for Semantic Web applications =<br />
<br />
* [http://sourceforge.net/apps/mediawiki/swobjects/index.php?title=Main_Page SWObjects]<br />
<br />
= Tools to convert relational data to RDF =<br />
<br />
== [http://www4.wiwiss.fu-berlin.de/bizer/d2r-server/ D2R server] ==<br />
<br />
D2R Server is a tool for publishing relational databases on the Semantic Web.<br />
It enables RDF and HTML browsers to navigate the content of the database,<br />
and allows applications to query the database using the SPARQL query language<br />
<br />
<br />
= Tools that support converting non-relational data to RDF =<br />
<br />
== [http://dblab.cs.toronto.edu/project/xcurator/ xCurator ] ==<br />
<br />
The xCurator project offers an end-to-end framework to transform a semi-structured (XML) source into high-quality Linked Data</div>Rboycehttps://www.w3.org/wiki/index.php?title=HCLSIG/LODD&diff=56142HCLSIG/LODD2012-01-19T15:47:18Z<p>Rboyce: /* Project Pages */</p>
<hr />
<div>__NOTOC__<br />
== Linking Open Drug Data (LODD) ==<br />
<br />
=== Project Description ===<br />
<br />
There is much interesting information about drugs that is available on the Web. The sources of data range from impacts of the drugs on gene expression, through to the results of clinical trials. This project focuses on linking the various sources of drug data together to answer interesting scientific and business questions. More on project [[/Roadmap| Deliverables and Roadmap]].<br />
<br />
The figure below shows part of the data sets that have been published and interlinked by the project so far, within the Linked Data cloud. These data sets are represented in dark gray, while light gray represents other Linked Data from the life sciences, and white indicates interlinked datasets. Collectively, the data sets consist of over 8 million RDF triples, which are interlinked by more than 370,000 RDF links (As of August 2009). More details are available on the [[/Data| datasets]] page.<br />
<br />
<br><br />
http://www4.wiwiss.fu-berlin.de/lodd/lodd-datasets_2009-08-06.png<br />
<br />
A highlight of this project is using state-of-the-art semantic link discovery techniques for interlinking the published datasets. More on the interlinking methodology can be found on the [[/Interlinking| Interlinking]] page.<br />
<br />
One of the main goals of this project is investigating use cases that demonstrate how researchers in life science, as well as physicians and patients can take advantage of the connected data sets. Read more about some of the [[../AlternativeMedicineUseCase/| use cases]].<br />
<br />
<br><br />
=== Meetings ===<br />
<br />
==== Telcon ====<br />
* [[/Meetings/2011-07-11_Conference_Call|Next Meeting Wed July 11, 2011]]<br />
* [[/Meetings|Past Meetings]]<br />
* [[/Actions| Action Items]]<br />
<br />
<br><br />
<br />
=== Project Pages ===<br />
<br />
* [[/Roadmap| Deliverables and Roadmap]]<br />
* [[/Data|Data Sets]]<br />
* [[/Interlinking|Interlinking]]<br />
* [[../AlternativeMedicineUseCase/|Alternative Medicine Use Case]]<br />
* [[/Business|Business Case]]<br />
* [[/Mapping Experimental Data]]<br />
* [[/Literature|Literature]]<br />
<br />
<br><br />
<br />
=== Participants ===<br />
* Anja Jentzsch (Freie Universitat Berlin)<br />
* Eric Prud'hommeaux (W3C)<br />
* Susie Stephens (Johnson & Johnson Pharmaceutical Research & Development)<br />
* Bosse Anderssen (AZ)<br />
* M. Scott Marshall (University of Amsterdam)<br />
* Chris Bizer (Freie Universitat Berlin)<br />
* Glen Newton (National Research Council Canada)<br />
* Michel Dumontier (Carleton University)<br />
* TN Bhat (NIST)<br />
* Oktie Hassanzadeh (University of Toronto)<br />
* Matthias Samwald (DERI)<br />
* Jun Zhao (University of Oxford)<br />
* Egon Willighagen (Uppsala University)<br />
* Janos Hajagos (Stony Brook University School of Medicine)<br />
* Claus Stie Kallesøe (Lundbeck)<br />
<br />
<br><br />
<br />
If you have any questions please contact [mailto:Susie.Stephens@gmail.com Susie Stephens]<br />
<br />
Categories: [[Category:Hclsig]]</div>Rboycehttps://www.w3.org/wiki/index.php?title=HCLSIG/LODD/RelationalDataToRDFTools&diff=55774HCLSIG/LODD/RelationalDataToRDFTools2011-12-02T20:42:22Z<p>Rboyce: Created page with "= Tools to convert relational data to RDF = == [http://www4.wiwiss.fu-berlin.de/bizer/d2r-server/ D2R server] == D2R Server is a tool for publishing relational databases on the…"</p>
<hr />
<div>= Tools to convert relational data to RDF =<br />
<br />
== [http://www4.wiwiss.fu-berlin.de/bizer/d2r-server/ D2R server] ==<br />
<br />
D2R Server is a tool for publishing relational databases on the Semantic Web.<br />
It enables RDF and HTML browsers to navigate the content of the database,<br />
and allows applications to query the database using the SPARQL query language</div>Rboycehttps://www.w3.org/wiki/index.php?title=HCLSIG/LODD/NonRelationalDataToRDFTools&diff=55773HCLSIG/LODD/NonRelationalDataToRDFTools2011-12-02T20:40:15Z<p>Rboyce: </p>
<hr />
<div>= Tools that support converting non-relational data to RDF =<br />
<br />
== [http://dblab.cs.toronto.edu/project/xcurator/ xCurator ] ==<br />
<br />
The xCurator project offers an end-to-end framework to transform a semi-structured (XML) source into high-quality Linked Data</div>Rboycehttps://www.w3.org/wiki/index.php?title=HCLSIG/LODD/NonRelationalDataToRDFTools&diff=55772HCLSIG/LODD/NonRelationalDataToRDFTools2011-12-02T20:39:22Z<p>Rboyce: Created page with "= Tools that support converting non-relational data to RDF = === [http://dblab.cs.toronto.edu/project/xcurator/ xCurator ] === The xCurator project offers an end-to-end framewo…"</p>
<hr />
<div>= Tools that support converting non-relational data to RDF =<br />
<br />
=== [http://dblab.cs.toronto.edu/project/xcurator/ xCurator ] ===<br />
<br />
The xCurator project offers an end-to-end framework to transform a semi-structured (XML) source into high-quality Linked Data</div>Rboycehttps://www.w3.org/wiki/index.php?title=HCLSIG/LODD&diff=55771HCLSIG/LODD2011-12-02T20:34:34Z<p>Rboyce: /* Project Pages */</p>
<hr />
<div>__NOTOC__<br />
== Linking Open Drug Data (LODD) ==<br />
<br />
=== Project Description ===<br />
<br />
There is much interesting information about drugs that is available on the Web. The sources of data range from impacts of the drugs on gene expression, through to the results of clinical trials. This project focuses on linking the various sources of drug data together to answer interesting scientific and business questions. More on project [[/Roadmap| Deliverables and Roadmap]].<br />
<br />
The figure below shows part of the data sets that have been published and interlinked by the project so far, within the Linked Data cloud. These data sets are represented in dark gray, while light gray represents other Linked Data from the life sciences, and white indicates interlinked datasets. Collectively, the data sets consist of over 8 million RDF triples, which are interlinked by more than 370,000 RDF links (As of August 2009). More details are available on the [[/Data| datasets]] page.<br />
<br />
<br><br />
http://www4.wiwiss.fu-berlin.de/lodd/lodd-datasets_2009-08-06.png<br />
<br />
A highlight of this project is using state-of-the-art semantic link discovery techniques for interlinking the published datasets. More on the interlinking methodology can be found on the [[/Interlinking| Interlinking]] page.<br />
<br />
One of the main goals of this project is investigating use cases that demonstrate how researchers in life science, as well as physicians and patients can take advantage of the connected data sets. Read more about some of the [[../AlternativeMedicineUseCase/| use cases]].<br />
<br />
<br><br />
=== Meetings ===<br />
<br />
==== Telcon ====<br />
* [[/Meetings/2011-07-11_Conference_Call|Next Meeting Wed July 11, 2011]]<br />
* [[/Meetings|Past Meetings]]<br />
* [[/Actions| Action Items]]<br />
<br />
<br><br />
<br />
=== Project Pages ===<br />
<br />
* [[/Roadmap| Deliverables and Roadmap]]<br />
* [[/Data|Data Sets]]<br />
* [[/Interlinking|Interlinking]]<br />
* [[../AlternativeMedicineUseCase/|Alternative Medicine Use Case]]<br />
* [[/Business|Business Case]]<br />
* [[/Mapping Experimental Data]]<br />
* [[/Literature|Literature]]<br />
* [[/NonRelationalDataToRDFTools|Tools to convert NON-RELATIONAL data to RDF]]<br />
* [[/RelationalDataToRDFTools|Tools to convert RELATIONAL data to RDF]]<br />
<br />
<br><br />
<br />
=== Participants ===<br />
* Anja Jentzsch (Freie Universitat Berlin)<br />
* Eric Prud'hommeaux (W3C)<br />
* Susie Stephens (Johnson & Johnson Pharmaceutical Research & Development)<br />
* Bosse Anderssen (AZ)<br />
* M. Scott Marshall (University of Amsterdam)<br />
* Chris Bizer (Freie Universitat Berlin)<br />
* Glen Newton (National Research Council Canada)<br />
* Michel Dumontier (Carleton University)<br />
* TN Bhat (NIST)<br />
* Oktie Hassanzadeh (University of Toronto)<br />
* Matthias Samwald (DERI)<br />
* Jun Zhao (University of Oxford)<br />
* Egon Willighagen (Uppsala University)<br />
* Janos Hajagos (Stony Brook University School of Medicine)<br />
* Claus Stie Kallesøe (Lundbeck)<br />
<br />
<br><br />
<br />
If you have any questions please contact [mailto:Susie.Stephens@gmail.com Susie Stephens]<br />
<br />
Categories: [[Category:Hclsig]]</div>Rboycehttps://www.w3.org/wiki/index.php?title=HCLSIG/LODD/Data&diff=54039HCLSIG/LODD/Data2011-09-25T06:42:35Z<p>Rboyce: /* LODD-related datasets that the LODD group already made available as Linked Data */</p>
<hr />
<div>__NOTOC__<br />
=== LODD-related datasets that the LODD group already made available as Linked Data ===<br />
<br />
{| border="1" cellpadding="2" cellspacing="0"<br />
| '''Name'''<br />
| '''Topic'''<br />
| '''Short Description'''<br />
| '''Size and coverage'''<br />
| '''Status / Activity'''<br />
| '''Example Instances'''<br />
| '''SPARQL Endpoint'''<br />
|-<br />
| [http://www4.wiwiss.fu-berlin.de/drugbank/ DrugBank]<br />
| Drugs<br />
| [http://www.drugbank.ca/ Drugbank.ca] provides drug (i.e., chemical, pharmacological and pharmaceutical) data with comprehensive drug target (i.e., sequence, structure, and pathway) information ({{doi|10.1093/nar/gkj067}})<br />
| 766,920 triples; 4,800 drugs, 2,500 protein sequences<br />
| updated regularly<br />
| Varenicline [http://beckr.org/marbles?uri=http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugs/DB01273 via Marbles], [http://demo.openlinksw.com/ode/?uri=http%3A%2F%2Fwww4.wiwiss.fu-berlin.de%2Fdrugbank%2Fresource%2Fdrugs%2FDB01273 via OpenLink Data Explorer] <br />
| http://www4.wiwiss.fu-berlin.de/drugbank/sparql <br />
|-<br />
| [http://linkedct.org/ LinkedCT]<br />
| Clinical Trials<br />
| Linked data source of trials from [http://clinicaltrials.gov ClinicalTrials.gov]<br />
| ~25 million triples, 106,000 trials (as of April 2011)<br />
| Updated automatically at all times, refer to [http://linkedct.org/faq/ FAQ] for more details.<br />
| [http://data.linkedct.org/resource/condition/breast-cancer/ Breast Cancer] (Condition), a [http://data.linkedct.org/resource/trial/nct00999557/ NCT00999557] (Trial), [http://data.linkedct.org/resource/city/toronto/ Toronto] (City). <br />
| http://data.linkedct.org/sparql<br />
|-<br />
| [http://www4.wiwiss.fu-berlin.de/dailymed/ DailyMed]<br />
| Drugs<br />
| [http://dailymed.nlm.nih.gov/dailymed/about.cfm dailymed.nlm.nih.gov] provides information about approved prescription drugs, includes FDA approved labels (package inserts)<br />
| 164,276 triples; 4,039 drugs<br />
| updated regularly<br />
| "Sterile Water (Irrigant)" [http://beckr.org/marbles?uri=http://www4.wiwiss.fu-berlin.de/dailymed/resource/drugs/492 via Marbles], [http://demo.openlinksw.com/ode/?uri=http%3A%2F%2Fwww4.wiwiss.fu-berlin.de%2Fdailymed%2Fresource%2Fdrugs%2F492 via OpenLink Data Explorer]<br />
| http://www4.wiwiss.fu-berlin.de/dailymed/sparql<br />
|-<br />
| [http://dbpedia.org/About DBpedia]<br />
| Drugs/ Diseases/ Proteins<br />
| RDF data about 2.49 million things that has been extracted from Wikipedia<br />
| 218 million RDF triples; 2,300 drugs, 2,200 proteins<br />
| updated every 3 months <br />
| [http://dbpedia.org/resource/Aspirin Aspirin], [http://dbpedia.org/resource/HIV HIV]<br />
| http://dbpedia.org/sparql<br />
|-<br />
| [http://www4.wiwiss.fu-berlin.de/diseasome/ Diseasome]<br />
| Diseases / Genes<br />
| [http://www.nd.edu/~alb/Publication06/145-HumanDisease_PNAS-14My07-Proc/Suppl/ Diseasome] describes characteristics of disorders and disease genes linked by known disorder–gene associations<br />
| 91,182 triples; 2,600 genes<br />
| updated 2006<br />
| Alzheimer's [http://beckr.org/marbles?uri=http://www4.wiwiss.fu-berlin.de/diseasome/resource/diseases/74 via Marbles], [http://demo.openlinksw.com/ode/?uri=http%3A%2F%2Fwww4.wiwiss.fu-berlin.de%2Fdiseasome%2Fresource%2Fdiseases%2F74 via OpenLink Data Explorer]<br />
| http://www4.wiwiss.fu-berlin.de/diseasome/sparql<br />
|-<br />
| [http://code.google.com/p/junsbriefcase/wiki/TGDdataset RDF-TCM]<br />
| Genes / Diseases / Medicine / Ingredients<br />
| Traditional Chinese medicine, gene and disease association dataset and a linkset mapping TCM gene symbols to Extrez Gene IDs created by Neurocommons <br />
| 117,643 <br />
| updated August 2009 (stable)<br />
| [http://purl.org/net/tcm/tcm.lifescience.ntu.edu.tw/id/medicine/Ginkgo_biloba Ginkgo biloba] <br />
| http://hcls.deri.org/sparql; graph name: http://hcls.deri.org/resource/graph/tcm<br />
|-<br />
| [http://www.nlm.nih.gov/research/umls/rxnorm/ RxNorm]<br />
| Drugs<br />
| A linked version of the NLM's RxNorm database that connects prescription drugs, ingredients, and NDC through RXCUI a concept unique identifier. RxNorm is a product developed by NIH’s National Library of Medicine. It currently interlinks 12 different drug vocabularies around a unique concept identifier. Due to licensing only six of the drug vocabularies are made available as part of the LODD cloud. This includes: Medical Subject Headings,, Metathesaurus FDA National Drug Code Directory, Metathesaurus FDA Structured Product Labels, National Drug File, RxNorm Vocabulary, Veterans Health Administration National Drug File<br />
Links are provided connecting RxNorm to drug bank and to the UMLS.<br />
|over 7.7 million triples; 165,806 RXCUI (Concept Unique Identifiers) Unique drugs and ingredients; 332,754 RXAUI (Atomic Unique Identifiers) sourced terms<br />
| Based on 3/2010 Rxnorm Release; Last updated 5/2010<br />
| [http://link.informatics.stonybrook.edu/rxnorm/RXAUI/2994963 Singulair from the Metathesaurus FDA Structured Product Labels]<br />
| http://link.informatics.stonybrook.edu/sparql/<br />
|-<br />
| [http://www4.wiwiss.fu-berlin.de/sider/ SIDER]<br />
| Diseases / Side Effects<br />
| [http://sideeffects.embl.de/ SIDER] contains information on marketed drugs and their adverse effects ({{doi|10.1038/msb.2009.98}})<br />
| 192,515 triples; 63,000 adverse effect reports, 1,737 genes<br />
| updated 2009<br />
| Confusion [http://beckr.org/marbles?uri=http://www4.wiwiss.fu-berlin.de/sider/resource/side_effects/C0009676 via Marbles]<br />
| http://www4.wiwiss.fu-berlin.de/sider/sparql<br />
|-<br />
| [http://www4.wiwiss.fu-berlin.de/stitch/ STITCH]<br />
| Chemicals / Proteins<br />
| [http://stitch.embl.de/ STITCH] contains information on chemicals, proteins, and their interactions ({{doi|10.1093/nar/gkm795}})<br />
| 7,500,000 chemicals; 500,000 proteins; 370 organisms <br />
| updated July 2009<br />
| Lactose [http://beckr.org/marbles?uri=http://www4.wiwiss.fu-berlin.de/stitch/resource/chemicals/CID000000294 via Marbles]<br />
| http://www4.wiwiss.fu-berlin.de/stitch/sparql<br />
|-<br />
| [http://www4.wiwiss.fu-berlin.de/medicare/ Medicare]<br />
| Medicare Formulary<br />
| xxx<br />
| xxx<br />
| xxx<br />
| xxx<br />
| http://www4.wiwiss.fu-berlin.de/medicare/sparql<br />
|-<br />
| [http://www.ebi.ac.uk/chembl/ ChEMBL]<br />
| Chemical / Assays (Proteins, Organisms) / Papers<br />
| [http://www.ebi.ac.uk/chembl/ ChEMBL]] contains information on trial drugs with information about activity against targets like but not limited to proteins. All is backed up by and linked to literature. Includes links to Bio2RDF for ChEBI and Uniprot. License: CC-BY-SA.<br />
| ~24M triples<br />
| Updated 2010-01<br />
| A [http://rdf.farmbio.uu.se/chembl/snorql/?describe=http://rdf.farmbio.uu.se/chembl/activity/a2642163 IC50 activity].<br />
| http://rdf.farmbio.uu.se/chembl/sparql<br />
|-<br />
| [http://www.who.int/gho/en/index.html WHO Global Health Observatory]<br />
| Infectious Diseases /Demography / Socioeconomic Conditions / Environmental Factors<br />
| Data and statistics for infectious diseases at country, regional, and global levels<br />
| 354300<br />
| Updated 2010-09<br />
| xxx<br />
| http://aksw.org/Projects/Stats2RDF<br />
|-<br />
| [http://nlp.dbmi.pitt.edu/nlprepository.html University of Pittsburgh NLP Repository] <br />
| Drugs / Procedures / Diagnoses<br />
| A semantic index of concepts present in 800 full-text clinical notes from the University of Pittsburgh NLP Repository<br />
| 116,855<br />
| Proof of concept -- Updated 02/25/2011<br />
| [http://tarski.duhs.org:8080/openrdf-workbench/repositories/u-pitt-blulab/explore?resource=bl%3Areport_404 Concepts from a sample radiology report]<br />
| [http://tarski.duhs.org:8080/openrdf-workbench/repositories/u-pitt-blulab/query http://tarski.duhs.org:8080/openrdf-workbench/repositories/u-pitt-blulab/query]<br />
|-<br />
| [http://dbmi-icode-01.dbmi.pitt.edu/dikb-evidence/front-page.html The Drug Interaction Knowledge Base] <br />
| Drugs / Metabolic Inhibition Drug-drug Interactions (DDIs) / Claims and Evidence for drug mechanisms and DDIs<br />
| A D2R server of more than 60 drugs currently in the DIKB<br />
| 41,480<br />
| Updated 09/25/2011<br />
| [http://dbmi-icode-01.dbmi.pitt.edu:2020/page/Drugs/paroxetine paroxetine], [http://dbmi-icode-01.dbmi.pitt.edu:2020/page/Drugs/atorvastatin atorvastatin]<br />
| http://dbmi-icode-01.dbmi.pitt.edu:2020/<br />
|}<br />
<br />
[[File:2010-12-04_lodd_cloud.png|600px]]<br />
<br />
A graph of some of the LODD datasets (dark grey), related biomedical datasets (light grey), related general-purpose datasets (white) and their interconnections. Line weights correspond to the number of links. The direction of an arrow indicates the dataset that contains the links, e.g., an arrow from A to B means that dataset A contains RDF triples that use identifiers from B. Bidirectional arrows usually indicate that the links are mirrored in both datasets.<br />
More on the interlinking methodology and statistics can be found on the [[../Interlinking|Interlinking]] page.<br />
<br />
The LODD datasets have been crawled by the SWSE Semantic Web search engine and can be accessed via a faceted browsing interface at [http://visinav.deri.org/hcls/] ([http://visinav.deri.org/hcls/list?keyword=varenicline Example query: Varenicline]).<br />
<br />
Most of the LODD datasets have also been integrated into the SPARQL endpoint of the HCLS Knowledge Base, see [http://esw.w3.org/topic/HCLSIG_BioRDF_Subgroup/DERI_HCLS_KB the wiki page of the HCLS KB] for further information.<br />
<br />
=== Bio2RDF Data Sets ===<br />
<br />
The [http://bio2rdf.org/ Bio2RDF project] has published 40 biology-, gene- and medical-related datasets (altogether 2.3 billion triples). <br />
The datasets are available via SPARQL endpoints and as Linked Data. It is recommended that you use the [http://sourceforge.net/project/platformdownload.php?group_id=142631 Bio2RDF Java Servlet], and optionally [http://quebec.bio2rdf.org/download/virtuoso/indexed/ download the databases] for efficient personal use. Running your own instance of the [http://virtuoso.openlinksw.com/wiki/main/Main/VirtEC2AMIBio2rdfInstall OpenLink Virtuoso AMI for EC2] is also an option (and for basic URI resolution doesn't require the Java Servlet, although if you want advanced queries you should still download it and configure it to query your EC2 sparql endpoint).<br />
<br />
* [http://www.freebase.com/view/user/bio2rdf/public/sparql Bio2RDF sparql endpoint list] [http://rdf.freebase.com/rdf/user/bio2rdf/public/sparql Sparql endpoint list in RDF]<br />
* [http://linkeddata.openlinksw.com:8891/pubmed:10500064 Identification of an autoimmune enteropathy-related 75-kilodalton antigen], via an [[OpenLinkSoftware|OpenLink]] hosted edition of [[Bio2Rdf]] <br />
* [http://linkeddata.openlinksw.com:8891/pubmed:9636670 Structure of the gene encoding the human cyclin-dependent kinase inhibitor p18 and mutational analysis in breast cancer], via an [[OpenLinkSoftware|OpenLink]] hosted edition of [[Bio2Rdf]] <br />
* [http://beckr.org/marbles?uri=http%3A%2F%2Fbio2rdf.org%2Fpubmed%3A9626117 PubMed article] viewed using the Marbles Linked Data browser.<br />
* [http://beckr.org/marbles?lang=en&uri=http%3A%2F%2Fbio2rdf.org%2Ffoaf%3AClemens%2C_T_L PubMed author] viewed using the Marbles Linked Data browser.<br />
* [http://beckr.org/marbles?uri=http%3A%2F%2Fbio2rdf.org%2Fomim%3A161555 OMIM Killer Cell Lectin-Like Receptor] viewed using the Marbles Linked Data browser.<br />
* [http://iws.seu.edu.cn/services/falcons/objectsearch/queryresult.jsp?query=%22KILLER+CELL%22 Falcons Search for KILLER CELL]. The Bio2RDF data has been crawled by the Falcons Semantic Web Search engine. This is an example on how the data is accessed by humans using the search engine. Falcons also offers an API that can by used by applications to access the data.<br />
<br />
=== Chem2bio2RDF ===<br />
<br />
* Information about the [http://chem2bio2rdf.org/ chem2bio2rdf] data sets<br />
<br />
=== Data Sets for the LODD Task ===<br />
<br />
To complement the drug-related Web of Data build by the LODD effort, the following data sets could/should also be published as Linked Data.<br />
<br />
The LODD effort is currently gathering more information about relevant datasets. See also [[/DataSetEvaluation|Evaluation of LODD Data Sets]] for current evaluation results.<br />
<br />
* [http://library.dialog.com/bluesheets/html/bl0107.html Adis R&D Insight]<br />
* [http://www.ebi.ac.uk/chebi/ chEBI]<br />
* [http://xpdb.nist.gov/pdb/chemblast.html ChemBlast]<br />
* [http://www.chemspider.com/ ChemSpider]<br />
* [http://ClinicalTrials.gov ClinicalTrials.gov]<br />
* [http://www.citeline.com/trialtrove.html Citeline TrialTrove]<br />
* [http://dailymed.nlm.nih.gov/dailymed/about.cfm DailyMed]<br />
* [http://dbpedia.org/About DBpedia]<br />
* [http://www.nd.edu/~alb/Publication06/145-HumanDisease_PNAS-14My07-Proc/Suppl/ Diseasome]<br />
* [http://www.drugbank.ca/ Drug Bank]<br />
* [http://www.virtualref.com/abs/72.htm DrugDB]<br />
* [http://www.ncbi.nlm.nih.gov/pubmed/17921997 Drugome]<br />
* [http://lsdis.cs.uga.edu/projects/asdoc/ Drug Ontology]<br />
* [http://scientific.thomsonreuters.com/products/iddb/ Investigational Drug Database] - Proprietary<br />
* [http://www.ovid.com/site/catalog/DataBase/1244.jsp?top=2&mid=3&bottom=7&subsection=10 IMS]<br />
* [http://www.genome.jp/kegg/drug/ KEGG Drug]<br />
* [http://Lillytrials.com LillyTrials]<br />
* [http://www.fda.gov/medwatch/ MedWatch]<br />
* [http://www.fda.gov/cder/ndc/ National Drug Code]<br />
* [http://www.ncbi.nlm.nih.gov/omim/ OMIM]<br />
* [http://www.fda.gov/cder/ob/ Orange Book]<br />
* [http://www.pharmaprojects.com/ Pharmaprojects] - Proprietary<br />
* [http://pubchem.ncbi.nlm.nih.gov/ PubChem]<br />
* [http://www.nlm.nih.gov/research/umls/rxnorm/ RxNorm]<br />
* [http://www.ncbi.nlm.nih.gov/sites/entrez?Db=pubmed&Cmd=ShowDetailView&TermToSearch=15360858 VA NDF-RT]<br />
* Other data sources could include blogs, discussion boards, wikis, etc.<br />
* and.... <br />
** [http://www.who.int/globalatlas/ World Health Organization's Global Health Atlas]<br />
** [http://www.epispider.org/ EpiSPIDER]<br />
** [http://www.accessdata.fda.gov/Scripts/cder/DrugsatFDA/ Drugs@FDA - FDA Approved Drug Products]<br />
** [http://www.drugdigest.org/wps/portal/ddigest DrugDigest]<br />
** [http://humancyc.org/ HumanCyc: Encyclopedia of ''Homo sapiens'' Genes and Metabolism]<br />
** [http://www.alzforum.org/ Alzheimer Research Forum]<br />
** [http://wwwcf.nlm.nih.gov/umlslicense/rxtermApp/rxTerm.cfm RxTerms]<br />
** [http://hudine.neu.edu/ HuDiNe]<br />
** [http://wiki.medpedia.com/Cymbalta Medpedia]<br />
** [http://tcm.lifescience.ntu.edu.tw/ TCMGeneDIT] and [[Media:HCLSIG$$LODD$$Data$TCMGeneDIT_RDF_Dataset_r1.zip|RDF dump]]<br />
** [http://www.tuftsctsi.org/~/media/Files/CTSI/Library%20Files/FCC%20for%20CER%20Rpt%20to%20Pres%20and%20Congress_063009.ashx List of other possible data sources from page 66 onwards]<br />
<br />
=== Alternative Herbal Medicine use case ===<br />
* [[Data/TCMGeneDIT|TCMGeneDIT dataset]]<br />
<br />
=== Identified Based Linkage Points ===<br />
<br />
* INCHIs<br />
* [[PubChem]] Compound ID (CID)<br />
* [[PubChem]] NSC<br />
* Chemical Abstract ID (CAS)<br />
* New Drug Application (NDA)<br />
<br />
=== Data Set Attributes ===<br />
<br />
* Licensing<br />
* Data Format<br />
* Identifiers</div>Rboycehttps://www.w3.org/wiki/index.php?title=HCLSIG/CDS/Datasets_and_ontologies&diff=54038HCLSIG/CDS/Datasets and ontologies2011-09-24T14:28:38Z<p>Rboyce: /* Drug interaction knowledge base */</p>
<hr />
<div>= Datasets and ontologies relevant for the CDS task force =<br />
<br />
== Datasets ==<br />
<br />
==== Drug datasets ====<br />
<br />
===== [http://www.drugbank.ca Drugbank] =====<br />
<br />
Drugbank.ca provides drug (i.e., chemical, pharmacological and pharmaceutical) data with comprehensive drug target (i.e., sequence, structure, and pathway) information. It also includes information on drug-drug and drug-food interactions. <br />
<br />
===== [http://www.nlm.nih.gov/research/umls/rxnorm/ RxNorm] =====<br />
<br />
A linked version of the NLM's RxNorm database that connects prescription drugs, ingredients, and NDC through RXCUI a concept unique identifier. RxNorm is a product developed by NIH’s National Library of Medicine. It currently interlinks 12 different drug vocabularies around a unique concept identifier. Due to licensing only six of the drug vocabularies are made available as part of the LODD cloud. This includes: Medical Subject Headings,, Metathesaurus FDA National Drug Code Directory, Metathesaurus FDA Structured Product Labels, National Drug File, RxNorm Vocabulary, Veterans Health Administration National Drug File <br />
<br />
===== [http://evs.nci.nih.gov/ftp1/NDF-RT/ National Drug File Reference Terminology (NDF-RT)] =====<br />
<br />
NDF-RT is the terminology used by FDA and the FedMed collaboration to code these essential pharmacologic properties of medications: <br />
Mechanism of Action<br />
Physiologic Effect<br />
Structural Class<br />
<br />
===== Drug interaction knowledge base =====<br />
<br />
Known and predicted metabolic inhibition drug-drug interactions with links to and summaries of evidence. HTML rendering: http://dbmi-icode-01.dbmi.pitt.edu/dikb-evidence/front-page.html D2R and SPARQL endpoint: http://dbmi-icode-01.dbmi.pitt.edu:2020/.<br />
<br />
==== Datasets containing associations between genetic variation, associated phenotypes and genetic tests ====<br />
<br />
===== [http://www.pharmgkb.org/ Pharmacogenomics Knowledgebase / PharmGKB] ===== <br />
A large database of curated knowledge and raw data about associations between genes, genetic variants, drug response and disease. <br />
<br />
===== [http://www.gwascentral.org/ GWAS Central] (formerly called HGVbaseG2P)===== <br />
A database of genome-wide association studies that also provides summaries of study results. <br />
<br />
===== [http://snpedia.org/ SNPedia] ===== <br />
A wiki-based platform containing information on phenotypes associated with SNP variants, population prevalence of genetic variants and SNP microarrays. <br />
<br />
===== [http://evidence.personalgenomes.org/about GET-Evidence (evidence.personalgenomes.org)] =====<br />
A large database of automatically annotated and then manually curated information about the impact of genetic variations. Example: http://evidence.personalgenomes.org/MYL2-A13T<br />
<br />
===== [http://www.ncbi.nlm.nih.gov/omim Online Mendelian Inheritance in Man (OMIM)] ===== <br />
Information about diseases with Mendelian inheritance, including references to the implicated genes. <br />
<br />
===== [http://www.ncbi.nlm.nih.gov/gap dbGaP] ===== <br />
Results of studies that have investigated the interaction of genotype and phenotype.<br />
<br />
===== [http://hugenavigator.net/HuGENavigator/home.do HuGE Navigator] ===== <br />
Information on population prevalence of genetic variants, gene-disease associations, gene-gene and gene- environment interactions, and evaluation of genetic tests. <br />
<br />
===== [http://geneticassociationdb.nih.gov/ Genetic Association Database (GAD)] ===== <br />
Diseases associated with genetic variants.<br />
<br />
===== [http://genotator.hms.harvard.edu/geno/ Genotator]===== <br />
Aggregated gene-disease relationship data containing an integrated view over other datasets.<br />
<br />
===== [http://www.ncbi.nlm.nih.gov/sites/GeneTests/?db=GeneTests NCBI GeneTests] ===== <br />
<br />
===== [http://ghr.nlm.nih.gov/ Genetics Home Reference] ===== <br />
<br />
<br />
<br />
==== Genome databases with general data about genetic variation and human genomes ====<br />
<br />
===== [http://www.ncbi.nlm.nih.gov/projects/SNP/ dbSNP] ===== <br />
<br />
===== [http://www.lrg-sequence.org/page.php?page=about Locus Reference Genomic / LRG] ===== <br />
An internationally recognized reference database, providing stable genomic DNA sequences and identifies for regions of the human genome.<br />
<br />
===== [http://www.ncbi.nlm.nih.gov/dbvar/ dbVar] ===== <br />
Large-scale genetic structural variation data (e.g., insertions, deletions).<br />
<br />
===== [http://hapmap.ncbi.nlm.nih.gov/ HapMap] ===== <br />
<br />
<br />
==== Collections of personal genetic data ====<br />
<br />
===== [http://www.1000genomes.org/data 1000 genomes project] =====<br />
Genome sequences of over 1000 volunteers<br />
<br />
===== [http://www.geenivaramu.ee/index.php Database of the Estonian Genome Center], University of Tartu =====<br />
A collection of genetic data associated with health and lifestyle data of over 50,000 persons.<br />
<br />
===== [http://www.personalgenomes.org/ Personal Genome Project] ===== <br />
Whole-genome data donated by volunteers.<br />
<br />
===== Vanderbilt Biobank =====<br />
See http://www.nature.com/clpt/journal/v84/n3/full/clpt200889a.html<br />
<br />
<br />
== Relevant ontologies and taxonomies ==<br />
<br />
===== Suggested Ontology for Pharmacogenomics (SO-PHARM) ===== <br />
A complex ontology covering the representation of genetic variation and pharmacgenomics.<br />
<br />
===== Pharmacogenomics Ontology (PO) ===== <br />
Represents PharmGKB data; ontology for measures and outcomes.<br />
<br />
===== Pharmacogenomics Relationship Ontology (PHARE) ===== <br />
Proposes concepts and roles to represent relationships of pharmacogenomics interest. Used for representing findings extracted from texts.<br />
<br />
===== [http://www.sequenceontology.org Sequence Ontology (SO)] ===== <br />
Contains terms often used for the annotation of sequences and features, including detailed description of different types of sequence variations.<br />
<br />
===== Disease Ontology ===== <br />
An ontology of human diseases.<br />
<br />
===== [http://www.human-phenotype-ontology.org/index.php/hpo_home.html Human Phenotype Ontology (HPO)] =====<br />
<br />
===== Mammalian Phenotype Ontology =====<br />
<br />
===== Phenotypic Quality Ontology (PATO) ===== <br />
An ontology of types of phenotypic properties.<br />
<br />
===== [http://loinc.org/ Logical Observation Identifiers Names and Codes (LOINC)] =====<br />
An established coding system for clinical lab results. Contains many identifiers for results of genetic tests.<br />
<br />
== Formats and schemas ==<br />
<br />
===== OMG SNP ===== <br />
A simple XML schema for the representation of SNPs [http://www.iso.org/iso/iso_catalogue/catalogue_tc/catalogue_detail.htm?csnumber=43182&commid=54960]. Maintained by the Object Management Group (OMG).<br />
<br />
===== Genomic Sequence Variation Markup Language (GSVML), ISO 25720:2009 <br />
An XML schema geared towa [http://www.iso.org/iso/iso_catalogue/catalogue_tc/catalogue_detail.htm?csnumber=43182&commid=54960]. Maintained by the International Organization for Standardization (ISO).<br />
<br />
===== HL7 Clinical Document Architecture (CDA) Genetic Testing Report (GTR) =====</div>Rboycehttps://www.w3.org/wiki/index.php?title=HCLSIG/CDS/Datasets_and_ontologies&diff=54008HCLSIG/CDS/Datasets and ontologies2011-09-21T22:08:44Z<p>Rboyce: /* Drug interaction knowledge base */</p>
<hr />
<div>= Datasets and ontologies relevant for the CDS task force =<br />
<br />
== Datasets ==<br />
<br />
==== Relevand Linked Open Drug Data datasets ====<br />
<br />
===== [http://www.drugbank.ca Drugbank] =====<br />
<br />
Drugbank.ca provides drug (i.e., chemical, pharmacological and pharmaceutical) data with comprehensive drug target (i.e., sequence, structure, and pathway) information. It also includes information on drug-drug and drug-food interactions. <br />
<br />
===== [http://www.nlm.nih.gov/research/umls/rxnorm/ RxNorm] =====<br />
<br />
A linked version of the NLM's RxNorm database that connects prescription drugs, ingredients, and NDC through RXCUI a concept unique identifier. RxNorm is a product developed by NIH’s National Library of Medicine. It currently interlinks 12 different drug vocabularies around a unique concept identifier. Due to licensing only six of the drug vocabularies are made available as part of the LODD cloud. This includes: Medical Subject Headings,, Metathesaurus FDA National Drug Code Directory, Metathesaurus FDA Structured Product Labels, National Drug File, RxNorm Vocabulary, Veterans Health Administration National Drug File <br />
<br />
===== Drug interaction knowledge base =====<br />
<br />
Drug interactions. HTML rendering: http://dbmi-icode-01.dbmi.pitt.edu/dikb-evidence/front-page.html SPARQL endpoint released soon.<br />
<br />
==== Datasets containing associations between genetic variation, associated phenotypes and genetic tests ====<br />
<br />
===== [http://www.pharmgkb.org/ Pharmacogenomics Knowledgebase / PharmGKB] ===== <br />
A large database of curated knowledge and raw data about associations between genes, genetic variants, drug response and disease. <br />
<br />
===== [http://www.gwascentral.org/ GWAS Central] (formerly called HGVbaseG2P)===== <br />
A database of genome-wide association studies that also provides summaries of study results. <br />
<br />
===== [http://snpedia.org/ SNPedia] ===== <br />
A wiki-based platform containing information on phenotypes associated with SNP variants, population prevalence of genetic variants and SNP microarrays. <br />
<br />
===== [http://evidence.personalgenomes.org/about GET-Evidence (evidence.personalgenomes.org)] =====<br />
A large database of automatically annotated and then manually curated information about the impact of genetic variations. Example: http://evidence.personalgenomes.org/MYL2-A13T<br />
<br />
===== [http://www.ncbi.nlm.nih.gov/omim Online Mendelian Inheritance in Man (OMIM)] ===== <br />
Information about diseases with Mendelian inheritance, including references to the implicated genes. <br />
<br />
===== [http://www.ncbi.nlm.nih.gov/gap dbGaP] ===== <br />
Results of studies that have investigated the interaction of genotype and phenotype.<br />
<br />
===== [http://hugenavigator.net/HuGENavigator/home.do HuGE Navigator] ===== <br />
Information on population prevalence of genetic variants, gene-disease associations, gene-gene and gene- environment interactions, and evaluation of genetic tests. <br />
<br />
===== [http://geneticassociationdb.nih.gov/ Genetic Association Database (GAD)] ===== <br />
Diseases associated with genetic variants.<br />
<br />
===== [http://genotator.hms.harvard.edu/geno/ Genotator]===== <br />
Aggregated gene-disease relationship data containing an integrated view over other datasets.<br />
<br />
===== [http://www.ncbi.nlm.nih.gov/sites/GeneTests/?db=GeneTests NCBI GeneTests] ===== <br />
<br />
===== [http://ghr.nlm.nih.gov/ Genetics Home Reference] ===== <br />
<br />
<br />
<br />
==== Genome databases with general data about genetic variation and human genomes ====<br />
<br />
===== [http://www.ncbi.nlm.nih.gov/projects/SNP/ dbSNP] ===== <br />
<br />
===== [http://www.lrg-sequence.org/page.php?page=about Locus Reference Genomic / LRG] ===== <br />
An internationally recognized reference database, providing stable genomic DNA sequences and identifies for regions of the human genome.<br />
<br />
===== [http://www.ncbi.nlm.nih.gov/dbvar/ dbVar] ===== <br />
Large-scale genetic structural variation data (e.g., insertions, deletions).<br />
<br />
===== [http://hapmap.ncbi.nlm.nih.gov/ HapMap] ===== <br />
<br />
<br />
==== Collections of personal genetic data ====<br />
<br />
===== [http://www.1000genomes.org/data 1000 genomes project] =====<br />
Genome sequences of over 1000 volunteers<br />
<br />
===== [http://www.geenivaramu.ee/index.php Database of the Estonian Genome Center], University of Tartu =====<br />
A collection of genetic data associated with health and lifestyle data of over 50,000 persons.<br />
<br />
===== [http://www.personalgenomes.org/ Personal Genome Project] ===== <br />
Whole-genome data donated by volunteers.<br />
<br />
===== Vanderbilt Biobank =====<br />
See http://www.nature.com/clpt/journal/v84/n3/full/clpt200889a.html<br />
<br />
<br />
== Relevant ontologies and taxonomies ==<br />
<br />
===== Suggested Ontology for Pharmacogenomics (SO-PHARM) ===== <br />
A complex ontology covering the representation of genetic variation and pharmacgenomics.<br />
<br />
===== Pharmacogenomics Ontology (PO) ===== <br />
Represents PharmGKB data; ontology for measures and outcomes.<br />
<br />
===== Pharmacogenomics Relationship Ontology (PHARE) ===== <br />
Proposes concepts and roles to represent relationships of pharmacogenomics interest. Used for representing findings extracted from texts.<br />
<br />
===== [http://www.sequenceontology.org Sequence Ontology (SO)] ===== <br />
Contains terms often used for the annotation of sequences and features, including detailed description of different types of sequence variations.<br />
<br />
===== Disease Ontology ===== <br />
An ontology of human diseases.<br />
<br />
===== [http://www.human-phenotype-ontology.org/index.php/hpo_home.html Human Phenotype Ontology (HPO)] =====<br />
<br />
===== Mammalian Phenotype Ontology =====<br />
<br />
===== Phenotypic Quality Ontology (PATO) ===== <br />
An ontology of types of phenotypic properties.<br />
<br />
===== [http://loinc.org/ Logical Observation Identifiers Names and Codes (LOINC)] =====<br />
An established coding system for clinical lab results. Contains many identifiers for results of genetic tests.<br />
<br />
== Formats and schemas ==<br />
<br />
===== OMG SNP ===== <br />
A simple XML schema for the representation of SNPs [http://www.iso.org/iso/iso_catalogue/catalogue_tc/catalogue_detail.htm?csnumber=43182&commid=54960]. Maintained by the Object Management Group (OMG).<br />
<br />
===== Genomic Sequence Variation Markup Language (GSVML), ISO 25720:2009 <br />
An XML schema geared towa [http://www.iso.org/iso/iso_catalogue/catalogue_tc/catalogue_detail.htm?csnumber=43182&commid=54960]. Maintained by the International Organization for Standardization (ISO).<br />
<br />
===== HL7 Clinical Document Architecture (CDA) Genetic Testing Report (GTR) =====</div>Rboycehttps://www.w3.org/wiki/index.php?title=HCLSIG/LODD/Data&diff=48323HCLSIG/LODD/Data2011-02-25T17:34:48Z<p>Rboyce: /* LODD-related datasets that the LODD group already made available as Linked Data */</p>
<hr />
<div>__NOTOC__<br />
=== LODD-related datasets that the LODD group already made available as Linked Data ===<br />
<br />
{| border="1" cellpadding="2" cellspacing="0"<br />
| '''Name'''<br />
| '''Topic'''<br />
| '''Short Description'''<br />
| '''Size'''<br />
| '''Status/ Activity'''<br />
| '''Example Instances'''<br />
| '''SPARQL Endpoint'''<br />
|-<br />
| [http://www4.wiwiss.fu-berlin.de/drugbank/ DrugBank]<br />
| Drugs<br />
| [http://www.drugbank.ca/ Drugbank.ca] provides drug (i.e., chemical, pharmacological and pharmaceutical) data with comprehensive drug target (i.e., sequence, structure, and pathway) information ({{doi|10.1093/nar/gkj067}})<br />
| 766,920 triples; 4,800 drugs, 2,500 protein sequences<br />
| updated regularly<br />
| Varenicline [http://beckr.org/marbles?uri=http://www4.wiwiss.fu-berlin.de/drugbank/resource/drugs/DB01273 via Marbles], [http://demo.openlinksw.com/ode/?uri=http%3A%2F%2Fwww4.wiwiss.fu-berlin.de%2Fdrugbank%2Fresource%2Fdrugs%2FDB01273 via OpenLink Data Explorer] <br />
| http://www4.wiwiss.fu-berlin.de/drugbank/sparql <br />
|-<br />
| [http://linkedct.org/ LinkedCT]<br />
| Clinical Trials<br />
| Linked data source of trials from [http://clinicaltrials.gov ClinicalTrials.gov]<br />
| 7 million triples, 62000 trials<br />
| preview release<br />
| [http://data.linkedct.org/resource/intervention/7322 Influenza] (Intervention), A [http://data.linkedct.org/resource/trials/NCT00001872 Trial], [http://data.linkedct.org/resource/condition/52 AIDS] (condition), A [http://data.linkedct.org/resource/reference/12201 reference], A [http://data.linkedct.org/resource/location/162398 location] <br />
| http://data.linkedct.org/sparql<br />
|-<br />
| [http://www4.wiwiss.fu-berlin.de/dailymed/ DailyMed]<br />
| Drugs<br />
| [http://dailymed.nlm.nih.gov/dailymed/about.cfm dailymed.nlm.nih.gov] provides information about approved prescription drugs, includes FDA approved labels (package inserts)<br />
| 164,276 triples; 4,039 drugs<br />
| updated regularly<br />
| "Sterile Water (Irrigant)" [http://beckr.org/marbles?uri=http://www4.wiwiss.fu-berlin.de/dailymed/resource/drugs/492 via Marbles], [http://demo.openlinksw.com/ode/?uri=http%3A%2F%2Fwww4.wiwiss.fu-berlin.de%2Fdailymed%2Fresource%2Fdrugs%2F492 via OpenLink Data Explorer]<br />
| http://www4.wiwiss.fu-berlin.de/dailymed/sparql<br />
|-<br />
| [http://dbpedia.org/About DBpedia]<br />
| Drugs/ Diseases/ Proteins<br />
| RDF data about 2.49 million things that has been extracted from Wikipedia<br />
| 218 million RDF triples; 2,300 drugs, 2,200 proteins<br />
| updated every 3 months <br />
| [http://dbpedia.org/resource/Aspirin Aspirin], [http://dbpedia.org/resource/HIV HIV]<br />
| http://dbpedia.org/sparql<br />
|-<br />
| [http://www4.wiwiss.fu-berlin.de/diseasome/ Diseasome]<br />
| Diseases / Genes<br />
| [http://www.nd.edu/~alb/Publication06/145-HumanDisease_PNAS-14My07-Proc/Suppl/ Diseasome] describes characteristics of disorders and disease genes linked by known disorder–gene associations<br />
| 91,182 triples; 2,600 genes<br />
| updated 2006<br />
| Alzheimer's [http://beckr.org/marbles?uri=http://www4.wiwiss.fu-berlin.de/diseasome/resource/diseases/74 via Marbles], [http://demo.openlinksw.com/ode/?uri=http%3A%2F%2Fwww4.wiwiss.fu-berlin.de%2Fdiseasome%2Fresource%2Fdiseases%2F74 via OpenLink Data Explorer]<br />
| http://www4.wiwiss.fu-berlin.de/diseasome/sparql<br />
|-<br />
| [http://code.google.com/p/junsbriefcase/wiki/TGDdataset RDF-TCM]<br />
| Genes / Diseases / Medicine / Ingredients<br />
| Traditional Chinese medicine, gene and disease association dataset and a linkset mapping TCM gene symbols to Extrez Gene IDs created by Neurocommons <br />
| 117,643 <br />
| updated August 2009 (stable)<br />
| [http://purl.org/net/tcm/tcm.lifescience.ntu.edu.tw/id/medicine/Ginkgo_biloba Ginkgo biloba] <br />
| http://hcls.deri.org/sparql; graph name: http://hcls.deri.org/resource/graph/tcm<br />
|-<br />
| [http://www.nlm.nih.gov/research/umls/rxnorm/ RxNorm]<br />
| Drugs<br />
| A linked version of the NLM's RxNorm database that connects prescription drugs, ingredients, and NDC through RXCUI a concept unique identifier. RxNorm is a product developed by NIH’s National Library of Medicine. It currently interlinks 12 different drug vocabularies around a unique concept identifier. Due to licensing only six of the drug vocabularies are made available as part of the LODD cloud. This includes: Medical Subject Headings,, Metathesaurus FDA National Drug Code Directory, Metathesaurus FDA Structured Product Labels, National Drug File, RxNorm Vocabulary, Veterans Health Administration National Drug File<br />
Links are provided connecting RxNorm to drug bank and to the UMLS.<br />
|over 7.7 million triples; 165,806 RXCUI (Concept Unique Identifiers) Unique drugs and ingredients; 332,754 RXAUI (Atomic Unique Identifiers) sourced terms<br />
| Based on 3/2010 Rxnorm Release; Last updated 5/2010<br />
| [http://link.informatics.stonybrook.edu/rxnorm/RXAUI/2994963 Singulair from the Metathesaurus FDA Structured Product Labels]<br />
| http://link.informatics.stonybrook.edu/sparql/<br />
|-<br />
| [http://www4.wiwiss.fu-berlin.de/sider/ SIDER]<br />
| Diseases / Side Effects<br />
| [http://sideeffects.embl.de/ SIDER] contains information on marketed drugs and their adverse effects ({{doi|10.1038/msb.2009.98}})<br />
| 192,515 triples; 1,737 genes<br />
| updated 2009<br />
| Confusion [http://beckr.org/marbles?uri=http://www4.wiwiss.fu-berlin.de/sider/resource/side_effects/C0009676 via Marbles]<br />
| http://www4.wiwiss.fu-berlin.de/sider/sparql<br />
|-<br />
| [http://www4.wiwiss.fu-berlin.de/stitch/ STITCH]<br />
| Chemicals / Proteins<br />
| [http://stitch.embl.de/ STITCH] contains information on chemicals, proteins, and their interactions ({{doi|10.1093/nar/gkm795}})<br />
| 7,500,000 chemicals; 500,000 proteins; 370 organisms <br />
| updated July 2009<br />
| Lactose [http://beckr.org/marbles?uri=http://www4.wiwiss.fu-berlin.de/stitch/resource/chemicals/CID000000294 via Marbles]<br />
| http://www4.wiwiss.fu-berlin.de/stitch/sparql<br />
|-<br />
| [http://www4.wiwiss.fu-berlin.de/medicare/ Medicare]<br />
| Medicare Formulary<br />
| xxx<br />
| xxx<br />
| xxx<br />
| xxx<br />
| http://www4.wiwiss.fu-berlin.de/medicare/sparql<br />
|-<br />
| [http://www.ebi.ac.uk/chembl/ ChEMBL]<br />
| Chemical / Assays (Proteins, Organisms) / Papers<br />
| [http://www.ebi.ac.uk/chembl/ ChEMBL]] contains information on trial drugs with information about activity against targets like but not limited to proteins. All is backed up by and linked to literature. Includes links to Bio2RDF for ChEBI and Uniprot. License: CC-BY-SA.<br />
| ~24M triples<br />
| Updated 2010-01<br />
| A [http://rdf.farmbio.uu.se/chembl/snorql/?describe=http://rdf.farmbio.uu.se/chembl/activity/a2642163 IC50 activity].<br />
| http://rdf.farmbio.uu.se/chembl/sparql<br />
|-<br />
| [http://www.who.int/gho/en/index.html WHO Global Health Observatory]<br />
| Infectious Diseases /Demography / Socioeconomic Conditions / Environmental Factors<br />
| Data and statistics for infectious diseases at country, regional, and global levels<br />
| 354300<br />
| Updated 2010-09<br />
| xxx<br />
| http://aksw.org/Projects/Stats2RDF<br />
|-<br />
| [http://nlp.dbmi.pitt.edu/nlprepository.html University of Pittsburgh NLP Repository] <br />
| Drugs / Procedures / Diagnoses<br />
| A semantic index of concepts present in 800 full-text clinical notes from the University of Pittsburgh NLP Repository<br />
| 116,855<br />
| Proof of concept -- Updated 02/25/2011<br />
| [http://tarski.duhs.org:8080/openrdf-workbench/repositories/u-pitt-blulab/explore?resource=bl%3Areport_404 Concepts from a sample radiology report]<br />
| [http://tarski.duhs.org:8080/openrdf-workbench/repositories/u-pitt-blulab/query http://tarski.duhs.org:8080/openrdf-workbench/repositories/u-pitt-blulab/query]<br />
|}<br />
<br />
http://www4.wiwiss.fu-berlin.de/lodd/lodd-datasets_2009-08-06.png<br />
<br />
This figure shows the incorporation of LinkedCT, [[DailyMed]], [[DrugBank]], Diseasome, RDF-TCM, and SIDER into the Linked Data cloud. These data sets are represented in dark gray, while light gray represents other Linked Data from the life sciences, and white indicates interlinked datasets covering geographic, person-related and conceptual data.<br />
More on the interlinking methodology and statistics can be found on the [[../Interlinking|Interlinking]] page.<br />
<br />
The LODD datasets have been crawled by the SWSE Semantic Web search engine and can be accessed via a faceted browsing interface at [http://visinav.deri.org/hcls/] ([http://visinav.deri.org/hcls/list?keyword=varenicline Example query: Varenicline]).<br />
<br />
Most of the LODD datasets have also been integrated into the SPARQL endpoint of the HCLS Knowledge Base, see [http://esw.w3.org/topic/HCLSIG_BioRDF_Subgroup/DERI_HCLS_KB the wiki page of the HCLS KB] for further information.<br />
<br />
=== Bio2RDF Data Sets ===<br />
<br />
The [http://bio2rdf.org/ Bio2RDF project] has published 40 biology-, gene- and medical-related datasets (altogether 2.3 billion triples). <br />
The datasets are available via SPARQL endpoints and as Linked Data. It is recommended that you use the [http://sourceforge.net/project/platformdownload.php?group_id=142631 Bio2RDF Java Servlet], and optionally [http://quebec.bio2rdf.org/download/virtuoso/indexed/ download the databases] for efficient personal use. Running your own instance of the [http://virtuoso.openlinksw.com/wiki/main/Main/VirtEC2AMIBio2rdfInstall OpenLink Virtuoso AMI for EC2] is also an option (and for basic URI resolution doesn't require the Java Servlet, although if you want advanced queries you should still download it and configure it to query your EC2 sparql endpoint).<br />
<br />
* [http://www.freebase.com/view/user/bio2rdf/public/sparql Bio2RDF sparql endpoint list] [http://rdf.freebase.com/rdf/user/bio2rdf/public/sparql Sparql endpoint list in RDF]<br />
* [http://linkeddata.openlinksw.com:8891/pubmed:10500064 Identification of an autoimmune enteropathy-related 75-kilodalton antigen], via an [[OpenLinkSoftware|OpenLink]] hosted edition of [[Bio2Rdf]] <br />
* [http://linkeddata.openlinksw.com:8891/pubmed:9636670 Structure of the gene encoding the human cyclin-dependent kinase inhibitor p18 and mutational analysis in breast cancer], via an [[OpenLinkSoftware|OpenLink]] hosted edition of [[Bio2Rdf]] <br />
* [http://beckr.org/marbles?uri=http%3A%2F%2Fbio2rdf.org%2Fpubmed%3A9626117 PubMed article] viewed using the Marbles Linked Data browser.<br />
* [http://beckr.org/marbles?lang=en&uri=http%3A%2F%2Fbio2rdf.org%2Ffoaf%3AClemens%2C_T_L PubMed author] viewed using the Marbles Linked Data browser.<br />
* [http://beckr.org/marbles?uri=http%3A%2F%2Fbio2rdf.org%2Fomim%3A161555 OMIM Killer Cell Lectin-Like Receptor] viewed using the Marbles Linked Data browser.<br />
* [http://iws.seu.edu.cn/services/falcons/objectsearch/queryresult.jsp?query=%22KILLER+CELL%22 Falcons Search for KILLER CELL]. The Bio2RDF data has been crawled by the Falcons Semantic Web Search engine. This is an example on how the data is accessed by humans using the search engine. Falcons also offers an API that can by used by applications to access the data.<br />
<br />
=== Chem2bio2RDF ===<br />
<br />
* Information about the [http://chem2bio2rdf.org/ chem2bio2rdf] data sets<br />
<br />
=== Data Sets for the LODD Task ===<br />
<br />
To complement the drug-related Web of Data build by the LODD effort, the following data sets could/should also be published as Linked Data.<br />
<br />
The LODD effort is currently gathering more information about relevant datasets. See also [[/DataSetEvaluation|Evaluation of LODD Data Sets]] for current evaluation results.<br />
<br />
* [http://library.dialog.com/bluesheets/html/bl0107.html Adis R&D Insight]<br />
* [http://www.ebi.ac.uk/chebi/ chEBI]<br />
* [http://xpdb.nist.gov/pdb/chemblast.html ChemBlast]<br />
* [http://www.chemspider.com/ ChemSpider]<br />
* [http://ClinicalTrials.gov ClinicalTrials.gov]<br />
* [http://www.citeline.com/trialtrove.html Citeline TrialTrove]<br />
* [http://dailymed.nlm.nih.gov/dailymed/about.cfm DailyMed]<br />
* [http://dbpedia.org/About DBpedia]<br />
* [http://www.nd.edu/~alb/Publication06/145-HumanDisease_PNAS-14My07-Proc/Suppl/ Diseasome]<br />
* [http://www.drugbank.ca/ Drug Bank]<br />
* [http://www.virtualref.com/abs/72.htm DrugDB]<br />
* [http://www.ncbi.nlm.nih.gov/pubmed/17921997 Drugome]<br />
* [http://lsdis.cs.uga.edu/projects/asdoc/ Drug Ontology]<br />
* [http://scientific.thomsonreuters.com/products/iddb/ Investigational Drug Database] - Proprietary<br />
* [http://www.ovid.com/site/catalog/DataBase/1244.jsp?top=2&mid=3&bottom=7&subsection=10 IMS]<br />
* [http://www.genome.jp/kegg/drug/ KEGG Drug]<br />
* [http://Lillytrials.com LillyTrials]<br />
* [http://www.fda.gov/medwatch/ MedWatch]<br />
* [http://www.fda.gov/cder/ndc/ National Drug Code]<br />
* [http://www.ncbi.nlm.nih.gov/omim/ OMIM]<br />
* [http://www.fda.gov/cder/ob/ Orange Book]<br />
* [http://www.pharmaprojects.com/ Pharmaprojects] - Proprietary<br />
* [http://pubchem.ncbi.nlm.nih.gov/ PubChem]<br />
* [http://www.nlm.nih.gov/research/umls/rxnorm/ RxNorm]<br />
* [http://www.ncbi.nlm.nih.gov/sites/entrez?Db=pubmed&Cmd=ShowDetailView&TermToSearch=15360858 VA NDF-RT]<br />
* Other data sources could include blogs, discussion boards, wikis, etc.<br />
* and.... <br />
** [http://www.who.int/globalatlas/ World Health Organization's Global Health Atlas]<br />
** [http://www.epispider.org/ EpiSPIDER]<br />
** [http://www.accessdata.fda.gov/Scripts/cder/DrugsatFDA/ Drugs@FDA - FDA Approved Drug Products]<br />
** [http://www.drugdigest.org/wps/portal/ddigest DrugDigest]<br />
** [http://humancyc.org/ HumanCyc: Encyclopedia of ''Homo sapiens'' Genes and Metabolism]<br />
** [http://www.alzforum.org/ Alzheimer Research Forum]<br />
** [http://wwwcf.nlm.nih.gov/umlslicense/rxtermApp/rxTerm.cfm RxTerms]<br />
** [http://hudine.neu.edu/ HuDiNe]<br />
** [http://wiki.medpedia.com/Cymbalta Medpedia]<br />
** [http://tcm.lifescience.ntu.edu.tw/ TCMGeneDIT] and [[Media:HCLSIG$$LODD$$Data$TCMGeneDIT_RDF_Dataset_r1.zip|RDF dump]]<br />
** [http://www.tuftsctsi.org/~/media/Files/CTSI/Library%20Files/FCC%20for%20CER%20Rpt%20to%20Pres%20and%20Congress_063009.ashx List of other possible data sources from page 66 onwards]<br />
<br />
=== Alternative Herbal Medicine use case ===<br />
* [[Data/TCMGeneDIT|TCMGeneDIT dataset]]<br />
<br />
=== Identified Based Linkage Points ===<br />
<br />
* INCHIs<br />
* [[PubChem]] Compound ID (CID)<br />
* [[PubChem]] NSC<br />
* Chemical Abstract ID (CAS)<br />
* New Drug Application (NDA)<br />
<br />
=== Data Set Attributes ===<br />
<br />
* Licensing<br />
* Data Format<br />
* Identifiers</div>Rboycehttps://www.w3.org/wiki/index.php?title=HCLSIG/LODD/Meetings/2010-07-07_Conference_Call&diff=45030HCLSIG/LODD/Meetings/2010-07-07 Conference Call2010-07-07T16:03:18Z<p>Rboyce: /* Minutes */</p>
<hr />
<div>== Conference Details ==<br />
* Date of Call: Wednesday July 7, 2010<br />
* Time of Call: 11:00am Eastern Daylight Time (EDT), 16:00 British Summer Time (BST), 17:00 Central European Time (CET)<br />
* Dial-In #: +1.617.761.6200 (Cambridge, MA)<br />
* Dial-In #: +33.4.89.06.34.99 (Nice, France)<br />
* Dial-In #: +44.117.370.6152 (Bristol, UK) <br />
* Participant Access Code: 4257 ("HCLS"). <br />
* IRC Channel: irc.w3.org port 6665 channel #HCLS (see [http://www.w3.org/Project/IRC/ W3C IRC page] for details, or see [http://cgi.w3.org/member-bin/irc/irc.cgi Web IRC])<br />
* Duration: ~1h<br />
* Convener: Susie <br />
<br />
== Agenda ==<br />
* Mapping experimental data - All<br />
- [http://esw.w3.org/HCLSIG/LODD/Mapping_Experimental_Data Feedback on Wiki site]<br />
- [http://esw.w3.org/images/d/d0/ISMB2010_Final.pdf Learning from the BioOntologies Paper] <br />
* Seed grants - Susie<br />
* Data updates - Egon, Matthias, Anja, Oktie<br />
* AOB<br />
<br />
== Minutes ==<br />
<br />
Attendees: Bosse, Rich, Matthias, Oktie, Claus, Elgar, Susie<br />
<br />
<Susie> http://esw.w3.org/HCLSIG/LODD/Mapping_Experimental_Data<br />
<br />
<rboyce> agenda item 1: feed back on the wiki site<br />
<br />
<rboyce> 10 questions -- are these good questions? Should there be other questions?<br />
<br />
<rboyce> Discussion about the definition of experimental data...does it include EHR data?<br />
<br />
<rboyce> EHR record could be an experimental dataset<br />
<br />
<rboyce> aka "instance data"<br />
<br />
<rboyce> main focus -- how do we best model what is often very complex data produced from experiments in RDF?<br />
<br />
<rboyce> are these good questions to have answered in a "best practice<br />
<br />
<rboyce> " document"<br />
<br />
<rboyce> bbalsa: questions: once data is published -- how can the original data be augmented with additional insights?<br />
<br />
<rboyce> bbalsa: this question would be helpful for scientists publishing their data in RDF<br />
<br />
<rboyce> http://en.wikipedia.org/wiki/Entity-attribute-value_model<br />
<br />
<rboyce> question: it seems that experience with best practices for representing health data in the entity-attribute-value (EAV) model would be applicable to representing experimental data in RDF. Has anyone looked into this?<br />
<br />
<Susie> http://esw.w3.org/HCLSIG/LODD/Mapping_Experimental_Data#Mapping_Experimental_Data_to_RDF<br />
<br />
<rboyce> example paper on best practices for EAV modeling: http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2110957/<br />
<br />
<rboyce> some of the questions would be much harder to address than others<br />
<br />
<rboyce> for example, determining URI namespaces might be less involved than some others<br />
<br />
<rboyce> question: questions regarding tools -- e.g. how to get the tool to work with linked data and usable interfaces<br />
<br />
<rboyce> Susie -- tools questions might not be appropriate for a 'best practices' paper<br />
<br />
<rboyce> mapping should be independent of implementation<br />
<br />
<rboyce> oktie -- as data sources grow; d2r mapping might not scale to allow efficient query responses<br />
<br />
<rboyce> oktie -- how important is scalability?<br />
<br />
<rboyce> Susie -- scalability is good to consider when making recommendations <br />
<br />
<rboyce> Is this a general "best practice" document?<br />
<br />
<rboyce> Susie -- the document should be applicable to any disease and patient population<br />
<br />
<rboyce> The first questions would be very helpful to researchers new to using RDF; would save people time and confusion.<br />
<br />
<rboyce> how well does using ADNI as a focus area generalize ?<br />
<br />
<rboyce> Susie -- it is a realistic data set that might be a very good starting point<br />
<br />
<rboyce> There might be other useful data sources that are not in a relational databases -- what about those?<br />
<br />
<rboyce> Susie -- it is possible to convert (e.g. XML to triple store) but relational to RDF would be a good place to start.<br />
<br />
<rboyce> clausstie: introduction -- experienced in medicinal chemistry, IT, knowledge management<br />
<br />
<rboyce> Are we restricting this to experimental data?<br />
<br />
<rboyce> Susie -- would like to start with an experimental data set because it is more complicated<br />
<br />
<rboyce> Susie -- other types of datasets of interest?<br />
<br />
<rboyce> How do we assign ids? Do we create our own and map other objects to the new ones?<br />
<br />
<rboyce> Mapping might be too general of a term -- changes from application from application<br />
<br />
<rboyce> we should be precise by what we mean by "mapping"<br />
<br />
<rboyce> implementation should be last decision -- are we restricting this to RDF?<br />
<br />
<rboyce> Should we focus on mapping concepts etc first then implementation <br />
<br />
<rboyce> Susie -- what questions would we want to ask of the data set --- this might influence the way we model the data<br />
<br />
<rboyce> Susie -- the representation choice might be predetermined (as RDF) given the purpose of this SIG<br />
<br />
<rboyce> Which entities should be classes and which ones should be instances?<br />
<br />
<rboyce> Susie -- we should probably be thinking of some other data sources to include so that we could demonstrate federated queries<br />
<br />
<rboyce> and aggregation<br />
<br />
<rboyce> Susie -- will take the bioontologies paper (http://esw.w3.org/images/d/d0/ISMB2010_Final.pdf) and extract findings that help address some of the questions<br />
<br />
<rboyce> Susie -- please document the steps one would take to map ADNI to RDF<br />
<br />
<rboyce> Susie -- please be thinking about ADNI and potential complementary data sources (e.g. drug bank, clinicaltrials.gov)<br />
<br />
<rboyce> Susie -- will create a wiki page with new/updated questions<br />
<br />
<rboyce> ----------------<br />
<br />
<rboyce> Seed grants...<br />
<br />
<Susie> https://www.jnjcosat.com/cosat<br />
<br />
<rboyce> --------------<br />
<br />
<rboyce> Data updates<br />
<br />
<rboyce> Anja mentioned (by email to Richard) that Drug Bank RDF mapping will be edited to address issues: <https://sourceforge.net/projects/loddproject/forums/forum/910130/topic/3719723/index/page/1></div>Rboyce