HCLSIG/CDS/Datasets and ontologies
Datasets and ontologies relevant for the CDS task force
Datasets
Drug datasets
Drugbank
Drugbank.ca provides drug (i.e., chemical, pharmacological and pharmaceutical) data with comprehensive drug target (i.e., sequence, structure, and pathway) information. It also includes information on drug-drug and drug-food interactions.
RxNorm
A linked version of the NLM's RxNorm database that connects prescription drugs, ingredients, and NDC through RXCUI a concept unique identifier. RxNorm is a product developed by NIH’s National Library of Medicine. It currently interlinks 12 different drug vocabularies around a unique concept identifier. Due to licensing only six of the drug vocabularies are made available as part of the LODD cloud. This includes: Medical Subject Headings,, Metathesaurus FDA National Drug Code Directory, Metathesaurus FDA Structured Product Labels, National Drug File, RxNorm Vocabulary, Veterans Health Administration National Drug File
National Drug File Reference Terminology (NDF-RT)
NDF-RT is the terminology used by FDA and the FedMed collaboration to code these essential pharmacologic properties of medications: Mechanism of Action Physiologic Effect Structural Class
Drug interaction knowledge base
Known and predicted metabolic inhibition drug-drug interactions with links to and summaries of evidence. HTML rendering: http://dbmi-icode-01.dbmi.pitt.edu/dikb-evidence/front-page.html D2R and SPARQL endpoint: http://dbmi-icode-01.dbmi.pitt.edu:2020/.
Datasets containing associations between genetic variation, associated phenotypes and genetic tests
Pharmacogenomics Knowledgebase / PharmGKB
A large database of curated knowledge and raw data about associations between genes, genetic variants, drug response and disease.
GWAS Central (formerly called HGVbaseG2P)
A database of genome-wide association studies that also provides summaries of study results.
SNPedia
A wiki-based platform containing information on phenotypes associated with SNP variants, population prevalence of genetic variants and SNP microarrays.
GET-Evidence (evidence.personalgenomes.org)
A large database of automatically annotated and then manually curated information about the impact of genetic variations. Example: http://evidence.personalgenomes.org/MYL2-A13T
Online Mendelian Inheritance in Man (OMIM)
Information about diseases with Mendelian inheritance, including references to the implicated genes.
dbGaP
Results of studies that have investigated the interaction of genotype and phenotype.
Information on population prevalence of genetic variants, gene-disease associations, gene-gene and gene- environment interactions, and evaluation of genetic tests.
Genetic Association Database (GAD)
Diseases associated with genetic variants.
Genotator
Aggregated gene-disease relationship data containing an integrated view over other datasets.
NCBI GeneTests
Genetics Home Reference
Genome databases with general data about genetic variation and human genomes
dbSNP
Locus Reference Genomic / LRG
An internationally recognized reference database, providing stable genomic DNA sequences and identifies for regions of the human genome.
dbVar
Large-scale genetic structural variation data (e.g., insertions, deletions).
HapMap
Collections of personal genetic data
1000 genomes project
Genome sequences of over 1000 volunteers
Database of the Estonian Genome Center, University of Tartu
A collection of genetic data associated with health and lifestyle data of over 50,000 persons.
Personal Genome Project
Whole-genome data donated by volunteers.
Vanderbilt Biobank
See http://www.nature.com/clpt/journal/v84/n3/full/clpt200889a.html
Relevant ontologies and taxonomies
Suggested Ontology for Pharmacogenomics (SO-PHARM)
A complex ontology covering the representation of genetic variation and pharmacgenomics.
Pharmacogenomics Ontology (PO)
Represents PharmGKB data; ontology for measures and outcomes.
Pharmacogenomics Relationship Ontology (PHARE)
Proposes concepts and roles to represent relationships of pharmacogenomics interest. Used for representing findings extracted from texts.
Sequence Ontology (SO)
Contains terms often used for the annotation of sequences and features, including detailed description of different types of sequence variations.
Disease Ontology
An ontology of human diseases.
Human Phenotype Ontology (HPO)
Mammalian Phenotype Ontology
Phenotypic Quality Ontology (PATO)
An ontology of types of phenotypic properties.
Logical Observation Identifiers Names and Codes (LOINC)
An established coding system for clinical lab results. Contains many identifiers for results of genetic tests.
Formats and schemas
OMG SNP
A simple XML schema for the representation of SNPs [1]. Maintained by the Object Management Group (OMG).
===== Genomic Sequence Variation Markup Language (GSVML), ISO 25720:2009 An XML schema geared towa [2]. Maintained by the International Organization for Standardization (ISO).