HCLS/HCLS dbpedia
Mapping Life Science and Health Care ontologies and datasets to DBpedia and YAGO
Task Objectives
- Identifying possibilities for mappings between HCLS data (such as the OBO ontologies, MeSH in SKOS, NeuronDB) to DBpedia. Listing them on this wiki page, together with possible queries that could be used for automated mapping.
- Choosing relations to use for the mappings (e.g., rdfs:seeAlso, owl:sameAs, owl:equivalentClass)
- Writing, publishing and running scripts to create the mappings
- Deciding how to publish the mappings on the web (downloadable .zip files, SPARQL endpoint, dedicated graph in the SPARQL endpoint of the HCLS KB, linked data...)
- Publishing the mappings
- Agreeing on a community process to update the mappings periodically
Participants
- Matthias Samwald (DERI Galway)
- Kingsley Idehen - interested in publishing since DBpedia, Uniprot, HCLS, Yago are all available from Virtuoso instances within close proximity at OpenLink
- (add yourself if you participate)
Queries and mappings
List all properties in the DBpedia triplestore
select distinct ?property where {?property a rdf:Property .}
List all properties that are used to describe proteins
Simple query to get a feeling for what the domain and range of a property are
select distinct * where {?resource <http://dbpedia.org/property/SOME_PROPERTY_HERE> ?value} LIMIT 40
Properties that are of interest for mapping purposes
Prime candidates
http://dbpedia.org/property/uniprot
http://dbpedia.org/property/goCode
http://dbpedia.org/property/casNumber
http://dbpedia.org/property/casno (sometimes used for chembox identifiers of resources, not for resources themselves)
http://dbpedia.org/property/inchi
http://dbpedia.org/property/chebi
http://dbpedia.org/property/meshname
http://dbpedia.org/property/meshid
http://dbpedia.org/property/diseasesdb
Scratchpad
http://dbpedia.org/property/iupacName http://dbpedia.org/property/molecularWeight http://dbpedia.org/property/casNumber http://dbpedia.org/property/casno (sometimes used for chembox identifiers of resources, not for resources themselves) http://dbpedia.org/property/pubchem http://dbpedia.org/property/smiles http://dbpedia.org/property/iupacname http://dbpedia.org/property/iupacName http://dbpedia.org/property/inchi http://dbpedia.org/property/chebi
http://dbpedia.org/property/meshname http://dbpedia.org/property/meshid
http://dbpedia.org/property/mgiid http://dbpedia.org/property/omim http://dbpedia.org/property/homologene
http://dbpedia.org/property/pmid http://dbpedia.org/property/doi
http://dbpedia.org/property/diseasesdb
http://dbpedia.org/property/icd10 often refers to a separate resource derived from a wiki template, such as http://dbpedia.org/page/Arthritis/icd10/ICD10 (compare this to http://en.wikipedia.org/wiki/Arthritis -- the representation in DBpedia seems puzzling / not usable)
http://dbpedia.org/property/regnum http://dbpedia.org/property/divisio http://dbpedia.org/property/ordo http://dbpedia.org/property/subfamilia http://dbpedia.org/property/tribus http://dbpedia.org/property/phylum http://dbpedia.org/property/genus
From proteins: http://dbpedia.org/property/interpro http://dbpedia.org/property/scop http://dbpedia.org/property/opmProtein http://dbpedia.org/property/pfam http://dbpedia.org/property/pdb http://dbpedia.org/property/prosite http://dbpedia.org/property/smart http://dbpedia.org/property/opmFamily http://dbpedia.org/property/name http://dbpedia.org/property/hgncid http://dbpedia.org/property/omim http://dbpedia.org/property/chromosome http://dbpedia.org/property/band http://dbpedia.org/property/entrezgene http://dbpedia.org/property/refseq http://dbpedia.org/property/arm http://dbpedia.org/property/uniprot http://dbpedia.org/property/umichopmProperty http://dbpedia.org/property/homologene http://dbpedia.org/property/mgiid http://dbpedia.org/property/ecNumber http://dbpedia.org/property/iubmbEcNumber http://dbpedia.org/property/goCode
Notes
- It seems like the domains of some properties are a bit heterogeneous, additional restrictions (e.g., on the YAGO classification of a resource) might be needed.
- Some properties are not widely used (< 200 times) and can be disregarded for the mappings.
Related resources
Categories