WebSchemas/BioDatabases
Overview
This page discusses a schema extension for describing biological databases, proposed by MORITA Mizuki (NIBIO) on behalf of Sagace (a biological database search engine) and NBDC (National Bioscience Database Center, Japan).
Vocabulary
- Adds a class 'BiologicalDatabaseEntry' as a kind of CreativeWork, introducing 'entryID', 'isEntryOf', 'taxon', and 'seeAlso'. Adds 'BiologicalDatabase' also subclass of CreativeWork, with no special properties. Both also use 'breadcrumb' from WebPage.
BiologicalDatabaseEntry
Properties for a biological database:
Property | Expected Type | Description |
---|---|---|
Properties from Thing | ||
additionalType | URL | An additional type for the item, typically used for adding more specific types from external vocabularies in microdata syntax. |
description | Text | A short description of the entry. |
image | URL | URL of an image of the entry. |
name | Text | The name of the entry. |
url | URL | URL of the entry. |
Properties from CreativeWork | ||
alternativeHeadline | Text | A secondary title of the entry. |
inLanguage | Language | The language of the content. Please use one of the language codes from the IETF BCP 47 standard. |
dateCreated | Date | The date on which the content was created (in ISO 8601 date format). |
dateModified | Date | The date on which the content was most recently modified (in ISO 8601 date format). |
keywords | Text | The keywords/tags used to describe this content. |
provider | Person or Organization | Specifies the person or organization that distributed the content. |
Properties from WebPage | ||
breadcrumb | Text | A set of links that can help a user understand and navigate a website hierarchy. |
Original properties in BiologicalDatabaseEntry | ||
entryID | Text | The identifier of the entry. |
isEntryOf | BiologicalDatabase | Indicates the database to which the entry belongs. |
taxon | BiologicalDatabaseEntry or Text | The taxonomy information of the entry. |
seeAlso | BiologicalDatabaseEntry or URL | Reference to another resource. |
reference | Text or URL | The identifier of the reference, such as PMID, DOI, and PMCID. For example: . If the reference doesn't have ID, use URL. For example: . |
BiologicalDatabase
Properties for a biological database entry:
Property | Expected Type | Description |
---|---|---|
Properties from Thing | ||
additionalType | URL | An additional type for the item, typically used for adding more specific types from external vocabularies in microdata syntax. |
description | Text | A short description of the entry. |
image | URL | URL of an image of the entry. |
name | Text | The name of the entry. |
url | URL | URL of the entry. |
Properties from CreativeWork | ||
alternativeHeadline | Text | A secondary title of the entry. |
inLanguage | Language | The language of the content. Please use one of the language codes from the IETF BCP 47 standard. |
dateCreated | Date | The date on which the content was created (in ISO 8601 date format). |
dateModified | Date | The date on which the content was most recently modified (in ISO 8601 date format). |
keywords | Text | The keywords/tags used to describe this content. |
provider | Person or Organization | Specifies the person or organization that distributed the content. |
Properties from WebPage | ||
breadcrumb | Text | A set of links that can help a user understand and navigate a website hierarchy. |
Original properties in BiologicalDatabase | ||
reference | Text or URL | The identifier of the reference, such as PMID, DOI, and PMCID. For example: . If the reference doesn't have ID, use URL. For example: . |
Example Markup
- You can test the following examples with Live Microdata.
- Please see the Sagace "how to mark up" page for more examples.
BiologicalDatabaseEntry
JCRB0225 [COLO320 DM] Profile: Human colon carcinoma cell line with double minute chromosomes. Tags: tumor, colon, adenocarcinoma Date created: 08/27/2007 Animal: human Organism: Homo sapiens (human) Taxonomy ID: 9606 [UniProt Taxonomy] JCRB Cell Bank
<div itemscope itemtype="http://schema.org/BiologicalDatabaseEntry">
<h1><a itemprop="url" href="http://cellbank.nibio.go.jp/legacy/celldata/jcrb0225.htm">
<span itemprop="entryID">JCRB0225</span> [<span itemprop="name">COLO320 DM</span>]
</a></h1>
Profile: <span itemprop="description">Human colon carcinoma cell line with double minute chromosomes.</span>
Tags: <span itemprop="keywords">tumor</span>, <span itemprop="keywords">colon</span>, <span itemprop="keywords">adenocarcinoma</span>
Date created: <meta itemprop="dateCreated" content="2007-08-27">08/27/2007
Animal: human
<span itemprop="taxon" itemscope itemtype="http://schema.org/BiologicalDatabaseEntry">
Organism: <span itemprop="name">Homo sapiens</span> (human)
Taxonomy ID: <a itemprop="url" href="http://www.uniprot.org/taxonomy/9606"><span itemprop="entryID">9606</span></a>
[<span itemprop="isEntryOf" itemscope itemtype="http://schema.org/BiologicalDatabase"><a itemprop="url" href="http://purl.uniprot.org/taxonomy/"><span itemprop="name">UniProt Taxonomy</span></a></span>]
</span>
<span itemprop="isEntryOf" itemscope itemtype="http://schema.org/BiologicalDatabase">
<a itemprop="url" href="http://cellbank.nibio.go.jp/">
<span itemprop="name">JCRB Cell Bank</span>
</a>
</span>
</div>
KEGG DISEASE: H00653 Entry: H00653 Name: Marfan syndrome, including: Marfan syndrome (MFS); Neonatal MFS; Atypically severe MFS; New variant of MFS Description: Marfan syndrome (MFS) is a relatively common autosomal dominant disorder ... Other DBs: ICD-10: Q87.4 OMIM: 154700 Species: Human KEGG DISEASE (Diseases viewed as perturbed states of the molecular system)
<div itemscope itemtype="http://schema.org/BiologicalDatabaseEntry">
<a itemprop="url" href="http://www.kegg.jp/dbget-bin/www_bget?ds:H00653"><span itemprop="name">KEGG DISEASE: H00653</span></a>
Entry:
<span itemprop="entryID">H00653</span>
Name: <span itemscope itemtype="http://schema.org/MedicalCondition">
<span itemprop="code" itemscope itemtype="http://schema.org/MedicalCode">
<meta itemprop="codeValue" content="Q87.4">
<meta itemprop="codingSystem" content="ICD-10">
</span>
</span>
Marfan syndrome, including:
Marfan syndrome (MFS);
Neonatal MFS;
Atypically severe MFS;
New variant of MFS
Description:
<span itemprop="description">Marfan syndrome (MFS) is a relatively common autosomal dominant disorder ...</span>
Other DBs:
<span itemprop="seeAlso" itemscope itemtype="http://schema.org/BiologicalDatabaseEntry">
<span itemprop="isEntryOf" itemscope itemtype="BiologicalDatabase">
<span itemprop="name">ICD-10</span>
</span>:
<a itemprop="url" href="http://www.kegg.jp/kegg-bin/get_htext?br08403+H00653"><span itemprop="entryID">Q87.4</span></a></span>
<span itemprop="seeAlso" itemscope itemtype="http://schema.org/BiologicalDatabaseEntry"><span itemprop="isEntryOf" itemscope itemtype="BiologicalDatabase">
<span itemprop="name">OMIM</span>
</span>: <a itemprop="url" href="http://omim.org/entry/154700"><span itemprop="entryID">154700</span></a></span>
Species:
<span itemprop="taxon" itemscope itemtype="http://schema.org/BiologicalDatabaseEntry">
<span itemprop="name">Human</span>
</span>
<span itemprop="isEntryOf" itemscope itemtype="http://schema.org/BiologicalDatabase">
<a itemprop="url" href="http://www.kegg.jp/kegg/disease/"><span itemprop="name">KEGG DISEASE (Diseases viewed as perturbed states of the molecular system)</span></a>
</span>
</div>
BiologicalDatabase
JCRB Cell Bank Profile: JCRB Cell Bank is the first cell bank in Japan. We collect ... Date established: 10/1984 Last modified: 02/28/2011 Operated by: National Institute of Biomedical Innovation (NIBIO)
<div itemscope itemtype="http://schema.org/BiologicalDatabase">
<span itemprop="name"><a itemprop="url" href="http://cellbank.nibio.go.jp/">JCRB Cell Bank</a></span>
Profile: <span itemprop="description">JCRB Cell Bank is the first cell bank in Japan. We collect ...</span>
Date established: <meta itemprop="dateCreated" content="1984-10">10/1984
Last modified: <meta itemprop="dateModified" content="2011-02-28">02/28/2011
Operated by: <span itemprop="provider" itemscope itemtype="http://schema.org/Organization">
<a itemprop="url" href="http://www.nibio.go.jp/"><span itemprop="name">National Institute of Biomedical Innovation (NIBIO)</span></a>
</span>
</div>
KEGG: Kyoto Encyclopedia of Genes and Genomes Profile: KEGG (Kyoto Encyclopedia of Genes and Genomes) is a database resource that ... Date established: 1995 Current release: Release 64.0, October 1, 2012 Developed by: Kanehisa Laboratories References: Kanehisa M, et al. Nucleic Acids Res. 40, D109-D114 (2012). [pubmed] Kanehisa M and Goto S. Nucleic Acids Res. 28, 27-30 (2000). [pubmed] Kanehisa M, et al. PNE, 52(12), 1486-1491 (2007). [PNE}
<div itemscope itemtype="http://schema.org/BiologicalDatabase">
<span itemprop="name"><a itemprop="url" href="http://www.kegg.jp/">KEGG: Kyoto Encyclopedia of Genes and Genomes</a></span>
Profile: <span itemprop="description">KEGG (Kyoto Encyclopedia of Genes and Genomes) is a database resource that ...</span>
Date established: <meta itemprop="dateCreated" content="1995">1995
Current release: Release 64.0, <meta itemprop="dateModified" content="2012-10-01">October 1, 2012
Developed by: <span itemprop="provider" itemscope itemtype="http://schema.org/Organization"><a itemprop="url" href="http://www.kanehisa.jp/"><span itemprop="name">Kanehisa Laboratories</span></a></span>
References:
Kanehisa M, et al. Nucleic Acids Res. 40, D109-D114 (2012). [<meta itemprop='reference' content='pmid:22080510'/><a href="http://www.ncbi.nlm.nih.gov/pubmed/22080510">pubmed</a>]
Kanehisa M and Goto S. Nucleic Acids Res. 28, 27-30 (2000). [<meta itemprop='reference' content='pmid:10592173'/><a href="http://www.ncbi.nlm.nih.gov/pubmed/10592173">pubmed</a>]
Kanehisa M, et al. PNE, 52(12), 1486-1491 (2007). [<meta itemprop='reference' content='http://lifesciencedb.jp/dbsearch/Literature/get_pne_cgpdf.php?year=2007&number=5212&file=o6YUOqyfHjzsI1vg5UpTlQ'/><a href="http://lifesciencedb.jp/dbsearch/Literature/get_pne_cgpdf.php?year=2007&number=5212&file=o6YUOqyfHjzsI1vg5UpTlQ">PNE</a>]
</div>
Discussion (Your review and comments are needed!)
- You can test the following examples with Live Microdata.
1. How to markup taxonomy (4 candidates)
1-1. Original [use taxonID]
<div itemscope itemtype ="http://schema.org/BiologicalDatabaseEntry">
<h1><a itemprop="url" href="http://www.uniprot.org/uniprot/Q401N2">
<span itemprop="entryID">Q401N2</span> [<span itemprop="name">Zinc-activated ligand-gated ion channel</span>]
</a></h1>
Organism: <a href="http://www.uniprot.org/taxonomy/9606">Homo sapiens (human)</a>
Taxonomy ID: <a href="http://www.uniprot.org/taxonomy/9606"><span itemprop="taxonID">9606</span></a>
[<a href="http://purl.uniprot.org/taxonomy/">UniProt Taxonomy</a>]
<span itemprop="isEntryOf" itemscope itemtype="http://schema.org/BiologicalDatabase">
Database: <a itemprop="url" href="http://www.uniprot.org/"><span itemprop="name">UniProt</span></a>
</span>
</div>
1-2. Proposed change 1 [taxonID -> taxon] [Currently selected]
<div itemscope itemtype ="http://schema.org/BiologicalDatabaseEntry">
<h1><a itemprop="url" href="http://www.uniprot.org/uniprot/Q401N2">
<span itemprop="entryID">Q401N2</span> [<span itemprop="name">Zinc-activated ligand-gated ion channel</span>]
</a></h1>
<span itemprop="taxon" itemscope itemtype="http://schema.org/BiologicalDatabaseEntry">
Organism: <span itemprop="name">Homo sapiens</span> (human)
Taxonomy ID: <a itemprop="url" href="http://www.uniprot.org/taxonomy/9606"><span itemprop="entryID">9606</span></a>
[<span itemprop="isEntryOf" itemscope itemtype="http://schema.org/BiologicalDatabase"><a itemprop="url" href="http://purl.uniprot.org/taxonomy/"><span itemprop="name">UniProt Taxonomy</span></a></span>]
</span>
<span itemprop="isEntryOf" itemscope itemtype="http://schema.org/BiologicalDatabase">
Database: <a itemprop="url" href="http://www.uniprot.org/"><span itemprop="name">UniProt</span></a>
</span>
</div>
1-3. Proposed change 2 [taxonID -> taxon] (simpler but less useful for search engines?)
<div itemscope itemtype ="http://schema.org/BiologicalDatabaseEntry">
<h1><a itemprop="url" href="http://www.uniprot.org/uniprot/Q401N2">
<span itemprop="entryID">Q401N2</span> [<span itemprop="name">Zinc-activated ligand-gated ion channel</span>]
</a></h1>
Organism: <span itemprop="taxon">Homo sapiens</span> (human)
Taxonomy ID: <a href="http://www.uniprot.org/taxonomy/9606">9606</a>
[<a href="http://purl.uniprot.org/taxonomy/">UniProt Taxonomy</a>]
<span itemprop="isEntryOf" itemscope itemtype="http://schema.org/BiologicalDatabase">
Database: <a itemprop="url" href="http://www.uniprot.org/"><span itemprop="name">UniProt</span></a>
</span>
</div>
1-4 Proposed change 3 [taxonID -> SpeciesCode<additional type>]
<div itemscope itemtype ="http://schema.org/BiologicalDatabaseEntry">
<h1><a itemprop="url" href="http://www.uniprot.org/uniprot/Q401N2">
<span itemprop="entryID">Q401N2</span> [<span itemprop="name">Zinc-activated ligand-gated ion channel</span>]
</a></h1>
Organism: <span itemprop="code" itemscope itemtype="http://schema.org/BiologicalDatabaseCode">Homo sapiens (human)
Taxonomy ID: <span itemprop="code">9606</a> <meta itemprop="codingSystem" content="taxon">
</span>
</span>
<span itemprop="isEntryOf" itemscope itemtype="http://schema.org/BiologicalDatabase">
Database: <a itemprop="url" href="http://www.uniprot.org/"><span itemprop="name">UniProt</span></a>
</span>
</div>
1-5. Or other possibilities welcome.
2-1. Candidate 1 (“relatedLink” is NOT applicable, which cannot take ‘BiologicalDatabaseEntry’ as datatype) [Currently selected]
<div itemscope itemtype ="http://schema.org/BiologicalDatabaseEntry">
<h1><a itemprop="url" href="http://www.uniprot.org/uniprot/Q401N2">
<span itemprop="entryID">Q401N2</span> [<span itemprop="name">Zinc-activated ligand-gated ion channel</span>]
</a></h1>
Cross-references:
KEGG: <span itemprop="seeAlso" itemscope itemtype="http://schema.org/BiologicalDatabaseEntry"><a itemprop="url" href="http://purl.uniprot.org/kegg/hsa:353174"><span itemprop="name">hsa:353174</span></a></span>
RefSeq: <span itemprop="seeAlso" itemscope itemtype="http://schema.org/BiologicalDatabaseEntry"><a itemprop="url" href="http://www.ncbi.nlm.nih.gov/entrez/viewer.fcgi?db=protein&id=NP_851321.2"><span itemprop="name">NP_851321.2</span></a></span>, <span itemprop="seeAlso" itemscope itemtype="http://schema.org/BiologicalDatabaseEntry"><a itemprop="url" href="http://www.ncbi.nlm.nih.gov/nuccore/NM_180990.3"><span itemprop="name">NM_180990.3</span></a></span>
H-InvDB: <span itemprop="seeAlso" itemscope itemtype="http://schema.org/BiologicalDatabaseEntry"><a itemprop="url" href="http://h-invitational.jp/hinv/spsoup/locus_view?hix_id=HIX0027141"><span itemprop="name">HIX0027141</span></a></span>
<span itemprop="isEntryOf" itemscope itemtype="http://schema.org/BiologicalDatabase">
Database: <a itemprop="url" href="http://www.uniprot.org/"><span itemprop="name">UniProt</span></a>
</span>
</div>
2-2. Candidate 2 (simpler but less useful for search engines?) (“relatedLink” is applicable)
<div itemscope itemtype ="http://schema.org/BiologicalDatabaseEntry">
<h1><a itemprop="url" href="http://www.uniprot.org/uniprot/Q401N2">
<span itemprop="entryID">Q401N2</span> [<span itemprop="name">Zinc-activated ligand-gated ion channel</span>]
</a></h1>
Cross-references:
KEGG: <a itemprop="seeAlso" href="http://purl.uniprot.org/kegg/hsa:353174">hsa:353174</a>
RefSeq: <a itemprop="seeAlso" href="http://www.ncbi.nlm.nih.gov/entrez/viewer.fcgi?db=protein&id=NP_851321.2">NP_851321.2</a>, <a itemprop="seeAlso" href="http://www.ncbi.nlm.nih.gov/nuccore/NM_180990.3">NM_180990.3</a>
H-InvDB: <a itemprop="seeAlso" href="http://h-invitational.jp/hinv/spsoup/locus_view?hix_id=HIX0027141">HIX0027141</a>
<span itemprop="isEntryOf" itemscope itemtype="http://schema.org/BiologicalDatabase">
Database: <a itemprop="url" href="http://www.uniprot.org/"><span itemprop="name">UniProt</span></a>
</span>
</div>
2-3. Or other possibilities welcome.
3. How to markup references in BiologicalDatabase (2 candidates)
3-1. Define a property ‘reference’, like ‘productID’ in Thing > Product (flexible, the term ‘reference’ is easily understandable) [Currently selected]
<div itemscope itemtype="http://schema.org/BiologicalDatabase">
<span itemprop="name"><a itemprop="url" href="http://www.kegg.jp/">KEGG: Kyoto Encyclopedia of Genes and Genomes</a></span>
Profile: <span itemprop="description">KEGG (Kyoto Encyclopedia of Genes and Genomes) is a database resource that integrates genomic, chemical and systemic functional information. In particular, gene catalogs from completely sequenced genomes are linked to higher-level systemic functions of the cell, the organism and the ecosystem.</span>
Date established: <meta itemprop="dateCreated" content="1995">1995
Current release: Release 64.0, <meta itemprop="dateModified" content="2012-10-01">October 1, 2012
Developed by: <span itemprop="provider" itemscope itemtype="http://schema.org/Organization"><a itemprop="url" href="http://www.kanehisa.jp/"><span itemprop="name">Kanehisa Laboratories</span></a></span>
References:
Kanehisa M, et al. Nucleic Acids Res. 40, D109-D114 (2012). [<meta itemprop='reference' content='pmid:22080510'/><a href="http://www.ncbi.nlm.nih.gov/pubmed/22080510">pubmed</a>]
Kanehisa M and Goto S. Nucleic Acids Res. 28, 27-30 (2000). [<meta itemprop='reference' content='pmid:10592173'/><a href="http://www.n'cbi.nlm.nih.gov/pubmed/10592173">pubmed</a>]
Kanehisa M, et al. PNE, 52(12), 1486-1491 (2007). [<meta itemprop='reference' content='http://lifesciencedb.jp/dbsearch/Literature/get_pne_cgpdf.php?year=2007&number=5212&file=o6YUOqyfHjzsI1vg5UpTlQ'/><a href="http://lifesciencedb.jp/dbsearch/Literature/get_pne_cgpdf.php?year=2007&number=5212&file=o6YUOqyfHjzsI1vg5UpTlQ">PNE</a>]
</div>
3-2. Define properties ‘pmid’, ‘doi’, and ‘pmcid’, like ‘isbn’ in Thing > CreativeWork > Book
<div itemscope itemtype="http://schema.org/BiologicalDatabase">
<span itemprop="name"><a itemprop="url" href="http://www.kegg.jp/">KEGG: Kyoto Encyclopedia of Genes and Genomes</a></span>
Profile: <span itemprop="description">KEGG (Kyoto Encyclopedia of Genes and Genomes) is a database resource that integrates genomic, chemical and systemic functional information. In particular, gene catalogs from completely sequenced genomes are linked to higher-level systemic functions of the cell, the organism and the ecosystem.</span>
Date established: <meta itemprop="dateCreated" content="1995">1995
Current release: Release 64.0, <meta itemprop="dateModified" content="2012-10-01">October 1, 2012
Developed by: <span itemprop="provider" itemscope itemtype="http://schema.org/Organization"><a itemprop="url" href="http://www.kanehisa.jp/"><span itemprop="name">Kanehisa Laboratories</span></a></span>
References:
Kanehisa M, et al. Nucleic Acids Res. 40, D109-D114 (2012). [<meta itemprop='pmid' content='22080510'/><a href="http://www.ncbi.nlm.nih.gov/pubmed/22080510">pubmed</a>]
Kanehisa M and Goto S. Nucleic Acids Res. 28, 27-30 (2000). [<meta itemprop='pmid' content='10592173'/><a href="http://www.ncbi.nlm.nih.gov/pubmed/10592173">pubmed</a>]
</div>
3-3. Or other possibilities welcome.
Comments and Discussion
1. A Comment from Dan Brickley (2012-03-13).
- Others have also mentioned interest in adding some notion of species.
2. Discussion in the BioHackathon ML (from 2012-08-10).
- https://groups.google.com/forum/#!topic/biohackathon/8y73xtWSHxc%5B1-25%5D
- Taxonomy ID - Simplicity or Flexibility? (Jerven Bolleman)
3. BioHackathon 2012 (2012-09-02/2012-09-07)
- https://github.com/dbcls/bh12/wiki/Schema.org-extension/
- We discussed with the “Standard RDF representation for glycans” group.
- How to mark up trivial name (e.g., sialyl-lewis-x, lactosamine)
- sameAs, seeAlso or relatedLink
- taxonID or taxon
4. BioHackathon 12.12 (2012-12-19/2012-12-22)
5. SEMANTIC WEB HEALTH CARE AND LIFE SCIENCES (HCLS) INTEREST GROUP Seminar (2013-04-02)
- http://www.w3.org/blog/SW/2013/03/30/maori-ito-presents-enhanced-search-for-life-science-databases-with-proposed-schema-org-extension/
- PDF - http://www.w3.org/wiki/images/6/6f/2013-hcls-maori-ito-db-schema-dot-org.pdf
6. A comment from Jamie Estill(2013-08-13)
- It seems that author and contentLocation should also be included as a properties from CreativeWork
How to Join the Discussion
Please give your comments on the proposed schema by the following ways:
1. Reply to the original post on Mailing List (public-vocabs@w3.org)
2. Reply to the original post (@keyboardrobot) on Twitter
3. Reply to the original post (@keyboardrobot) on Twitter [in Japanese]
References
Search engines in the life science field
- Entrez (NCBI) -- http://www.ncbi.nlm.nih.gov/sites/gquery
- EB-eye (EBI) -- http://www.ebi.ac.uk/ebisearch/
- Life Science Database Cross Search (NBDC) -- http://biosciencedbc.jp/dbsearch/en/
- Sagace (NIBIO) -- http://sagace.nibio.go.jp/en/
Meta data for biological databases
- BioDBcore -- http://biodbcore.org/
- Catalogue of DBs with BioDBcore -- http://biosharing.org/biodbcore
Validator for microdata
- You can test the above example markups with Live Microdata.