OpenBiologicalOntologiesInOwl

From W3C Wiki

Describe OpenBiologicalOntologiesInOwl here.

Name

Search Ontology and Retrieve (SOAR) Search Ontology Retrieve Embedded (SORE)

What

Could we create an initial mapping of mouse to human in an extensible, public setting? So that other organisms could be mapped in weblike manner (I map zebrafish to mouse, post on my site, not requiring input from this group to allow such mappings...any number of people could perform the mappings...)

result is a big graph of names and classifications ui allows creation of subgraphs then validate the subgraphs (or not) and link to articles that support the subgraph

Why

This is a semantic method to extract valid hypotheses from the graph and quickly and accurately access the literature supporting those hypotheses. It allows for cross-species comparison that is very hard to do right now. i.e., this set of genes is very interesting in the mouse muscle setting and their molecular activity might be present in humans. Which of them are known to express in human muscle, and what do I know about them?

Also, we will use a web design philosophy. Anyone will be able to download the graph and use it with some rights reserved, or publish mappings to other organisms, write software that uses it, etc...

This is an experiment. We don't know if it will work, or if it does work, how useful it would really be. But if it does work, and it is useful, it will catalyze an enormous amount of work and would serve as a backbone to which most important information could be attached.

How

OBO is an umbrella address for a set of open biological ontologies (OBO). OBO encourages each ontology to exist in a common syntax supported by DAG-EDIT or OWL. What would it take to take the set of ontologies at OBO into OWL instead of DAG-EDIT? How about mapping mouse anatomy ontologies onto human anatomy ontologies using OWL?

How about then mapping genes to those tissues? Requires using Unigene (which requires a conversoin to RDF)

Could that mapping leverage other ontologies? Could it be a "backbone" for cross-species reasoning?

A structured controlled vocabulary of stage-specific anatomical structures of the mouse (Mus)

A structured controlled vocabulary of stage-specific anatomical structures of the human. It has been designed to mesh with the mouse anatomy and incorporates each Carnegie stage of development (CS1-20). The timed version of the human developmental anatomy ontology gives all the tissues present at each Carnegie Stage (CS) of human development (1-20) linked by a part-of rule. Each term is mentioned only once so that the embryo at each stage can be seen as the simple sum of its parts. Users should note that tissues that are symmetric (e.g. eyes, ears, limbs) are only mentioned once.

Terms in the GO ontologies are mapped to specific entities (genes) in mouse and human.

What would the benefits be?

What would the drawbacks be?

What scenarios would be possible as a result?

A necessary step to make this more useful is to map the anatomy ontologies - which refer to structures (organs, tissues) - to the genes known to express in those tissues. This is important in that it would allow a scientist to more accurately understand if an observed activity in a model organism would be a likely predictor of similar activity in human.

Mapping genes to tissues: Unigene

UniGene is an experimental system for automatically partitioning GenBank sequences into a non-redundant set of gene-oriented clusters. Each UniGene cluster contains sequences that represent a unique gene, as well as related information such as the tissue types in which the gene has been expressed and map location.

field: tissue mapping for each unigene cluster (source of cdnas)

field: organism mapping for some clusters

have to reconcile EXPRESSION INFORMATION (structured vocab for describing cell type in unigene - not necessarily consistent with anatomy resources) to the anatomy ontologies. cDNA sources: other ; Eye ; Pre-implantation_Embryo ; Whole_Body ; Lung ; Colon ; mixed ; Bone ; Brain ; Mammary_Gland

Mapping More Ontologies

then we could link GO to the name of the unigene clusters

and we'd have a mouse-human anatomical mapping with genes and GO attached

which would let us start to draw catalogs of what genes are active in what tissues to help interpret mouse data as a predictor of activity in human

details, details

hard to do fine-grain mapping - may have Bile Canaliculi [A03.620.150.125] as ontology term, but "liver" as mapping for gene list

take set of genes for each organ text mine literature using that set of gene names as dictionary text mine within that subset using the ontolgical categories underneath the organ as dictionary and some sort of distance metric

  Bile Ducts, Intrahepatic [A03.620.150]				

Bile Canaliculi [A03.620.150.125] (MESH)

   <TS16\,liver and biliary system; EMAP:1928
    <TS16\,cystic duct; EMAP:1929
    <TS16\,gall bladder primordium; EMAP:1930
    <TS16\,hepatic duct; EMAP:1931
     <TS16\,extrahepatic part\,hepatic duct; EMAP:1932
     <TS16\,intrahepatic part\,hepatic duct; EMAP:1933
    <TS16\,liver; EMAP:1934
     <TS16\,hepatic primordium; EMAP:1935
      <TS16\,parenchyma\,hepatic primordium; EMAP:1936 (EMAP)	

Coarse association of genes and tissue from Unigene This process associates sub-concepts and gene names through literature co-occurence

by doing this we can associate genes to anatomical concepts more completely and allow for the creation of "diffs" between human mouse data from a semantic perspective

Could also use the Cell type ontology and cell compartment ontology in GO.

do we need to address url/urn debate here?

tie to literature text mining of genes to articles and / or semantic concepts to articles (and to each other) is a key. this is a semantic search network for literature so each triple in the system should have pubmed references. Where else should there be pubmed references?

Then What? Simply publish on the web? Browser interface/Haystack? Reasoning engine? How would someone actually use this to accomplish the "why"? how to correct improper assignements, either by people or text mining? how to map in?

Summing up

  • "Integrate" mouse and human anatomical ontolgies which contain tissue references / classes
  • Use Unigene to get gene lists for tissue types (coarse)
  • Refine and extend using directed text mining and other ontologies (MeSH, GO)
  • Publish on web with interface that allows limited reasoning - killer scenario needed

UI html form triples connected by booleans svos controlled by vocabs in graph