HCLS/OntologyTaskForce/RefinedOtfUseCase

From W3C Wiki
Jump to: navigation, search
HCLS Home OTF Home Discussions Post to HCLS listserv Minutes Links

HCLSIG Use Case Refinement


Review Current Use Cases

  1. Parkinson's Use Case
  2. NeuronDB & CocoDat bridging ontology using UMLS terms
  3. [[SenselabUsecase2|SenseLab] + BrainPharm + AlzForum/SWAN]
  4. [[GeneNetwork|Use of GeneNetwork] for genomic and genetic info]
  5. Ligand-Receptor Interaction, Molecular Interaction Networks, and Ontology Evolution
  6. OMIM Descriptions of Neurogenetic Disease

The idea I want to construe is to give everyone a concrete sense of how the elements in these databases can be brought together to provide an explicit, semantically linked view of the nervous system from the molecular on up through expressed disease - sort of like a Powers of 10 for the neurosciences, where the connections aren't made via images, but via formal semantic expression of related molecular, cellular, and supra-cellular entities. The other aspect I hope to convey is just as genetic networks and metabolic pathways provide a graph to connect the molecular entities, in the neurosciences - in any higher discipline, really - inter- and intra-cellular physiology, as well as structure-function maps providing functional linkages between mesoscopic organ regions - the brain in this case - can provide the graph to connect the larger scale entities.

The following knowledge resources have been examined in great deal by several members of the HCLSIG OTF in regards to performing semantically-mediated integration:

  1. Senselab - in particular the cell-focused component of SenseLab, NeuronDB - contains a large collection of semantically identified neuronal molecules, CNS cell types, and nerve cell models.
  2. Cocodat contains some cellular data on primate CNS nerve cells and much info on mesoscopic brain region functional roles and region-to-region connectivity.
  3. AlzForum contains info on many levels, but not nearly in the depth provided by these other sources in the domains they cover. It also contains links to the human clinical literature not included in the other repositories.

I'm tempted to add data from:

These neuroscience literature informatics resources contain a huge collection of human functional neuroimaging data. The latter also includes a certain amount of terminological specification for 1000s of articles from the literature from which the neuroimaging data derives, as well as spatial locations for the activated brain regions given in a normalized coordinate system - i.e., Talairach space.

It would also be helpful to have a repository of mouse models of neurodegerative disease to correlate against these data sources as this could help tie together the data sets from molecules on through anatomy and simple behavioral experiments, but no such repository exists, yet. This is one of the tasks we expect to address in the BIRN project - essentially like the Mouse Models of Human Cancer Consortium, but for neurodegenerative disease, though we are looking to create a federated resource providing semantically-aware access to a myriad of related neuroscience data repositories, not a single repository of inbred and transgenic embryos.

No great insight here. I won't be demonstrating any new connections amongst these data repositories in this presentation. KeiCheung and DonaldDoherty have done much more on this than I in their HCLSIG OTF Wiki write ups. I do hope, however, walking through these examples from molecules to higher brain regions can make the issue of how the basic research and clinical neuroscientist usage scenarios outlined in detail for Parkinsonian Syndrome map to a set of technical requirements where semantic web technology is particularly well suited to provide a solution.


Review Neuroinformatic Studies of Neurodevelopment

  1. I propose we review the following four neuroscientific research articles which touch on the topics of early neurodevelopment, as well as adult neurogenesis in reference to plasticity in learning and memory and its relation to cognitive ability, as well as recovery from neurodegenerative disease. Each of these studies makes heavy use of informatic approaches to data collection and statistical analysis of the accumulated information in addressing at least a portion of the questions they pose at the outset. I believe focussing in on these very specific studies can provide a more tangible sense of neuroscientific use cases which would profit from a Semantic Web-based implementation. What I'd like to do is suggest folks read these four pretty accessible studies and think about the following two questions (with no presumption they will be answered in the affirmative):
    • a) Could Semantic Web Technology have made the study easier to perform?
    • b) Could Semantic Web Technology have enabled the researchers to extend the questions they asked - or provide additional detail?
  2. The four articles are:

Though these articles are focused on neurodevelopment as opposed to specific neurodegenerative disease, they all focus on aspects of neuronal structure and function that are immediately relevant to any neurological disease where loss of neurons or abberant neuronal architecture and/or microcircuitry are at play. In all four studies, the researchers focus on the factors governing development of specific CNS neuronal architectural features, including synaptic molecular architecture. They also try to provide insight into the functional ramifications of these structural details. Kempermann et al. refer to a myriad of gene's via the GeneNetwork repository, Benavides-Piccione et al. are dealing primarily with large-scale statistic cataloging of neocortical nerve cell architecture, and the other two articles focus on the developmental roles played by the following genes:

Gene
Bcl-X
Nova2

The other point I hope to get across with both of these overall tasks is that in neuroinformatics, one of the hardest problems is making that connection between the large-scale informatics efforts run against very large data sets and the studies performed in the lab every day by working neuroscientists. The attraction for many informaticists is to capture the big fish - for instance, how does one create the large-scale, killer app for neuroinformatics equivalent to the something like SRS for sequence level bioinformatics or something with the utility Gene Ontology provides for genome-wide expression analysis. The aspects of both SRC and GO that has made them such resoundingly useful tools is the relevance they also have to working scientists doing their everyday benchtop work. They are also essentially links between the published literature and the published primary data.

I think its in making the point of how Semantic Web Tech can help solve these sorts of focused informatics tasks - and place these "daily" results in a semantically-meaningful way into the larger world of available information - including reaching back decades into the literature - that it can truly prove its unique value.

This is still the "small fish" of semantic web/ontological engineering applications - merely being as semantically specific and accurate as is practical - and doing so in a formal way. Complex reasoning can come later and will likely come as more complex modeling approaches - as is the case in metabolomics, for instance - begin to build on the available, semantically-formalized repository containing data from the molecular on up to the neural systems level. I think there is also the need to specify the micro-variability seen in neuronal cell types at both the level of cellular anatomy, transmitter biochemistry, and ion-channel distribution. These will be key elements in providing a broad, neuroinformatics foundation for the neurosciences. We have excellent models on these features, but there's is a need to do a much more existential characterization of this information at the cellular level, as these three factors - topological factors effecting the diffusion of transmitter and of electrical activity, disposition both of transmitter metabolic enzymes as well as receptor distribution & turnover, and finally - my favorite - ion channel distribution & turnover - these will be critical factors to track specifically on populations of cells. Many of the techniques needed to collect such information are either new or still in the offing, but new experimental approaches continue to accrue on this front at an alarming rate. I believe Semantic Web Technology will be critical in providing a means to effectively collate these critical information for studies of large scale, statically-significant correlative studies.

HCLS Home OTF Home Discussions Post to HCLS listserv Minutes Links