HCLSIG BioRDF Subgroup/QueryFederation

From W3C Wiki

Query Federation Task

This task explores how to federate queries across multiple sources exposing diverse types of data via different interfaces such as SPARQL endpoint, open linked data, and relational database. A demo will be implemented as a proof of concept example of semantic query federation involving query mapping, data integration, inferencing, reasoning, etc.

  • Use case(s): neuroscience data integration (e.g., at the receptor level)
    • dynamic construction of a comprehensive receptor tree by merging receptor trees from multiple sources such as SenseLab and DBPedia
    • dynamic aggregation of receptor-related data from multiple sources
  • Comparison of triplestores (e.g., Virtuoso and Allegro Graph) to identify complementary features
    • e.g., How to retrieve parent-child nodes recursively from a tree
  • Identification of relevant datasets
    • dbpedia, senselab ontology in HCLS KB's, tables, open linked data
      • receptor tables at IUPHAR: [1]
  • Rules - exploring use of rules to facilitate query federation
    • query mapping rules, data integration rules, ...
  • Approaches for query federation
  • User Interface for demo

Demo Mockup

This demo mockup is currently receptor-centric. However, it can be expanded to include other entities such as cell types, genes, etc.

Step 1: Selection of receptors (e.g., GABAA Receptor and NMDA Receptor)

  • Receptor (D)
    • Amino Acid Receptor (S)
      • Glutmate Receptor (D,S)
        • AMPA Receptor (D,S)
        • Kainate Receptor (S)
        • NMDA Receptor (D,S)
        • mGluR Receptor (S)
        • GabaA Receptor (D,S)
        • GabaB Receptor (D,S)
  • Ionotropic Receptor (D)
    • NMDA Receptor (D,S)
  • D=DBpedia, S=SenseLab

Step 2: Selection of receptor attributes for aggregation (all attributes are selected by default)

  • Definition (DBPedia's abstract)
  • Image (DBPedia image of receptor structure: foaf:img)
  • Genes (DBPedia's refseq)
  • Binding molecules (HCLS KB hosted at Free U -- PDSP)
  • GO Accession (HCLS KB hosted at DERI -- GO)

Step 3: Summary table of aggregated data for the selected receptors

Receptor Definition Genes Binding molecules
GABAA Receptor The GABAA receptor is one of two ligand-gated ion channels responsible for mediating the effects of gamma-aminobutyric acid (GABA), the major inhibitory neurotransmitter in the brain. In addition to the GABA binding site, the GABAA receptor complex appears to have distinct allosteric binding sites for benzodiazepines, barbiturates, ethanol, inhaled anaesthetics, furosemide, kavalactones, neuroactive steroids, and picrotoxin. Genes Binding molecules
NMDA Receptor The NMDA receptor (NMDAR) is an ionotropic receptor for glutamate . Activation of NMDA receptors results in the opening of an ion channel that is nonselective to cations. 2092,2093,2094,2095,2096 Spermine,L-Aspartate,D-Aspartate,L-Glutamate,Glutamate,Amantadine

Step 4: Select a row (receptor) from the table for deeper exploration

For a selected receptor, we can explore the detailed information provided by different sources (e.g., dbpedia, senselab, faviki, and so on). As each source provides specific types of information (e.g., gene/protein related, literature information, neuronal properties), we can also explore based on information types/categories.

Proposed Action Items and Progress

  • Explore how to dynamically construct the SenseLab receptor tree from the HCLS KB hosted at Free University Berlin -- this can take advantage of some built-in feature(s) of Allegro Graph (Adrian -- with Matthias help in understanding the structure)
  • Explore how to dynamically construct a receptor tree from dbpedia (Rob)
    • Rob has been interacting with Kingsley at OpenLink re. use of Virtuoso to resolve the issue of inferencing of the receptor class hierarchy (see Resources/Deliverables below)
  • Join the SenseLab receptor tree and the DBPedia receptor tree (Matthias)
    • Matthias has created mappings between NeuronDB receptors and DBPedia receptors (see Resources/Deliverables below)
  • Explore how common URI's might fit in (e.g., receptors) (Scott)
  • Suppose the user can select one or more receptors from the integrated receptor tree, what types of information related to the selected receptor(s) we would like to aggregate from HCLS KB's, DBPedia, and others (e.g., IUPHAR). The user can probably select from a set of receptor attributes. (Kei, Matthias, Scott who have experiences in the neuroscience area)
    • Kei has created a demo mock up (see above)
  • Use tools like FeDeRate to implement the aggregation. For example we can create a table with the first column listing the selected receptors and the subsequent columns listing the values of different receptor attributes (e.g., genes, IUPHAR receptor code, etc) (Eric and Rob)
  • Explore social/semantic bookmarking using Faviki (Jun)
    • Jun has tagged a few web pages with receptor names
      • The natural aternative use case was presented at LODD (see [2] and Resources/Deliverables below)
  • TCM dataset (Jun)
    • TCMGeneDIT
  • Receptor Explorer demo (Rob)
    • See Resources/Deliverables link below.
  • Federated query demo using FeDeRate (Eric)
  • voiD demo (Jun)
  • Other ideas?

Resources / Deliverables

Possible Meetings/Ways for Disseminating the Work

  • C-SHALS 2009 demo at the HCLS tutorial
  • DILS 2009 submission (deadlines for abstracts: Feb 13, 2009, full papers: Feb 20, 2009)
  • Semantic Web Case Studies or Use Cases
  • Journal submission? (e.g., BMC Bioinformatics special issue: “Semantic Web Applications and Tools for the Life Sciences”)
  • W3C notes?