From W3C Wiki

Systems Biology is about creating models that accurately and comprehensively explain and predict how Biological systems work.

Traditionally, biology has been studied by pursuing a reductionistic approach - i.e. Biologists studied a single gene or a single protein and explored its role in disease or other metabolic processes. The emergence of high-throughput technologies have enabled the measurement of many genes and protein simultaneously, effectively giving biology and bioinformatics a windows into the cell.

Systems Biology is the science that studies the structure and dynamics of living systems rather than its individual components. We know today that many biological activities, such as, for example, transcription of genes to proteins are controlled, and regulated, by signalling cascades. A classical example is the MAPK/ERK pathway, a signalling cascade responsible for carrying signals from the surface of the cell, all the way to its nucleus where DNA transcription is directly affected. In fact, such signalling pathways are chains of proteins, interacting with each other to carry a message from the surface to the nucleus. On the other hand we know that, along this chain, many other protein interactions can happen, its message can be affected and regulated by other pathways.

Knowledge about such interactions can give us clues, and points of entry, for understanding how drugs can be used to correct problems in the signalling process. The current view of how biological systems work is that the interactions between proteins are more similar to a communication network rather the isolated chains of events.

The focus of this task force will be on defining the method by which such biological networks can be modelled, and explored, using Semantic Web Technologies.

Admin Stuff


This task force meets on the 2nd and 4th Wednesday at 11am EST. Join our next call:

Confirmed Speakers/Next Meetings

  • Franco Du Preez/Carole Goble - the sysmoDB project and tools developed therein: talk on 28/03/2012
  • Oliver Ruebenacker - systems biology models and ontologies: talk to be scheduled

Use Cases

Systems Biology for drug discovery

By understanding the topology of a biological network, we can more easily understand what should be targets for drugs. However, knowing which biological entities to target is only one step of the drug discovery process. The next step is to identify which drugs have been used to target such proteins and to identify which have been successful applied in therapy scenarios by querying databases such as Drugbank or Linked CT.

The Systems Biology task force will explore scenarios whereby protein network representations can be enriched with known drug-protein interactions with the goal of identifying drugs that can potentially be used in the treatment of diseases such as a cancer.

Systems Biology for personalized medicine

The era of high-throughput biology and personalized genomic medicine is upon us. The FDA has already approved genetic testing to evaluate treatment options for particular types of cancer. There is every indication that more genetic testing is under way for other types of diseases.

Personalized medicine will soon include the ability to scan each patient's genome in order to immediately discover whether the patient will have an adverse reaction to a drug.

Semantic Systems Biology

Protein-protein, protein-drug and gene-gene interactions are typically represented as adjacency matrices. Although this format is useful for mathematical manipulation, it only takes into consideration one layer of information.

Semantic web technologies enables the representation of the network with additional layers of information.

The systems biology task force will therefore identify which layers or domains of information (e.g. gene function, gene-drug interaction) are relevant for the task of enabling the discovery of adverse effects when a patient's genome is scanned.

Figure 1. Different layers of information encode for different parts of the systems biology networks. Although one layer may be useful for detecting interaction based on cellular component (e.g. both gene products act in the membrane), other layers will be useful for expressing interaction based on common gene function; others still may include information about co-disease occurence.

Initial Goals and Structure of the Task Force

1. In this task force, we will pursue a pathway-based approach: instead of looking at 30K genes simultaneously, we will follow known biological models (e.g. the MapK signalling pathway). This starts by 1) identifying a pathway that we think it is relevant and 2) going after the proteins/genes that are relevant in that pathway

2. The approach will be that of collecting the experimental data to understand what is going on at the pathway level. Using public experimental "multi-omic" datasets that we know of such as a) TCGA; b) ICGC; c) sage bionetworks we will identify types of experimental data that we want to extract measurements from (e.g. SNPs, gene expression, protein expression)

3. Once we identify where the data is coming from, we will capture that data for the 8-9 proteins in our pathway of interest and create a linked data representation to capture this data (we will reuse vocabularies such as BEL in our framework)

4. We will do this for several types of cancer in order to create a representation of the dynamism of the pathway in different situations (cancer types)

5. Finally, we will formally identify a model, driven by the experimental data, that can be used to identify discrepancies between the events going on in each cancer type; this will also be linked to publicly available data about protein-drug and protein-protein interactions

6. We will write a paper describing our methodology and results


  • Helena F. Deus (chair)
  • Jun Zhao
  • Carole Goble
  • Mark Wilkinson
  • Michael Miller
  • Eric Prud'hommeaux
  • Marco Roos
  • Oliver Ruebenacker
  • Michel Dumontier
  • David Wild
  • Anita de Waard
  • Nicolas Le Novère
  • Erich A. Gombocz

Current Efforts


Workflows and Model Creation

Queries and Dataset