From W3C Wiki

Scientific Discourse Task Force

The HCLS Scientific Discourse Task is co-chaired by TimClark and Anita De Waard.

Project Description

Provide a Semantic Web platform for biomedical discourse which can be evolved over time into a more general facility for many types of scientific discourse, and which is linked to key biological categories specified by ontologies.

Discourse categories should include research questions, scientific assertions or claims, hypotheses, comments and discussion, experiments, data, publications, citations, and evidence. Biological categories should include such categories as genes, proteins, antibodies, animal models, laboratory protocols, biological processes, disease classifications,anatomical structures, user-generated taxonomies, and tools.

Our primary scientific use cases will be derived from problems in digital scientific communications and web-based research collaboratories supporting research in neurological disorders and therapies.

The scientific use cases will motivate a series of informatics use cases which can later be generalized across wider areas of biology and medicine. We propose as a first general scientific use case, the cross-application of discoveries in stem cell, Alzheimer and Parkinson disease research via biomedical web communities in those areas.

The informatics specific use cases will initially focus on interoperability of the SWAN Alzheimer Knowledge Base with research communities using the Science Collaboration Framework (SCF) Drupal deployment, as well as with useful tools for bibliographic annotation and online scientific discussion and collaboration.


Objectives are paired with tasks and activities.

A. Use Cases:

1. Enhance Drug-mechanism Knowledge in Drug Product Labels

  • Status: Use case being finalized (deciding on drugs/patient characteristics, partners are joining)
  • Leads: Richard Boyce, Maria Liakata, Jodi Schneider, Mike Taylor, Anita de Waard
  • Partners: DERI, Elsevier; inviting ePocrates, Elsevier' Reaxys Database

2. Defining core metadata for describing biomedical investigations

  • Status: Description of use case being edited
  • Leads: Tim Clark, Susanna-Assunta Sansone, David Shotton, Philippe Rocca-Serra

3. Mining Treatment Outcomes

  • Goal: develop a generic, anonymised, multi-EHS compliant format and an information architecture that allows access to symptom/treatment data as a Linked Data source that can be made freely available and used for drug efficacy/outcome (meta)studies
  • Status: Description of use case being edited
  • Leads: Joanne Luciano, Anita de Waard
  • Partners: RPI, Elsevier, - need clinical/patient forum and/or pharma partner

--Anita de Waard 14:51, 31 October 2011 (UTC)

B. Ontologies:

1. Scientific Discourse formalization: Prepare SWAN IG Note. Completed

2. Scientific Discourse in Online Communities: Integrate SWAN and SIOC ontologies. Completed

3. Annotation Ontology (AO): Build provenance-aware model of mappings between sections of non-SemWeb documents, and terms in SemWeb ontologies.

4. Coarse-grained rhetorical structure: ORB - Ontology of rhetorical blocks

5. Discourse, Data & Experiment: Integrate SWAN, myExperiment & OBI (Ontology of Biomedical Investigations) ontologies

6. Bibliographic Ontologies: Integrate SWAN Citations with PRISM & CiTO to enable widely interoperable bibliographic ontologies.

7. Medium-grained structure: Integration of Medium-grained discourse structure with DRO and DoCO

C. Scientific Discourse structure:

  • Subtask coordinator: Anita de Waard
  • Discussion: Rhetorical Document Group
  • Charter: hold weekly calls discussing:
    • New directions in scientific discourse structures
    • Update ontologies and alignments

Project Pages

  • Business Case

Semantic Integration of Biomedical Web Communities to Accelerate Research In Neurodegenerative Disorders

Online community sites in general (forums, weblogs, bulletin boards, collaboratories, open access journals, etc.) have replaced many of the traditional means of keeping a community informed and are supplementing, and to some extent replacing, print libraries and print publishing. They are a valuable source of information and quite often it is a community site where you would end up when searching for some information. But there is a problem - online community sites are like islands without bridges connecting them. You may find information in a forum, but not know that there are missing pieces of related information that can be found on other community sites.

SIOC (Semantically-Interlinked Online Communities) is an attempt to link online community sites and to use Semantic Web technologies to describe the information community sites have about their structure and contents. An aim of SIOC is to allow people to find related information in other online communities and to discover new connections between discussion posts. The SIOC project is a sub-initiative of the DERI Líon project (funded by SFI).

In parallel with the SIOC effort, researchers are now beginning to realise the potential of social web technologies for scientific, legislative and other domain-specific discourses. Both formal scientific works in publications and also research discourse in community mechanisms can and should be interlinked with semantics. For example, efforts like bio-zen and SISC are aim ing to represent data, information and knowledge from research in all facets of the life sciences on the Semantic Web. There is a need to provide structured representations of professional scientific discourse for the HCLS domain, and this fits well with a future direction of SIOC to augment the existing framework with terms and applications specific to various domains.

Alzheimer Disease (AD) and Parkinson Disease (PD) are devastating neurodegenerative disorders for which there is no cure, and whose mechanisms (etiology) are incompletely understood. There are currently some 5 million Alzheimer patients and more than 1 million Parkinson patients in the U.S. alone, with the cost of care-giving running into the hundreds of billions of dollars. These numbers are expected to double over the next several decades because of projected increases in the aged population. AD is now the third most expensive disease to treat in the U.S., costing society close to $100 billion annually.

AD and PD are characterized by the loss of function and eventual death of massive numbers of neurons, beginning in specific brain regions (entorhinal cortex in AD and substantia nigra in PD).

There are a number of ways in which closer integration of stem cell, AD and PD research could be beneficial:

  • Emerging hypotheses propose that certain molecules that play a central role in AD and PD are involved in neural regeneration, and that a possible cause of cell death is the loss of this regenerative capacity. These molecules include nerve growth factor, amyloid precursor protein and dopamine.
  • Maintaining stable cultures of human neurons has been difficult, and has impeded progress in developing "test tube" models to test hypotheses and screen drugs. Using embryonic stem cells to generate human neurons and other types of brain cells could lead to better test-tube models of neurodegenerative disease.
  • Therapy development is exceptionally challenging for neurodegenerative diseases because in the adult central nervous system, neurons generally are not capable of regenerating to replace diseased and dying cells. Stem cells may be manipulated to develop cell lines and engineered tissue suitable for transplantation therapy.
  • Stem cell biology may provide knowledge to harness the brain's innate regenerative capacity for therapeutic purposes.

Alzheimer Disease research is the focal area of the oldest and largest biomedical web community by and for AD researchers, Alzforum (www.alzforum.org).

The SWAN ontology and knowledgebase (Ciccarese et al. 2008, Journal of Biomedical Informatics, in press) is a joint project of the Alzforum and the Massachusetts General Hospital. Stem Cell technology is likewise the subject of StemBook, an online publication of the Harvard Stem Cell Institute. StemBook is implemented using the Science Collaboration Framework (SCF), a special distribution of Drupal which among other capabilities can node-proxy resources on SPARQL endpoints (lazily instantiated node data) and understands certain elements of SWAN. A third web community, PD Online Research, also based on SCF, is now under development with scheduled deployment in Spring 2009 and planned integration with SWAN.

These communities with intersecting but distinct research interests are poster children for semantic interoperability of discourse. They form a convenient and perhaps ideal driving biological project for integrating SWAN and SIOC while keeping requirements grounded in the needs of actual biomedical researchers.


Past meetings:

Dial-in & IRC Information

  • Dial-In #: +1.617.761.6200 (Cambridge, MA)
  • Dial-In #: + (Paris, France)
  • Dial-In #: +44.203.318.0479 (London, UK)
  • Participant Access Code: 42572 ("HCLS2")
  • IRC Channel: irc.w3.org port 6665 channel #HCLS2 use IRC direct link or (see W3C IRC page for details, or see Web IRC)
  • Mibbit quick start: Click on mibbit for instant IRC access
  • Duration: 1hr

Related Links


(To e-mail any DERI members, use firstname.lastname@deri.org)

  • Sophia Ananiadou (U of Manchester)
  • Uldis Bojars (DERI / NUI Galway)
  • John Breslin (DERI / NUI Galway)
  • Gully Burns (USC/ISI)
  • Kei Cheung (Yale School of Medicine)
  • Annamaria Carusi (University of Oxford)
  • Paolo Ciccarese (Harvard Medical School)
  • Tim Clark (Harvard Medical School)
  • Ron Daniel (Elsevier)
  • Sudeshna Das (Harvard Medical School)
  • Anita deWaard (Elsevier)
  • Alf Eaton (Nature Networks)
  • Ronan Fox (DERI / NUI Galway)
  • Matthew Gamble (University of Manchester)
  • Carole Goble (University of Manchester)
  • Tudor Groza (DERI / NUI Galway)
  • Christoph Lange (Jacobs University)
  • Joanne Luciano (Tetherless World Constellation @ Rensselaer Polytechnic Institute, Predictive Medicine, Inc.)
  • Scott Marshall (Leiden University Medical Center)
  • David R Newman (University of Southampton)
  • Marco Ocana (Balboa Systems)
  • Jack Park (Open University)
  • Alexandre Passant (DERI / NUI Galway)
  • Satya Sahoo (Wright State University)
  • Matthias Samwald (DERI / NUI Galway)
  • Susanna Sansone (University of Oxford)
  • Tony Scerri (Elsevier)
  • Jodi Schneider (DERI/NUI Galway)
  • David Shotton (University of Oxford)
  • Susie Stephens (Johnson & Johnson Pharmaceutical Research & Development)
  • Holger Stenzhorn (DERI / NUI Galway)
  • Karin Verspoor (University of Colorado)
  • Elizabeth Wu (Alzheimer Research Forum)
  • Jun Zhao (University of Oxford)

If you have any questions please contact Tim Clark (tim_clark at Harvard dot EDU)