HCLS Scientific Discourse Call Monday, May 2 10 am EST, 3 pm BST

Meeting notes

1. BioRDF Demonstrator:

  • BioRDF group: [1]
  • Demo: [2]
  • Annotated corpus with triples: [3]


  • [13] Dunckley T, Beach TG, et al.. (2006). Gene expression correlates of neurofibrillary tangles in Alzheimer's disease. Neurobiol Aging;27:1359-71. (PubMed) (PMC) GEO: GSE4757 ArrayExpress: E-GEOD-4757
  • [14] Liang WS, Dunckley T, et al.. (2007). Gene expression profiles in anatomically and functionally distinct regions of the normal aged human brain. Physiol Genomics 28: 311-22. (PubMed) (PMC) GEO: GSE5281(same as below) ArrayExpress: N/A
  • [15] Liang WS, Reiman EM, et al.. (2008). Alzheimer's disease is associated with reduced expression of energy metabolism genes in posterior cingulate neurons. Proc Natl Acad Sci U S A l2008;105: 4441-6. (PubMed) (PMC) GEO: GSE5281(same as above) ArrayExpress: N/A

Need help to automate:

1) Institution provenance and PIs etc.

2) Experimental context: what platform (e.g. microarray experiments - what company etc); disease patients have; where in the brain samples were collected, how far along was the disease when the sample was collected.

3) From this: generate list of genes, need details of statistical methods, what was algorithm etc. and analysis provenance etc. and confidence in statistical results

Current use case: cancer; previous use case: Alzheimers

2. BioRDF-Scientific Discourse Joint Demonstrator proposal

The scientific discourse group (in particular: Jodi, Anita and Paolo) will mark up the corpus that the BioRDF group has worked on. We will mark up these documents with

a) ORB

b) Annotation Ontology

within the Harvard Annotation Framework, and link the BioRDF triples to specific locations in the text.

This serves three purposes:

1) It allows the Scientific Discourse group to test if ORB + AO is enough to mark a given location in the document. If so - that concludes the deliverables of the subtask; if not, we need and will define a 'medium-grained' ontology.

2) It provides the BioRDF group with more detailed, location-linked annotations to their test corpus 3) This can help them in their quest to automate the mining of these triples

After this markup is done, the evaluations will be:

1) Is ORB + AO enough? Is the SciDisc/Rhetorical structure group done?

2) Can this be a useful start towards automating the knowledge the BioRDF group wants to automate?

If anyone from either group is interested in participating in this exercise, please let us know - [4].


1) timeframe and names for points below

2) how close are we to fulfilling our original use cases? [5]

3) overlap with other HCLS subgroups (see [6] for a listing)

4) next steps.

Conclusions meeting April 18:

1) Joint work on annotation a corpus of documents with links to workflow components and data. This will allow a concrete instantiation of the medium-grained ontology, and offer a discussion point for describing the experiment/paper link which we are approaching from many different sides. Alex Garcia will jumpstart this process by making a collection of full-text Elsevier documents available which he has annotated with RDF; after seeing these, we will select a subcorpus to mark up a) Data b) Experimental model c) Key discourse components from, and work to make a demonstrator.

2) A paper. Discussing our various models, and ways to integrate; include discussion re. overlap/difference between (explicit, personal) knowledge in discourse and (implicit, shared) knowledge that underlies experimental models. Could be possible outcome of demo.

3) A face-to-face meeting. Kees van Bochove has kindly agreed to organise this. Possible venues: ISMB in Vienna, ICBO in Buffalo, or a one-off workshop in hte Netherlands. Topic: Experiment/discourse integration: models, examples, and next steps. Specifically, Kees wants to discuss with whoever might be interested in co-organising this face to face: -- work plan: what do we want to achieve --> which outcome would we like from the symposium: when are we happy? -- (candidate) attendee list -- options for date/time/place

