W3C

HCLS

27 Aug 2007

See also: IRC log

Attendees

Present
Alan DanielRuben Don_Doherty Kei_Cheung P14 P19 Scott_Marshall Susie ericP matthiassamwald
Regrets
Chair
Susie
Scribe
ericP

Contents


<scribe> ACTION: Susie to ask Johnathan what he needs for TAG access [recorded in http://www.w3.org/2007/08/27-BioRDF-minutes.html#action01]

ericP: not sure what more direct access there could be

Demo extensions - Alan

Use cases - All

Documentation of SenseLab conversion - Kei

Kei: we have created a wiki page documenting the senslab conversion

<matthiassamwald> http://esw.w3.org/topic/HCLS/Senselab_Conversion

Kei: we converted NeuronDB to RDF and later to OWL
... in this, we learned ontology design features
... was focused on the NeuronDB database structure
... matthiassamwald joined the team and helped convert to a more generic OWL structure
... that data was used in the Banff demo
... contents of Senelab's native DB change from time to time
... working out how to reflect in RDF
... considering two-step approach:
... - syntactic conversion to RDF
... ... automated
... - semantic conversion to OWL
... ... needs human intervention
... .
... had a meeting with some HCLS folks about conversion process
... want to make sure we follow best practices and that we track the demo ontology changes

Susie: what's the goal, W3C note?

Kei: not sure. this is just an initial version
... want feedback from more people
... suggestions?

Susie: given that you are grappling with modeling/data/ontology changes, can you call it finished in say, a month?
... and then release future [coherent] versions

Kei: seems reasonable

ericP: let me know if you'd like to publish it as a SPARQL interface to your database

Kei: may be easier with a complete RDF dump
... but some folks may wish to access the DB directly [with queries]

Susie: senselab is in Oracle?
... could publish in MySQL and use mapping stuff we're working on [SPASQL]

Kei: yes, interested in working on mapping

Demo extensions - Alan

Don: not on extension yet; still working on installation
... ran into glitch on loading DBs
... we have Virtuoso installed
... running into problem with perl script
... AlanR is helping

Susie: any progress on the poster?

Don: will be ready to talk about poster in about two weeks
... want to dive into matthiassamwald's poster and target to a neuroscience audience

matthiassamwald: will upload my demo to a server and send a pointer to public-semweb-lifesci

Susie: AlanR is working with DERI to install DB

alanr: DERI has machine back up
... not sure if they've installed Virtuoso
... considering hiring someone to write install scripts
... my schedule should be more calm now
... expect progress in next couple weeks

Susie: hosted at MIT in the interim?

alanr: yes. dunno if we will always host it

Susie: EricN is working on UI
... we considered working with UI experts, but seems we don't want to do that now

alanr: I have an idea for a UI; am looking for an implementor
... idea: wiki page with queries
... ... fill in a form to tailor the queries on the wiki
... ... and an interface to add specific predicates and structures for say, MESH
... .

ScottM: very interested, but don't want to commit to time

matthiassamwald: I will be starting in DERI in october/november

alanr: would like an auto-completer like for google search in firefox

<matthiassamwald>

[ discussion of related libraries, including leeet and a Sesame-tailored completion engine ]

Susie: noting everyone is on vacation, any progress on data conversion?

alanr: nope

<matthiassamwald> [Leeet] features an autocomplete mechanism based on Sparql queries.

alanr: talked to a fellow from EBI who is interested in expression data
... Marco Brandisi (SP?)

Susie: would be interesting to work on DrugDB. will prod people who volunteered
... may be some folks at Lilly who will want to contribute

<alanr> 1) Representing the information about the samples, experiment, protocols leading to the hybridization, technical aspects of the hybridization, etc.

<alanr> 2) Representing what the computed intensity of the spots on an array, as well as how those were computed (e.g. MAS5, rma, d-chip, etc)

<alanr> 3) Representing which genes are thought to be relatively highly expressed by interpreting the intensity of the spots as amount of expression of certain genes.

Use cases - All

DanielR: was interested in a use case involving images
... want to work with extending images with semantic annotations

susie: we are working on Use Cases in SWEO

DanielR: there was a discussion of a mammagraph use case

ScottM: I have been working on a mammography study in the netherlands

DanielR: expect NCI-backed standard for annotation medical images on the web
... controlled terminologies where possiblem for SNOMED, ... , something for regions

[scribe distracted -- missed stuff]

alanr: some work on annotations on Alan Brain Atlas
... there are existing region taxonomies
... another connection: Bijan Parsia said he'd be working on spatial reasoning
... (above, near...)

text mining

<mscottm> http://www.biosemantics.org/index.php?page=anni-2-0

ScottM: looked at Annie
... nice handling of synonyms for say, protiens
... once you pick a URI system, you will have Biologists who use their own names for, say, protein or gene
... you need a tool to manage the mapping

<scribe> ... done internally in text mining systems

UNKNOWN_SPEAKER: perhaps we can re-use unique concept identifier techniques from text mining systems
... my group provides web services for text mining packages; does not try text mining packages itself
... albert shuman has a nice overview of different systems
... UIMA framework came up
... migrated from IBM to apache
... makes text mining sysems more inter-operable
... noticed a corpus for huntingtons

alanr: working on extraction of named entities (diseases, phenotypes, ... whatever) and interactions
... some results from geneways (SP?)
... still pretty noisy
... all in PDF -- coding to convert to HTML

ScottM: Lucine uses PDF format

alanr: Lucine treats the document as a bag of words -- scrambling the order won't change the results
... believe HTML is the easiest to work with

<matthiassamwald> Dietrich Rebholz-Schuhmann

Susie: can you share Rebholz's tutorial?

ScottM: sure -- it's on-line

alanr: matthiassamwald wrote some related code

matthiassamwald: you can give it a pubmed identifier or query and you get back a list of annotated abstracts

<matthiassamwald> http://whatizit.neurocommons.org

<alanr> http://svn.neurocommons.org/svn/trunk/nlp/soc_textmining/

ScottM: advantage of using web services is that you can point at a service as the provenance of a piece of extracted data

Summary of Action Items

[NEW] ACTION: Susie to ask Johnathan what he needs for TAG access [recorded in http://www.w3.org/2007/08/27-BioRDF-minutes.html#action01]
 
[End of minutes]

Minutes formatted by David Booth's scribe.perl version 1.128 (CVS log)
$Date: 2007/08/28 20:55:55 $

Scribe.perl diagnostic output

[Delete this section before finalizing the minutes.]
This is scribe.perl Revision: 1.128  of Date: 2007/02/23 21:38:13  
Check for newer version at http://dev.w3.org/cvsweb/~checkout~/2002/scribe/

Guessing input format: RRSAgent_Text_Format (score 1.00)

Succeeded: s/can/will want to /
Succeeded: s/hemo/mamma/
Succeeded: s/spacial/spatial/
Found Scribe: ericP
Inferring ScribeNick: ericP

WARNING: No "Present: ... " found!
Possibly Present: Alan DanielR DanielRuben Don Don_Doherty Kei Kei_Cheung P14 P19 ScottM Scott_Marshall Susie alanr dlrubin ericP inserted matthiassamwald mscottm
You can indicate people for the Present list like this:
        <dbooth> Present: dbooth jonathan mary
        <dbooth> Present+ amy


WARNING: No meeting title found!
You should specify the meeting title like this:
<dbooth> Meeting: Weekly Baking Club Meeting

Got date from IRC log name: 27 Aug 2007
Guessing minutes URL: http://www.w3.org/2007/08/27-BioRDF-minutes.html
People with action items: susie

[End of scribe.perl diagnostic output]