HCLSIG BioRDF Subgroup/Meetings/2007-04-09 Conference Call
- Date of Call: Monday April 9, 2007
- Time of Call: 11:00am Eastern Time
- Dial-In #: +1.617.761.6200 (Cambridge, MA)
- Dial-In #: +33.4.89.06.34.99 (Nice, France)
- Dial-In #: +44.117.370.6152 (Bristol, UK)
- Participant Access Code: 246733 ("BIORDF")
- IRC Channel: irc.w3.org port 6665 channel #BioRDF (see W3C IRC page for details, or see Web IRC)
- Duration: ~1 hour
- Convener: Susie Stephens
- Scribe: June Kinoshita
- Review action items
- Progress in data set conversion
- Alzgene conversion needs a new owner (MH, ran out of time)
- Presentation update
- Demo equipment update
On the phone I see +1.203.737.aaaa, +1.781.620.aabb, June_Kinoshita, John_Barkley, Don_Doherty, EricP
On IRC I see RRSAgent, Zakim, Don, kei_and_matthias, June, ericP, Jonathan_Rees, alanr [email@example.com] has joined #biordf
Susie: The plan today is to cover outstanding action items; progress on data set; presentation component of the demo at WWW conference; demo equipment update. Scanning through action items, will all fit with other parts of agenda, except for Eric P question, has he contacted anyone in David ? group at MIT to see if they can attend F2F?
Eric P: Can go pester them.
Susie: Would be good to be quite concrete about what we're asking of them.
Susie: Progress on data set conversion. Have word document that highlights all the datasets that we're hoping to include in the demo. Deadlines: Homologene?
Eric P: Alan had comments, thinks its usable now. Names could be improved.
Susie: I'll put those down as done.
Kei: Is spreadsheet available on line?
Susie: Yes, on demo web page, near bottom. Did send out link. Under Information Resources/Progress Report.
Susie: Next on list is Entrez Gene. Alan was going to work on that.
Alan: Can skip over. I'm behind on all my translations.
Susie: Same for phenotypes, pathways, GO?
Susie: Can you make progress on them by tomorrow?
Alan: I'll do as much as I can by tomorrow.
Susie: Neuroanatomy, John Barkley and Don have been working on BAMS.
Don: The bulk of it is done.
Alan: One question I had is whether there were paper references, PubMed, to portions of BAMS. Don't recall seeing them in the translation.
Don: Yes, there are PubMed references. Can all be found from URIs in there now.
Alan: Ideally would like to have PubMed attachments in the source, as we will be referring a lot to PubMed. Do you want to ask Mikel re: scraping?
Susie: SenseLab is next on the list. Saw some email discussion on this.
Kei: Basically have converted NeuronDB into OWL versions. We have feedback from Matthias, Alan, Bill, and revised. Newer version. Matthias is helping to establish mapping to external sources, including BIRN, BAMS, John Barkley's; mapping info by Don. We'll continue to work on it and map to more ontologies including SWAN.
Alan: Is there any mapping to MeSH going on?
Kei: Matthias is looking at that.
Alan: Mapping need not be in OWL; we're happy with MeSH ID. Looks like we'll be using the SKOS version of MeSH. Olivier said he had fewer reservations about that. Since URIs are not settled, more important to have mapping. A spreadsheet with URI from NeuronDB on the left, and a MeSH ID on the right.
Alan: I sent mail to Olivier, who said he still has some reservations about the SKOS version; so I asked what concerns were because we may end up rewriting it. Susie: Olivier may be on vacation for a chunk of time, so may be best to forge ahead.
Alan: There are people we talked to, asked questions about MeSH. Any way, need to know what maps to what. RDF part is easy.
Susie: In terms of mapping NeuronDB to RDF, are there still things you're working on?
Kei: Still some final things, but minor. Main focus is on mapping to external sources.
Alan: The one that we had discussed for the demo is BrainPharm. Is that up next?
Kei: So we want to gain enough experience and apply to that conversion. That's in our plan. BrainPharm is a small database.
Alan: I'd be happy to chat offline about any issues. Susie: So is it correct to say the basic conversion is done for NeuronDB?
Kei: Yes, still tweaking and mapping to other databases.
Susie: Will add new line that you're converting BrainPharm to RDF.
Susie: What kind of time frame? Demo is May 9.
Alan: A further question about NeuronDB? Is it all in Oracle?
Kei: Yes. Did a prototype conversion using D2RQ code.
Alan: I know Susie wanted to include Oracle and Mapping aspects to this demo, so trying to figure out what we have that's close enough.
Kei_and_matthias: ... my version of MeSH in SKOS can be downloaded from neuroscientific.net/ont/bio-zen-MESH-new.zip (I hope this is the latest version)
Kei: One possibility is DART-GRID system to facilitate mapping to RDF; might be enough for simple mapping. What is general feeling in this group to use something like DART-GRID for the demo?
Alan: We need to find someone who is going to own the Oracle piece.
Susie: Just to wind back, important that demo shows standard technology that people are using today. So would make a lot of sense to take advantage of fact SenseLab is in Oracle. No preference whether it's DART GRID or D2RQ.
Susie: Even being able to get a single query spanning MySQL and Oracle would be exciting.
Susie: Is there any data that's made available in MySQL and use that as another component potentially. Maybe people want to think about that.
Kei: Gene Ontology is available in MySQL version.
Alan: There's an existing conversion we can use off the shelf. To be plausible, would want to show something that's not currently available.
Susie: Maybe we can think about overnight and discuss at F2F.
Susie: Moving down list, conversion to AlzGene to RDF. Michael Hoyer?
Alan: He was doing this in between consulting gigs. Didn't make as much progress as he wanted to. Try to get what he has done on to the Wiki by Weds, Thursday or Friday. Need someone else to own that piece. Not trivial because relatively complicated data, but quite valuable. Maybe Eric P wants to work on next? Do need someone to take it over.
Eric P: Alan, you'd be a good judge of that because I don't know the data involved.
Alan: Is there someone else, maybe John Barkley?
John: Could take a look.
Alan: We could talk on phone and look at it together.
Susie: Maybe we can make a decision on it tomorrow.
Susie: Next two items are related to literature.
Jonathan Rees, MeSH to PubMed, and PubMed to Gene.
Alan, how's he doing?
Alan: He's doing nicely. Trying to decode what the MeSH terms are saying. First term is MeSH topic, then subtopics, if there's a third term, that's another subtopic. Each of these terms is a narrower version of the unqualified term. Each PubMed ID gets a set of MeSH term instances attached to them. Jon says he has prototype, not sure if will be loaded by tomorrow. Issue of bringing in SKOS version, harmonization of URI; proposes using PURLs over URIs; so that's where we are with MeSH. Will be done this week for sure.
Susie: How come you're thinking of making a decision around PURLs?
Kei_and_matthias: (persistent URLs, see http://purl.org)
Alan: Question is what to do about things like EntrezGene records, where we don't have URIs. Jonathan has looked over some things in context of URI task and thinks reasonable choice is to use PURLs. Same choice was used for OBO. Can name a gene record unambiguously using PURLs, can set up partial re-directs, so that's the rough idea. Other part of it is that it fits in reasonably well as a compromise between rewrite method that I presented at HCLS. Hoping we provide the way to translate PURL into NCBI in way we proposed, but PURL itself is resolvable.
Susie: Next dataset, Allen Brain Atlas already converted. Next is NCCDB with Maryann Martone. Haven't sent anything yet.
Susie: What about SWAN?
June: It’s basically done.
Susie: Next item is actual presentation. Agreed to put a first draft set of slides together. It's very preliminary but is up on the wiki. Under Presentation Topics/Demo Slides, you can see what I have so far. We have a 30-minute slot, so even if giving a slide presentation, would at most want 12 slides, or maybe 7 slides to give time for demo. Even so, managed to get up to 8 slides. Slide 1: Title (has been submitted to conference). 2) Agenda, 3) Mission of HCLS, task forces; 4) Frame use case, high-level info around Alzheimer disease; 5) Introduce different databases, highlight how heterogeneous they are; 6) Highlight technology approach, how we've converted data to RDF, triple store, how we're querying the data; don't know what interface is so can add that in; 7) Benefits of semantic web; 8) Conclusions. That's basically the framework; Lilly needs three weeks to review, so need to fill in next week. So I suggest we create generic slides, fill in details in the talk. Can convert to PPT by next Monday.
Alan: Main concern is I tend to work on slides right up to the demo. I imagine there'd be slides peppered in between while demo is showing. Concern about having slides ready by next week adds a constraint I'm not sure how to work with. Can we partition slides so we have room to add slides? Can we have "Susie's slides" vs others?
Alan: Demo is a pre-existing condition, so could we put in a disclaimer that this work predates your being with Lilly?
Susie: Given that I do work for Lilly, it's going to be hard to say that it has nothing to do with Lilly. They are paying me to spend time working with HCLS, so it is related to Lilly. Given we have only a 30-minute slot, the slides I've already listed really should be included, and we won't have time to show much else, so not sure what we'd be adding.
Susie: Regarding scientific questions, can't we just say them, rather than write them down?
Susie: If we want to incorporate scientific questions, at least can send it off now so Lilly can sign off on the scope of it.
Eric: Is there a way for Lilly to agree about minor tweaks at last minute?
Susie: If you have the bulk of slides reviewed, late tweaks are probably OK.
Alan: Is there an expectation that we need to use Lilly presentation backgrounds?
Susie: Don't think so.
Alan: Very awkward because we hadn't all planned for this.
Alan: Need to work on submitting what we can by Monday.
Susie: It's good to be thinking about the presentation now, because there isn't a lot of time.
Alan: Only danger is that there will be things to say and ways to say them that won't get done until later. I try to have slides be compact and clear, and I hadn't planned for it, so it's a big change.
Don: Can we consult with other parties? Eric N. Get guidance on that?
Susie: Reality is that my name is on the program as a co-presenter, and this is a constraint my company has. Not sure how to work around it. Don't see how having generic slides is such a drawback. For me, not such a challenge to talk around the slides.
Eric P taking over scribing here: topic: Demo Equipment update
Susie: IBM think they can get a box for us ... [something about HP]
... ENueman and BBug contacting Apple
SW_HCLS(BioRDF)11:00AM has ended