HCLSIG BioRDF Subgroup/Meetings/2006-06-05 Conference Call
- Date of Call: Monday June 5, 2006
- Time of Call: 11:00am Eastern Time
- Dial-In #: +1.617.761.6200 (Cambridge, MA)
- Participant Access Code: 246733 ("BIORDF")
- IRC Channel: irc.w3.org port 6665 channel #BioRDF (see W3C IRC page for details, or see Web IRC)
- Duration: ~1 hour
- Convener: Susie Stephens
- Scribe: M. Scott Marshall
Attendees: Marja Koivunen, Daniel Rubin, Davide Zaccagnini, Olivier Bodenreider, John Barkley, Adam West, M. Scott Marshall, Alan Ruttenberg, Bill Bug, Amith Sheth, Susie Stephens, Brian Gilman, Jonathan Rees, Kei Cheung, SWAN team - Huajun
by Olivier Bodenreider (see PDF presentation)
- Started in 1986
- National Library of Medicine
Unified Medical Language System Components
- 200,000 lexical items
- Part of speech and variant information
- 5M names from over 100 terminologies
- 1M concepts
- 16M relations
- 135 high-level categories
- 7000 relations among them
Can be used to integrate "subdomains" such as: SNOMED, OMIM, MeSH, NCBI Taxonomy, GO, UWDA (old version of FMA)
Addison's Disease Example (could as easily have been Alz, Parkinsons's, etc.)
Information Integration example: NF2 - Gene, protein, and disease
Marja (irc): how could UMLS support: "Find genes related to heart" ?
Huajun (irc): is there any RDF/OWL version of umls? [NO]
Huajun (irc): what's the underlying representation model of umls?
Marja (irc): how are CUIs founded? not resolvable on the web but could be done quickly
- UMLS: http://umlsinfo.nlm.nih.gov
- UMLS browsers: (free, but UMLS license required)
- Knowledge Source Server: http://umlsks.nlm.nih.gov
- Semantic Navigator: http://mor.nlm.nih.gov/perl/semnav.pl
- RRF browser: (standalone application distributed with the UMLS)
(some during presentation)
Bill Bug: Use Case: relating human genetics literature curated in OMIM and semantically tagged with both SNOMED and MeSH terms to molecular data such as dbEST whose semantic annotations consist of a mix of MeSH & some SNOMED terms (and some free text - though this has greatly improved over the past 5 years). Using UMLS to disambiguate between the two terminologies can help to eliminate false negatives (some false positives) when trying to associate specific dbEST entries to specific human diseases. This combined with NLP on the free text performed in association with relevant terminological & ontological resources can be quite effective in identifying these associations bewteen molecular data and disease.
???: number of mutations, tumor suppressor genes, quantified terms, how are these represented?
General concern about licensing, access, use - identifiers owned by NLM
Olivier: some terminologies are proprietary but 60% of concepts have non-proprietary stings associated.
Alan, Bill, Scott: Would like URI access to UMLS concepts for at least unique identifiers (see HCLSIG forum thread proposal for standard NCBI database URI)
Marja(irc): just getting identifiers would be an important first step
Olivier?: new SNOMED is open for U.S. access, U.K. has own license schemes
Olivier?: Suggests talking to John Madden about a special license
Davide: NLP will use either SNOMED or UMLS
Bill: important to understand what user requirements UMLS was designed to serve - primary for terminology disambiguation - it is not an ontology. As Olivier mentioned, the Semantic Network which forms the very small, foundational graph to which all other UMLS concepts (CUIs) link is the closest thing to an ontology in UMLS, though even this requires work to improve its formal specificity, especially as relateds to the relationships (see the OBO Relations Ontology) and definitions.
Amit?: Relation extraction from UMLS, will send pubs to Bill
Scott: is co-occurrence and other quantified data available from UMLS? Is there a way to pass known confidence/uncertainty information?
Bill: BIRN is focussed on neuroscience data. Needs: 1) clearly defined id's 2) Latest versions of all source terminologies. UMLS needs must lag behind the source terminologies due to the resources investment required to update the Metathesaurus. These tasks must be prioritized (clinical will often get precedence over basic research terminologies) given the limited resources available to NLM for this work.
Olivier: 1) UMLS is not an ontology 2) lean on NLM and terminology producers to update
Bill: recommend going through NCBO when lobbying (both NLM and terminology producers)
Bill: PATO, FUGO, NeuroNames + FMA = NeuroFMA, neuroradiology branch of RadLex