See also: IRC log
<egombocz> 510-705 is me, egombocz
<michel> janos: started rxnorm project as part of lodd in 2010
<michel> ... rxnorm is an NLM project, need to sign UMLS license agreement to download and work with it
<michel> ... contains proprietary sources, but ways to filter out
<michel> ... and get only the public domain content
<michel> ... Susie Stephens had contacted Olivier Bodenreider - anything at level 0 is public
<michel> ... never released the entire dump of files - part in license that you can make the whole thing available at once
<michel> ... created a front end to it
<michel> ... last update in 2011
<michel> ... NLM has released a subset of RxNorm - called current prescribable content
<michel> janos: without license agreement required to be signed - easy to generate RDF
<michel> ... wants to update the toolchain
<michel> michel: can we explore the data a bit
michel: URIs could be replaced
with labels where available
... i also notice that some of the links do not have labels at all
janos: right now it is quite close to the original data
michel: RXnorm has formulation, ingredients -- are there other relationships?
janos: ingredients, product labels, trade names, a hierarchy of types
michel: any information regarding the function?
janos: they link to the VA drug
file, and this has information on indications and
... that data might not be in the public domain
michel: i think the therapeutic application areas are what we would be most interested in.
janos: for some of the drugs with links to DrugBank, that could give such information
michel: in the latest Bio2RDF release we included NDCs, this could be used for linking RxNorm.
(NDC = National Drug Code)
janos: first link is the SPARQL endpoint, second link is the name of the graph in the endpoint containing RxNorm data.
<michel> types and labels : http://sparqlbin.com/#ed8e6610c703076a944e3bf3ace1c712
<michel> y label http://link.informatics.stonybrook.edu/rxnorm/RXCUI RXCUI: is a unique concept identifier http://link.informatics.stonybrook.edu/rxnorm/SAB SAB is the source vocabulary http://link.informatics.stonybrook.edu/rxnorm/TTY TTY (term type) http://link.informatics.stonybrook.edu/rxnorm/RXAUI RXAUI: identifies a string down to its source
michel: i would have expected types (classes, property types)
janos: they are typed, in the UMLS way -- typing is done in a specific vocabulary
michel: i would like to have some
feedback on this approach to RDFizing
... we are also facing this decision in Bio2RDF a lot -- either reflecting source data as unaltered as possible, or re-interpreting the data to make it more useful to the RDF world (e.g., restructuring, adding classes)
janos: i would be afraid of misinterpreting something.
<michel> matthias: the problem of reinterpreting the data is that you could get things wrong, and it could be more difficult to keep those data updated - mappings might need to be revisited
<michel> ... maybe some middle ground; janos could maintain this generic representation, and perhaps add some more content with some simple rules
<michel> ... rdfs:label, dc:title, simple class typing
<michel> ... opt for both options to some degree
<michel> janos: maybe maintain each in different graphs?
michel: i think it is an opportunity for us -- not multiple values -- it is possible to translate one graph to another graph
(back on the call)
<michel> michel: sparql construct of rxnorm into "linked data + rdfs reasoning" friendly
<michel> ... dealing semantic relationships in umls
michel: had conversation with
BioPortal team, who have RxNorm -- they refuse to add any
semantics, i think you cannot even navigate the tree well
... RxNorm is part of a family of datasets, representation-wise
... seeing how to better reflect that in RDF is interesting
<egombocz> Agree with this; BioPortal's version is also from 3/7/2011 so it's old too. We really should get this into a more useful representation
erich: BioPortal version is a
pain indeed. It is also out of date.
... that is not just issue of RxNorm, we have interconnectivity issues in lots of UMLS-based environments.
alistair: janos, do you plan on using VoiD for dataset metadata? i sugget using it.
janos: yes, i will look into it.
alistair: i also suggest the PAV vocabulary for provenance
<agray> PAV: provenance versioning and authoring ontology http://purl.org/pav/
michel: okay, we need a workplan
janos: i can make N3 files
... about representation and publishing, we need to discuss as a group. in which namespace should result be published?
... Bio2RDF namespace?
michel: we use Bio2RDF namespace
for the datasets we convert.
... generally, URIs are generated based on specific rules we defined
michel: details are shown in these slides
janos: original files are in RRF
format. then the data is important into SQL database, then
scripts are doing transformations based on the SQL
... i plan to rewrite scripts so they run without SQL server
... scripts are in Python
michel: ideally for Bio2RDF would be PHP, but other languages are also okay if they are well-commented
janos: i plan to have this updated by April 1.
michel: we will consolidate our
provenance models (see bio2rdf wiki)
... in April we will know better about which provenance representation would be best.
<agray> It is the open phacts and bio2rdf provenacne models which are being consolidated
This is scribe.perl Revision: 1.137 of Date: 2012/09/20 20:19:01 Check for newer version at http://dev.w3.org/cvsweb/~checkout~/2002/scribe/ Guessing input format: RRSAgent_Text_Format (score 1.00) Succeeded: s/PAD/PAV/ No ScribeNick specified. Guessing ScribeNick: matthias_samwald Inferring Scribes: matthias_samwald WARNING: No "Topic:" lines found. WARNING: No "Present: ... " found! Possibly Present: BobF IPcaller P14 P9 PAV Sal TallTed Tony aaaa aabb aacc agray alistair egombocz egonw_ ericP erich janos jhajagos matthias michel You can indicate people for the Present list like this: <dbooth> Present: dbooth jonathan mary <dbooth> Present+ amy Got date from IRC log name: 06 Mar 2013 Guessing minutes URL: http://www.w3.org/2013/03/06-HCLS-minutes.html People with action items: WARNING: Input appears to use implicit continuation lines. You may need the "-implicitContinuations" option. WARNING: No "Topic: ..." lines found! Resulting HTML may have an empty (invalid) <ol>...</ol>. Explanation: "Topic: ..." lines are used to indicate the start of new discussion topics or agenda items, such as: <dbooth> Topic: Review of Amy's report[End of scribe.perl diagnostic output]