W3C

- DRAFT -

HCLS

09 Sep 2014

See also: IRC log

Attendees

Present
+1.978.794.aaaa, tim_w, Mehmet, DBooth, Tony, claude, ericP, +1.469.226.aabb, Neda, mscottm2, Ingeborg, [IPcaller]
Regrets
Chair
DBooth (by default)
Scribe
dbooth

Contents


<mscottm2> +

Value Sets in RDF

Claude: Reviewed ballot submission on standardizing value sets in DSTU. They mean an expression that can generate a collection of terms.
... YOu can say you want to intentionally define a value set, or extensionally, which means you enumerate the values.
... Think of a valueset as a knowledge document: metadata (author, date published, version,etc); an expression that allows you to define this extension.
... I noticed in the HL7 proposal that the expression itself was defined in UML. But I would think that if you were to define the expression of a valueset you'd do it in an expression language, because valuesets are a way to define members of a set.
... How would we address this in the semantic world?
... Given in OWL, you use DL to define sets of concepts, could that be used in the definition of valuesets?

<ericP> http://www.w3.org//2013/02/ODM/

Claude: One way to define a VS is to enumerate all the terms -- a list of terms.

<mscottm2> Here's something that Eric wrote up about value sets: https://www.w3.org/wiki/HCLS/ClinicalObservationsInteroperability/FDATherapeuticAreaOntologies/Validation#Value_set

<ericP> example of context-neutral value set in OWL

Claude: The second way is a regex that allows you to find all the terms you want.

<ericP> example of "indirect hierarchy"

Claude: Another way is "this term and all its children, but only the leaves"
... Also have a concept in a vocabulary, and you filter on that property ( :diabetes ) and you get all diabetes-related terms.
... Also by unioning, intersecting or differencing sets.
... Also by a query: the result set is the valueset.
... What the evaluation returns is the valueset

Eric: In ShEx we can't talk about "only the leaves"
... But can express valuesets by globbing or by an HTTP GET.
... for creating a valueset by query.
... If you're trying to validate some data to ensure that things are in the right range, then you can plug the thing that you're testing against into the query.
... But that's beyond what you'll get out of ShEx or Resource Shapes normally. You'd need an extension.
... Expecting a SNOMED term that is a morphoolgy, there are 50k morphology terms, that's more work if you do a GET to get them all.

Claude: There are queries with filters -- like SPARQL -- and the other way is set operations.
... If ShEx supports this then it would seem to cover most of the query/filter side of things, and if there's a way to represent set operations then you could cover most of these cases.

Eric: What are the use cases and their expressivity requirements? ShEx itself can't do intersection or union, but you could pass that to an extension function via a URI that captures that functionality
... For instance validation, you can say X must be in some VS: The object of this predicate is in this VS, and it checks to see if it is in it.
... How can you make it get the right VS? (Intersections, diffs, etc)
... But downside is that that looks like a big ugly URL in the ShEx, when it's in fact a URL-encoded SPARQL query.
... But I think you also want in the language, a URL-composer, so that you can have the URL in a human-readable form.

Claude: But to have the URL, someone has to have already defined the query.
... Need to say: here's my intent. I want this set, and this expression, and this set, and the subsumption, and union them. Maybe DL could do this. On the LHS you could say the set of all terms that matches this regex, unioned with the set of all terms including or descendent from some other term.
... I can take this expression and convert to SPARQL. On one side I have the expression, and the other i can execute it. But how woudl I define it if the expression has not yet been built?

Eric: Closest available: a small amount of code to get to this. Take SheX or Resource Shapes, and say the valueset is this URL that ends in "?=", plus the URL-encoding of this triple-quoted string (which is sparql), and by that you're able to talk to any sparql endpoint.
... so you could do all of the set operationsn in sparql.
... Slightly further away is to do them in OWL DL-Query that's built into Protege, but you'd need deployed DL-query engines. And both of those approaches require embedded URLs and humann readable descriptions.
... Harold, does that make sense?

Harold: I'm grasping at context here.

Eric: You have some Resource Shapes or ShEx that describes a valid instance, and in that it says the VS that must match is enumerated here, where "here" is the result of a query.

Harold: What would be useful in that situation would be if that URL resolves to a set of URLs. Come up with a simple representation of what that URL returns. E.g., Turtle, or an RDF list.

David: Or a SPARQL result set.

Eric: You could add to your favorite validator the ability to add a URL to a "query=" URL.

David: Like a URL template

Harold: I'd like to separate the mechanism for generating the VS from the VS reference.
... We want to enable it to reference a simple flat list -- no sparql involved.
... But some valuesets are complex with more structure, and it's good to allow pepople to be clever too.
... One challenge is that some valuesets are large, and it's in our interest if we can ask "is this a valid value" rather than returning all 300k values.

Claude: Suppose you have a huge VS, and you want to know if a given term is in it.
... If you define the VS a certain way, can you do that check without ever looking at a term in it?
... E.g., if the VS is defined by a rule, maybe just plug the given term into the rule to see if it is satisfied.

David: Two use cases: validation (whether a term is allowed in a VS) versus generation of all terms in a VS (e.g. for displaying in a dro-down list)

<ericP> analyte: C-reactive peptide; source: CSF

David: So you specify the term by giving a set of properties of it.

Harold: Need to include in expressions the URI and a block of text that indicates what we expect from it: "This yields any LOINC code with the following characteristics ... "
... In the process of offereing any data models, you get to the VS and we need to record it in a human readable way to document it.

David: So the expressionsn themselves do not make it obvious to the human reader what will come back?

Harold: Yes, someone must do the work to turn it into a URI and resolve it to possible values. e.g., countries that possess a particular characteristic.
... i can describe that in a shape, then as a secondary test someone can figure out the best way to determine that, and they they give me a URI that will give me that list.

Eric: You don't want to put making a semantic representation of medra in the critical path?

Harold: Right.
... We had a proposal from Bodenreider for putting Medra into shapes.

<mscottm2> https://www.w3.org/wiki/HCLS/ClinicalObservationsInteroperability/FDATherapeuticAreaOntologies/Validation#Value_set

Scott: Re a URI for a VS and it's filled in by a procedural attachment, Eric wrote about CDISK code for marital status, he itemizes the VS in this fragment. In ShEx you could do it that way, but I'd like to say that a shape can take on a value from the set of marital codes without having to enumerate them.
... This shape takes a value from Observations.

Eric: One way is to enumerate directly. Another is to do a GET that returns line-feed delimited list of the terms. (Where that URL is not too ugly.) Another is ShEx with a hideous URL that is triple-quoted and goes at the end of the query URL. A fourth way is to give it a URL and the system knows the URL but it isn't mechanically connected to the web, it's an internal hook to a database, for example.

Claude: Another use case: terminologist wants to define a VS. How do we support them in doing this? SPIN is SPARQL-like. Need a way for that terminologies to define and share that VS.

<hsolbrig> MeSh in RDF. Temporary endpoint. http://mor2.nlm.nih.gov/conductor account: meshdemo password: demoofmeshinrdf - select iSQL button -- gives a SPARQL window

Claude: Clinical person needs to be able to capture this, saying what the VS should be.

David: Need a well-known notion of ValueSet, so that someone can say "this is a ValueSet".

Eric: You could have something like purl.org that dereferences to a valueset, perhaps with a description like Harold wants, and a way to compose new ones from old ones.

<hsolbrig> Agree. ValueSet should consist of a set of URL's + metadata about source / dates / etc.

David: This need to compose sets from other sets is not unique to healthcare. How is it normally done?

Eric: If you represent the VS list as a set of triples, rather than a list of values, then you could use SPARQL to do the set operations.

<hsolbrig> Apologies, I need to go...

Eric: Where have you seen these set operations in VS?

Claude: Reaction might be caused by food or medication or latex gloves, etc. So you may want to say that you can unioin all of these substances as the terms in the VS.
... Another case is that there's an existing VS, but many are not relevant to the domain, so you want to take away the irrelevant ones.

Eric: Or you might want to prefer a VS.

<Kerstin> the constrained use case just described now (narrower scope) is very comon for us in clinical trial data!

Validation Working Group

Eric: Call for participation has gone out. Now's the time to join! First f2f will be in Santa Clara, end of October.

<ericP> https://www.w3.org/mid/op.xkzxenxqsvvqwp@sith.local

<ericP> W3C Call for Participation in RDF Shapes WG (member only)

ADJOURNED

<ericP> https://help.github.com/articles/working-with-large-files

lost eric!

Summary of Action Items

[End of minutes]

Minutes formatted by David Booth's scribe.perl version 1.138 (CVS log)
$Date: 2014-09-09 17:02:09 $

Scribe.perl diagnostic output

[Delete this section before finalizing the minutes.]
This is scribe.perl Revision: 1.138  of Date: 2013-04-25 13:59:11  
Check for newer version at http://dev.w3.org/cvsweb/~checkout~/2002/scribe/

Guessing input format: RRSAgent_Text_Format (score 1.00)

Succeeded: s/David/DBooth/
No ScribeNick specified.  Guessing ScribeNick: dbooth
Inferring Scribes: dbooth
Default Present: +1.978.794.aaaa, tim_w, Mehmet, DBooth, Tony, claude, ericP, +1.469.226.aabb, Neda, mscottm2, Ingeborg, [IPcaller]
Present: +1.978.794.aaaa tim_w Mehmet DBooth Tony claude ericP +1.469.226.aabb Neda mscottm2 Ingeborg [IPcaller]
Got date from IRC log name: 09 Sep 2014
Guessing minutes URL: http://www.w3.org/2014/09/09-hcls-minutes.html
People with action items: 

[End of scribe.perl diagnostic output]