Use Case Metrics Progress

From Decision XG
Revision as of 11:58, 8 July 2010 by Jwaters2 (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Approach

The "Metrics for Assessment" Use case requires representing the basic components for describing and utilizing a measurable property for filtering and ordering decision options. These components include the name of the metric, the datatype, the lower and upper thresholds and the ordering (low-to-high or high-to-low) of the measured value. Once a user chooses a given set of options for a given decision question, any DatatypeProperty of those options can be a possible metric. For example, if the question is "What computer should I buy?", then the set of options are available computers and the properties include measurable values such as cost, disk space, cpu speed, and warranty period. If the question is "What city is best for establishing my new business?", then the set of options are cities in the country or region and the properties are things like cost of living, taxes, population size, available facilities, and growth potential. If the question is "What earthquakes are significant in my region for my weekly report?", then the set of options are earthquakes in the last week and the the properties are things like magnitude, depth, and region.

The goal then is to follow the eXtreme design methodology and the Neon Toolkit to import and specialize the pattern as needed, then create instances useful for testing and finally create a sparql query as a unit test to ensure the ontology includes what we need for this use case.

Progress

No particular metric pattern was identified from the repository of patterns at ontologydesignpatterns.org; however, this particular approach to metrics is still being fleshed out, and the potentially useful patterns still being identified. Some initial thought was put in to some potential patterns, but none were identified at this point. Instead, the basic metric components were modeled directly.

The following ontology in Turtle format was used for the initial modeling:

@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>.
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>.
@prefix dc: <http://purl.org/dc/elements/1.1/>.
@prefix xsd: <http://www.w3.org/2001/XMLSchema#>.
@prefix dm: <http://www.emAdopters.info/2010/Decisions#>.
@prefix ex: <http://www.emAdopters.info/2010/Decisions/purchase/computers/metrics#>.
@prefix d34: <http://data-gov.tw.rpi.edu/raw/34/data-34.rdf#>.
@prefix dgp32: <http://data-gov.tw.rpi.edu/vocab/p/32/>.
@prefix dgtwc: <http://data-gov.tw.rpi.edu/2009/data-gov-twc.rdf#>.
@prefix owl: <http://www.w3.org/2002/07/owl#>.

#------------------
# CLASSES: Metric
#------------------

dm:Metric
 rdf:type rdfs:Class.

#------------------
# PROPERTIES: weight, dataProperty, subMetric, options, lowerThresh,
#  upperThresh, orderLowToHigh
#------------------

dm:weight
 a  owl:DatatypeProperty;
 rdfs:domain dm:Metric;
 rdfs:range xsd:float.

dm:dataProperty
 a  owl:ObjectProperty;
 rdfs:domain dm:Metric;
 rdfs:range  owl:Thing.

dm:subMetric
 a  owl:ObjectProperty;
 rdfs:domain  dm:Metric;
 rdfs:range   dm:Metric.

dm:dataSet
 a owl:ObjectProperty;
 rdfs:domain dm:Metric.

dm:options
 a  owl:ObjectProperty;
 rdfs:domain  dm:Metric;
 rdfs:range   owl:Thing.

dm:lowerThresh
 a  owl:DatatypeProperty;
 rdfs:domain dm:Metric;
 rdfs:range  xsd:float.

dm:upperThresh
 a  owl:DatatypeProperty;
 rdfs:domain  dm:Metric;
 rdfs:range   xsd:float.

dm:stringMatch
 a  owl:DatatypeProperty;
 rdfs:domain dm:Metric;
 rdfs:range  xsd:string.

dm:orderLowToHigh
 a  owl:DatatypeProperty;
 rdfs:domain dm:Metric;
 rdfs:range  xsd:boolean.
 

#---------------------
# INSTANCES: Metrics for a decision assessing earthquake significance
#  Magnitude, Depth, Region
#---------------------


dm:Magnitude
 a  dm:Metric;
 dc:title "magnitude";
 dm:weight 1.5;
 dm:dataProperty dgp32:magnitude;
 dm:dataSet <http://data-gov.tw.rpi.edu/raw/34/data-34.rdf>;
 dm:options dgtwc:DataEntry;
 dm:lowerThresh  3.0;
 dm:orderLowToHigh true.

dm:Depth
 a dm:Metric;
 dc:title "depth";
 dm:weight 1.0;
 dm:dataProperty dgp32:depth;
 dm:dataSet <http://data-gov.tw.rpi.edu/raw/34/data-34.rdf>;
 dm:options dgtwc:DataEntry;
 dm:upperThresh  50.0;
 dm:orderLowToHigh  false.

dm:Region
 a dm:Metric;
 dc:title "region";
 dm:weight  1.0;
 dm:dataProperty  dgp32:region;
 dm:dataSet <http://data-gov.tw.rpi.edu/raw/34/data-34.rdf>;
 dm:options  dgtwc:DataEntry;
 dm:stringMatch  "Southern California".


A sample SPARQL unit test was then created showing that we can recover the metric components from the sample instances. The query is shown below:

PREFIX : <http://www.w3.org/2002/07/owl#> 
PREFIX dm: <http://www.emAdopters.info/2010/Decisions#> 
PREFIX data-gov-twc: <http://data-gov.tw.rpi.edu/2009/data-gov-twc.rdf#> 
PREFIX dc: <http://purl.org/dc/elements/1.1/> 
PREFIX owl: <http://www.w3.org/2002/07/owl#> 
PREFIX p: <http://data-gov.tw.rpi.edu/vocab/p/32/> 
PREFIX raw: <http://data-gov.tw.rpi.edu/raw/34/> 
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> 
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> 
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#> 

SELECT ?s ?weight ?orderLowToHigh ?dp ?ds ?options ?lt ?ut WHERE { ?s rdf:type dm:Metric;
 dm:weight ?weight.
 OPTIONAL {?s dm:orderLowToHigh ?orderLowToHigh.}
 ?s dm:dataSet ?ds;
 dm:dataProperty ?dp;
 dm:options ?options.
   OPTIONAL { ?s dm:lowerThresh ?lt.} OPTIONAL { ?s dm:upperThresh ?ut.} } 

The results of running the query are shown in both Protege and Neon. A screendump of the Neon toolkit below shows the SPARQL query being executed and the returned results. (The results are there but not quite matched up correctly with the columns due to the optional elements not being aligned correctly when some are missing.)

MetricsInNeon.jpg

A screendump of the Protege tool below shows the SPARQL query being executed and the returned results.

MetricsInProtege.jpg

Impact

An appropriate representation of a Metric is key to providing important understanding of how a decision is reached. The ability to quantify the metric with appropriate filtering thresholds and weighting allows the metric to be combined with others for an assessment which can be automated as a first-cut ordering of the options. The ability to specify which property of which resource of which dataset corresponds to this metric is important for integrating with and utilizing open linked datasets. The effective representation of metrics will allow them to be documented, accessible, reused and generally managed for improved understanding of how decisions were reached and for training on effective and consistent decision processing.