XML dataset content validation

From Semantic Sensor Network Incubator Group
Jump to: navigation, search

The use case is detailed on restricted-access Wiki:

https://wiki.csiro.au/confluence/display/WaterInformatics/Validation+Service

It is summarized here for readers who do not have access to the original.

For exchange of XML encoded data, checking the validity of the instance is a two stage procedure: 1. XML Schema validation - checks the syntax and structure, and limited content-checking (built-in XML Schema data-types) 2. content validation - checking the values of elements/attributes - these might be either literals or URIs

Content validation is done using Schematron. a. The basic validity check is "does the value of node A (XPath expression) appear in list X". b. For some nodes, a more sophisticated check is added, along the lines of "the value of node A (XPath expression) must have a 'foo' relation with the value of node B (Xpath expression)"

In all cases, the 'list' of valids is maintained in RDF (SKOS), with the 'foo' relations in one or other of the lists. These are exposed through SPARQL services with a simple http concept-retrieval service interface on top. Schematron invokes these services through the document() function.

The XML data is defined by an XML Schema, so validity testing in general may using XPath expressions. However, the calls to the vocabulary services use SPARQL, which does not generate a canonical response, so is not XPath-safe. Therefore the Schematron tests are cast as 'boolean' (true/false) tests.

Validation service components.jpg

Back to Use_cases