While the Semantic Web has demonstrated considerable value for collaborative contributions to data, adoption in many mission-critical environments requires data to conform to specified patterns. This need for interface defintions spans domains. For instance, validation in a banking context shares many requirements with quality assurance of linked clinical data. Systems like Linked Open Data, which don't have formal interface specifications, share these validation needs. Development of standards and tools to meet these requirements can greatly increase the utility and ubiquity of Semantic Web data.
Most data representation languages used in conventional settings offer some sort of input validation, ranging from parsing grammars for domain-specific languages to XML Schema, RelaxNG or Schematron for XML structures. While the distributed nature of RDF affects the notions of "validity", tool chains need to be established to publish interface definitions and ensure data integrity.
This workshop combines discussion of use cases for data validation/interface defintion on the Semantic Web with development of technologies to enable those use cases.
The open world requirements for Semantic Web vocabularies, particularly RDF Schema and OWL, lead to powerful data assertions which work with varying levels of inference and integration.
These constraints have so far limited the expressivity of "schema" languages to mostly be useful for inference rather than validation.
For instance, there is no standard way to assert that, for the purposes to populating a corporate directory,
foaf:Persons must include a
foaf:mbox or that for a health record, a person's height must be less than 193 centimeters.
Necessity has driven individuals, organizations and standards bodies to develop a variety of approaches, ranging from
SPARQL ASK (example) queries to detailed description logic modeling in OWL (example).
While experts in these tools may be able to easily develop such tool chains, the community as a whole, including potential Semantic Web users, will benefit greatly from validation standards and commodity tools which implement them.
While the Linked Open Data cloud keeps expanding, the inconsistency in quality of the data it is made of limits its use. Indeed, organizations cannot directly use the LOD cloud. Instead, they must shield themselves from all the inconsistencies that it contains by using a local replica which they have curated using custom made processes. The curating process is expensive, must be repeated every time the local replica is refreshed and does not benefit anyone else.
A standard way of validating RDF data would allow for a more streamlined, and therefore cheaper, curating process and make it possible to share validation rules for other to use. This could then be used to develop a corpus of validation schemas that could collectively be used to improve the LOD cloud.
The goal of this workshop is threefold:
An accompanying Examples of RDF Validation document is intended to provide ideas and inspiration for the Workshop, but not intended to constrain the workshop scope.
The outcome of this workshop will be reported to the current related working groups, Linked Data Platform, RDF-WG, and may be used as input for chartering other work.
Validation standards must address conventional requirements, as well as those brought by users who have so far not been able to adopt Semantic Web tools:
Topics for position papers may include, but are not limited to:
SPARQL ASKcompared to a grammar language like XML Schema or RelaxNG.
To ensure productive discussions, the Workshop will include sessions which are primarily technical, but grounded in business needs.
We invite representatives to submit papers that help us bring together knowledge in the following topic areas: