Warning:
This wiki has been archived and is now read-only.

User:Rcygania2/Coverage Requirement

From RDF Data Shapes Working Group
Jump to: navigation, search

These are working notes for a potential requirement for SHACL.

Make the notion of coverage more explicit.

User Story

  • Jose's story about linked data portals lends support, he says.

Use Case

I think it would be interesting because it provides a:

  • Nice way of defining/extracting valid subgraphs (e.g., for passing on to some other system that has a less flexible internal model, e.g., for visualisation or transformation to a different format)
  • Nice way of checking what of my data a server may ignore. That is, the “closed shape” scenario. Constraints are useful for working out that data is missing. But they are also useful for working out that there is too much, or in other words, some processing capacity on the server side is missing. Coverage may allow a client to “validate” the server, and that it can handle all the important stuff.
  • Nice way of doing quality control. In a “scruffy” dataset such as DBpedia, what percentage of the data is “covered” by some stricter rules? Do we have extra triples floating around that nobody understands?

Requirement

But…

  • This can't be easily done with off-the-shelf SPARQL processors, so it's perhaps best treated as an optional notion
  • What if there are multiple ways of satisfying a constraint? What triples are then included in the covering graph?