Warning:
This wiki has been archived and is now read-only.

SHACL-ShEx-Comparison

From RDF Data Shapes Working Group
Jump to: navigation, search

SHACL - main differences with ShEx

  • On W3C recommendation track
  • Two subsets: SHACL Core and SHACL SPARQL. SHACL SPARQL includes SHACL Core
  • SHACL SPARQL gives SHACL extensibility - users can extend the language by creating their own constraint components. Semantics of such extensions is defined.
  • SHACL Core does not require a SPARQL implementation, but it is strongly aligned with SPARQL. Some of the implementations are based on SPARQL.
  • SHACL includes a validation result vocabulary that defines how validation reports and validation results should look like
  • SHACL has a well developed mechanism for scoping of validation. A user can say that a given shape applies to all instances of a class (and its sub classes), to specified IRIs, to all objects of triples with a given predicate and/or to all subjects of triples with a given predicate. The advanced SPARQL features (not currently on the recommendation track, only a WG note) allow additional, more advanced targeting.
  • SHACL has a well defined mechanism for combining shapes graphs using owl:import. This makes it easy to re-use and extend shapes graphs
  • SHACL leaves validation of recursive shapes to implementations. It is not clear what ShEx does.
  • Currently, SHACL has a single syntax - RDF in any serialization. Examples are given in Turtle and JSON-LD. Compact syntax for SHACL is yet to be defined. Originally, on the recommendation track compact syntax has been de-scoped. Now, that the language is stable, it will be easy to create a compact syntax for SHACL Core. Given the WG timeline, such syntax is likely to be produced as a WG Note.

ShEx - main differences with SHACL

  • Not on W3C recommendation track. ShEx is a Community Group effort.
  • ShEx has a compact syntax. There are 3 syntaxes: ShExC - a compact syntax that is RELAX-NG like, ShExR - RDF syntax and ShExJ - a JSON-LD syntax that uses a prescribed structure which holds an object at the top level.
  • ShEx is not an extensible language in the same way SHACL is. It's constructs are limited to what is defined by the CG. With that, the most direct comparison of ShEx is with SHACL Core. ShEx supports "Semantic Actions" as callouts to arbitrary functions for extensibility, but such extensions are not interoperable. Their semantics are not defined and they are implementation specific.
  • ShEx doesn't have a vocabulary for defining validation report and validation results, other than pass/fail. The API section talks about results http://shex.io/shex-semantics/#the-shaperesults-type, but is described there much more limited than what is provided by SHACL.
  • ShEx itself has a limited ability to scope validation. There is a concept of a ShapesMap that specifies what IRIs should be validated against what shapes. It is called a map because it maps IRIs of nodes in an RDF graph to labels of shapes in the ShEx schema. A shapes map must be present for the validation to happen. ShEx primer says that SHACL target properties could also be used for scoping, but the ShEx spec doesn't mention this feature and there are no examples in either the spec or the primer.
  • ShEx has a limited ability to combine "shapes graphs". In fact, ShEx schemas are not necessarily RDF graphs. ShEx contains the shapeExternal mechanism to reference a shape by an IRI and let the ShEx processor search that shape in some external place, but this feature is under-defined and implementation specific.
  • There is a non-normative API section

What can be expressed in ShEx, but not in SHACL Core

  • SHACL Core doesn't define TotalDigits and FractionDigits constraint components. These can be defined using SHACL SPARQL or implemented using sh:pattern, sh:minLength etc.

What can be expressed in SHACL Core, but not in ShEx

In addition to the above differences:

  • ShEx doesn't support declaring that the combination of predicate/language tag must be unique. For example, to declare that the values of rdfs:label must be any language tagged string but that there must be only one per language. SHACL Core offers sh:uniqueLang constraint component
  • ShEx doesn't support comparing values of properties. For example, you can't say that for a given resource, the value of ex:startDate must be less than the value of ex:endDate. SHACL Core has Property Pair Constraint Components. Additional constraint components can be defined with SHACL SPARQL
  • ShEx doesn't define any annotations. Both, SHACL and ShEx allow the use of user-defined "non validating" annotations. However, to facilitate interoperability of such annotations for use cases like form building or predictable printing of RDF files, SHACL has specifically defined a number of annotation properties e.g., sh:defaultValue and sh:order.
  • ShEx has only partial support for property paths. Unlike SHACL, it does it though nested shapes. This doesn't allow specifying cardinality of the property path values. Nested shapes approach also means that the path must be definitive e.g., one can't express ex:parent*/ex:lastName path - a constraint on the last name of any ancestor
  • ShEx can't be used to say that a resource that is a value of some property must be a member of a certain class transitively. For example, if there is a class ex:Developer with subclasses ex:JavaDeveloper, ex:JavaScript Developer, both SHACL and ShEx could be used to say that the value of ex:resolvedBy must be a resource with rdf:type that is a subclass of ex:Developer. SHACL does this by using sh:class constraint and ShEx can do it by using nested shapes. However, if the class hierarchy changes to insert some subclasses between ex:Developer and ex:JavaDeveloper, the SHACL shape will still work as expected. ShEx shape will not.
  • ShEx doesn't support qualified cardinality constraints. For example, to say that a large project team must have more than 10 team members and at least one of these team members must have a PMP certification. However, it may be that the repeated properties in ShEx can handle this - not clear on this since their coverage in the spec and the primer is terse.

ShEx Community View on the differences - outdated

http://weso.github.io/RDFValidation_ESWC16/slides/ShEx_vs_Shacl.pdf

This document has been created in the spring of 2016. Main differences identified as follows:

  • ShEx has rules that define the grammar that must be satisfied by a focus node. SHACL has scopes to select nodes for validation and then constraints (rules that focus nodes must satisfy). Presentation also talks about filters, but these are no longer supported
  • SHACL constraints are conjunctive by default. And ShEx?
  • It is hard to check what is a well formed SHACL shapes graph. This doesn't seem to be true anymore, at least not according to this testimonial Implementing a complete check for SHACL Core is quite easy
  • Default cardinality in ShEx is 1 and 1. Default cardinality in SHACL is 0 and unlimited
  • Differences in the extension mechanism
  • Specifying constraints on repeated properties: repeated properties in ShEx, qualified value shapes in SHACL
  • Handling of recursion
  • SHACL supports defining class members as a scope and has a constraint component in SHACL Core that checks that values are members of specified classes. These features take rdfs:subClassOf statements into account