W3C W3C Member Submission

Shape Expressions 1.0 Definition

W3C Member Submission 2 June 2014

This version:
http://www.w3.org/submissions/2014/SUBM-shex-defn-20140602/
Latest published version:
http://www.w3.org/submissions/shex-defn/
Author:
Harold Solbrig,
Eric Prud'hommeaux,

Abstract

Shape Expressions associate RDF graphs with labeled patterns called "shapes". Shapes can be used for validation, documentation and transformation of RDF data.

Shape Expressions express formal constraints on the content of RDF graphs and is intended to be used to validate RDF documents, communicate expected graph patterns for interfaces and to generate forms and validation code. This document describes the formal semantics of the Shape Expressions language through the use of the Z Specification Language. The accompanying Shape Expressions Primer provides an informal introduction to the language.


Status of This Document

This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications can be found in the W3C technical reports index at http://www.w3.org/TR/.

This document describes the Shape Expressions langauge developed as a community effort. It is being submitted to W3C so that it can inform the development of a future RDF Data Shape specification.

By publishing this document, W3C acknowledges that the Submitting Members have made a formal Submission request to W3C for discussion. Publication of this document by W3C indicates no endorsement of its content by W3C, nor that W3C has, is, or will be allocating any resources to the issues addressed by it. This document is not the product of a chartered W3C group, but is published as potential input to the W3C Process. A W3C Team Comment has been published in conjunction with this Member Submission. Publication of acknowledged Member Submissions at the W3C site is one of the benefits of W3C Membership. Please consult the requirements associated with Member Submissions of section 3.3 of the W3C Patent Policy. Please consult the complete list of acknowledged W3C Member Submissions.

1 Introduction
2 The RDF Data Model in Z
3 Shape Expression Evaluation
4 Rule Evaluation
5 Action evaluation
6 Parsing Rules
7 Appendix
References

1. Introduction

The Shape Expressions Language (ShEx) is used to specify formal constraints on the content of RDF graphs and are intended to be used to validate RDF documents, communicate expected graph patterns for interfaces and to generate forms and validation code. ShEx can be used to:

Information about the use, grammar and syntax of ShEx can be found at http://www.w3.org/2013/ShEx/. This document describes the formal semantics of the ShEx language using the Z specification language, beginning with a Z specification of the characteristics of an RDF Graph that are referenced by ShEx.

The output from ExtractZed.xsl produces text parsable with fuzz -d .

2 The RDF Data Model in Z

The formal definition for an RDF graph in RDF 1.1 Concepts and Abstract Syntax[RDF11-CONCEPTS]:

“An RDF graph is a set of RDF Triples

“An RDF triple consists of three components:

  • the subject, which is an IRI or a blank node
  • the predicate, which is an IRI
  • the object, which is an IRI, a literal or a blank node

“... IRIs, literals and blank nodes are distinct and distinguishable.”

provides a definition in Z:

[Graph, ℙ Triple]
\begin{zed}
Graph == \power Triple
\end{zed}
        

The ShEx language treats IRIs and blank nodes as primitive types, which are defined as Z free types:

[IRI, BlankNode]
\begin{zed}
[IRI, BlankNode]
\end{zed}

        

The ShEx language can express constraints on both the type and content of literals, which are modeled separately:

“A literal in an RDF graph consists of two or three elements:

  • a lexical form, being a Unicode string...
  • a datatype IRI, being an IRI
  • if and only if the datatype IRI is
    http://www.w3.org/1999/02/22-rdf-syntax-ns#langString, a non-empty language tag as defined in [BCP47]. The language tag MUST be well-formed according to section 2.2.9 of [BCP47].”

This is modelled by String and LanguageTag as free types:

[String, LanguageTag]
\begin{zed}
[String, LanguageTag]
\end{zed}

        
and using them in the definition the two flavors of RDFLiteral, plain literal and typed literal::
TypedLiteral [lexicalForm:String ; dataType : IRI]
PlainLiteral [lexicalForm:String ; langTag : String]
RDFLiteral ::= pl ⟨⟨PlainLiteral⟩⟩ | tl ⟨⟨TypedLiteral⟩⟩
\begin{zed}
TypedLiteral \defs [lexicalForm:String ; dataType : IRI | dataType \neq XSD\_String] \\
PlainLiteral \defs [lexicalForm:String ; dataType : IRI; langTag : LanguageTag | \\
\t3 dataType = XSD\_String] \\
RDFLiteral ::= pl \ldata PlainLiteral \rdata | tl \ldata TypedLiteral \rdata \\ 
\end{zed}

        
RDFTerm is defined as:

IRIs, literals and blank nodes are collectively known as RDF terms

RDFTerm ::= iri ⟨⟨IRI⟩⟩ | literal ⟨⟨RDFLiteral⟩⟩ | bnode ⟨⟨BlankNode⟩⟩
\begin{zed}
RDFTerm ::= iri \ldata IRI \rdata | literal \ldata RDFLiteral \rdata | bnode \ldata BlankNode \rdata
\end{zed}

        
The definition of RDF Triple is modelled as a tuple consisting of three constrained RDFTerms:
Triple
s, p, o : RDFTerm
iri s IRI bnode s BlankNode
iri p IRI
iri o IRI bnode o BlankNode literal o RDFLiteral
\begin{schema}{Triple}
   s,p,o : RDFTerm
\where
   iri \entryFor s \in IRI \lor bnode \entryFor s \in BlankNode \\
   iri \entryFor p \in IRI \\
   iri \entryFor o \in IRI \lor bnode \entryFor o \in BlankNode \lor literal \entryFor o \in RDFLiteral
\end{schema}

        

2.1 RDF Access Functions

The ShEx language uses the following functions:

2.1.1 triplesForSubject

– return set of triples in a graph triples whose subject is a given RDFTerm

Triple
triplesForSubject : RDFTerm
subj : RDFTerm; g : Graph triplesForSubject subj g = {t : g | t.s = subj}
\begin{gendef}
   triplesForSubject: RDFTerm \fun Graph \fun Graph
\where
   \forall subj: RDFTerm; g: Graph @ triplesForSubject~subj~g = \{t: g | t.s = subj \}
\end{gendef}

2.1.2 triplesForObject

– return set of triples in a graph triples whose object is a given RDFTerm

Triple
triplesForObject : RDFTerm
obj : RDFTerm; g : Graph triplesForObject obj g = {t : g | t.o = obj}
\begin{gendef}
   triplesForObject: RDFTerm \fun Graph \fun Graph
\where
   \forall obj: RDFTerm; g: Graph @ triplesForObject~obj~g = \{t: g | t.o = obj \}
\end{gendef}

2.2 Well Known URIs

The following URIs are referenced explicitly in the ShEx implementation:

XSD_STRING, SHEX_IRI, SHEX_BNODE, SHEX_LITERAL, SHEX_NONLITERAL : IRI
 
\begin{axdef}
XSD\_String, SHEX\_IRI, SHEX\_BNODE, SHEX\_LITERAL, SHEX\_NONLITERAL : IRI
\end{axdef}

        

The above definitions of Graph, Triple, RDFTerm and their components will now be used to describe the relationship between a ShEx Schema and an RDF graph.

3 Shape Expression Evaluation

A Shape Expression Schema is a collection of labeled rules where exactly one rule in the collection is identified as the outermost or “starting” rule. In addition, any rule that is referenced within the Schema is also itself a member of the Schema Formally:

Schema
rules : Label Rule
start : Label
start dom rules
r : ran rules
      (rran group ⇒ (groupr).rule ∈ dom rules) ∧
      (rran andran (andr) ⊆ dom rules) ∧
      (rran xorran (xorr) ⊆ dom rules) ∧
      (rran arc ∧ (arcr).valueSpecran valueRef
            (valueRef ⋅(arcr).valueSpec) ∈ dom rules)
\begin{schema}{Schema}
   rules : Label \pfun Rule \\
   start : Label
\where
   start \in \dom rules \\
\forall r: \ran rules @ \\
\t1 (r \in \ran group \implies (group \entryFor r).rule \in \dom rules) \land \\
\t1 (r \in \ran and \implies \ran (and \entryFor r) \subseteq \dom rules) \land \\
\t1 (r \in \ran xor \implies \ran (xor \entryFor r) \subseteq \dom rules) \land \\ 
\t1 (r \in \ran arc \land (arc \entryFor r).valueSpec \in \ran valueRef \implies \\
\t2 (valueRef \entryFor (arc \entryFor r).valueSpec) \in \dom rules)
\end{schema}

While existing ShEx implementations define a rule Label as being either an IRI or a BlankNode, the type of Label does not impact the evaluation semantics. For our purposes, we can simply define it as a separate free type:

[Label]
\begin{zed}
[Label]
\end{zed}

The validity of a given RDF Graph is determined by taking the start Rule of a ShEx Schema and a reference IRI and evaluating the validity of the Rule against the supplied graph.

Formally, the evaluate function takes a Schema, a Graph and a reference IRI and, if the start Rule in the Schema, in the context of the starting Schema and graph, returns either nomatch (z) or pass (p) then the function returns pass. In all other cases, the function returns fail (f ).

evaluate : SchemaGraphIRIOptValidity
s : Schema ; g : Graph ; i : IRI ; v : OptValidity ; ec : EvalContext |
     ec.graph = gec.schema = s
evaluatesgi =
    if evalRule ec(iri i)(s.ruless.start) ∈ {z, p}
    then p else f
\begin{gendef}
   evaluate: Schema \fun Graph \fun IRI \fun OptValidity
\where
   \forall s: Schema; g: Graph; i: IRI; v: OptValidity; ec: EvalContext | \\
\t1 ec.graph = g \land ec.schema = s @ \\
   evaluate ~s ~g ~i = \\
\t1 \IF evalRule ~ec ~ (iri~ i)~ (s.rules~ s.start) \in \{\nomatch, \pass\} \\
\t1 \THEN \pass \ELSE \fail
\end{gendef}
        

4 Rule Evaluation

A ShEx Rule is a set of constraints that can be evaluated against a reference RDFTerm in the context of a given Schema and RDF Graph:

EvalContext [schema : Schema ; graph : Graph]
\begin{zed}
EvalContext \defs [schema: Schema; graph: Graph] \\
\end{zed}

Formally, the evalRule function takes an EvalContext, a reference RDFTerm and a Rule and returns one of the following:

OptValidity ::= p | f | z | | ε
\begin{zed}
OptValidity ::= \pass | \fail | \nomatch | \none | \error \\
\end{zed}
        
A Rule can take one of five forms. Each will be formally described later in this document, but informally they are:
Rule ::= arc⟨⟨ArcRule⟩⟩ |
    rarc⟨⟨RevArcRule⟩⟩ |
    group⟨⟨GroupRule⟩⟩ |
    and⟨⟨AndRule⟩⟩ |
    xor⟨⟨XorRule⟩⟩
\begin{zed}
Rule ::= arc \ldata ArcRule \rdata | \\
\t1 rarc \ldata RevArcRule \rdata | \\
\t1 group \ldata GroupRule \rdata | \\
\t1 and \ldata AndRule \rdata | \\
\t1 xor \ldata XorRule \rdata \\
\end{zed}
        
evalRule : EvalContextRDFTermRuleOptValidity
ec : EvalContext ; i : RDFTerm ; r : RuleevalRule ec i r =
    if r ∈ ran arc
        then evalArcRule ec i(arcr)
    else if r ∈ ran rarc
        then evalRefArcRule ec i(arcr)
    else if r ∈ ran group
        then evalGroupRule ec i(arcr)
    else if r ∈ ran and
        then evalAndRule ec i(arcr)
    else
        evalXorRule ec i(arcr)
\begin{gendef}
   evalRule: EvalContext \fun RDFTerm \fun Rule \fun OptValidity
\where
   \forall ec: EvalContext; i:RDFTerm; r: Rule @ evalRule~ec~i~r = \\
\t1 \IF r \in \ran arc \\
\t2 \THEN evalArcRule~ec~i~(arc \entryFor r) \\
\t1 \ELSE \IF r \in \ran rarc \\
\t2 \THEN evalRevArcRule~ec~i~(rarc \entryFor r) \\
\t1 \ELSE \IF r \in \ran group \\
\t2 \THEN evalGroupRule~ec~i~(group \entryFor r) \\
\t1 \ELSE \IF r \in \ran and \\
\t2 \THEN evalAndRule~ec~i~(and \entryFor r) \\
\t1 \ELSE \\
\t2 evalXorRule~ec~i~(xor \entryFor r)
\end{gendef}
        
The evalRulefunction de-references the supplied Label and invokes evalRule with the result. This is not explicitly represented because the Z specification language does not allow cyclic dependencies. This function is undefined if Label is not in EvalContext

evalRule′ : EvalContextRDFTermLabelOptValidity
ec : EvalContext ; l : Labell ∈ dom ec.schema.shapes
\begin{gendef}
   evalRule': EvalContext \fun RDFTerm \fun Label \fun OptValidity
\where
   \forall ec: EvalContext; l:Label @ l \in \dom ec.schema.rules
\end{gendef}

4.1 ArcRule Evaluation

The ArcRule is used to select the subset of the graph having a given predicate or predicates and to determine whether the cardinality and/or “type” of this subset matches a supplied criteria. The rule itself consists of a PredicateFilter to select the triples, an ObjectSpecification to evaluate the result, an optional min and max cardinality and a (possibly empty) set of Actions:

ArcRule
filter : PredicateFilter
valueSpec : ObjectSpecification
min, max : ℕ[0 ..1]
actions :Action
(#min = 1 ∧ #max = 1) ⇒ value minvalue max
\begin{schema}{ArcRule}
   filter: PredicateFilter \\
   valueSpec: ObjectSpecification \\
   min, max: \optional[\nat] \\
   actions: \power Action
\where
   (\# min = 1 \land \# max = 1) \implies value~ min \leq value~ max
\end{schema}
ArcRule evaluation consists of:
  1. Select the subset of the EvalContext Graph with the supplied subject and predicates matching PredicateFilter
  2. Evaluate the cardinality and return the result if it doesn’t pass
  3. Evaluate the object of each of the triples in the set against ObjectSpecification. If any of the evaluations fail, return fail (f ).
  4. Return the result of evaluating actions against the matching triples.
evalArcRule : EvalContextRDFTermArcRuleOptValidity
ec : EvalContext ; s : RDFTerm ; ar : ArcRule ; sg : Graph |
    sg = evalPredicateFilter ar.filter (triplesForSubject s ec.graph) ∙
evalArcRule ec s ar =
    if evalCardinality sg ar.min ar.maxp
        then evalCardinality sg ar.min ar.max
    else if evalObjectSpecification ec ar.valueSpec sgp
        then evalObjectSpecification ec ar.valueSpec sg
    else
        dispatchar.actions sg ec
\begin{gendef}
   evalArcRule: EvalContext \fun RDFTerm \fun ArcRule \fun OptValidity
\where
   \forall ec: EvalContext; s: RDFTerm; ar: ArcRule; sg: Graph  | \\ 
\t1 sg = evalPredicateFilter~ar.filter~(triplesForSubject~ s~ ec.graph) @ \\
evalArcRule~ec~s~ar = \\
\t1 \IF evalCardinality~sg~ar.min~ar.max \neq \pass \\
\t2 \THEN evalCardinality~sg~ar.min~ar.max \\
\t1 \ELSE \IF evalObjectSpecification~ec~ar.valueSpec~sg \neq \pass \\
\t2 \THEN evalObjectSpecification~ec~ar.valueSpec~sg \\
\t1 \ELSE \\
\t2  dispatch~ ar.actions~ sg~ ec 
\end{gendef}

4.1.1 PredicateFilter Validation

A PredicateFilter can be one of:

  • pfIRI - the IRI of a specific predicate or the IRIstem that defines a set of predicates
  • pfWild - an expression that matches any predicate except those matching the (possibly empty) set of IRIorStems
IRIorStem ::= iosi⟨⟨IRI⟩⟩ | ioss⟨⟨IRIstem⟩⟩
PredicateFilter ::= pfIRI ⟨⟨IRIorStem⟩⟩ | pfWild ⟨⟨ℙ IRIorStem⟩⟩
\begin{zed}
IRIorStem ::= iosi \ldata IRI \rdata | ioss \ldata IRIstem \rdata \\
PredicateFilter ::= pfIRI \ldata IRIorStem \rdata | pfWild \ldata \power IRIorStem \rdata
\end{zed}
An IRIstem matches any IRI whose stringified representation begins with the stringified representation of IRIstem according to standard IRI matching rules [RFC3987]. This is represented by the function:
[IRIstem]
\begin{zed}
[IRIstem]
\end{zed}
IRIstemRange : IRIstem ⇸ ℙ IRI
\begin{gendef}
	IRIstemRange: IRIstem \pfun \power IRI
\end{gendef}
evalPredicateFilter returns all of the triples in a Graph whose predicate matches the supplied PredicateFilter:
evalPredicateFilter : PredicateFilterGraphGraph
f : PredicateFilter ; g : GraphevalPredicateFilter f g =
    if f ∈ ran pfIRI then evalIRIorStem(pfIRIf) g
    else evalWild(pfWildf) g
\begin{gendef}
   evalPredicateFilter : PredicateFilter \pfun Graph \pfun Graph
\where
   \forall f: PredicateFilter; g: Graph @ evalPredicateFilter~ f ~g = \\
\t1 \IF f \in \ran pfIRI \THEN evalIRIorStem~ (pfIRI \entryFor f)~ g \\
\t1 \ELSE evalWild~ (pfWild \entryFor f)~ g
\end{gendef}
evalIRIorStem returns all of the triples in a Graph matching the supplied IRIorStem
evalIRIorStem : IRIorStemGraphGraph
e : IRIorStem ; g : GraphevalIRIorStem e g =
    if e ∈ ran iosi then {t : g | irit.p = iosie}
    else {t : g | irit.pIRIstemRange(iosie)}
\begin{gendef}
   evalIRIorStem: IRIorStem \pfun Graph \pfun Graph
\where
   \forall e: IRIorStem; g: Graph @ evalIRIorStem~ e ~g = \\
\t1 \IF e \in \ran iosi \THEN \{t : g | iri \entryFor t.p = iosi \entryFor e \} \\
\t1 \ELSE \{t: g | iri \entryFor t.p \in IRIstemRange~ (ioss \entryFor e) \}
\end{gendef}
evalWild returns all of the triples in a Graph that do not match an entry in the set of IRIorStems.
evalWild :IRIorStemGraphGraph
es : ℙ IRIorStem ; g : GraphevalWild es g =
    {t : g | t ∉ ⋃ {e : esevalIRIorStemeg}}
\begin{gendef}
   evalWild: \power IRIorStem \pfun Graph \pfun Graph
\where
   \forall es: \power IRIorStem; g: Graph @ evalWild ~es ~g = \\
\t1 \{t: g | t \notin \bigcup \{e: es @ evalIRIorStem~e~g\} \}
\end{gendef}

4.1.2 ObjectSpecification Evaluation

ObjectSpecification specifies a set of possible values for an RDFTerm and takes one of the following forms:

  • ValueType - matches Literals having a specified data type
  • ValueSet - matches IRIs or Literals that match one or more of the expressions in the specified set
  • ValueWild - matches any target except those matching the (possibly empty) set of IRIstems
  • ValueReference - matches any target that is considered valid according the Rule identified by Label.
MatchValue ::= mviri ⟨⟨IRI⟩⟩ | mviris ⟨⟨IRIstem⟩⟩ |
     mvlit ⟨⟨RDFLiteral⟩⟩
ObjectSpecification ::= valueType ⟨⟨IRI⟩⟩ |
     valueSet ⟨⟨ℙ MatchValue⟩⟩ |
     osWild ⟨⟨ℙ MatchValue⟩⟩ |
     valueRef ⟨⟨Label⟩⟩
\begin{zed}
MatchValue ::= mviri \ldata IRI \rdata | mviris \ldata IRIstem \rdata | \\
\t2 mvlit \ldata RDFLiteral \rdata \\
ObjectSpecification ::= valueType \ldata IRI \rdata | \\
\t1 valueSet \ldata \power MatchValue \rdata | \\
\t1 osWild \ldata \power MatchValue \rdata | \\
\t1 valueRef \ldata Label \rdata
\end{zed}

4.1.3 evalCardinality

– evaluates the cardinality the supplied graph.

  • If the graph has no elements and:
    • min value is 0 – nomatch (z)
    • min value isn’t specified or is > 0 – none ()
  • Otherwise:
    • If number of elements in graph < min or > max fail (f )
    • Otherwise – pass (p)

evalCardinality : Graph → ℕ[0..1] → ℕ[0..1] → OptValidity
g : Graph ; min , max : ℕ[0 ..1] ∙ evalCardinality g min max =
    if #min = 1 ∧ #g = 0 ∧ value min = 0
        then z
    else if #g = 0
        then
    else if (#min = 1 ∧ #g < value min) ∨
            (#max = 1 ∧ #g > value max)
        then f
    else p
\begin{gendef}
   evalCardinality: Graph \fun \optional[\nat] \fun \optional[\nat] \fun OptValidity
\where
   \forall g: Graph; min, max: \optional[\nat] @ evalCardinality~g~min~max = \\
   \t1 \IF \# min = 1 \land \# g = 0 \land  value~ min = 0 \\
\t2 \THEN \nomatch \\
\t1 \ELSE \IF \# g = 0 \\
\t2 \THEN \none \\
\t1 \ELSE \IF (\# min = 1 \land \# g < value~ min) \lor \\
\t3 (\# max = 1 \land \# g > value~ max) \\
\t2 \THEN \fail \\
\t1 \ELSE \pass
\end{gendef}

evalObjectSpecification – returns pass (p) if all of the triples in a Graph match the supplied ObjectSpecification, otherwise fail (f )

evalObjectSpecification : EvalContextObjectSpecificationGraph
    OptValidity
ec : EvalContext ; os : ObjectSpecification ; g : Graph
evalObjectSpecification ec os g =
    ift : gevalObjectSpecificationTriple ec os t.o = p then p
    else f
\begin{gendef}
   evalObjectSpecification: EvalContext \fun ObjectSpecification \fun Graph \fun \\
\t1 OptValidity
\where
   \forall ec: EvalContext; os: ObjectSpecification; g: Graph @ \\ evalObjectSpecification~ec~os~g = \\
\t1 \IF \forall t: g @ evalObjectSpecificationTriple~ec~os~t.o = \pass \THEN \pass \\
\t1 \ELSE \fail
\end{gendef}
evalObjectSpecificationTriple : EvalContextObjectSpecification
    RDFTermOptValidity
ec : EvalContext ; os : ObjectSpecification ; n : RDFTerm
evalObjectSpecificationTriple ec os n =
    if os ∈ ran valueType
        then evalValueType (valueTypeos) n
    else if os ∈ ran valueSet
        then evalTermSet (valueSetos) n
    else if os ∈ ran osWild
        then evalTermWild (osWildos) n
    else
        then evalReference ec (valueRefos) n
\begin{gendef}
   evalObjectSpecificationTriple: EvalContext \fun ObjectSpecification \fun \\
\t1 RDFTerm \fun OptValidity
\where
   \forall ec:EvalContext; os: ObjectSpecification; n: RDFTerm @ \\
  evalObjectSpecificationTriple~ec~os~n = \\
\t1 \IF os \in \ran valueType \THEN \\
\t2 evalValueType~(valueType \entryFor os)~n \\
\t1 \ELSE \IF os \in \ran valueSet \THEN \\
\t2 evalTermSet~(valueSet \entryFor os)~n \\
\t1 \ELSE \IF os \in \ran osWild \THEN \\
\t2 evalTermWild~(osWild \entryFor os)~n \\
\t1 \ELSE \\
\t2 evalTermReference~ec~(valueRef \entryFor os)~n
\end{gendef}

evalValueType – returns pass if the supplied RDFTerm is:

  • type literal and whose dataType matches ValueType
  • type IRI and ValueType is type RDF_Literal
evalValueType : IRIRDFTermOptValidity
vt : IRI ; n : RDFTerm ; l : RDFLiteralevalValueType vt n =
    if vt = SHEX_IRIn ∈ ran iri then p
    else if vt = SHEX_BNODEn ∈ ran bnode then p
    else if vt = SHEX_NONLITERAL ∧ (n ∈ ran irin ∈ ran bnode) then p
    else if vt = SHEX_LITERALn ∈ ran literal then p
    else if n ∈ ran literall = (literaln) ∧
         ((l ∈ ran pl ∧ (pll). dataType = vt) ∨
         (l ∈ ran tl ∧ (tll). dataType = vt))
        then p
    else f
\begin{gendef}
   evalValueType: IRI \pfun RDFTerm \pfun OptValidity
\where
   \forall vt: IRI; n: RDFTerm; l: RDFLiteral @ evalValueType~vt~ n = \\
\t1 \IF vt = SHEX\_IRI \land  n \in \ran iri \THEN \pass \\
\t1 \ELSE \IF vt = SHEX\_BNODE \land  n \in \ran bnode \THEN \pass \\
\t1 \ELSE \IF vt = SHEX\_NONLITERAL \land  (n \in \ran iri \lor n \in \ran bnode) \THEN \pass \\
\t1 \ELSE \IF vt = SHEX\_LITERAL \land  n \in \ran literal \THEN \pass \\
\t1 \ELSE \IF n \in \ran literal \land l = (literal \entryFor n) \land \\
\t2 ((l \in \ran pl \land (pl \entryFor l).dataType = vt) \lor \\
\t2  (l \in \ran tl \land (tl \entryFor l).dataType = vt)) \THEN \pass \\
\t1 \ELSE \fail
\end{gendef}

evalTermSet – return p if the supplied RDFTerm is a member of MatchValue

evalTermSet :MatchValueRDFTermOptValidity
mvs : ℙ MatchValue ; n : RDFTermevalTermSet mvs n =
    ifmv : mvs
        ((mv ∈ ran mvirin ∈ ran iri ∧ (irin) = mvirimv) ∨
        (mv ∈ ran mvirisn ∈ ran iri
            (irin) ∈ IRIstemRange (mvirismv)) ∨
        (n ∈ ran literalmvlitmv = literaln))
        then p
    else f
\begin{gendef}
   evalTermSet: \power MatchValue \pfun RDFTerm \pfun OptValidity
\where
   \forall mvs: \power MatchValue; n: RDFTerm @ evalTermSet~mvs~n = \\
\t1	\IF \exists mv: mvs @ \\
\t2 ((mv \in \ran mviri \land n \in \ran iri \land (iri \entryFor n) = mviri \entryFor mv) \lor \\
\t2  (mv \in \ran mviris \land n \in \ran iri \land  \\
\t3 (iri \entryFor n) \in IRIstemRange ~ (mviris \entryFor mv) ) \lor \\
\t2  (n \in \ran literal \land mvlit \entryFor mv = literal \entryFor n)) \\
\t1 \THEN \pass \\
\t1 \ELSE \fail
\end{gendef}

evalTermWild – return pass (p) if the supplied RDFTerm is not a member of MatchValue.

evalTermWild :MatchValueRDFTermOptValidity
mvs : ℙ MatchValue ; n : RDFTermevalTermWild mvs n =
    if evalTermSet mvs n = p then f else p
\begin{gendef}
   evalTermWild: \power MatchValue \fun RDFTerm \fun OptValidity
\where
   \forall mvs: \power MatchValue; n: RDFTerm @ evalTermWild~mvs~n = \\
\t1 \IF evalTermSet~ mvs~ n = \pass \THEN \fail \ELSE \pass
\end{gendef}

evalTermReference – return p if the subgraph of the EvalContext graph whose subjects match the supplied RDFTerm satisfies the ValueReference rule.

evalTermReference : EvalContextLabelRDFTermOptValidity
ec : EvalContext ; vr : Label ; n : RDFTerm
    evalTermReference ec vr n =
    if n ∉ ran literal then evalRule′ ec n vr
    else f
\begin{gendef}
   evalTermReference: EvalContext \pfun Label \pfun RDFTerm \pfun OptValidity
\where
	\forall ec: EvalContext; vr: Label; n: RDFTerm @ \\ evalTermReference~ec~vr~n = \\
\t1 \IF n \notin \ran literal \THEN evalRule'~ec~n~vr \\
\t1 \ELSE \fail
\end{gendef}

4.2 RevArcRule Evaluation

The RevArcRule is used to select the subset of the graph having a given predicate or predicates and to determine whether the cardinality and/or “type” of this subset matches a supplied criteria. The rule itself consists of a PredicateFilter to select the triples, an SubjectSpecification to evaluate the result, a optional min and max cardinality and a (possibly empty) set of Actions:

RevArcRule
filter : PredicateFilter
valueSpec : ObjectSpecification
min, max : ℕ[0 ..1]
actions :Action
(#min = 1 ∧ #max = 1) ⇒ value minvalue max
\begin{schema}{RevArcRule}
   filter: PredicateFilter \\
   valueSpec: SubjectSpecification \\
   min, max: \optional[\nat] \\
   actions: \power Action
\where
   (\# min = 1 \land \# max = 1) \implies value~ min \leq value~ max
\end{schema}
RevArcRule evaluation consists of:
  1. Select the subset of the EvalContext Graph with the supplied object and predicates matching PredicateFilter
  2. Evaluate the cardinality and return the result if it doesn’t pass
  3. Evaluate the object of each of the triples in the set against SubjectSpecification. If any of the evaluations fail, return fail (f ).
  4. Return the result of evaluating actions against the matching triples.
evalRevArcRule : EvalContextRDFTermRevArcRuleOptValidity
ec : EvalContext ; o : RDFTerm ; rar : RevArcRule ; og : Graph |
    og = evalPredicateFilter rar.filter (triplesForObject o ec.graph) ∙
evalArcRule ec o rar =
    if evalCardinality og rar.min rar.maxp
        then evalCardinality og rar.min rar.max
    else if evalObjectSpecification ec rar.valueSpec ogp
        then evalObjectSpecification ec rar.valueSpec og
    else
        dispatchar.actions og ec
\begin{gendef}
   evalRevArcRule: EvalContext \fun RDFTerm \fun RevArcRule \fun OptValidity
\where
   \forall ec: EvalContext; o: RDFTerm; rar: RevArcRule; og: Graph  | \\ 
\t1 og = evalPredicateFilter~rar.filter~(triplesForObject~ o~ ec.graph) @ \\
evalRevArcRule~ec~o~rar = \\
\t1 \IF evalCardinality~og~rar.min~rar.max \neq \pass \\
\t2 \THEN evalCardinality~og~rar.min~rar.max \\
\t1 \ELSE \IF evalSubjectSpecification~ec~rar.valueSpec~og \neq \pass \\
\t2 \THEN evalSubjectSpecification~ec~rar.valueSpec~og \\
\t1 \ELSE \\
\t2  dispatch~ rar.actions~ og~ ec 
\end{gendef}

4.2.1 SubjectSpecification Evaluation

SubjectSpecification specifies a set of possible values for an RDFTerm and takes one of the following forms:

  • SubjectSet - matches IRIs or IRIstems that match one or more of the expressions in the specified set
  • SubjectWild - matches any target except those matching the (possibly empty) set of IRIstems
  • subjectRef - matches any target that is considered valid according the Rule identified by Label.
SubjectSpecification ::= subjectSet⟨⟨ℙ MatchValue⟩⟩ |
    ssWild ⟨⟨ℙ MatchValue⟩⟩ |
    subjectRef ⟨⟨Label⟩⟩
\begin{zed}
SubjectSpecification ::=  subjectSet \ldata \power MatchValue \rdata | \\
\t1 ssWild \ldata \power MatchValue \rdata | \\
\t1 subjectRef \ldata Label \rdata
\end{zed}

paragraphevalSubjectSpecification – returns pass (p) if all of the triples in a Graph match the supplied SubjectSpecification, otherwise fail (f )

evalSubjectSpecification : EvalContextSubjectSpecificationGraph
    OptValidity
ec : EvalContext ; o : Graph ; ss : SubjectSpecification ; g : Graph
evalSubjectSpecification ec ss g =
    ift evalSubjectSpecificationTriple ec ss t.o = p then p
    else f
\begin{gendef}
   evalSubjectSpecification: EvalContext \fun SubjectSpecification \fun Graph \fun \\
\t1 OptValidity
\where
   \forall ec: EvalContext; ss: SubjectSpecification; g: Graph @ \\ evalSubjectSpecification~ec~ss~g = \\
\t1 \IF \forall t: g @ evalSubjectSpecificationTriple~ec~ss~t.o = \pass \THEN \pass \\
\t1 \ELSE \fail
\end{gendef}
evalSubjectSpecificationTriple : EvalContextSubjectSpecification
    RDFTermOptValidity
ec : EvalContext ; ss : SubjectSpecificationTriple ; n : RDFTerm
evalSubjectSpecificationTriple ec ss n =
    if ss ∈ ran subjectSet
        then evalTermSet(subjectSetss) n
    else if ss ∈ ran ssWild
        then evalTermWild(ssWildss) n
    else
         evalTermReferenceec(subjectRefss) n
\begin{gendef}
   evalSubjectSpecificationTriple: EvalContext \fun SubjectSpecification \fun \\
\t1 RDFTerm \fun OptValidity
\where
   \forall ec:EvalContext; ss: SubjectSpecification; n: RDFTerm @ \\
  evalSubjectSpecificationTriple~ec~ss~n = \\
\t1 \IF ss \in \ran subjectSet \THEN \\
\t2 evalTermSet~(subjectSet \entryFor ss)~n \\
\t1 \ELSE \IF ss \in \ran ssWild \THEN \\
\t2 evalTermWild~(ssWild \entryFor ss)~n \\
\t1 \ELSE \\
\t2 evalTermReference~ec~(subjectRef \entryFor ss)~n
\end{gendef}

4.3 GroupRule Evaluation

A GroupRule serves two purposes. The first is to declare that a referenced rule is to be treated as “optional”, which, in this case means that if (a) the referenced rule returned none (), (meaning an ArcRule was encountered that had no matching predicates and a non-zero minimum cardinality) the group rule returns nomatch (z). An optional GroupRule also treats an error situation as a fail (f ).

The second purpose of a group rule is to allow a set of external actions to be evaluated whenever the referenced rule returns pass (p).

OPT ::= OPTIONAL | REQUIRE
GroupRule ≙ [ rule : Label ; opt : OPT ; actions : ℙ Action ]
\begin{zed}
OPT ::= OPTIONAL | REQUIRED \\
GroupRule \defs [rule: Label; opt: OPT; actions: \power Action ] \\
\end{zed}
evalGroupRule evaluates Rule, applies opt and, if the result is pass (p), evaluates the actions with respect to the passing Graph
evalGroupRule : EvalContextRDFTermGroupRuleOptValidity
ec : EvalContext ; i : RDFTerm ; gr : GroupRuleevalGroupRule ec i gr =
    if evalRule′ ec i gr.rule = gr.opt = OPTIONAL
        then z
    else if evalRule′ ec i gr.rule = εgr.opt = OPTIONAL
        then z
    else if evalRule′ ec i gr.rule = p
        then dispatch gr.actions ec
    else
        evalRule′ ec i gr.rule
\begin{gendef}
   evalGroupRule: EvalContext \fun RDFTerm \fun GroupRule \fun OptValidity
\where
   \forall ec: EvalContext; i: RDFTerm; gr: GroupRule @ evalGroupRule~ ec~ i ~ gr = \\
\t1 \IF evalRule'~ec~i~gr.rule = \none \land gr.opt = OPTIONAL \\
\t2 \THEN \nomatch \\
\t1 \ELSE \IF evalRule'~ec~i~gr.rule = \error \land gr.opt = OPTIONAL \\
\t2 \THEN \fail \\
\t1 \ELSE \IF evalRule'~ec~i~gr.rule = \pass \\
\t2 \THEN dispatch~ gr.actions~ \emptyset ~ ec \\
\t1 \ELSE evalRule'~ec~i~gr.rule
\end{gendef}

4.4 AndRule evaluation

An AndRule consists of a set of one or more Rules, whose evaluation is determined by the following table:

And z f p ε
1 1 f1 f1 ε1
z 1 z1 f1 p1 ε1
f f1 f1 f1 f1 ε1
p f1 p1 f1 p1 ε1
ε ε1 ε1 ε1 ε1 ε1

The formal implementation of which will be realized in the corresponding function:

And : OptValidityOptValidityOptValidity
\begin{gendef}
   And: OptValidity \fun OptValidity \fun OptValidity
\end{gendef}

Observing that the above table is a monoid with nomatch (z) as an identity element, evalAndRule can be applied using the standard functional pattern:

AndRule == seq₁ Label
\begin{zed}
AndRule == \seq_1 Label
\end{zed}
evalAndRule : EvalContextRDFTermAndRuleOptValidity
ec : EvalContext ; i : RDFTerm ; r : AndRule
evalAndRule ec i r =
     foldr And z (map (evalRule′ ec i) r)
\begin{gendef}
   evalAndRule: EvalContext \fun RDFTerm \fun AndRule \fun OptValidity
\where
   \forall ec: EvalContext; i: RDFTerm; r: AndRule  @ \\
	evalAndRule~ec~i~r = \\
\t1	foldr~ And~ \nomatch ~ (map~(evalRule'~ec~i)~r)
\end{gendef}

4.5 XorRule Evaluation

An XorRule consists of a set of one or more Rules, whose evaluation is determined by the following table:

Or z f p ε
1 z1 1 p1 ε1
z z1 z1 z1 p1 ε1
f 1 z1 f1 p1 ε1
p p1 p1 p1 ε23 ε1
ε ε1 ε1 ε1 ε1 ε1
The formal implementation of which will be realized in the corresponding function:
Xor : OptValidityOptValidityOptValidity
\begin{gendef}
   Xor: OptValidity \fun OptValidity \fun OptValidity
\end{gendef}
As with the And function above, Xor is a monoid whose identity is fail (f ) resulting in the following definition for evalXorRule
XorRule == seq₁ Label
\begin{zed}
XorRule == \seq_1 Label
\end{zed}
evalXorRule : EvalContextRDFTermXorRuleOptValidity
ec : EvalContext ; i : RDFTerm ; r : XorRule
evalXorRule ec i r =
     foldr Xor f (map (evalRule′ ec i) r)
\begin{gendef}
   evalXorRule: EvalContext \fun RDFTerm \fun XorRule \fun OptValidity
\where
   \forall ec: EvalContext; i: RDFTerm; r: XorRule  @ \\
	evalXorRule~ec~i~r = \\
\t1	foldr~ Xor~ \fail ~ (map~(evalRule'~ ec ~i)~ r)
\end{gendef}

5 Action evaluation

The dispatch function allows the evaluation / execution of arbitrary external “Actions”. While the evaluation of an Action can (obviously) have side effects outside the context of the ShEx environment, it must be side effect free within the execution context. In particular, an Action may not change anything in the EvalContext The action dispatcher exists to allow external events to happen. Parameters:

The dispatch function usually returns pass (p) or fail (f ), although there may also be cases for other OptValidity values in certain circumstances. The dispatch function always returns pass (p) if the set of actions is empty.

[Action]
\begin{zed}
[Action]
\end{zed}
dispatch : ℙ ActionGraphEvalContextOptValidity
as : ℙ Action : g Graph : ec EvalContext
     as = dispatch as g ec = p
\begin{gendef}
   dispatch: \power Action \fun Graph \fun EvalContext \fun OptValidity
\where
   \forall as: \power Action; g: Graph; ec: EvalContext @ \\
\t1 as = \emptyset \implies dispatch~ as~ g ~ec = \pass
\end{gendef}

6 Parsing Rules

The semantics of the productions and terminals from the Turtle specification are defined in Turtle Terse RDF Triple Language section 7 Parsing.

The following grammar rules are mapped to the abstract syntax:

Productions:

[1]    ShExDoc    ::=    statement*
[2]    statement    ::=    directive | shape
[3]    directive    ::=    sparqlPrefix | sparqlBase | "start" "=" ( label | typeSpec CODE* )
let b = a fresh blank node
typeSpec.setLabel(b)
new ArcRule()
[4]    sparqlPrefix    ::=    "PREFIX" PNAME_NS IRIREF
[5]    sparqlBase    ::=    "BASE" IRIREF
[6]    shape    ::=    label typeSpec CODE*
if (CODE.SIZE > 0)
  let r = new GroupRule(false, typeSpec)
  r.label = label
else
  typeSpec.label = label
[7]    typeSpec    ::=    "{" OrExpression "}"
[8]    OrExpression    ::=    l:AndExpression ( "|" r:AndExpression )*
if (r)
  new XorRule(l, r)
else
  l
[9]    AndExpression    ::=    l:UnaryExpression ( "," r:UnaryExpression )*
if (r)
  new AndRule(l, r)
else
  l
[10]    UnaryExpression    ::=    arc
| "(" OrExpression ")" opt:"?"? CODE*
new GroupRule(OrExpression, opt, CODE)
[11]    label    ::=    iri | BlankNode
[12]    arc    ::=    nameClass valueSpec default? repeatCount? properties? CODE*
new ArcRule(nameClass, valueSpec, default, repeatCount, properties, CODE)
[13]    nameClass    ::=    iriStem
pfIRI(iriStem)
| "." exclusions
pfWild(exclusions)
| 'a'
pfIRI(rdf:type)
[14]    valueSpec    ::=    "@" label
ValueReference(label)
| typeSpec
let b = a fresh blank node
typeSpec.setLabel(b)
| valueSet
new ValueSet(valueSet)
| object
new ValueType(object)
| exclusions
new ValueWild(exclusions)
[15]    iriStem    ::=    iri ("~")?
[16]    default    ::=    "=" object
[17]    properties    ::=    "[" iri object ( ";" ( iri object )? )* "]"
[18]    exclusions    ::=    ( "-" iriStem )*
[19]    repeatCount    ::=    "*" | "+" | "?" | "{" INTEGER ( "," (INTEGER)? )? "}"
[20]    valueSet    ::=    "(" (object)+ ")"
[21]    object    ::=    iriStem | BlankNode | literal
[22]    literal    ::=    RDFLiteral | NumericLiteral | BooleanLiteral
[23]    NumericLiteral    ::=    INTEGER | DECIMAL | DOUBLE
[24]    RDFLiteral    ::=    String ( LANGTAG | "^^" iri )?
[25]    BooleanLiteral    ::=    "true" | "false"
[26]    String    ::=    STRING_LITERAL1 | STRING_LITERAL2 | STRING_LITERAL_LONG1 | STRING_LITERAL_LONG2
[27]    iri    ::=    IRIREF | PrefixedName
[28]    PrefixedName    ::=    PNAME_LN | PNAME_NS
[29]    BlankNode    ::=    BLANK_NODE_LABEL | ANON

Terminals Not in SPARQL:

[30]    <CODE>    ::=    "%" ( [a-zA-Z+#_] ([a-zA-Z0-9+#_])* )? "{" ( [^%\\] | "\\" "%" )* "%" "}"

Ordering (informative)

A Shape Expressions schema with no semantic actions can be treated as unordered. In practice, error messages can be much clearer, and extension functions much simpler to write if the engine follows the lexical order when validating the members of a conjunction. Likewise, form and interface code generation will be more predictable and controllable if the Shape Expressions are processed in a specific order.

7 Appendix

7.1 Foldr

The foldr function is the standard functional pattern, which takes a binary function of type T, an identity function for type T, a sequence of type T and returns the result of applying the function to the right to left pairs of the sequence.

\begin{gendef}[T]
   foldr: (T \fun T \fun T) \fun T \fun \seq T \fun T
\where
   \forall f:T \fun T \fun T; id: T; xs: \seq T @ foldr~f~id~xs = \\
\t1 \IF xs = \langle \rangle \THEN id \\
\t1 \ELSE f~(head~xs)~(foldr~f~id~(tail~xs))
\end{gendef}

7.2 Map

The map function takes a function from type A to type B and applies it to all members in the supplied sequence

[A, B]
map : ( AB ) → seq A → seq B
f : AB ; xs : seq A ∙ map f xs =
    if xs = ⟨⟩ then ⟨⟩⌢
    elsef(head xs)⟩--map f(tail xs)
\begin{gendef}[A,B]
   map: (A \fun B) \fun \seq A \fun \seq B
\where
   \forall f: A \fun B; xs: \seq A @ map~f~xs = \\
\t1 \IF xs = \langle \rangle \THEN \langle \rangle \\
\t1 \ELSE \langle f~(head~xs) \rangle \cat map~f~(tail~xs)
\end{gendef}

6.3 Helper Functions

Z uses the notion of free type definitions in the form:

FreeType ::= constructor⟨⟨source⟩⟩
\begin{syntax}
FreeType ::= constructor \ldata source \rdata
\end{syntax}
which introduces a collection of constants of type FreeType, one for each element of the set source. constructor is an injective function from source to FreeType:
constructor ::= sourceFreeType
\begin{syntax}
constructor ::= source \inj FreeType
\end{syntax}
In the models that follow, there is a need to reverse this – to find the source for a given FreeType instance. The function exists for this purpose. As an example, if one were to define:
TravelDirections ::= bus⟨⟨BusDirections⟩⟩ | walking⟨⟨WalkingDirections⟩⟩
\begin{syntax}
TravelDirections ::= bus \ldata BusDirections \rdata | \ walking \ldata WalkingDirections \rdata
\end{syntax}
If one is supplied with an instance of Travel , one can convert it to the appropriate type by:
x : TravelDirections
if x ran bus then busx else walkingx
\begin{syntax}
x : TravelDirections
\where
\IF x \in \ran bus \THEN bus \entryFor x \ELSE walking \entryFor x
\end{syntax}

[X, Y]
_⋅_: (X Y) × Y X
y : Y; f : X Y f⋅y = (μx : dom f | f x = y)
\begin{gendef}[X, Y]
    \_~  \entryFor \_~   : (X \inj Y) \cross Y \pfun X
   %
   \where %
   \forall y : Y; f : X \inj Y @  f \entryFor y =
   (\mu x : \dom f | f~x = y)
\end{gendef}

One way to represent optional values is a set with one member. We take that route here and introduce a bit of syntactic sugar to show our intent:

T[0..1] == {s : ℙ T | #s ≤ 1}
\begin{zed} 
  \optional[T] == \{ s : \power T \mid \# s \leq 1 \} \\
\end{zed}

And a shorthand for addressing the content:

[T]
value:T T
s :T value s = (μ e : T | e s)
\begin{gendef}[T]
   value: \power T \pfun T
\where
   \forall s: \power T @ value~ s = (\mu e : T | e \in s) 
\end{gendef}

A. References

A.1 Normative References

[RDF11-CONCEPTS]
Richard Cyganiak, David Wood, Markus Lanthaler. RDF 1.1 Concepts and Abstract Syntax. W3C Recommendation, 25 February 2014. URL: http://www.w3.org/TR/2014/REC-rdf11-concepts-20140225/. The latest edition is available at http://www.w3.org/TR/rdf11-concepts/
[RFC3987]
M. Dürst and M. Suignard. Internationalized resource identifiers (IRIs). IETF Request For Comment January 2005. URL: http://tools.ietf.org/html/rfc3987.
[BCP47]
RA Phillips and M. Davis. Tags for identifying language. IETF Best CurrentPractices, September 2009. URL: http://tools.ietf.org/html/bcp47.

A.2 Informative References

[Z Notation Reference Manual]
The Z Notation: A Reference Manual, Second Edition, J. M. Spivey, Prentice Hall, 1992.
[Fuzz 2000]
Release Notes For Fuzz 2000, J. M. Spivey.
Grp
z1
z z1
f f1
p p1
ε ε1
All
f6
z p7
f f11
p p10
ε f21