Introduction
ShEx enables description and validation of RDF data through declarations of expected properties, their cardinalities, and the type and structure of their objects (and, less frequenly, subjects).
ShEx is analogous to W3C XML Schema and RelaxNG for XML, JSON-schema for JSON, and DDL for SQL.
As an example, imagine an issue tracking interface where an Issue is submitted by some person and potentially assigned to the same person or someone else.
These issues can have a status of unassigned
or assigned
.
inst:Issue1
inst:User2
inst:User4
ex:state
ex:assigned
ex:reportedBy
ex:assignedTo
mailto:bob@...
foaf:name
Bob Smith
foaf:mbox
mailto:joe@...
foaf:name
Joe Smith
foaf:mbox
To capture this in ShEx, we create "shapes" to describe the different nodes in the graph:
(Mousing over the the following list items items will highlight the corresponding elements in the example text.)
exactly one ex: state with a value of ex: unassigned or ex: assigned ,
exactly one ex: reportedBy which references a user.
an optional (indicated by a ? ) ex: assignedTo which references a user.
The user referenced by the ex: assignedTo property must have:
exactly one foaf: name with a value of an RDF Literal with a datatype of xsd: string ,
one or more (indicated by a + ) foaf: mbox es with a value of an RDF IRI (as opposed to a blank node or literal).
schema
PREFIX iface: <http://myco.example/interface#>
PREFIX ex: <http://ex.example/#>
PREFIX foaf: <http://xmlns.com/foaf/>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
start = iface:IssueShape
iface:IssueShape {
ex: state (ex: unassigned ex: assigned ) ,
ex: reportedBy @iface:PersonShape ,
ex: assignedTo @iface:PersonShape ?
}
iface:PersonShape {
foaf: name xsd: string ,
foaf: mbox IRI +
}
passing data
PREFIX inst: <http://example.org/instances/#>
PREFIX ex: <http://ex.example/#>
PREFIX foaf: <http://xmlns.com/foaf/>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
inst:Issue1 a ex: Issue ;
ex: state ex: assigned ;
ex: reportedBy inst:User2 ;
ex: assignedTo inst:User2 .
inst:User2 a foaf: Person ;
foaf: name "Bob Smith" ;
foaf: mbox <mailto:bob@example.org> ;
foaf: mbox <mailto:rs@example.org> .
failing data
PREFIX inst: <http://example.org/instances/#>
PREFIX ex: <http://ex.example/#>
PREFIX foaf: <http://xmlns.com/foaf/>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
inst:Issue3 a ex: Issue ;
ex: state ex: unsinged ;
ex: assignedTo inst:User4 .
inst:User4 a foaf: Person ;
foaf: name "Joe Smith" , "Joseph Smith" ;
foaf: mbox <mailto:joe@example.org> .
Note that ShEx has the same tokens (syntactic representations of IRIs, PNames, BNodes, Literals) as Turtle and SPARQL.
The process of validating an RDF graph against a schema involves matching a focus node
, a node in an RDF graph, against a shape in the schema.
The process of validating an RDF graph against a schema can be thought of matching a node in an RDF graph against a shape in the schema.
So far, we've introduced (hover over items to see examples):
cardinality (how many arcs with a given predicate are expected): optional , one or more
shape references
enumerations of permissible values
literal datatype (note that "Bob Smith" is a literal with a datatype xsd: string )
RDF node type (e.g. IRI , previously known as URI reference )
Let's see how much more language you want or need: