ShEx/RDF serialization

From Semantic Web Standards
Jump to: navigation, search

Obsolete - please see the JSON-LD interpretation of ShExJ as defined in ShEx Semantics


RDF serialization

(work under progress)

This schema is defined such that definition the shape and rule definitions are separated from the semantics (class and property semantics). See for more background reasoning the full discussion about semantics and shapes.

This splitting makes it possible to separate the SHEX scheme definitions into multiple topics, without bloating the shape definition standard. So that the SHEX is about shapes. Later extentions could include the extra semantics.

So besides the SHEX we could think about the following topics.

  1. SHEX + semantics defenitions see level 2 for more detailed ideas
  2. JSON-LD @context support
  3. Data statistics, that can be used for both understanding the structure better and query performance enhancements
  4. (maybe later) Issues related to the convertion from XML to rdf aka RELAX NG to SHEX

Overview and discussions

See discussion SHEX format for general discussion points related to the SHEX standard.

An overview of schema can be seen in image (to be uplaoded, see schema below).

Rule Groups

Each group of rule may include another groups of rules. A group of rules can either be a OrGroup(One of the rules in the group should apply) or AndGroup(All rules within the group should apply).

A ResourceShape is always an AndGroup, meaning that all rules must apply for the instances it matches. When a ResourceShape should be an OrGroup, then one can include a single OrGroup within a ResourceShape.

In the RDF format has each group definition an referable identifier(except if its defined anonymous), whereas in de SHEX language it is not.

Discussion Point
In the SHEX language it is impossible to define referable Rule Groups, meaning the RDF format has more expressive power. 
Is this a problem and if so, should we force each instance of AndRule and OrRule to be be an anonymous node, or we can add support for naming group rules in SHEX 

A Rulegroup can not be defined and is so defined as 'VIRTUAL', it has to be either a AndGroup or a OrGroup.

AndGroup and OrGroup can only be included into other Rulegroups and can not be referenced by a ShapeProperty. Only ResourceShapes can be referenced by a ShapeProperty.

The multiplicity is defined by the occurs property and is defined ones and together with the RuleGroup. It is not defined upon inclusion within another group. See multiplicity for more details.

Discussion Point
We could also define it such that for each inclusion we define an extra instance that reference to the group to be included and with which 
multiplicity it should be included. Drawback of this solution is that you need define an 'extra layer' of objects 

Definition would be like
VIRTUAL se:RuleGroup{ 
  se:subGroup @se:IncludeRuleGroup*,
  ....
}

se:IncludeRuleGroup & se:occursRule{
  se:group @se:RuleGroup   
}

Properties

For each property a multiplicity is defined, see multiplicity.

There are 4 different type of properties

  1. ShapeProperty, which defines a reference to another ShapeProperty
  2. ValueProperty, which defines a simple value that is one off the simple type defined by the XSD scheme
  3. EnumerationSetProperty, which defines a set of allowed properties. (TODO full class strucuture)
  4. StemProperty, which defines a pattern to which an IRI must match
  5. AnyProperty this accepts any value, some filter can be given to only accept subjects(se:IRI), literals(se:Literal) or blank nodes(se:BNode).

multiplicity

The multiplicity for both group inclusion and properties is defined by the occurs property. This multiplicity is defined with one of the following options:

multiplicy SHEX defenition RDF serialization
1..1 X rs:occurs rs:Exactly-one .
0..1  ? X rs:occurs rs:Zero-or-one .
1..N + X rs:occurs rs:One-or-many .
0..N * X rs:occurs rs:Zero-or-many .
y..z {y,z} X rs:occursmin y^^xsd:integer; rs:occursmax z^^xsd:integer .

Note that for the inclusion of an AndGroup that only the first 2 options are meaning full. Since its not possible to have group elements together by order if the structure is unordered.

Discussion Point
xsd:integer has been chosen as it has a unlimited range, or should we better use xsd:int, which is limited to 32 bits.
Discussion Point
I would rather name rs:occurs as se:multiplicity so we can name the reverse multiplicity with se:reverseMultiplicity.
Discussion Point
Currently there exists no option to define an y..N occurrence.
Discussion Point
ResourceShapes should have a multiplicity of one, because when referenced by a shape property then the multiplicity is already defined in the property.
However, in the current SHEX language it is not possible to define rule groups distinctly or separately. 

rs:ResourceShape & se:AndRuleGroup{
  se:name xsd:string?,
  rdf:type (rs:ResourceShape),
  rs:occurs (rs:Exactly-one)
}

Use of resource shape (http://open-services.net/ns/core#) schema

Discussion Point
In my opinion it would be better to initially define all properties within the SHEX schema. 
(note I do have fundamental different opinion about reusing other schema's) 
The predicates as defined in the definition resource shape do have a similar but different semantic meaning. 
I think it would be better to use SKOS to link the SHEX schema predicate to the ones in the resource shape schema 
and maybe only use those which have exactly the same meaning.

Encoding of the instances of classes and rdfs:subClassOf

See encoding of instanceof.

Encoding of the greedy matching

See encoding of greedy matching for the discussion about encoding the greedy matching.

Constraint rule

Constraint rules can be either assigned to

  1. Shapes, which can use any combination of rules based on any of the properties within the shape
  2. Properties, which can use any combination of rules only based on the associated property

Each rule is defined by a separate constrained rule object, which defines the language of the rule and the rule itself. A constraint rule can be either defined in javascript or SPARQL.

I kept the rule definition for now within a xsd:string, but I would like to define the serialization of a SPARQL rule to be based on SPIN.

Details about the mapping/calling of constraint rules will be added later.

SHEX schema in SHEX

(work in progress)

Below the SHEX rdf definition defined in the SHEX language itself (RDF serialization of it is below, which does self validdate)

PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX rs: <http://open-services.net/ns/core#>		
PREFIX se: <http://www.w3.org/2013/ShEx/Definition#> 
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#> .

start = se:StartDef
se:StartDef {
  se:items @se:StartDefItem+
  rdf:type (se:StartDef)
}

se:StartDefItem & se:occursRule {
  se:shape @rs:ResourceShape;
  rdf:type (se:StartDefItem)
}

#Or rule group | no ,
se:occursRule {
  rs:occurs (rs:Exactly-one rs:One-or-many rs:Zero-or-many rs:Zero-or-one) |
  (rs:minoccurs xsd:integer,rs:maxoccurs xsd:integer))
} 

VIRTUAL se:RuleGroup { 
  rs:occurs (rs:Exactly-one rs:Zero-or-one)
  se:subGroup @se:RuleGroup*,
  rs:property @se:Property*,
  rdf:type (se:RuleGroup se:OrRuleGroup se:AndRuleGroup rs:ResourceShape)
}

se:OrRuleGroup & se:RuleGroup {
  rdf:type (se:OrRuleGroup)
}

se:AndRuleGroup & se:RuleGroup {
  rdf:type (se:AndRuleGroup rs:ResourceShape)
}

rs:ResourceShape & se:AndRuleGroup{
  se:name xsd:string?,
  rdf:type (rs:ResourceShape),
  #rs:occurs (rs:Exactly-one) see discussion point above
}

VIRTUAL se:Property & se:occursRule {
  rs:name xsd:string,
  rs:propertyDefinition @rdfs:Property,
  se:constraint @se:ConstraintRule*,
  rdf:type (se:Property se:ShapeProperty se:ValueProperty se:EnumerationSetProperty se:StemProperty se:AnyTypeProperty)
}

rdfs:Property {
 # rdf:type (rdfs:Property)
}

se:ShapeProperty & se:Property {
  rs:valueShape @rs:ResourceShape,
  rdf:type (se:ShapeProperty)
}

se:ValueProperty & se:Property {
  rs:valueType xsd:~+,
  rdf:type (se:ValueProperty)
}

se:EnumerationSetProperty & se:Property {
  rs:allowedValue .(IRI)+,
  rdf:type (se:EnumerationSetProperty)
}

se:StemProperty  & se:Property {
  se:stem .(IRI)+,
  rdf:type (se:StemProperty)
}

se:AnyTypeProperty  & se:Property {
  se:superType (se:IRI se:Literal se:BNode)*,
  rdf:type (se:AnyTypeProperty)
}

se:ConstraintRule { #support for SPIN rules to be added later
  se:language (se:js se:sparql),
  se:rule xsd:string,
  rdf:type (se:ConstraintRule)
}

SHEX schema in RDF

Below the SHEX rdf definition defined in SHEX rdf, which is now tested to be self validating

@prefix rdfs:  <http://www.w3.org/2000/01/rdf-schema#> .
@prefix owl:   <http://www.w3.org/2002/07/owl#> .
@prefix rdf:   <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rs:    <http://open-services.net/ns/core#> .
@prefix se:    <http://www.w3.org/2013/ShEx/Definition#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

 se:Start a se:StartDef;
     se:items [ a se:StartDefItem;
       rs:occurs rs:Exactly-one;
       se:shape se:StartDef;
     ].

 se:StartDef a rs:ResourceShape;
   rs:occurs rs:Exactly-one;
   rs:property [ a se:ShapeProperty;
     rs:name "items";
     rs:occurs rs:One-or-many;
     rs:propDefinition se:items;
     rs:valueShape se:StartDefItem;
   ];
   rs:property [ a se:EnumerationSetProperty ;
     rs:occurs rs:Exactly-one;
     rs:name "type";
     rs:propDefinition rdf:type;
     rs:allowedValue se:StartDef;
   ].  
   
 se:StartDefItem a rs:ResourceShape;
   rs:occurs rs:Exactly-one;
   se:subGroup se:occursRule;
   rs:property [ a se:ShapeProperty;
     rs:name "shape";
     rs:occurs rs:Exactly-one;
     rs:propDefinition se:shape;
     rs:valueShape rs:ResourceShape;
   ];
   rs:property [ a se:EnumerationSetProperty ;
     rs:occurs rs:Exactly-one;
     rs:name "type";
     rs:propDefinition rdf:type;
     rs:allowedValue se:StartDefItem;
   ]. 

 se:occursRule a se:OrRuleGroup;
   rs:occurs rs:Exactly-one;
   rs:property [ a se:EnumerationSetProperty;
     rs:occurs rs:Exactly-one;
     rs:name "occurs";
     rs:propDefinition rs:occurs;
     rs:allowedValue rs:Exactly-one, rs:One-or-many,rs:Zero-or-many,rs:Zero-or-one;
   ];
   se:subGroup [ a se:AndRuleGroup;
     rs:occurs rs:Exactly-one;
     rs:property [ a se:ValueProperty;
       rs:occurs rs:Exactly-one;
       rs:name "minoccurs";
       rs:propDefinition rs:minoccurs;
       rs:valueType xsd:integer;
     ];
     rs:property [ a se:ValueProperty;
       rs:occurs rs:Exactly-one;
       rs:name "maxoccurs";
       rs:propDefinition rs:maxoccurs;
       rs:valueType xsd:integer;
     ];
   ].  

 se:RuleGroup a rs:ResourceShape;
   rs:occurs rs:Exactly-one;
   #A group may not occur multiple times
   rs:property [ a se:EnumerationSetProperty;
     rs:occurs rs:Exactly-one;
     rs:name "occurs";
     rs:propDefinition rs:occurs;
     rs:allowedValue rs:Exactly-one, rs:Zero-or-one;
   ];
   rs:property [ a se:ShapeProperty;
     rs:name "subGroup";
     rs:occurs rs:Zero-or-many;
     rs:propDefinition se:subGroup;
     rs:valueShape se:RuleGroup;
   ];
   rs:property [ a se:ShapeProperty;
     rs:name "property";
     rs:occurs rs:Zero-or-many;
     rs:propDefinition rs:property;
     rs:valueShape se:Property;
   ];
   rs:property [ a se:EnumerationSetProperty;
       rs:occurs rs:Exactly-one;
       rs:name "type";
       rs:propDefinition rdf:type;
       rs:allowedValue se:RuleGroup, se:OrRuleGroup, se:AndRuleGroup, rs:ResourceShape ;
   ];
   se:subGroup [ a se:OrRuleGroup ; #Define it as VIRTUAL, by forcing it to include one of its 'childs'
     rs:occurs        rs:Exactly-one; 
     se:subGroup        se:OrRuleGroup;
     se:subGroup        se:AndRuleGroup;
   ].

 se:OrRuleGroup a rs:ResourceShape ;
   rs:occurs rs:Exactly-one;
   se:subGroup se:RuleGroup;
   rs:property [ a se:EnumerationSetProperty ;
     rs:occurs rs:Exactly-one;
     rs:name "type";
     rs:propDefinition rdf:type;
     rs:allowedValue se:OrRuleGroup;
  ].  

 se:AndRuleGroup a rs:ResourceShape ;
   rs:occurs rs:Exactly-one;
   se:subGroup se:RuleGroup;
   rs:property [ a se:EnumerationSetProperty ;
     rs:occurs rs:Exactly-one;
     rs:name "type";
     rs:propDefinition rdf:type;
     rs:allowedValue se:AndRuleGroup, rs:ResourceShape;
  ].  
    
 rs:ResourceShape a rs:ResourceShape ;
   rs:occurs rs:Exactly-one;
   se:subGroup se:AndRuleGroup;
   rs:property [ a se:ValueProperty ;
     rs:occurs rs:Zero-or-one;
     rs:name "name";
     rs:propDefinition rs:name;
     rs:valueType xsd:string ;
   ];
   rs:property [ a se:EnumerationSetProperty ;
     rs:occurs rs:Exactly-one;
     rs:name "type";
     rs:propDefinition rdf:type;
     rs:allowedValue rs:ResourceShape;
   ].   

 se:Property a rs:ResourceShape ;
   rs:occurs rs:Exactly-one;
   se:subGroup se:occursRule;
   rs:property [ a se:ValueProperty ;
     rs:occurs rs:Exactly-one;
     rs:name "name";
     rs:propDefinition rs:name;
     rs:valueType xsd:string ;
  ];  
  rs:property [ a se:ShapeProperty ;
     rs:occurs rs:Exactly-one;
     rs:name "propDefinition";
     rs:propDefinition rs:propDefinition;
     rs:valueShape rdfs:Property ;
  ];  
  rs:property [ a se:ShapeProperty ;
     rs:occurs rs:Zero-or-many;
     rs:name "constraint";
     rs:propDefinition se:constraint;
     rs:valueShape se:ConstraintRule ;
  ];  
  rs:property [ a se:EnumerationSetProperty ;
     rs:occurs rs:Exactly-one;
     rs:name "type";
     rs:propDefinition rdf:type;
     rs:allowedValue se:Property, se:ShapeProperty, se:ValueProperty, se:EnumerationSetProperty, se:StemProperty,se:AnyTypeProperty;
  ];
  se:subGroup [ a se:OrRuleGroup ; #Define it as VIRTUAL, by forcing it to include one of its 'childs'
    rs:occurs        rs:Exactly-one; 
    se:subGroup        se:ShapeProperty;
    se:subGroup        se:ValueProperty;
    se:subGroup        se:EnumerationSetProperty;
    se:subGroup        se:StemProperty;
    se:subGroup		  se:AnyTypeProperty;
  ].

 rdfs:Property a rs:ResourceShape ;
   rs:occurs rs:Exactly-one.
 #To be defined further
 #   rs:property [ a se:EnumerationSetProperty ;
 #     rs:occurs rs:Exactly-one;
 #     rs:name "type";
 #     rs:propDefinition rdf:type;
 #     rs:allowedValue rdfs:Property;
 #].   
    
 se:ShapeProperty a rs:ResourceShape ;
   rs:occurs rs:Exactly-one;
   se:subGroup se:Property;
   rs:property [ a se:ShapeProperty ;
     rs:occurs rs:One-or-many;
     rs:name "valueShape";
     rs:propDefinition rs:valueShape;
     rs:valueShape rs:ResourceShape ;
   ];  
   rs:property [ a se:EnumerationSetProperty ;
     rs:occurs rs:Exactly-one;
     rs:name "type";
     rs:propDefinition rdf:type;
     rs:allowedValue se:ShapeProperty;
   ]. 

 se:ValueProperty a rs:ResourceShape ;
   rs:occurs rs:Exactly-one;
   se:subGroup se:Property;
   rs:property [ a se:StemProperty ;
     rs:occurs rs:One-or-many;
     rs:name "valueType";
     rs:propDefinition rs:valueType;
     se:stem xsd:;
   ];  
   rs:property [ a se:EnumerationSetProperty ;
     rs:occurs rs:Exactly-one;
     rs:name "type";
     rs:propDefinition rdf:type;
     rs:allowedValue se:ValueProperty;
   ]. 
    
 se:EnumerationSetProperty a rs:ResourceShape ;
   rs:occurs rs:Exactly-one;
   se:subGroup se:Property;
   rs:property [ a se:AnyTypeProperty ;
     rs:occurs rs:One-or-many;
     rs:name "allowedValue";
     rs:propDefinition rs:allowedValue;
     se:superType se:IRI;
   ];  
   rs:property [ a se:EnumerationSetProperty ;
     rs:occurs rs:Exactly-one;
     rs:name "type";
     rs:propDefinition rdf:type;
     rs:allowedValue se:EnumerationSetProperty;
   ].  

 se:StemProperty a rs:ResourceShape ;
   rs:occurs rs:Exactly-one;
   se:subGroup se:Property;
   rs:property [ a se:AnyTypeProperty ;
     rs:occurs rs:One-or-many;
     rs:name "stem";
     rs:propDefinition se:stem;
     se:superType se:IRI ;
   ];    
   rs:property [ a se:EnumerationSetProperty ;
     rs:occurs rs:Exactly-one;
     rs:name "type";
     rs:propDefinition rdf:type;
     rs:allowedValue se:StemProperty;
  ].
   
 se:AnyTypeProperty a rs:ResourceShape ;
   rs:occurs rs:Exactly-one;
   se:subGroup se:Property;
   rs:property [ a se:EnumerationSetProperty ;
     rs:occurs rs:Zero-or-many;
     rs:name "superType";
     rs:propDefinition se:superType;
     rs:allowedValue se:IRI,se:Literal,se:BNode;
   ];    
   rs:property [ a se:EnumerationSetProperty ;
     rs:occurs rs:Exactly-one;
     rs:name "type";
     rs:propDefinition rdf:type;
     rs:allowedValue se:AnyTypeProperty;
  ]. 

 se:ConstraintRule a rs:ResourceShape ;
   rs:occurs rs:Exactly-one;
   rs:property [ a se:EnumerationSetProperty ;
     rs:occurs rs:Exactly-one;
     rs:name "language";
     rs:propDefinition se:language;
     rs:allowedValue se:js,se:sparql;
   ];
   rs:property [ a se:ValueProperty ;
     rs:occurs rs:Exactly-one;
     rs:name "rule";
     rs:propDefinition se:rule;
     rs:valueType xsd:string ;
   ];
   rs:property [ a se:EnumerationSetProperty ;
     rs:occurs rs:Exactly-one;
     rs:name "type";
     rs:propDefinition rdf:type;
     rs:allowedValue se:ConstraintRule;
  ].

Example data

The scheme is an example of itself as its self validates