CheckClauses

From W3C Wiki

Check clauses

Check clauses are one way to implement Co-occurrence constraints in a schema language. The name is used by analogy with the `CHECK` clauses of SQL, which enforce constraints which go beyond those expressible in the type system, on all rows in a table.

The specific proposal outlined here was originally proposed for XML Schema 1.1 by Paul Biron of Health Level 7; it has been modified in minor details by Michael Sperberg-McQueen in the course of transcribing it here.

This proposal assumes the reader is familar with ISO SchemaTron ([1]).

Proposal 1

XML Schema 1.1 blesses a specific subset of SchemaTron to be used in appinfo which all schema processors are required to understand and correctly process. That subset consists of the following elements:

  • pattern
  • rule
  • assert
  • report

There are also restrictions on the attributes and children of these elements. To make things short, here are the restrictions in DTD notation:


<!ELEMENT sch:pattern (sch:rule)+>
<!ATTLIST sch:pattern
   id CDATA #IMPLIED
   xmlns:sch CDATA "http://purl.oclc.org/dsdl/schematron" >
<!ELEMENT sch:rule (sch:assert | sch:report)+>
<!ATTLIST sch:rule
   context (.) #FIXED '.'
   id CDATA #IMPLIED>
<!ELEMENT sch:assert (#PCDATA)>
<!ATTLIST sch:assert
   test CDATA #REQUIRED>
<!ELEMENT sch:report (#PCDATA)>
<!ATTLIST sch:report
   test CDATA #REQUIRED>


sch:pattern can appear in complex type definitions as well as in element declarations that have anonymous types (global and local) ... and nowhere else. So, not in model groups, attributes, simple type definitions, etc.

The sch:rule/@context must be '.' (the current node). When used in a complex type, the context implicitly becomes any element that is declared to be of that complex type.

The value of report/@test and assert/@test is limited to boolean combinations of the XPath subset used for fields in identity constraints.

If any report succeeds (@test is true) or if any assert succeeds (@test is false), then the rule is violated and the element/attribute should be considered problematic.

To be determined:

  • what happens in the PSVI, in general
  • in particular, whether violations of the report/assert expectations lead to the value of [validity] being `invalid`, or to provision of some [co-occurrence-constraint] property with an appropriate value.

Possible variations / adjustments:

  • precise locations at which the new elements are allowed
  • precise sublanguage of XPath allowed in test attributes

Example 1

Given:

<xs:complexType name='ExclusiveAttrs'>
 <xs:appinfo xmlns:sch='http://purl.oclc.org/dsdl/schematron'>
  <sch:pattern id='ExclusiveAttrs'>
   <sch:rule context='.'>
    <report test='./@attr1 and ./@attr2'>
     attr1 and attr2 are mutually exclusive
    </report>
   </sch:rule>
  </sch:pattern>
 </xs:appinfo>
 <xs:attribute name='attr1' use='optional'/>
 <xs:attribute name='attr2' use='optional'/>
</xs:complexType>
<xs:element name='root' type='ExclusiveAttrs'/>


The following document is valid:

 
<root attr1='foo'/>
<root attr2='foo'/>

while this one is not valid:

 
<root attr1='foo' attr2='bar'/>


Example 2

Given the declarations:

 
<xs:complexType name='ChildConditionalOnAttr'>
 <xs:appinfo xmlns:sch='http://purl.oclc.org/dsdl/schematron'>
  <sch:pattern id='ChildConditionalOnAttr'>
   <sch:rule context='.'>
    <assert test='(child1 and ./@attr1) or (child2 and ./@attr2)'>
     childN should only be present when @attrN is present
    </report>
   </sch:rule>
  </sch:pattern>
 </xs:appinfo>
 <xs:sequence>
  <xs:element name='child1' minOccurs='0'/>
  <xs:element name='child2' minOccurs='0'/>
 </xs:sequence>
 <xs:attribute name='attr1' use='optional'/>
 <xs:attribute name='attr2' use='optional'/>
</xs:complexType>
<xs:element name='root' type='ChildConditionalOnAttr'/>

the following document is valid:

<root attr1='foo'>
 <child1/>
</root>
<root attr2='foo'>
 <child2/>
</root>

while these are not valid:

 
<root>
 <child1/>
</root>
<root attr2='foo'>
 <child1/>
</root>


Proposal 2

The same as proposal 1, but instead of blessing the SchemaTron subset for use in appinfo make it another "clause" in type definitions and element/attribute declarations. That is, the content models of element, simpleType and complexType as follows (in DTD notation):

<!ELEMENT xs:element
 (annotation?, ((simpleType | complexType)?,
  (unique | key | keyref | sch:pattern)*))>
<!ELEMENT xs:complexType
 (annotation?, (simpleContent | complexContent | ((group | all | choice | sequence)?,
    ((attribute | attributeGroup)*, anyAttribute?, sch:pattern*))))


Proposal 3

The same as proposals 1 and 2, but instead of allowing the new elements only in appinfo or only in declarations and definitions, allow them to appear in either place.

A 1.1 processor encountering a co-occurrence constraint in either location will enforce it.

A 1.0 processor encountering a co-occurrence constraint in a type definition will consider it an error (and recover however it may choose to recover -- most existing processors will presumably issue an error message and die, or otherwise decline to continue; few may be expected to ignore the co-occurrence constraint; fewer still to process it correctly).

A 1.0 processor encountering a co-occurrence constraint in appinfo, on the other hand, will ignore it and proceed to validate the document against the part of the schema that it does understand.

The rationale for Proposal 3 is that the choice between having a 1.0 processor die and having it soldier on and perform at least a partial validation is probably better left to the schema author than to the Working Group.