UniqueParticleAttribution

From W3C Wiki

Overview

Unique Particle Attribution (UPA) is a constraint on content models. Roughly speaking, it is the XML Schema equivalent of DTD requirement for Determistic Element Content, which is itself inherited from SGML. Again roughly speaking, it ensures that the content can be validated without looking ahead more than one tag, which allows simple, streaming implementations. See #Gotchas below for where the above rough statements break down.

Implementation Support

Though UPA is a required constraint of XML Schema, it has been common, at least in the past, for implementations to not report UPA violations. So it necessary to make sure your processor handles UPA before relying on it to validate a schema.

Example

A simple UPA violation:

  <xs:complexType name="bad1">
    <xs:sequence>
      <xs:choice>
        <xs:element name="A" type="xs:string" minOccurs="0" />
        <xs:element name="B" type="xs:string" minOccurs="0" />
      </xs:choice>
      <xs:choice>
        <xs:element name="A" type="xs:string" minOccurs="0" />
        <xs:element name="C" type="xs:string" minOccurs="0" />
      </xs:choice>
    </xs:sequence>
  </xs:complexType>


This example shows a common pattern where the schema author tries to be flexible by making much of the content optional. The problem is that when the first content element is <A>, a processor doesn't know which particle to assign to the element. There are two particles that element <A> can match, so there is no unique particle.

Interaction with Wildcards

Wildcards are also particles and can cause UPA violations when mixed with optional elements (or other wildcards.


  <xs:complexType name="bad2">
    <xs:sequence>
      <xs:element name="A" type="xs:string" minOccurs="0"/>
      <xs:element name="B" type="xs:string" minOccurs="0"/>
      <xs:any/>
    </xs:sequence>
  </xs:complexType>


Sometimes a wildcard conflict can be avoided by use a namespace="##other" attribute on the wildcard particle, but problems can arise later if the content model is extended to handle elements in more than one namespace. XML Schema 1.0 does not support exclusion of multiple namespaces in a wildcard.

Gotchas

The fact that the XML Schema specification states the meaning of UPA three times underscores the complexity of the issue (only one statement is normative -- the others are in Appendix H).

UPA necessarily differs from the letter of the DTD constraints because of the added complexity in the language for particle quantifiers and wildcards. See [1] for an example of where UPA fails to ensure that one-element look-ahead is sufficient for deterministic processing.