W3CArchitecture Domain XML | XML Schema

XML Schema Versioning Use Cases

[SUBTITLE]Draft - 31 January 2006

[SUBTITLE]W3C XML Schema Working Group

[VERSION]This version: http://www.w3.org/XML/2005/xsd-versioning-use-cases/2006-01-31.html

[VERSION]Latest version: http://www.w3.org/XML/2005/xsd-versioning-use-cases/

[VERSION]Previous version: http://www.w3.org/XML/2005/xsd-versioning-use-cases/2005-06-06.html

[VERSION]Editor:

[EDITORS]Hoylen Sue <hoylen@hoylen.com>

Abstract

This document describes use cases where XML Schemas are being versioned. These are situations when there are more than one XML Schema, and those schemas are related to each other in some way. The use cases describe the desired behaviour from XML Schema processors when they encounter the different versions of schemas and the instances defined by them.

This document has been produced by the W3C XML Schema Working Group, to serve as input to the Working Group's work on the versioning of XML Schemas. It is used to define what types of versioning situations that XML Schema should addressing.

The use cases illustrate the types of versioning problems that should be solved by versioning mechanisms that might be added to XML Schema. XML Schema might not be able to solve all the use cases, but it is hoped that it can solve a majority of them.

Status of this Document

This is a draft discussion document. Some of the use cases have been extensively discussed in the Working Group. However, the current set of use cases and text describing them have not been endorsed by the Working Group. The current document is in draft form, and is subject to change.

These use cases are based on real examples submitted by users of XML Schema.

The XML Schema Working Group would welcome additional uses cases which illustrate aspects of versioning which have not been captured by the existing use cases.

Table of contents

1. Introduction
1.1. Classification
1.1.1. Schema Availability
1.1.2. Instance Processed
1.1.3. Operation Performed
1.2. Common requirements
1.2.1. Multiple generational versioning
1.2.2. Multiple branch versioning
1.2.3. Author friendly
1.2.4. Uses only schema and instance
1.2.5. Fallback
1.3. Terminology
2. Use cases
2.1. Major-minor
2.1.1. Overview
2.1.2. "Major-minor" scenario A: Minor without new schemas
2.1.3. "Major-minor" scenario B: Major without new schemas
2.1.4. "Major-minor" scenario C: Minor with new schemas
2.1.5. "Major-minor" scenario D: Major with new schemas
2.1.6. Discussion
2.2. Object-oriented
2.2.1. Overview
2.2.2. "Object-oriented" scenario A: Employee as a person
2.2.3. "Object-oriented" scenario B: Manager as a person
2.2.4. "Object-oriented" scenario C: Manager in payroll
2.2.5. "Object-oriented" scenario D: Senior Manager in payroll
2.2.6. "Object-oriented" scenario E: Person in payroll
2.2.7. Discussion
2.3. Ignore-unknowns
2.3.1. Overview
2.3.2. "Ignore-unknowns" scenario A: Price management to cash register
2.3.3. "Ignore-unknowns" scenario B: Fuel pump to cash register
2.3.4. "Ignore-unknowns" scenario C: Price management to volume tracker
2.3.5. "Ignore-unknowns" scenario D: Fuel pump to volume tracker
2.3.6. Discussion
2.4. Comparison
2.4.1. Overview
2.4.2. "Comparison" scenario A: Cash register and price management system
2.4.3. "Comparison" scenario B: Cash register and fuel pump
2.4.4. "Comparison" scenario C: Volume tracker and price management
2.4.5. "Comparison" scenario D: Volume tracker and fuel pump
2.4.6. Discussion
2.5. Specialization
2.5.1. Overview
2.5.2. "Specialization" scenario A: Hospital to GP
2.5.3. "Specialization" scenario B: GP to GP
2.5.4. "Specialization" scenario C: Hospital to hospital
2.5.5. "Specialization" scenario D: GP to hospital
2.5.6. "Specialization" scenario E: Physiotherapy to hospital
2.5.7. "Specialization" scenario F: Physiotherapy to GP
2.5.8. Discussion
2.6. Renaming
2.6.1. Overview
2.6.2. "Renaming" scenario A: Hospital to GP
2.6.3. "Renaming" scenario B: GP to GP
2.6.4. "Renaming" scenario C: Hospital to hospital
2.6.5. "Renaming" scenario D: GP to hospital
2.6.6. "Renaming" scenario E: Physiotherapy to hospital
2.6.7. "Renaming" scenario F: Physiotherapy to GP
2.6.8. Discussion
2.7. Customization
2.7.1. Overview
2.7.2. "Customization" scenario A: Video head office to Big-store head office
2.7.3. "Customization" scenario B: Video warehouse to Video head office
2.7.4. "Customization" scenario C: Big-store head office to Big-store branch
2.7.5. "Customization" scenario D: Big-store to Video head office
2.8. MathML
2.9. XSD versioning
2.9.1. Overview
2.9.2. "xsd" scenario A: Old instance with old schema on an old processor
2.9.3. "xsd" scenario B: Old instance with old schema on a new processor
2.9.4. "xsd" scenario C: New instance with new schema on an old processor
2.9.5. "xsd" scenario D: New instance with new schema on a new processor
2.9.6. Discussion
2.9.6.1. Example 1: New check constraints
2.9.6.2. Example 2: New embedded element (I)
2.9.6.3. Example 3: New embedded element (II)
2.9.6.4. Example 4: Extension namespace labeling
2.9.6.5. Example 5: new attribute on wildcards
2.9.6.6. Example 6: xsd:all allowed at any level of a content model.
2.9.6.7. Example 7: grammaticalization of attributes
2.10. Web services content
2.10.1. Overview
2.10.1.1. Notation
2.10.1.2. Approach 1: Inserted
2.10.2. "Web services content" scenario A: original instance
2.10.3. "Web services content" scenario B: versioned instance
2.10.4. "Web services content" scenario C: original instance with new schema
2.10.5. "Web services content" scenario D: new instance with original schema
2.10.6. Discussion
2.10.6.1. Approach 1: Inserted
2.10.6.2. Approach 2: Insert from different namespace
2.10.6.3. Approach 3: Appended
2.10.6.4. Approach 4: Appended from different namespace
2.10.6.5. Approach 5: Combination
2.10.6.6. Approach 6: Multi-phase versioning
2.10.6.7. Approach 7: Multi-namespace languages
2.10.6.8. Approach 8: Extension
2.10.6.9. Approach 9: Extensibility with ignore unknowns
2.10.6.10. Approach 10: Extensibility with ignore unexpected knowns
2.10.6.11. Approach 11: Extensibility with ignore all unexpected
2.11. Web services mustUnderstand
2.11.1. Overview
2.11.2. "Web services mustUnderstand" scenario A: Extension with old schema
2.11.3. Discussion
2.12. Web services container types
2.12.1. Overview
2.12.2. "Web services container types" scenario A: Specific extensions
2.12.3. "Web services container types" scenario B: Optional extensions
2.12.4. "Web services container types" scenario C: Must understand extensions
2.12.5. "Web services container types" scenario D: Recursive extensions
2.13. Web services and encryption
2.13.1. Overview
2.13.2. "Encryption" scenario A: partial validation
2.13.3. Discussion
2.14. Web services and WSDL
2.14.1. Ordering
2.14.2. Binding
2.14.3. Interface extension
2.14.4. Constraining location of extensions
2.14.5. Targeting extensions
3. Acknowledgements

1. Introduction

It is important to be able to create different versions of XML Schemas. In some applications, schemas need to change over time to meet new requirements that may emerge. It is often not practical to simultaneously replace all the deployments of the old schemas with the new ones. So applications will need to cope with different versions coexisting in the system. Hence, versioning mechanisms in XML Schema should new versions to be created, and the schema processors need to handle instances defined by the different versions.

This document describes the desirable behaviours for use cases that involve XML Schema versioning. The use case approach aims to describes external interactions on the system (in this case, the system is the XML Schema processor). They deliberately do not describe any implementation specific mechanisms. Possible versioning mechanismes are discussed in the "Framework for discussion of versioning" <http://www.w3.org/XML/2004/02/xsdv.html>.

This document focuses on the versioning of XML Schemas. In particular, on the behaviour of XML Schema processors, which performs schema validation and exposes information via the Post-Schema Validation Infoset (PSVI). Schema versioning is also important to other types of systems, such as for the application and for code generators using XML Schemas. However, those other aspects of versioning are outside the scope of this document.

It is hoped that versioning mechansims can be developed to solve as many of these use cases as possible. However, the Working Group does not guarantee that all of them can be solved by XML Schema.

For more general information on versioning, the W3C Technical Architecture Group (TAG) is producing a TAG Finding about Versioning on the Web.

Discussions on versioning is being conducted on the <public-xml-versioning@w3.org> mailing list. Please send your comments on this document to that mailing list. Instructions on how to join the mailing list are available at <http://www.w3.org/Mail/Request>.

1.1. Classification

Several axes have been identified to help classify the different types of use cases. These axes are:

1.1.1. Schema Availability

This axis indicates which schemas are available to the processor. Schemas will be identified by a letter, for example "B". The notation "V(B)" will be used to indicate that a schema called V is a version of schema B.

A version of a schema may be versioned again. The notation "W(V(B))" indicate that a schema called W is a version of V, and V itself is a version of B.

The availability of schemas will depend on the schema processor and the application. Sometimes they can be updated with the new schemas. Schemas could be manually installed into the application, or the processor could automatically fetch them when they are needed. At other times, processors might not be configurable with the new schemas. They may be embedded devices that cannot be easily updated, or they might be disconnected from a network and can't automatically fetch any new schemas. Fetching external schemas might also be disallowed due to performance or security reasons.

1.1.2. Instance Processed

This axis indicates what kind of instance document is being processed. That is, which version of schema the instance correspond to. The same notation as for schemas will be used to denote the different versions of instances.

A schema processor needs to handle instances of different versions. It may accept, partially accept, or reject the instance. The desired behaviour of schema processors is the subject of these use cases.

1.1.3. Operation Performed

This axis describes what action is being performed by the XML Schema processor.

The most common action is schema validation, where an instance document is validated against a XML Schema. This produces a result indicating if the document is valid, and a Post Schema Validation Infoset (PSVI). This is what current schema processors do.

A new operation in versioning is the comparison of two schemas to see if they are versions of each other. Existing schema processors do not perform this function.

1.2. Common requirements

1.2.1. Multiple generational versioning

The versioning mechanism must allow for multiple generations of versioning. It must be scalable, so that it can be used many times to create new versions of already versioned schemas.

1.2.2. Multiple branch versioning

The versioning mechanism must allow for more than one version to be independently created, and those versions to also have multiple versions.

For example, consider a schema v1. Independent versions v1.1, v1.2, and v1.3 are created of it. Versions of those are also created, such as v1.2.1, v1.3.1, and v1.3.1.1. The versioning mechanism must be clear about validity between these different versions.

1.2.3. Author friendly

The versioning mechanism must be produce instance documents which are suitable for manual editing and viewing. This must hold when multiple generations of versioning is in effect. Editing of an N-th generation document should be as simple as editing a first generation document.

1.2.4. Uses only schema and instance

The versioning mechanism must work within the framework of XML instance documents and XML Schemas.

It must not rely on any external mechanisms. For example, some of the versioning use cases can be solved by using a negotiation process to agree upon the version being used between the sender and the receiver. However, negotiation is not a part of schema processing and is not possible in some situations (e.g. for static XML files, or when there is only a one way communication channel). Hence, negotiation cannot be relied upon as a suitable XML Schema versioning mechanism.

1.2.5. Fallback

The use cases specify the desired behaviour for the situations where a schema processor encounters something that they didn't expect.

There could be mechanisms that allow alternative fallback behaviours to be specified. These would be used instead of the desired behaviour in the schema processor.

The alternative fallback behaviour could be specified in a number of different places. The behaviour may come from:

The precedence of the fallbacks, when more than one is used, needs to be defined.

1.3. Terminology

The term "backward compatiable" will mean that an instance document defined by an old schema can be processed by an application that handles the new schema.

The term "forward compatiable" will mean that an instance document defined by a new schema can be processed by an application that handles the old schema.

[TODO]Consider defining four terms. Adding terms that relate to schema compatiability, and not just instance document compatiability.

2. Use cases

2.1. Major-minor

2.1.1. Overview

In this use case a schema is versioned many times. Some of those versions have insignificant changes made to them, but some others will undergo significant changes. The versioning mechanism needs a way to distinguish between the two types of changes.

In the application world, the concept of major and minor versions is commonly used for software releases. This convention is often used to convey information about interoperability: minor versions are guaranteed to be compatiable, major versions do not have such guarantees.

What constitutes an incompatible change depends on the particular application using the schemas. Hence, the distinction cannot be automatically determined by examining the schemas.

For example, consider a schema for a book store's inventory system. This will be known as schema B. It contains:

<xsd:complexType name="booktype">
  <xsd:sequence>
    <xsd:element name="title" type="xsd:string"/>
    <xsd:element name="author" type="xsd:string"/>
    <xsd:element name="price" type="xsd:decimal"/>
    <xsd:any minOccurs="0" maxOccurs="unbounded" processContents="skip"/>
  </xsd:sequence>
</xsd:complexType>

Note: the schema author has attempted to plan for versioning by adding an "any" at the end of the content model. However, is it not a necessary part of the use case. The desired behaviours of this use case should still apply, even if the "any" was not present in the description of this example.

An example instance of B is:

<book>
  <title>Fun with XML Schemas</title>
  <author>A. Writer</author>
  <price>39.95</price>
</book>

A minor version of the schema is created, schema V(B). The only change is the addition of an optional editor element. Schema V could contain:

<xsd:complexType name="booktype">
  <xsd:sequence>
    <xsd:element name="title" type="xsd:string"/>
    <xsd:element name="author" type="xsd:string"/>
    <xsd:element name="price" type="xsd:decimal"/>
    <xsd:element name="editor" type="xsd:string" minOccurs="0"/>
  </xsd:sequence>
</xsd:complexType>

An example instance of V(B) is:

<book>
  <title>More fun with XML Schemas</title>
  <author>A. Writer</author>
  <price>39.95</price>
  <editor>B. Jones</editor>
</book>

A major version of the schema is also created, schema W(B). The change in this major version is the addition of an optional discount element. For this particular application, the "price" must be interpreted along with the "discount", so this is an incompatible change. The old applications (that used schema B) should not treat instances of W as instances of B, otherwise they will be interpreting the data incorrectly.

The schema author has decreed this is a significant version change. Structurally, the change is the same as the minor change, but the impact on processing applications makes it a major change. In some cases, non-technical reasons may influence whether a version is minor or major (e.g. a business decision to force system incompatiability). So the change by itself is not sufficient to indicate if it as a minor or major version change.

The schema W(B) could contains:

<xsd:complexType name="booktype">
  <xsd:sequence>
    <xsd:element name="title" type="xsd:string"/>
    <xsd:element name="author" type="xsd:string"/>
    <xsd:element name="price" type="xsd:decimal"/>
    <xsd:element name="discount" type="xsd:decimal" minOccurs="0"/>
  </xsd:sequence>
</xsd:complexType>

An example instance of W(B) is:

<book>
  <title>New adventures with XML Schemas</title>
  <author>A. Writer</author>
  <price>39.95</price>
  <discount>0.90</discount>
</book>

In the followng scenarios, the term "old" will be used to refer to applications which only process the base schema B, and the term "updated" used to refer to applications that can also process the new versions V or W.

Figure: major-minor

2.1.2. "Major-minor" scenario A: Minor without new schemas

[AXIS]Instance processed: V(B) [Minor version]

[AXIS]Schema Availability: B [Base]

[AXIS]Operation performed: Schema validation

Brief summary: The old inventory system processes an instance that has a minor version change.

Basic course of events:

1. An instance of V is processed.
2. The system identifies that it is a minor version of B.
3. It processes it using schema B.

Desired outcome:

2.1.3. "Major-minor" scenario B: Major without new schemas

[AXIS]Instance processed: W(B) [Major version]

[AXIS]Schema Availability: B [Base]

[AXIS]Operation performed: Schema validation

Brief summary: The old inventory system processes an instance that has a major version change.

Basic course of events:

1. An instance of W is processed.
2. The system identifies that it is a major version change from B.
3. The system cannot process the instance.

Desired outcome:

2.1.4. "Major-minor" scenario C: Minor with new schemas

[AXIS]Instance processed: V(B) [Minor version]

[AXIS]Schema Availability: B and V(B) [Base and Minor]

[AXIS]Operation performed: Schema validation

Brief summary: An updated inventory system processes an instance that has a minor version change.

Basic course of events:

1. An instance of V is processed.
2. The system identifies that it as an instance of V.
3. It processes it using schema V.

Desired outcome:

2.1.5. "Major-minor" scenario D: Major with new schemas

[AXIS]Instance processed: W(B) [Major version]

[AXIS]Schema Availability: B and W(B) [Base]

[AXIS]Operation performed: Schema validation

Brief summary: An updated inventory system processes an instance that has a major version change.

Basic course of events:

1. An instance of W is processed.
2. The system identifies that it is an instance of W.
3. The system processes it using schema W.

Desired outcome:

2.1.6. Discussion

This use case extends to multiple levels of versioning (both major and minor) even though just one level is shown here.

2.2. Object-oriented

2.2.1. Overview

This use case mimics the inheritance and polymorphic behaviour of object oriented programming languages.

In object oriented programming languages, the concept of inheritance can be used to create a new class based on an existing class. This establishes an "is-a" relationship between the versions. Polymorphism is one of the key features of object oriented programming: this is where an instance of a subclass can be used in the place of an instance of a superclass. The subclass is also known as a "derived class," and the superclass as a "base class."

Programmers may want to develop systems where the programming classes are serialized into XML. Hence, methods (implemented as services) need to behave the same way with regards to the serialized instances.

For example, consider a class that defines a Person. The XML Schema for the serialized class will be known as schema P.

class Person {
  string name;
  date dob; /* date of birth */
};

An example of a serialized Person object is:

<person>
  <name>Amy Smith</name>
  <dob>1938-11-01</dob>
</person>

An employee class is defined as inheriting from a Person, and adding an extra "employee_ID" member variable. The XML Schema for the serialized employee will be known as E(P).

class Employee: public Person {
  int employee_number;
};

An example of a serialized employee object is:

<person>
  <name>Michael Smith</name>
  <dob>1938-11-01</dob>
  <employee_number>69105</employee_number>
</person>

A manager is defined as inheriting from an employee, and adding an extra "project" member variable. The XML Schema for the serialized manager will be known as M(E(P)) or simply M.

class Manager: public Employee {
  string project;
};

An example of a serialized manager object is:

<person>
  <name>Steve Smith</name>
  <dob>1938-11-01</dob>
  <employee_number>69105</employee_number>
  <project>XY001</project>
</person>

A senior manager is defined as inheriting from manager, and adding an extra "department" member variable. The XML Schema for the serialized senior manager will be known as S(M(E(P))) or simply S.

class SeniorManager: public Manager {
  string department;
};

An example of a serialized senior manager object is:

<person>
  <name>Tim Smith</name>
  <dob>1938-11-01</dob>
  <employee_number>69105</employee_number>
  <project>RD001</project>
  <department>Research and Development</department>
</person>

Now consider a service that expects person objects. In this case it is a user database, which can accept any type of person (which includes employees, managers and senior managers).

class User_Database {
public:
  void register(Person p);
  ...
};

There is also a payroll service which accepts employees (which includes managers and senior managers), but not ordinary Persons (who are not employees).

class Payroll {
public:
  void pay_wages(Employee e);
  ...
};

Note: the element names have not changed in this example. However, the versioning mechanism might possibly require them to change to reflect the different types as a part of how the mechanisms operate.

Figure: object-oriented

2.2.2. "Object-oriented" scenario A: Employee as a person

[AXIS]Instance processed: E(P) [Employee]

[AXIS]Schema Availability: P [Person]

[AXIS]Operation performed: Schema validation

Brief summary: The user database receives an employee as a user. Although it does not know about employees, it accepts it because an employee is a version of a person. The schema processor is forward compatiable.

Basic course of events:

1. An employee instance is submitted into the user database service.
2. The service identifies the instance as a version of a person.
3. The instance is validated against the person schema.

Desired outcome:

2.2.3. "Object-oriented" scenario B: Manager as a person

[AXIS]Instance processed: M(E(P)) [Manager]

[AXIS]Schema Availability: P [Person]

[AXIS]Operation performed: Schema validation

Brief summary: The user database receives a manager as a user. Although it does not know about managers, it accepts it because it is an indirect version of a person. The schema processor is forward compatiable.

Basic course of events:

1. A manager instance is submitted into the user database service.
2. The service identifies the instance as a version of a person.
3. The instance is validated against the person schema.

Desired outcome:

2.2.4. "Object-oriented" scenario C: Manager in payroll

[AXIS]Instance processed: M(E(P)) [Manager]

[AXIS]Schema Availability: P and E(P) [Person and Employee]

[AXIS]Operation performed: Schema validation

Brief summary: The payroll service receives a manager object. Although it does not know about managers, it accepts it because a manager is a version of an employee. The schema processor is forward compatiable.

Basic course of events:

1. A manager instance is submitted into the payroll service.
2. The service identifies the instance as a version of an employee.
3. The instance is validated against the employee schema.

Desired outcome:

2.2.5. "Object-oriented" scenario D: Senior Manager in payroll

[AXIS]Instance processed: S(M(E(P))) [Senior Manager]

[AXIS]Schema Availability: P and E(P) [Person and Employee]

[AXIS]Operation performed: Schema validation

Brief summary: The payroll service receives a senior manager object. Although it does not know about senior managers, it does accept it because a senior manager is an indirect version of an employee. The schema processor is forward compatiable.

Basic course of events:

1. A senior manager instance is submitted into the payroll service.
2. The service identifies the instance as a version of an employee.
3. The instance is validated against the employee schema.

Desired outcome:

The processor might be able to easily determine that the instance is a person (from looking at the element name). However, the processor cannot blindly assume that any instance of a person is an employee. For example, there could be a customer subclass derived from the person class, but a customer is not an employee.

2.2.6. "Object-oriented" scenario E: Person in payroll

[AXIS]Instance processed: P [Person]

[AXIS]Schema Availability: P and E(P) [Person and Employee]

[AXIS]Operation performed: Schema validation

Brief summary: The payroll service incorrectly receives a person object. A person is not an employee, so it is not a valid.

Basic course of events:

1. A person instance is submitted into the payroll service.
2. The service tries to identify the instance.

Desired outcome:

2.2.7. Discussion

Currently, XML Schema 1.0 can solve this use case if all the schemas are available to the processor. However, it cannot solve the situation described here where the newer versions of the schema are not available.

It is not clear if it is necessary to support multiple inheritance.

2.3. Ignore-unknowns

2.3.1. Overview

In this use case, schemas are written to match the expectations of the programs which process the instances. These programs operate in a "ignore what they don't expect" mode.

Consider a convenience store, with a point-of-sale (POS) cash register that receives product data from various devices in the store. There are many different types of devices and they are produced by different vendors.

Due to market forces and competition between the vendors, there is no single standard data schema that they all use. Each vendor enhances their own schemas to be more competative within their own line of products. However, this produces problems for the customers who use products from multiple vendors. Their schemas are similar to each other, because they borrow features from each other, but there is no well defined versioning lineage between them.

One vendor produces a price management system. It sends product pricing information to the cash registers. It generates XML instances which contains an item_code number, a description and a price. The schema for this will be called schema P.

An example price management instance P is:

<product>
  <item_code>0130655678</item_code>
  <description>Orange juice</description>
  <price>4.45</price>
</product>

Another vendor produces fuel pumps. It sends messages which contains additional information that is only relevant to the management of fuel pumps. This vendor may have based their fuel pump schema on the price management schema, or they might not have -- due to the nature of the unregulated marketplace, the relationships between the schemas is unclear.

The fuel pumps transmit instances containing a description, volume in litres, and a price. It is similar to the inventory, because it has a description and a price. However, it is different because the item code is replaced by a fuel_code, and it adds price_per_litre and litres elements. The schema for this will be called schema F.

An example fuel pump instance F is:

<product>
  <fuel_code>14</fuel_code>
  <description>Diesel</description>
  <price_per_litre>0.899</price_per_litre>
  <litres>42.1</litres>
  <price>37.85</price>
</product>

Note that elements designed to hold extension data do not work in this environment. Firstly, because it is not a scalable solution when there are many versions of versions. Secondly, because vendors value their own data as being important to them -- they do not want to make them appear inferior to the other data by placing them inside an extension element.

A third vendor produces cash registers. The cash registers simply displays the products to the user.

Since the cash register should interoperate with as many different types of devices as possible, it has been written to be flexible in what it expects. Specifically, it has been written with the assumption that it will ignore any extra elements that it didn't expect. The different devices can add extra elements without breaking the application.

XML Schema should help promote interoperability between the devices. Some of these devices may be embedded devices that are difficult to update and contain limited processing power. The more powerful devices might validate messages when they are received. The more primitive devices might not perform any schema validation at all. However, they will will reject bad messages. If there is a problem (e.g. accusations that the fuel pump created a bad message, or the cash register rejected a good message) the XML Schema could be brought out to settle the dispute. In either case, the XML Schema needs to be able to represent the input that the application expects.

The application could be written in many different ways. For example, the following is a fragment of an XSLT script that represents the behaviour of the application. The important aspect of this example is that it works on any document as long as there is a description and price in the transaction -- it deliberately ignores any extra elements that it does not use.

<xsl:template match="product">
  <tr>
    <td><xsl:value-of select="description"/></td>
    <td><xsl:value-of select="price"/></td>
  </tr>
</xsl:template>

The challenge is to represent what the cash register program accepts using XML Schema. The schema must reflect the "ignore what they don't expect" behaviour that is inherent in the programs. This schema will be called schema C. It should be usable as an effective filter: for discriminating between input the program accepts, and input that it will fail on.

A fourth vendor produces fuel volume trackers. These keep track of the amount of fuel sold. They are designed to accept the product messages from the fuel pumps. An example of how the volume trackers might be processing the instances is shown by the following program. This program uses uses an event driven XML parser (only the callback functions are shown here). The important point is that this program only looks at the "description" and "litres" elements -- ignoring anything else it is not expecting. Also, the two elements are expected in that particular order.

/* Numbers in square brackets indicate the expected sequence of events.
   initial state = STATE_START */

void
startElement (void* data, const XML_Char* name, const XML_Char** attr)
{
  switch (state) {
  case STATE_START:
    if (strcmp(name, "product") == 0) {
      state = STATE_IN_PRODUCT; /* [1] */
    }
    break;
 
  case STATE_IN_PRODUCT:
    if (strcmp(name, "description") == 0) {
      state = STATE_IN_DESCRIPTION; /* [2] */
    }
    break;
 
  case STATE_SEEN_DESCRIPTION:
    if (strcmp(name, "litres") == 0) {
      state = STATE_IN_LITRES; /* [5] */
    }
    break;
 
  case STATE_SEEN_LITRES:
    // ignore all other elements
    break;
  }
}

void
characters (void* data, const XML_Char* s, int length)
{
  if (state == STATE_IN_DESCRIPTION) {
    current_fuel = database_lookup_fuel(s, length); /* [3] */

  } else if (state == STATE_IN_LITRES) {
    database_add_to_fuel(current_fuel, string_to_number(s, length));
    current_fuel = 0; /* [6] */
  }
}

void
endElement (void* data, const XML_Char* name)
{
  if (strcmp(name, "description") == 0) {
    state = STATE_SEEN_DESCRIPTION; /* [4] */
  } else if (strcmp(name, "litres") == 0) {
    state = STATE_SEEN_LITRES; /* [7] */
  } else if (strcmp(name, "product") == 0) {
    state = STATE_START; /* [8] */
  }
}

The XML Schema for the volume tracker application will be called schema V.

Figure: ignore-unknowns

2.3.2. "Ignore-unknowns" scenario A: Price management to cash register

[AXIS]Instance processed: P [Price management]

[AXIS]Schema Availability: C [Cash register]

[AXIS]Operation performed: Schema validation

Brief summary: The output from the price management system is sent to the cash register, and the extra data in it is ignored.

Basic course of events:

1. The price management system generates an instance of the price management schema.
2. The instance is sent to the cash register.
3. The cash register processes the instance.

Desired outcome:

2.3.3. "Ignore-unknowns" scenario B: Fuel pump to cash register

[AXIS]Instance processed: F [Fuel pump]

[AXIS]Schema Availability: C [Cash register]

[AXIS]Operation performed: Schema validation

Brief summary: The output from the fuel pump is sent to the cash register, and the extra data in it is ignored.

Basic course of events:

1. The fuel pump generates an instance of the fuel pump schema.
2. The instance is sent to the cash register.
3. The cash register processes the instance.

Desired outcome:

This scenario is the same as Scenario A, except that the price management system has been replaced by the fuel pump.

2.3.4. "Ignore-unknowns" scenario C: Price management to volume tracker

[AXIS]Instance processed: P [Price management]

[AXIS]Schema Availability: V [Volume tracker]

[AXIS]Operation performed: Schema validation

Brief summary: The output from the price management system is sent (in error) to the fuel volume tracker.

Basic course of events:

1. The price management system generates an instance of the price management schema.
2. The instance is sent to the volume tracker.
3. The volume tracker attempts to process the instance.

Desired outcome:

2.3.5. "Ignore-unknowns" scenario D: Fuel pump to volume tracker

[AXIS]Instance processed: F [Fuel pump]

[AXIS]Schema Availability: V [Volume tracker]

[AXIS]Operation performed: Schema validation

Brief summary: The output from the fuel pump is sent to the fuel volume tracker.

Basic course of events:

1. The fuel pump generates an instance of the fuel pump schema.
2. The instance is sent to the volume tracker.
3. The volume tracker processes the instance.

Desired outcome:

2.3.6. Discussion

This use case is about providing mechanisms in XML Schema so that it can more closely represent the class of documents accepted by XML processing programs. A certain class of programs are designed to be forward-compatiable by deliberately ignoring extra data that they didn't expect. This approach to versioning allows newer versions to add nodes without breaking older applications.

For example, schema "I" could contain:

<xsd:complexType name="product_inventory">
  <xsd:sequence>
    <xsd:element name="item_code" type="xsd:string"/>
    <xsd:element name="description" type="xsd:string"/>
    <xsd:element name="price" type="xsd:decimal"/>
  </xsd:sequence>
</xsd:complexType>

Schema "F" could contain:

<xsd:complexType name="fuel_pump_product">
  <xsd:sequence>
    <xsd:element name="fuel_code" type="xsd:string"/>
    <xsd:element name="description" type="xsd:string"/>
    <xsd:element name="price_per_litre" type="xsd:decimal"/>
    <xsd:element name="litres" type="xsd:decimal"/>
    <xsd:element name="price" type="xsd:decimal"/>
  </xsd:sequence>
</xsd:complexType>

With the current mechanisms in XML Schema, a suitable schema for representing what the cash register accepts cannot be written. For example, the following schema is not suitable, because it would reject instances from the inventory system and the fuel pump (even though the cash register can process them).

<!-- Note: this schema is not suitable, it is too restrictive -->
<xsd:complexType name="cash_register_product">
  <xsd:sequence>
    <xsd:element name="description" type="xsd:string"/>
    <xsd:element name="price" type="xsd:decimal"/>
  </xsd:sequence>
</xsd:complexType>

With this schema, instances of P and F are both invalid. However, the desired behaviour is that instances of P and F are both valid.

Similarly, a suitable schema for the volume tracker cannot be written using current XML Schema mechanisms.

<!-- Note: this schema is not suitable, it is too restrictive -->
<xsd:complexType name="volume_tracker_product">
  <xsd:sequence>
    <xsd:element name="description" type="xsd:string"/>
    <xsd:element name="litres" type="xsd:decimal"/>
    <xsd:any minOccurs="0" maxOccurs="unbounded" processContents="skip"/>
  </xsd:sequence>
</xsd:complexType>

If the above schema was used as the volume tracker schema, instances of P and F would be both invalid. However, the desired behaviour is that instances of P are invalid, but instance of F must be valid.

2.4. Comparison

2.4.1. Overview

In this use case, schemas are compared to determine if they can be treated as versions of each other.

This use case uses the situation of a convenience store with a cash register, price management system and fuel pumps. This situation was described in the previous use case.

2.4.2. "Comparison" scenario A: Cash register and price management system

[AXIS]Instance processed: none

[AXIS]Schema Availability: C and P [Cash register and Price management]

[AXIS]Operation performed: comparison

Brief summary: A systems integrator wants to check if the data produced by the price management system is suitable for the cash register to use.

Basic course of events:

1. The cash register schema is obtained.
2. The price management schema is obtained.
3. The two schemas are compared.

Desired outcome:

Although both schemas might have been developed independently, it is possible to consider the price management system as a compatiable version of the cash register schema. This operation allows that to be determined.

2.4.3. "Comparison" scenario B: Cash register and fuel pump

[AXIS]Instance processed: none

[AXIS]Schema Availability: C and F [Cash register and Fuel pump]

[AXIS]Operation performed: comparison

Brief summary: A systems integrator wants to check if the data produced by the fuel pump is suitable for the cash register to use.

Basic course of events:

1. The cash register schema is obtained.
2. The fuel pump schema is obtained.
3. The two schemas are compared.

Desired outcome:

2.4.4. "Comparison" scenario C: Volume tracker and price management

[AXIS]Instance processed: none

[AXIS]Schema Availability: V and P [Volume tracker and Price management]

[AXIS]Operation performed: comparison

Brief summary: A systems integrator wants to check if the data produced by the price management system is suitable for the volume tracker to use.

Basic course of events:

1. The volume tracker schema is obtained.
2. The price management schema is obtained.
3. The two schemas are compared.

Desired outcome:

The systems integrators now know that the output from the price management system is not compatiable with the volume tracker.

2.4.5. "Comparison" scenario D: Volume tracker and fuel pump

[AXIS]Instance processed: none

[AXIS]Schema Availability: V and F [Volume tracker and Fuel pump]

[AXIS]Operation performed: comparison

Brief summary: A systems integrator wants to check if the data produced by the fuel pump is suitable for the volume tracker to use.

Basic course of events:

1. The volume tracker schema is obtained.
2. The fuel pump schema is obtained.
3. The two schemas are compared.

Desired outcome:

The systems integrators now know that output from the fuel pumps are compatiable with the volume tracker.

2.4.6. Discussion

The comparison of schemas is a new type of operation. This use case has illustrated the operation of testing if the set of documents described by one schema is a subset of the set described by a different schema. Other possible operations could be to determine if that relationship is a proper subset, or equality.

Another example where this can be useful is with XHTML. In XHTML 1.0, the strict, transitional, and frameset document types all use the same XML namespace. However, the schemas for them are slightly different. Also, variants of XHTML 1.0 (such as XHTML Basic) also share the same namespace, even though the schemas for them are different. Thus, an application that processes one variant of XHTML needs to know whether it can accept instances of another variant.

2.5. Specialization

2.5.1. Overview

In this use case, schemas are specialized with more specific versions, and processing software needs to operate in the presence or absence of those specialized schemas. A distinguishing characteristic of this use case is that instances of the specialized schemas are always valid instances of the base schema.

There is a generic base schema which has been approved by a country's health department. To ensure interoperability, the country's government has mandated that all compliant software must be able to store and process data that corresponds to the generic base schema. This generic base schema has been designed to store medical data, but in a very non-specific way, so that it is flexible enough to handle a wide variety of data. This is necessary because medical data is very diverse, as well as needs to change to cope with new medical technologies and practices.

For example, the generic base schema could contain a generic datatype for storing two measurement values. We shall call this generic base schema: schema B.

<xsd:complexType name="measurement2">
  <xsd:sequence>
    <xsd:element name="value1">
      <xsd:complexType>
        <xsd:sequence>
          <xsd:element name="magnitude" type="xsd:decimal"/>
          <xsd:element name="unit" type="xsd:string"/>
        </xsd:sequence>
      </xsd:complexType>
    </xsd:element>
    <xsd:element name="value2">
      <xsd:complexType>
        <xsd:sequence>
          <xsd:element name="magnitude" type="xsd:decimal"/>
          <xsd:element name="unit" type="xsd:string"/>
        </xsd:sequence>
      </xsd:complexType>
    </xsd:element>
  </xsd:sequence>
</xsd:complexType>

An example instance of this element from schema B is:

<data>
  <value1>
    <magnitude>1660</magnitude>
    <unit>mm</unit>
  </value1>
  <value2>
    <magnitude>68</magnitude>
    <unit>kg</unit>
  </value2>
</data>

There are two products that use the generic schema: a General Practice (GP) management system, and a hospital clinical information management system. Clinical records are stored and exchanged between the two programs using XML. When the programs receive data, they validate it before storing or process them. The stored records may not be modified by the sender to suit the receiver because of legal and or technical reasons (e.g. the records are digitally signed).

The country's General Practice doctor organization decides that it wants to standardize how blood pressures are recorded. They define it as a specialization (or constraint) of the generic base schema's measurement datatype. The versioning mechanism (whatever that may be) is used indicate that the blood pressure is a version of the generic measurement datatype.

Here is an example from the specialized blood pressure schema, which we will call schema V(B). It constraints the units of both measurments to be mmHg.

<xsd:complexType name="blood_pressure">
  <xsd:sequence>

    <xsd:element name="value1">
      <xsd:complexType>
        <xsd:sequence>
          <xsd:element name="magnitude" type="xsd:decimal"/>
          <xsd:element name="unit">
            <xsd:simpleType>
              <xsd:restriction base="xsd:string">
                <xsd:enumeration value="mmHg"/>
              </xsd:restriction>
            </xsd:simpleType>
          </xsd:element>
        </xsd:sequence>
      </xsd:complexType>
    </xsd:element>

    <xsd:element name="value2">
      <xsd:complexType>
        <xsd:sequence>
          <xsd:element name="magnitude" type="xsd:decimal"/>
          <xsd:element name="unit">
            <xsd:simpleType>
              <xsd:restriction base="xsd:string">
                <xsd:enumeration value="mmHg"/>
              </xsd:restriction>
            </xsd:simpleType>
          </xsd:element>
        </xsd:sequence>
      </xsd:complexType>
    </xsd:element>

  </xsd:sequence>
</xsd:complexType>

An example instance of a blood pressure specialization V(B) is:

<data>
  <value1>
    <magnitude>142</magnitude>
    <unit>mmHg</unit>
  </value1>
  <value2>
    <magnitude>80</magnitude>
    <unit>mmHg</unit>
  </value2>
</data>

The General Practice software is modified or updated to have the blood pressure schema, but the hospital software is not. There can be many reasons why the hospital software does not have the blood pressure schema, such as: the timing cycle of software upgrades, it may be running in an off-line mode where schemas cannot be fetched, cost, performance, security or policy.

Later on, a physiotherapy clinic decides that it wants to further refine the definition of a blood pressure to only contain sensible values for the systolic and diastolic readings. The versioning mechanism would indicate that this refined blood pressure datatype is a version of the ordinary blood pressure datatype.

This is an excerpt from the refined physiotherapy schema W(V(B)). It shows that the numeric values in the measurement are restricted to specific numerical ranges.

<xsd:complexType name="blood_pressure_refined">
  <xsd:sequence>

    <xsd:element name="value1">
      <xsd:complexType>
        <xsd:sequence>
          <xsd:element name="magnitude">
            <xsd:simpleType>
              <xsd:restriction base="xsd:decimal">
                <xsd:minInclusive value="90"/>
                <xsd:maxInclusive value="140"/>
              </xsd:restriction>
            </xsd:simpleType>
          </xsd:element>
          <xsd:element name="unit">
            <xsd:simpleType>
              <xsd:restriction base="xsd:string">
                <xsd:enumeration value="mmHg"/>
              </xsd:restriction>
            </xsd:simpleType>
          </xsd:element>
        </xsd:sequence>
      </xsd:complexType>
    </xsd:element>

    <xsd:element name="value2">
      <xsd:complexType>
        <xsd:sequence>
          <xsd:element name="magnitude">
            <xsd:simpleType>
              <xsd:restriction base="xsd:decimal">
                <xsd:minInclusive value="60"/>
                <xsd:maxInclusive value="90"/>
              </xsd:restriction>
            </xsd:simpleType>
          </xsd:element>
          <xsd:element name="unit">
            <xsd:simpleType>
              <xsd:restriction base="xsd:string">
                <xsd:enumeration value="mmHg"/>
              </xsd:restriction>
            </xsd:simpleType>
          </xsd:element>
        </xsd:sequence>
      </xsd:complexType>
    </xsd:element>

  </xsd:sequence>
</xsd:complexType>
Figure: specialization

2.5.2. "Specialization" scenario A: Hospital to GP

[AXIS]Instance processed: B [Hospital]

[AXIS]Schema Availability: B and V(B) [Hospital and GP]

[AXIS]Operation performed: Schema validation

Brief summary: The hospital system sends a record to the GP system. An instance of the base schema will be processed according to the base schema. Even though the processor has access to the specialized schema, it is not used. The schema processor is backward compatiable.

Basic course of events:

1. The hospital generates an instance of the base schema.
2. The base instance is sent to the GP system (which has access to both the base and the specialized schema).
3. The GP system processes the base instance.

Desired outcome:

Since the GP system is receiving a base instance, it cannot (and should not) use the specialization schema.

2.5.3. "Specialization" scenario B: GP to GP

[AXIS]Instance processed: V(B) [GP]

[AXIS]Schema Availability: B and V(B) [Hospital and GP]

[AXIS]Operation performed: Schema validation

Brief summary: One GP system sends a message to another GP system. An instance of the specialized schema will be processed according to the specialized schema. Even though the processor could have validated it using the base schema, it must use the specialized schema.

Basic course of events:

1. Another GP system generates an instance of the specialized schema.
2. The specialized instance is sent to the GP system (which has access to both the base and the specialized schema)
3. The GP system processes the specialized instance.

Desired outcome:

The receiving GP system needs to treat the instance as a blood pressure, because it wants to process it in a special way (e.g. graph it or apply decision support with it). Although it could also validate it using the generic base schema, it does not do so because the extra constraints in the blood pressure schema are important for the application to correctly interpret the data as a blood pressure.

2.5.4. "Specialization" scenario C: Hospital to hospital

[AXIS]Instance processed: B [Hospital]

[AXIS]Schema Availability: B [Hospital]

[AXIS]Operation performed: Schema validation

Brief summary: One hospital system sends a message to another hospital. An instance of the base schema is processed according to the base schema.

Basic course of events:

1. Another hospital system generates an instance of the base schema.
2. The base instance is sent to the hospital system (which only has access to the base schema).
3. The hospital system processs the base instance.

Desired outcome:

This use case does not invoke any special versioning feature, but it is included here for completeness.

2.5.5. "Specialization" scenario D: GP to hospital

[AXIS]Instance processed: V(B) [GP]

[AXIS]Schema Availability: B [Hospital]

[AXIS]Operation performed: Schema validation

Brief summary: A GP system sends a message to a hospital system. An instance of the specialized schema is processed using the base schema when the processor does not have access to the specialized schema. The schema processor is forward compatiable.

Basic course of events:

1. The GP system generates an instance of the specialized schema.
2. The specialized instance is sent to the hospital system (which only has access to the base schema).
3. The hospital system processs the specialized instance.

Desired outcome:

The hospital system cannot recognise the data as a blood pressure, but can process it generically.

2.5.6. "Specialization" scenario E: Physiotherapy to hospital

[AXIS]Instance processed: W(V(B)) [Physiotherapy]

[AXIS]Schema Availability: B [Hospital]

[AXIS]Operation performed: Schema validation

Brief summary: A physiotherapy system sends a message to a hospital system. An instance of the refined physiotherapy schema is processed using the base schema (when the processor does not have access to the refined physiotherapy schema nor access to the specialized GP schema). The schema processor is forward compatiable across multiple versions.

Basic course of events:

1. The physiotherapy system generates an instance of the refined physiotherapy schema.
2. The refined physiotherapy instance is sent to the hospital system (which only has access to the base schema).
3. The hospital system processs the refined instance.

Desired outcome:

This scenario shows that the versioning must work across multiple generations of versions, not just between two successive versions.

2.5.7. "Specialization" scenario F: Physiotherapy to GP

[AXIS]Instance processed: W(V(B)) [Physiotherapy]

[AXIS]Schema Availability: B and V(B) [Hospital and GP]

[AXIS]Operation performed: Schema validation

Brief summary: A physiotherapy system sends a message to a GP system. An instance of the refined physiotherapy schema is processed using the blood pressure schema.

Basic course of events:

1. The physiotherapy system generates an instance of the refined physiotherapy schema.
2. The refined physiotherapy instance is sent to the GP system (which only has access to both the base schema and the blood pressure schema, but not the refined physiotherapy instance).
3. The GP system processs the refined instance.

Desired outcome:

Although the GP system has access to the basic schema (and could have validated the instance against it), it needs to validate it using the blood pressure schema. There are extra constraints that will be checked by it, which the basic schema would not check.

2.5.8. Discussion

The inclusion of the physiotherapy in this use case is to highlight the point that the versioning mechanism must work when there are multiple versions involved. A solution that provides for only a single level of versioning is not practical in the real world.

2.6. Renaming

2.6.1. Overview

In this use case, schemas are specialized with more specific versions and information items are renamed in those versions. The processing software needs to operate in the presence or absence of those specialized schemas. Note: this use case is similar to the "Specialization" use case, with the only difference being that the elements are renamed in the new versions.

There is a generic base schema which has been approved by a country's health department. To ensure interoperability, the country's government has mandated that all compliant software must be able to store and process data that corresponds to the generic base schema. This generic base schema has been designed to store medical data, but in a very non-specific way, so that it is flexible enough to handle a wide variety of data. This is necessary because medical data is very diverse, as well as needs to change to cope with new medical technologies and practices.

For example, the generic base schema could contain a generic datatype for storing two measurement values. We shall call this generic base schema: schema B.

<xsd:complexType name="measurement2">
  <xsd:sequence>
    <xsd:element name="value1">
      <xsd:complexType>
        <xsd:sequence>
          <xsd:element name="magnitude" type="xsd:decimal"/>
          <xsd:element name="unit" type="xsd:string"/>
        </xsd:sequence>
      </xsd:complexType>
    </xsd:element>
    <xsd:element name="value2">
      <xsd:complexType>
        <xsd:sequence>
          <xsd:element name="magnitude" type="xsd:decimal"/>
          <xsd:element name="unit" type="xsd:string"/>
        </xsd:sequence>
      </xsd:complexType>
    </xsd:element>
  </xsd:sequence>
</xsd:complexType>

An example instance of this element from schema B is:

<data>
  <value1>
    <magnitude>1660</magnitude>
    <unit>mm</unit>
  </value1>
  <value2>
    <magnitude>68</magnitude>
    <unit>kg</unit>
  </value2>
</data>

There are two products that use the generic schema: a General Practice (GP) management system, and a hospital clinical information management system. Clinical records are stored and exchanged between the two programs using XML. When the programs receive data, they validate it before storing or process them.

The country's General Practice doctor organization decides that it wants to standardize how blood pressures are recorded. They define it as a specialization (or constraint) of the generic base schema's measurement datatype. The versioning mechanism (whatever that may be) is used indicate that the blood pressure is a version of the generic measurement datatype.

Here is an example from the specialized blood pressure schema, which we will call schema V(B). It constraints the units of both measurments to be mmHg.

<xsd:complexType name="blood_pressure">
  <xsd:sequence>

    <xsd:element name="systolic">
      <xsd:complexType>
        <xsd:sequence>
          <xsd:element name="magnitude" type="xsd:decimal"/>
          <xsd:element name="unit">
            <xsd:simpleType>
              <xsd:restriction base="xsd:string">
                <xsd:enumeration value="mmHg"/>
              </xsd:restriction>
            </xsd:simpleType>
          </xsd:element>
        </xsd:sequence>
      </xsd:complexType>
    </xsd:element>

    <xsd:element name="diastolic">
      <xsd:complexType>
        <xsd:sequence>
          <xsd:element name="magnitude" type="xsd:decimal"/>
          <xsd:element name="unit">
            <xsd:simpleType>
              <xsd:restriction base="xsd:string">
                <xsd:enumeration value="mmHg"/>
              </xsd:restriction>
            </xsd:simpleType>
          </xsd:element>
        </xsd:sequence>
      </xsd:complexType>
    </xsd:element>

  </xsd:sequence>
</xsd:complexType>

An example instance of a blood pressure specialization V(B) is:

<blood_pressure>
  <systolic>
    <magnitude>142</magnitude>
    <unit>mmHg</unit>
  </systolic>
  <diastolic>
    <magnitude>80</magnitude>
    <unit>mmHg</unit>
  </diastolic>
</blood_pressure>

The General Practice software is modified or updated to have the blood pressure schema, but the hospital software is not. There can be many reasons why the hospital software does not have the blood pressure schema, such as: the timing cycle of software upgrades, it may be running in an off-line mode where schemas cannot be fetched, cost, performance, security or policy.

Later on, a physiotherapy clinic decides that it wants to further refine the definition of a blood pressure to only contain sensible values for the systolic and diastolic readings. The versioning mechanism would indicate that this refined blood pressure datatype is a version of the ordinary blood pressure datatype.

This is an excerpt from the refined physiotherapy schema W(V(B)). It shows that the numeric values in the measurement are restricted to specific numerical ranges.

<xsd:complexType name="blood_pressure_refined">
  <xsd:sequence>

    <xsd:element name="systolic">
      <xsd:complexType>
        <xsd:sequence>
          <xsd:element name="magnitude">
            <xsd:simpleType>
              <xsd:restriction base="xsd:decimal">
                <xsd:minInclusive value="90"/>
                <xsd:maxInclusive value="140"/>
              </xsd:restriction>
            </xsd:simpleType>
          </xsd:element>
          <xsd:element name="unit">
            <xsd:simpleType>
              <xsd:restriction base="xsd:string">
                <xsd:enumeration value="mmHg"/>
              </xsd:restriction>
            </xsd:simpleType>
          </xsd:element>
        </xsd:sequence>
      </xsd:complexType>
    </xsd:element>

    <xsd:element name="diastolic">
      <xsd:complexType>
        <xsd:sequence>
          <xsd:element name="magnitude">
            <xsd:simpleType>
              <xsd:restriction base="xsd:decimal">
                <xsd:minInclusive value="60"/>
                <xsd:maxInclusive value="90"/>
              </xsd:restriction>
            </xsd:simpleType>
          </xsd:element>
          <xsd:element name="unit">
            <xsd:simpleType>
              <xsd:restriction base="xsd:string">
                <xsd:enumeration value="mmHg"/>
              </xsd:restriction>
            </xsd:simpleType>
          </xsd:element>
        </xsd:sequence>
      </xsd:complexType>
    </xsd:element>

  </xsd:sequence>
</xsd:complexType>

2.6.2. "Renaming" scenario A: Hospital to GP

[AXIS]Instance processed: B [Hospital]

[AXIS]Schema Availability: B and V(B) [Hospital and GP]

[AXIS]Operation performed: Schema validation

Brief summary: The hospital system sends a record to the GP system. An instance of the base schema will be processed according to the base schema. Even though the processor has access to the specialized schema, it is not used. The schema processor is backward compatiable.

Basic course of events:

1. The hospital generates an instance of the base schema.
2. The base instance is sent to the GP system (which has access to both the base and the specialized schema).
3. The GP system processes the base instance.

Desired outcome:

Since the GP system is receiving a base instance, it cannot (and should not) use the specialization schema.

2.6.3. "Renaming" scenario B: GP to GP

[AXIS]Instance processed: V(B) [GP]

[AXIS]Schema Availability: B and V(B) [Hospital and GP]

[AXIS]Operation performed: Schema validation

Brief summary: One GP system sends a message to another GP system. An instance of the specialized schema will be processed according to the specialized schema. Even though the processor could have validated it using the base schema, it must use the specialized schema.

Basic course of events:

1. Another GP system generates an instance of the specialized schema.
2. The specialized instance is sent to the GP system (which has access to both the base and the specialized schema)
3. The GP system processes the specialized instance.

Desired outcome:

The receiving GP system needs to treat the instance as a blood pressure, because it wants to process it in a special way (e.g. graph it or apply decision support with it). Although it could also validate it using the generic base schema, it does not do so because the extra constraints in the blood pressure schema are important for the application to correctly interpret the data as a blood pressure.

2.6.4. "Renaming" scenario C: Hospital to hospital

[AXIS]Instance processed: B [Hospital]

[AXIS]Schema Availability: B [Hospital]

[AXIS]Operation performed: Schema validation

Brief summary: One hospital system sends a message to another hospital. An instance of the base schema is processed according to the base schema.

Basic course of events:

1. Another hospital system generates an instance of the base schema.
2. The base instance is sent to the hospital system (which only has access to the base schema).
3. The hospital system processs the base instance.

Desired outcome:

This use case does not invoke any special versioning feature, but it is included here for completeness.

2.6.5. "Renaming" scenario D: GP to hospital

[AXIS]Instance processed: V(B) [GP]

[AXIS]Schema Availability: B [Hospital]

[AXIS]Operation performed: Schema validation

Brief summary: A GP system sends a message to a hospital system. An instance of the specialized schema is processed using the base schema when the processor does not have access to the specialized schema. The schema processor is forward compatiable.

Basic course of events:

1. The GP system generates an instance of the specialized schema.
2. The specialized instance is sent to the hospital system (which only has access to the base schema).
3. The hospital system processs the specialized instance.

Desired outcome:

The hospital system cannot recognise the data as a blood pressure, but can process it generically.

2.6.6. "Renaming" scenario E: Physiotherapy to hospital

[AXIS]Instance processed: W(V(B)) [Physiotherapy]

[AXIS]Schema Availability: B [Hospital]

[AXIS]Operation performed: Schema validation

Brief summary: A physiotherapy system sends a message to a hospital system. An instance of the refined physiotherapy schema is processed using the base schema (when the processor does not have access to the refined physiotherapy schema nor access to the specialized GP schema). The schema processor is forward compatiable across multiple versions.

Basic course of events:

1. The physiotherapy system generates an instance of the refined physiotherapy schema.
2. The refined physiotherapy instance is sent to the hospital system (which only has access to the base schema).
3. The hospital system processs the refined instance.

Desired outcome:

This scenario shows that the versioning must work across multiple generations of versions, not just between two successive versions.

2.6.7. "Renaming" scenario F: Physiotherapy to GP

[AXIS]Instance processed: W(V(B)) [Physiotherapy]

[AXIS]Schema Availability: B and V(B) [Hospital and GP]

[AXIS]Operation performed: Schema validation

Brief summary: A physiotherapy system sends a message to a GP system. An instance of the refined physiotherapy schema is processed using the blood pressure schema.

Basic course of events:

1. The physiotherapy system generates an instance of the refined physiotherapy schema.
2. The refined physiotherapy instance is sent to the GP system (which only has access to both the base schema and the blood pressure schema, but not the refined physiotherapy instance).
3. The GP system processs the refined instance.

Desired outcome:

Although the GP system has access to the basic schema (and could have validated the instance against it), it needs to validate it using the blood pressure schema. There are extra constraints that will be checked by it, which the basic schema would not check.

2.6.8. Discussion

This use case originated from discussions about the "specialization" use case. Although the "specialization" use case has its basis in a real world example, this one does not. However, it is an interesting use case worth considering.

2.7. Customization

2.7.1. Overview

In this use case, a schema is customized for local needs, but the local version must not break validation with the original schema.

This use case is different from the "specialization" use case because the customizations are extensions of the base schema. Without any versioning mechanisms, instances of the customized schemas would not be valid instances of the base schema.

This use case involves a Big-store which has a head office and branches. One of the suppliers to Big-store is Video-supplier, which has a head office and warehouses.

A base schema is defined by a Big-store for invoices. We shall call this schema B. Big-store requires all of its suppliers to provide invoices using it.

An example instance of an item in the Big-store invoice is:

<item>
  <part>1138</part>
  <description>Generic widget</description>
  <quantity>12</part>
</item>

When the Big-store created this schema, it did not know how others may want to extend it, and they had no plans or expectations for the reuse of the schema. They did not deliberately spend time putting in hooks for versioning. However, this does not necessarily preclude versioning from happening, as long as versioning is possible by default. They did not deliberately preclude versioning from happening.

The Video-supplier decides to reimplement their stock tracking system to be natively based on the Big-store invoice schema. They want to avoid the need to translate, between the schemas they use internally and those it sends to external customers.

However, Video-supplier needs to store additional information in their internal invoices. To do this, they create a customized schema as a version of the Big-store invoice schema.

In this example, they want to indicate which warehouse the items were shipped from. They have added their customizations to the same namespace as the Big-store invoice. We shall call this schema V(B).

This is an example instance of the Video-supplier item:

<item>
  <part>1138</part>
  <description>Generic widget</description>
  <quantity>12</part>
  <source>
    <warehouse>W1</warehouse>
  </source>
</item>

An item is used in a number of different places in an invoice (e.g. items delivered, items backordered).

Invoices are sent from Video-supplier warehouses to the Video-supplier head office (which expects to see the customized information items). They are also sent from the Video-supplier head office to the Big-store head office. Invoices are also sent from the Big-store head office to their Big-store branches.

The versioning mechanism must accomplish two things: make it easy to write the extensions, and to not break processors expecting instances of the original schema.

Firstly, the versioning mechanism needs a simple way of indicating that wherever the Big-store invoice schema referred to a Big-store item, in the new Video-supplier schema a Video-supplier item must always be used instead. Creating a new definition of everything that refers to the Big-store item is not a good solution. The element could be used in many different places, which leads to a maintenance problem. Also, creating a new definition of everything would effectively create a parallel copy of the schema, where the clear relationship with the original schema would have been lost. The clear relationship being that only the item has been versioned, and nothing else.

Secondly, Video-supplier wants to send their customized invoices to the Big-store without translation (i.e. without needing to strip out the customized elements). Those new Video-supplier instances must work with old processors that expect Big-store instances, as shown in the scenarios below.

Figure: customization

2.7.2. "Customization" scenario A: Video head office to Big-store head office

[AXIS]Instance processed: V(B) [Video-supplier]

[AXIS]Schema Availability: B [Big-store]

[AXIS]Operation performed: Schema validation

Brief summary: The Video-supplier head office sends an invoice to the Big-store head office. An instance of the customized schema is processed by the base schema (without access to the customized schema). The schema processor is forward compatiable.

Basic course of events:

1. The Video-supplier system generates an instance of the customized schema.
2. The customized instance is sent to the Big-store head office system (which only has access to the base schema)
3. The Big-store head office system processes the customized instance.

Desired outcome:

Although the customized instance contains extra elements which are not a part of the base schema, they must not cause the processor or the application to fail. The instance must be valid according to the base schema, because the contract between the two companies was based on exchanging invoices valid according to the Big-store schema. The information returned by the PSVI must not cause the Big-store application to break.

2.7.3. "Customization" scenario B: Video warehouse to Video head office

[AXIS]Instance processed: V(B) [Video-supplier]

[AXIS]Schema Availability: B and V(B) [Big-store and Video-supplier]

[AXIS]Operation performed: Schema validation

Brief summary: A Video-supplier warehouse sends an invoice to the Video-supplier head office. An instance of the customized schema is processed by the customized schema.

Basic course of events:

1. The Video-supplier warehouse system generates an instance of the customized schema.
2. The customized instance is sent to the Video-supplier head office.
3. The Video-supplier head office system processes the customized instance.

Desired outcome:

2.7.4. "Customization" scenario C: Big-store head office to Big-store branch

[AXIS]Instance processed: B [Big-store]

[AXIS]Schema Availability: B [Big-store]

[AXIS]Operation performed: Schema validation

Brief summary: The Big-store head office sends an invoice to a Big-store branch. An instance of the base schema is processed by a procesor with the base schema.

Basic course of events:

1. The Big-store head office system generates an instance of the base schema.
2. The base instance is sent to a Big-store branch.
3. The Big-store branch system processes the base instance.

Desired outcome:

2.7.5. "Customization" scenario D: Big-store to Video head office

[AXIS]Instance processed: B [Big-store]

[AXIS]Schema Availability: B and V(B) [Big-store and Video-supplier]

[AXIS]Operation performed: Schema validation

Brief summary: The Big-store (incorrectly) sends an invoice to the Video-supplier head office. An instance of the base schema is processed by a system that expects an instance of the customized schema.

Basic course of events:

1. The Big-store system generates an instance of the base schema
2. The base instance is sent to the Video-supplier head office.
3. The Video-supplier head office system processes the base instance.

Desired outcome:

The Video-supplier head office application expects that all invoices it receives are Cooltoy invoices, which contain the extra warehouse information. It must rejects the invoice because it does not have the warehouse information.

2.8. MathML

[TODO]This use case needs to be completed.

What is the most convenient way to integrate new constructs (e.g. constructs which represent newly discovered mathematical constructs) into a specialized language like MathML? Is it possible to introduce new elements and attributes in such a way as to allow software which does not have hard-coded knowledge of them to do the right thing with them?

2.9. XSD versioning

2.9.1. Overview

The XML Schema definition language itself can be subject to versioning. Newer versions of the language may introduce changes that could affect how existing processors behave. It is desirable that the versioning mechanism allow for forward and backward compatiability. This applies to both the schema for schemas, as well as to the schema processors.

Mechanisms in the XML Schema definition language must allow for future versions to be created that will introduce new constructs in the language. Those future versions must behave well with existing schema processors -- a default behavour for them must be defined. One possible default behaviour would be to ignore the new constructs, although there could be other possible behaviours.

There may be situations where the default behaviour could be changed, where the alternative behaviour would be specified in the new schema. However, this would produce different results: between an old processor that has access to the new schema, and an old processor that does not.

The versioning mechanism developed to solve this use case (to handle new schema mechanisms) could also be used to handle changes relating to bug fixes, erratas and the depreciation of mechanisms.

Consider a version of the XML Schema definition language, version V. Additional constructs are added to it create a new version, version W. Informally, schema W will be referred to as the "new" version, and schema V as the "old" version.

Figure: xsd

2.9.2. "xsd" scenario A: Old instance with old schema on an old processor

[AXIS]Instance processed: V [Old]

[AXIS]Schema Availbility: V [Old]

[AXIS]Schema processor designed for: V [Old]

[AXIS]Operation performed: Schema validation

Brief summary: An old version instance is processed by an old version schema processor with the old schema.

Basic course of events:

1. The instance is read.
2. It is identified as a version of the old schema.
3. The instance is validated against the old schema.

Desired outcome:

This scenario does not involve any new constructs. It is the current situation, with any schema processor that does or does not support versioing.

2.9.3. "xsd" scenario B: Old instance with old schema on a new processor

[AXIS]Instance processed: V [Old]

[AXIS]Schema Availbility: V [Old]

[AXIS]Schema processor designed for: W [New]

[AXIS]Operation performed: Schema validation

Brief summary: An old version instance is processed by a new version schema processor, and the old schema is available.

Basic course of events:

1. The instance is read.
2. It is identified as a version of the old schema.
3. The instance is validated against the old schema.

Desired outcome:

Essentially, this scenario demonstrates that new schema processors need to be backward compatiable (supporting old schemas and their instances).

An alternative behaviour might be to declare that this scenario fails. That old processors do not need to support schemas that use new constructs in any way.

2.9.4. "xsd" scenario C: New instance with new schema on an old processor

[AXIS]Instance processed: W [New]

[AXIS]Schema Availbility: W [New]

[AXIS]Schema processor designed for: V [Old]

[AXIS]Operation performed: Schema validation

Brief summary: A new version instance is processed by an old version schema processor, but it does has access to the new schema.

Basic course of events:

1. The instance is read.
2. It is identified as an instance of the new schema, and the new schema is used.
3. However, the processor is not designed to accept the new schema, so it uses some default interpretation of the new schema constructs.
4. The instance is validated against the default interpretation of the new schema.

Desired outcome:

An alternative case would be if the new schema specified an alternative fallback behaviour. In that case, the old schema processor would interpret the new schema using the fallback behaviour, instead of the default behaviour. The instance would be validated using those alternative rules.

2.9.5. "xsd" scenario D: New instance with new schema on a new processor

[AXIS]Instance processed: W [New]

[AXIS]Schema Availability: W [New]

[AXIS]Schema processor designed for: W [New]

[AXIS]Operation performed: Schema validation

Brief summary: A new version instance is processed by a new version schema processor which has access to the new schema.

Basic course of events:

1. The instance is read.
2. It is identified as an instance of the new schema, and the new schema is used.
3. The instance is validated against the new schema.

Desired outcome:

This is the simple case where everything (the instance, schema, and processor) is conforming to the new version, so it should all work properly.

2.9.6. Discussion

The examples described below are possible types of changes that the versioning mechanism may have to handle. These are only hypothetical examples. They should not taken as an indiaction of what features are being considered, or not being considered, for future versions of XML Schema.

It is desirable that the versioning mechanism can handle all of them. However, a mechanism which handles only some of them may also be acceptable, since these are only hypothetical examples.

2.9.6.1. Example 1: New check constraints

A new top-level element called "xsd:check" is defined. It is analogous to a table-level check clause in SQL, and contains a series of predicates (expressed by xsd:test elements), each of which must be true of the document as a whole.

A processor designed to handle the new version of schemas will know how to check the constraints it contains. However, an older version processor will not know how.

The default behavior for an old version processor is to ignore the constraints and proceed without an error (although it might issue a warning).

The schema author should be able to indicate if the constraints cannot be ignored. In which case, an old version processor must signal an error.

Also, the creator of the XML instance may need to indicate how the old version processor should behave: whether it should ignore the checks, or fail if it can't process them.

2.9.6.2. Example 2: New embedded element (I)

Change of the syntax of content models. In addition to the 1.0 transfer syntax for content models, the WG wishes to allow a different XML transfer syntax. So where a version n processor expects an 'xsd:element', 'xsd:sequence', 'xsd:choice', or 'xsd:all' element, a version n+1 schema document may instead have a 'xsd:content' element.

Since a version n processor will not understand how to interpret the xsd:content element, a schema author interested in co-existence with version n processors may wish to specify a version-n-style content model as a fallback. (If the main appeal of xsd:content is that it provides currently unavailable functionality, the v.n fallback will be only an approximation. If the main appeal is that xsd:content is easier to read or more elegant, it seems unlikely that authors will want to provide the fallback for schemas being edited. Once the schema is frozen, however, the old syntax could be added by hand or by machine for portability's sake.)

2.9.6.3. Example 3: New embedded element (II)

Like the xsd:schema element itself, in v.n+1 the other top-level source declarations are also to be allowed to have xsd:check elements. The schema author may wish to provide fallbacks, where appropriate, using (say) key and keyref, which approximate the tests in the check clause and which can be performed by v.n processors.

The proper behavior of a v.n processor is to perform the fallback validation if any is specified, and to ignore the check clause otherwise (possibly with a warning).

2.9.6.4. Example 4: Extension namespace labeling

Addition of an "extension-namespace-prefixes" or "extension-namespaces" attribute to the xsd:schema element. This attribute allows a schema author to declare that certain namespaces should be recognized as containing elements or attributes which are extensions to the XSD specification; these extensions may be recognized and processed by some but not by all conforming processors and should not cause an error. A v.n processor should ignore the attribute (although it might raise an error if it encounters actual extension elements in the schema document in places where it's not prepared to find unknown material).

2.9.6.5. Example 5: new attribute on wildcards

A new local (unqualified) attribute, excluded-namespaces, is added to the xsd:any element, with the intended semantics being that the wildcard will match no elements in the namespaces named in the attribute value.

There are various possible desired outcomes; ideally, the WG should be able to achieve any of these.

(a) Version-N processors of the schema language should ignore the excluded-namespaces attribute; no fatal error should be raised (although a warning is in order). The result will be that the version-N processors will correctly accept anything valid according to the version-N+1 schema, but will incorrectly accept some documents which a version-N+1 processor would reject.

(b) Version-N processors of the schema language should ignore the excluded-namespaces attribute only if passed, at run time, a schema for schema documents which shows it as valid. Result as above.

(c) Schema authors should be able to cause version-N processors of the schema language either to ignore the excluded-namespaces attribute or to perform some alternative fallback behavior.

2.9.6.6. Example 6: xsd:all allowed at any level of a content model.

The xsd:all element is allowed not only at the top level of content models, but anywhere (roughly as in SGML). A schema author should have the ability to specify how a version-N processor should behave, either by providing a version-N formulation of a content model particle that can be used instead of the 'all'-group, or by specifying that a version-N processor must fail.

Optionally, the described behavior of the version-N processor may be made conditional on the processor being supplied with a schema for schema documents that certifies the un-understood use of the xsd:all element as valid.

2.9.6.7. Example 7: grammaticalization of attributes

Instead of requiring a content model followed by specification of attributes, the new schema language allows attribute declarations to occur in the content model. For example, inside a choice to indicate that one or the other, but not both, of the attributes specified may appear. (The effect will be similar to the content models of Relax NG.)

Two possible desired outcomes:

(a) a version-N processor should accept all the attribute declarations, ignoring their context, and interpret the content model as if the attribute declarations were not there. The result will be that the version-N processor will accept all documents valid according to the version N+1 schema, but will not enforce any co-occurrence constraints expressed by the content model.

(b) a version-N processor should fail unless the schema author provides an alternative formulation of the type in terms the version-N processor understands.

2.10. Web services content

2.10.1. Overview

This use case describes the different ways extra content can be added to create new versions of a schema.

These different ways reflects different conditions faced by schema authors. These conditions may manifest themselves as technical constraints (e.g. location of extension points where content can be added) or non-technical constraints (e.g. ownership over a schema).

This use case arose from the area of Web services, where a range of different approaches have already been used to version documents. These documents are generated and consumed by services, which are often operated by different parties under different conditions. Hence, different schema authors may have to work under different constraints.

2.10.1.1. Notation

To clearly distinguish the approaches, a shorthand notation is used that reflects the structure of the instances and the schemas.

An instance is denoted by a sequence of lowercase letters representing each component, in the order that they appear (e.g. "abc"). A type is denoted by a sequence of uppercase letters representing each component, in the order that they appear (e.g. "ABC").

All components are in the same namespace, unless it is followed by an apostrophe which indicates that is in a different namespace to the other components (e.g. ab'c indicates that component b is in a different namespace from components a and c). When referring to types, the letters without apostrophes are in the target namespace of the schema, and those with apostrophes are not in the target namespace (e.g. in AB'C, A and C are in the target namespace).

Schema extension are indicated by parenthesis around the base type and order is preserved (e.g. "(GF)M").

This use case consists of a number of approaches, each having its own set of scenarios. Since the scenarios for each approach are very similar, we will first describe one approach and all its scenarios in detail, and then the other approaches will be described more briefly in the discussion section.

2.10.1.2. Approach 1: Inserted

Consider a scenario where a travel reservation service accepts XML documents containing names. In the first version of the service, names are defined as containing a given name followed by a family name.

An example name instance is "gf":

<name>
  <given>James</given>
  <family>Maxwell</family>
</name>

An example name schema is "GF":

<xsd:complexType name="name">
  <xsd:sequence>
    <xsd:element name="given"  type="xsd:string"/>
    <xsd:element name="family" type="xsd:string"/>
  </xsd:sequence>
</xsd:complexType>

A new version of the schema is then created to include middle names.

In this first approach, the middle name is inserted between the given name and the family name (which is where it normally appears in common usage). The middle name component has the same namespace as the other components.

This approach illustrates a versioning mechanism that is very flexible in where it allows extensions (i.e. within existing content, rather than only appended to it), and also a situation where the extension is allowed to be in the same namespace as the original type. Both of these are highly desirable, because they allow the new version to appear as cleanly designed as the original.

An example of the instance "gmf" is:

<name>
  <given>James</given>
  <middle>Clerk</middle>
  <family>Maxwell</family>
</name>

The versioned schema is "GMF":

<xsd:complexType name="name">
  <xsd:sequence>
    <xsd:element name="given"  type="nameString"/>
    <xsd:element name="middle" type="nameString" minOccurs="0"/>
    <xsd:element name="family" type="nameString"/>
  </xsd:sequence>
</xsd:complexType>

2.10.2. "Web services content" scenario A: original instance

[AXIS]Instance processed: gf [Original]

[AXIS]Schema Availability: GF [Original]

[AXIS]Operation performed: Schema validation

Brief summary: An instance of the original schema will be accepted by the original schema.

Basic course of events:

1. An instance of the original schema is processed.
2. The processor identifies it as an instance of the original schema.
3. The system processes the instance.

Desired outcome:

Scenario A is straightforward, because it does not involve any versioning.

2.10.3. "Web services content" scenario B: versioned instance

[AXIS]Instance processed: gmf [Versioned]

[AXIS]Schema Availability: GMF [Versioned]

[AXIS]Operation performed: Schema validation

Brief summary: An instance of the new schema will be accepted by the new schema.

Basic course of events:

1. An instance of the new schema is processed.
2. The processor identifies it as an instance of the new schema.
3. The system processes the instance.

Desired outcome:

Scenario B is also straightforward, because it only involves the new version.

2.10.4. "Web services content" scenario C: original instance with new schema

[AXIS]Instance processed: gf [Original]

[AXIS]Schema Availability: GMF [Versioned]

[AXIS]Operation performed: Schema validation

Brief summary: An instance of the original schema will be accepted by the new schema.

Basic course of events:

1. An instance of the original schema is processed.
2. The processor identifies it as a version of the new schema.
3. The system processes the instance.

Desired outcome:

This scenario may occur when an old client invokes a new release of the service.

2.10.5. "Web services content" scenario D: new instance with original schema

[AXIS]Instance processed: gmf [Versioned]

[AXIS]Schema Availability: GF [Original]

[AXIS]Operation performed: Schema validation

Brief summary: An instance of the new schema will be accepted by the original schema.

Basic course of events:

1. An instance of the new schema is processed.
2. The processor identifies it as a version of the original schema.
3. The system processes the instance.

Desired outcome:

This scenario may occur when new Web service clients are used to invoke an original version of the service. Of the four, this is the most important scenario to address. Much of Web services deployments fall into this scenario, because often a new schema cannot be simply deployed into existing services.

2.10.6. Discussion

2.10.6.1. Approach 1: Inserted

This approach consists of:

This details of this approach was describe in the above text.

The original schema and the versioning mechanisms should allow new elements to be inserted between existing elements in the content model, and extensions can be created in the same namespace of the original type.

From the author's point of view this is highly desirable, because the logical position for the middle name is inbetween the given and family name.

2.10.6.2. Approach 2: Insert from different namespace

This approach consists of:

In this approach the new version adds the middle name in a different namespace.

An example of an instance gm'f is:

<name>
  <given>James</given>
  <NEWNS:middle>Clerk</NEWNS:middle>
  <family>Maxwell</family>
</name>

An example of the schema GM'F is:

<xsd:complexType name="name">
 <xsd:sequence>
   <xsd:element name="given" type="xsd:string"/>
   <xsd:element ref="NEWNS:middle" minOccurs="0"/>
   <xsd:element name="family" type="xsd:string"/>
 </xsd:sequence>
</xsd:complexType> 

Namespaces are a crucial way of determining the version of elements and attributes. The middle name element is created in a new namespace. This maybe because the schema author is following a policy that states that all new constructs must belong in new namespaces. Alternatively, the original schema's namespace could be controled by a different organisation, and the author of the new version does not have the right to make additions to it.

The original schema and the versioning mechanism allows extra elements to occur between existing elements, but only if it is from a different namespace.

2.10.6.3. Approach 3: Appended

This approach consists of:

In this approach the new version appends the middle name to the end of the content model. The middle name is placed in the same namespace as the other components.

An example of the instance "gfm" is:

<name>
  <given>James</given>
  <family>Maxwell</family>
  <middle>Clerk</middle>
</name>

The versioned schema is "GFM":

<xsd:complexType name="name">
  <xsd:sequence>
    <xsd:element name="given"  type="nameString"/>
    <xsd:element name="family" type="nameString"/>
    <xsd:element name="middle" type="nameString" minOccurs="0"/>
  </xsd:sequence>
</xsd:complexType>

The original schema and the versioning mechanism allows elements to be appended to the end of content models, and it also allows those elements to be in the same namespace as the original type.

This approach reflects the practice of including a wildcard at the end of content models. For example, if the original schema GF was something like:

<xsd:complexType name="name">
  <xsd:sequence>
    <xsd:element name="given"  type="nameString"/>
    <xsd:element name="family" type="nameString"/>
    <xsd:any namespace="##any" minOccurs="0" maxOccurs="unbounded"/>
  </xsd:sequence>
</xsd:complexType>
2.10.6.4. Approach 4: Appended from different namespace

This approach consists of:

In this approach the new version appends the middle name to the end of the content model. The middle name is placed in a different namespace to the other components.

An example of the instance "gfm'" is:

<name>
  <given>James</given>
  <family>Maxwell</family>
  <NEWNS:middle>Clerk</NEWNS:middle>
</name>

The versioned schema is "GFM'":

<xsd:complexType name="name">
 <xsd:sequence>
   <xsd:element name="given" type="xsd:string"/>
   <xsd:element name="family" type="xsd:string"/>
   <xsd:element ref="NEWNS:middle" minOccurs="0"/>
 </xsd:sequence>
</xsd:complexType> 

This approach reflects the practice of including a wildcard at the end of content models. For example, if the original schema was something like:

<xsd:complexType name="name">
  <xsd:sequence>
    <xsd:element name="given"  type="nameString"/>
    <xsd:element name="family" type="nameString"/>
    <xsd:any namespace="##other" minOccurs="0" maxOccurs="unbounded"/>
  </xsd:sequence>
</xsd:complexType>

The original schema and the versioning mechanism allows elements to be appended to the end of content models, but those elements have to be from a different namespace.

2.10.6.5. Approach 5: Combination

This approach consists of:

In this approach, the new version adds more than one element to separate places in the content model.

An example of the instance "pgmf" is:

<name>
  <prefix>Professor</prefix>
  <given>James</given>
  <middle>Clerk</middle>
  <family>Maxwell</family>
</name>

The versioned schema is "PGMF":

<xsd:complexType name="name">
  <xsd:sequence>
    <xsd:element name="prefix" type="nameString" minOcurs="0"/>
    <xsd:element name="given"  type="nameString"/>
    <xsd:element name="middle" type="nameString" minOccurs="0"/>
    <xsd:element name="family" type="nameString"/>
  </xsd:sequence>
</xsd:complexType>

This approach illustrates that extension points should be allowed in multiple places, and they can be used in combination together.

2.10.6.6. Approach 6: Multi-phase versioning

This approach consists of:

In this approach, two new version are created.

Firstly, a version is created by adding a middle name element. An example of the instance "gmf" is:

<name>
  <given>James</given>
  <middle>Clerk</middle>
  <family>Maxwell</family>
</name>

Secondly, another version is created by adding a preferredName element. An example of the instance "gpmf" is:

<name>
  <given>James</given>
  <preferredName>Jim</preferredName>
  <middle>Clerk</middle>
  <family>Maxwell</family>
</name>

This approach emphasises that extension points have to be maintained in subsequent versions.

For example, if the original schema was defined as "given wildcard family," then it should be replaced with "wildcard middle wildcard" so that future versions will have the same amount of flexibility. The versioned schema containing "given wildcard middle wildcard family" will allow the preferredName to be added. If it was simply replaced with "middle" to give "given middle family" then it would not be possible to insert the preferredName element at that location.

2.10.6.7. Approach 7: Multi-namespace languages

Approaches 2 and 4 only considered a simple situation where a single namespace was used in the original schema.

Another important situation is when the original schema uses multiple namespaces. For example, consider the example:

<ns1:name>
  <ns2:first>James</ns2:first>
  <ns3:last>Maxwell</ns3:last>
</ns1:name>

And then a version of it is created, adding an element from yet another namespace (i.e. the namespace with the prefix "ns4"):

<ns1:name>
  <ns2:first>James</ns2:first>
  <ns4:middle>Clerk</ns4:middle>
  <ns3:last>Maxwell</ns3:last>
</ns1:name>

The versioning mechanism needs to support languages that use elements across multiple namespaces. This makes it much more difficult to determine what is part of the original schema from what is not.

Schemas using multiple namespaces are common. Also, once a schema has been versioned several times by different parties, the resulting schema may have accumulated content from multiple namespaces. Hence, it is important that schemas containing multiple namespaces can be versioned.

2.10.6.8. Approach 8: Extension

This approach is a variant where the name type is in a new type that has a reference relationship to the old type.

This approach reflects situations where the author is unable to change the original type. Hence, they can only reference it. This approach may also be needed when the author wishes to preserve the original type for reuse by other types. Hence, it must keep the original type unchanged.

An instance of gfm is:

<name>
  <given>James</given>
  <family>Maxwell</family>
  <middle>Clerk</middle>
</name>

The schema, using extensions, is (GF)M where M is added at the end of the GF content model:

<xsd:complexType name="newNameType">
  <xsd:complexContent>
    <xsd:extension base="nameType">
      <xsd:sequence>
        <xsd:element name="middle" type="xsd:string"/>
      </xsd:sequence>
    </xsd:extension>
  </xsd:complexContent>
</xsd:complexType>
2.10.6.9. Approach 9: Extensibility with ignore unknowns

Three different approaches to extensibility and versioning are described here and the following two approaches. These three approaches illustrate desired behaviour from schema processors that supports versioning/extension by ignoring unexpected content. Ignoring them in three different ways. Ignoring unexpected content can help the validation of new instances against old schemas.

Note: the three approaches are not compatiable with each other, because they reflect the expectations from different sets of people.

This approach consists of:

It allows all unknown items, but faults on any known items which are invalid. The unknown item in this example is the prefix element, which comes from a different namespace. The known items are the given and family elements.

<name>
  <ext:prefix>Dr</ext:prefix>
  <given>Emmett</given>
  <family>Brown</family>
</name>

The above instance would be treated by the schema validator as if it was:

<name>
  <given>Emmett</given>
  <family>Brown</family>
</name>

However, the following instance "gfgf" is NOT allowed. It tries to extend the content model by adding additional elements that have already been defined in the content model. The processor will not ignore them, because they are known elements. This instance will be invalid according to the original schema.

<name>  <!-- invalid example -->
  <given>Emmett</given>
  <family>Brown</family>
  <given>Emmett</given>
  <family>Von Braun</family>
</name>

This approach argues that extensions are new items that have not already been defined, and existing constraints cannot be changed. For example, an extension cannot be created that has multiple repetations of the same element, if the original schema has already defined that element to have maxOccurs="1".

2.10.6.10. Approach 10: Extensibility with ignore unexpected knowns

This approach consists of:

This second extensibility approach allows extensions to come from the same namespace as the original schema. More specifically, it also allows existing elements in the content model to be used in extensions.

For example, the following instance p'gfgf would be allowed:

<name>
  <ext:prefix>Dr</ext:prefix>
  <given>Emmett</given>
  <family>Brown</family>
  <given>Emmett</given>
  <family>Von Braun</family>
</name>

The processor would treat the instance as if the extra elements to those that have already been defined were not present. It will treat the instance as:

<name>
  <ext:prefix>Dr</ext:prefix>
  <given>Emmett</given>
  <family>Brown</family>
</name>

The processor does not simply "ignore unknowns," but effectively "ignores unknowns in that particular location in the content model." The processor prunes those additional elements.

However, the elements from other namespaces are not simply ignored (which is what the next approach does). Hence, the original schema must allow for them (e.g. using wildcards in the content model) otherwise the extended instance will fail to validate against the original schema because of the extra ext:prefix element.

2.10.6.11. Approach 11: Extensibility with ignore all unexpected

This approach consists of:

This third extensibility approach allows extensions to come from both the same namespace as the original schema as well as other namespaces.

In this approach, the schema processor or consuming application handles versioning by simply ignoring all unexpected content. It ignores elements which were not a part of the original content model. It also ignores any extra occurances of elements that were in the original content model.

The instance (p'gfgf):

<name>
  <ext:prefix>Dr</ext:prefix>
  <given>Emmett</given>
  <family>Brown</family>
  <given>Emmett</given>
  <family>Von Braun</family>
</name>

will be treated as:

<name>
  <given>Emmett</given>
  <family>Brown</family>
</name>

Unlike the previous approach, the elements from external namespaces are also ignored by the schema processor. The original schema author does not have to allow for them.

2.11. Web services mustUnderstand

2.11.1. Overview

In this use case, a schema is extended with information that must be processed -- and the versioning mechanism must ensure that that condition is obeyed.

Consider the previous example of a name, consisting of a given name and a family name.

An example name instance is "gf":

<name>
  <given>James</given>
  <family>Maxwell</family>
</name>

An example name schema is "GF":

<xsd:complexType name="name">
  <xsd:sequence>
    <xsd:element name="given"  type="xsd:string"/>
    <xsd:element name="family" type="xsd:string"/>
  </xsd:sequence>
</xsd:complexType>

A middle name is added in a new version.

An example name instance is "gmf":

<name>
  <given>James</given>
  <middle v:mustUnderstand="true">Clerk</middle>
  <family>Maxwell</family>
</name>

The instance must somehow indicate that the middle name is an element which must be processed. For example, through the use of a special attribute, or through the use of the xsi:type attribute.

An example new name type is "GMF":

<xsd:complexType name="name">
  <xsd:sequence>
    <xsd:element name="given" type="xsd:string"/>
    <xsd:element name="middle" type="xsd:string" minOccurs="0"/>
    <xsd:element name="family" type="xsd:string"/>
  </xsd:sequence>
</xsd:complexType>

It is required that a consumer must be able to process the middle name. It is an error for them to not process it.

The ability to process it, and how it is processed, is dependent on the processing application. However, from a schema validation point of view, an instance should be considered invalid if the item that must be processed is not known to the schema.

2.11.2. "Web services mustUnderstand" scenario A: Extension with old schema

[AXIS]Instance processed: gmf [New instance]

[AXIS]Schema Availability: GF [Old schema]

[AXIS]Operation performed: Schema validation

Brief summary: A processor according to the old schema processes a new instance that includes content that must be processed.

Basic course of events:

1. An instance of GMF is processed.
2. The system identifies that it is a version of GF.
3. It processes it using schema GF.

Desired outcome:

2.11.3. Discussion

This is similar to the "major version change incompatible" scenario.

2.12. Web services container types

2.12.1. Overview

One pattern for structuring data in a flexible and extensible way is to define container types. A framework document format is defined with placeholder container elements that can contain arbitrary content. New versions of the document can contain additional information in the container elements.

The Web services SOAP message is a good example of this approach. The SOAP envelope consists of an optional header element and a mandatory body element. The header element and body element can contain arbitrary elements inside it.

SOAP has the ability to add content that does not affect the application data. This is realized in the SOAP header and body extensibility model. By pushing the "application" data into a body element, it enables extensibility in headers that is independent and loosely coupled to the application message.

A simple schema that illustrates this concept is the message format defined by:

<xsd:element name="Envelope">
  <xsd:complexType>
    <xsd:sequence>
      <xsd:element name="Header" type="xsd:anyType" minOccurs="0"/>
      <xsd:element name="Body" type="xsd:anyType"/>
    </xsd:sequence>
  </xsd:complexType>
</xsd:element>

An instance would look something like:

<env:Envelope>
  <env:Header>
    ...
  </env:Header>
  <env:Body>
    ...
  </env:Body>
<env:Envelope>

The container approach allows extensibility and versioning in the instance, description, and software. Having separate headers allows the message to be extended/versioned without having to change the body of the message. Thus, software which depends on the message body will not be affected.

Interestingly, by pushing the application message into the body which is a constraint on the structure, it opens up more extension/versioning possibilities. A common theme is that the published description is the sum of all the message aspects (e.g. application message + reliable messaging headers + security headers + addressing headers). In software, this can be realized by writing handlers for each extension, without needing to change the application message processing part.

XML Schema does not enable a container type (like SOAP envelope with body and header children) to be created which also describes the permissible content of the header or body. Thus WSDL 1.1 created the part and header constructs to associate constraints on the header or body of a SOAP envelope. WSDL 2.0 created an operation with in or out messages that define an abstract body, which combined with a SOAP binding defines the allowable content of a SOAP body. WSDL 2.0 also created a header binding extension to allow constraints on the SOAP header element.

Figure: container

2.12.2. "Web services container types" scenario A: Specific extensions

There is a need to be able to constrain the contents of the Body element for a particular usage of the container format.

In this example, a travel agent service wishes to define a flight message as a version of the message envelope. The flight message must be a message with a flight in the body of the message.

This example flight message may be sent to a service that indicates whether there are seats available on the flight.

<env:Envelope>
  <env:Body>
    <f:flight code="XS011">
      <f:departure airport="LHR">London Heathrow</f:departure>
      <f:arrival airport="BOS">Boston</f:arrival>
    </f:flight>
  </env:Body>
<env:Envelope>

The versioning mechanism must allow such a constraint to be specified, because the flight query service does not accept any other types of messages.

2.12.3. "Web services container types" scenario B: Optional extensions

A different service, such as one for choosing seats, allows for optional seat preference information to be added to the header.

<env:Envelope>
  <env:Header>
    <e:seatPreference location="rear" position="window"/>
  </env:Header>
  <env:Body>
    <f:flight code="XS011">
      <f:departure airport="LHR">London Heathrow</f:departure>
      <f:arrival airport="JFK">New York JFK</f:arrival>
    </f:flight>
  </env:Body>
<env:Envelope>

2.12.4. "Web services container types" scenario C: Must understand extensions

A particular travel agent cares for travellers with special medical needs. Their reservation client adds to the header dietary requests, but it must ensure that the processing service handles the dietary request (and does not simply ignore it). In SOAP, this is signaled with a mustUnderstand attribute.

<env:Envelope>
  <env:Header>
    <d:diet env:mustUnderstand="true"/>
      <d:meal type="vegetarian"/>
    </d:diet>
  </env:Header>
  <env:Body>
    <f:flight code="XS011">
      <f:departure airport="LHR">London Heathrow</f:departure>
      <f:arrival airport="JFK">New York JFK</f:arrival>
    </f:flight>
  </env:Body>
<env:Envelope>

The versioning mechanism needs a way to signal that specific additions are not to be silently ignored. This has already been described in the "Web services mustUnderstand" use case.

2.12.5. "Web services container types" scenario D: Recursive extensions

There is also the requirement to recursively support extensions. A particular header block extension may itself have extensions. Those extensions will need to be specified, designated mandatory or optional, and possibly flagged as something that the receiver must understand.

<env:Envelope>
  <env:Header>
    <d:diet env:mustUnderstand="true"/>
      <d:meal type="vegetarian"/>
      <s:healthy-choice selection="true"/>
      <s:allergies foodtype="peanuts" d:mustUnderstand="true"/>
    </d:diet>
  </env:Header>
  <env:Body>
    <f:flight code="XS011">
      <f:departure airport="LHR">London Heathrow</f:departure>
      <f:arrival airport="JFK">New York JFK</f:arrival>
    </f:flight>
  </env:Body>
<env:Envelope>

Currently, SOAP does not typically support recursion.

2.13. Web services and encryption

2.13.1. Overview

This use case considers an encrypted instance to be a "version" of the original schema.

Security is an important part of designing Web services, as well as in other applications. One of the most commonly used security mechanisms is making data confidential by encrypting it.

The W3C XML Encryption specification defines a mechanism that allows a portion of the document to be encrypted. That portion could be as small as a single element, or as large as the entire document. The encrypted portion is replaced with an <enc:EncryptedData> element that contains the encrypted data.

There does not appear to be a universal order for validation and decryption. The simple case is performing decryption first, and then validating the unencrypted document normally.

The more complicated case is performing validation without decryption. This case could occur when an intermediary needs to validate the instance before forwarding it on. As an intermediary, it does not have access to the keys to decrypt the data. The document can only be partially validated. The best it can do it to validate the unencrypted portions of the instance, and make assumptions about the encrypted portions. However, this is still useful, because it shows that the unencrypted portions are valid, and the intermediary can use the unencrypted data.

In this use case, we are making no assumptions about which portions of the instance are being encrypted. In some cases, the portions that will be encrypted will be known when the original schema is being designed. However, in other cases, the portions which will be encrypted will not be known at schema design time.

For example, consider an unencrypted instance "gf":

<name>
 <given>James</given>
 <family>Maxwell</family>
</name>

Then consider an instance, ge', that has the family name element encrypted:

<name>
 <given>James</given>
 <enc:EncryptedData>5a6e6b6a727979</enc:EncryptedData>
</name>

2.13.2. "Encryption" scenario A: partial validation

[AXIS]Instance processed: ge'

[AXIS]Schema Availability: GF and E'

[AXIS]Operation performed: Schema validation

Brief summary: The travel reservation software requires the clients transform the last name part of the name using an encryption algorithm. It may or may not create a subtype specifying exactly where the encryption occurs. The WS-Security specifies that a list of what has been encrypted in the message is stored in the WS-Security header. Assuming there is no subtype, the travel agent system may want to validate the message without failing on the encrypted parts.

Basic course of events:

1. Instance ge' is processed by a system that has GF and E' schemas.
2. The system identifies that ge' is a valid instance of GF with an E transformation of F.
3. The system can process the instance.

Desired outcome:

2.13.3. Discussion

This use case raises interesting questions about the treatment of partial validity when it comes to versioning.

2.14. Web services and WSDL

The Web Services Description Working Group encountered the need for XML Schema versioning in the development of WSDL 2.0, see comment in <http://www.w3.org/2005/05/25-schema/WSDL.html>.

Additional requirements on versioning from experiences with WSDL are briefly described below.

2.14.1. Ordering

WSDL specifies the ordering of the WSDL constructs. These constructs occur in multiple namespaces, which makes it impossible to specify the ordering using the features of XML Schema 1.0. It is desirable to specify the ordering of the content model when multiple namespaces are involved and when those namespaces change with different versions of the schema.

For further details on the WSDL ordering, see <http://www.w3.org/TR/2005/WD-wsdl20-primer-20050803/#element-order>.

2.14.2. Binding

Another type of extension and versioning is the binding and interface construct in WSDL. A binding takes all the operations in a particular interface and adorns them and the interface itself with binding specific information. The WSDL 2.0 primer binding basics <http://www.w3.org/TR/2005/WD-wsdl20-primer-20050803/#basics-binding> shows a binding that extends an interface and "mixes in" wsoap:protocol, wsoap:mep, and wsoap:code values. It seems possible for a description language to allow definition of a type based upon another complex type and adding multiple and different definitions throughout the base type.

2.14.3. Interface extension

Interfaces can be constructed by extending (i.e. versioning) another interface, as explained in <http://www.w3.org/TR/2005/WD-wsdl20-primer-20050803/#more-interfaces-inheritance>. All of the constructs in the "base" interface become part of the "derived" interface. There are numerous complications to this, such as behaviour if duplicate target QNames are found. There is a constraint that the derived interface may only have the same constructs as an interface (such as containing operations).

2.14.4. Constraining location of extensions

Many Web services specifications, such as WS-Addressing, define WSDL extensions. The schema does not allow any constraints upon the location of the extensions, such as wsa:UsingAddressing must only occur as a child of the soap:binding element.

2.14.5. Targeting extensions

Sometimes certain extensions need to be targeted to specific processors.

This can be found in the role attribute in SOAP 1.2 (actor attribute in SOAP 1.1), which identifies specific nodes that must process a particular header item.

3. Acknowledgements

Many people have contributed ideas, material and feedback that has improved this document. In particular, the editor acknowledges contributions from:

Special thank you to David Orchard for providing the Web services related use cases.

The editor also wishes to acknowledge the contributions from the members of the XML Schema Working Group.