W3CArchitecture Domain XML | XML Schema

XML Schema Versioning Use Cases

[SUBTITLE]Draft - 15 April 2005

[SUBTITLE]W3C XML Schema Working Group

[VERSION]This version: http://www.w3.org/XML/2005/xsd-versioning-use-cases/2005-04-15.html

[VERSION]Latest version: http://www.w3.org/XML/2005/xsd-versioning-use-cases/

[VERSION]Previous version: http://www.w3.org/XML/2005/xsd-versioning-use-cases/2005-04-14.html

[VERSION]Editor:

[EDITORS]Hoylen Sue, DSTC Pty Ltd <h.sue@dstc.edu.au>

Abstract

This document describes use cases where XML Schemas are being versioned. This is when there are more than one XML Schema involved, and those schemas are related to each other in some way. The use cases describe the desired behaviour from XML Schema processors when they encounter the different versions of schemas and the instances defined by them.

This document has been produced by the W3C XML Schema Working Group, to serve as input to the Working Group's work on the versioning of XML Schemas. It is used to define what types of versioning situations that XML Schema should addressing.

The use cases illustrate the types of versioning problems that should be solved by versioning mechanisms that might be added to XML Schema. XML Schema might not be able to solve all the use cases, but it is hoped that it can solve a majority of them.

Status of this Document

This is a draft discussion document. Some of the use cases have been extensively discussed in the Working Group. However, the current set of use cases and text describing them have not been endorsed by the Working Group. The current document is in draft form, and is subject to change.

These use cases are based on real examples submitted by users of XML Schema.

The XML Schema Working Group would welcome additional uses cases which illustrate aspects of versioning which have not been captured by the existing use cases.

Table of contents

1. Introduction
1.1. Classification
1.1.1. Schema Availability
1.1.2. Instance Processed
1.1.3. Operation Performed
1.2. Common requirements
1.2.1. Multiple generational versioning
1.2.2. Author friendly
1.2.3. Uses only schema and instance
1.2.4. Fallback
1.3. Terminology
2. Use cases
2.1. Major-minor
2.1.1. Overview
2.1.2. "Major-minor" scenario A: Minor without new schemas
2.1.3. "Major-minor" scenario B: Major without new schemas
2.1.4. "Major-minor" scenario C: Minor with new schemas
2.1.5. "Major-minor" scenario D: Major with new schemas
2.1.6. Discussion
2.2. Object-oriented
2.2.1. Overview
2.2.2. "Object-oriented" scenario A: Employee as a person
2.2.3. "Object-oriented" scenario B: Manager as a person
2.2.4. "Object-oriented" scenario C: Manager in payroll
2.2.5. "Object-oriented" scenario D: Senior Manager in payroll
2.2.6. Discussion
2.3. Ignore-unknowns
2.3.1. Overview
2.3.2. "Ignore-unknowns" scenario A: Price management to cash register
2.3.3. "Ignore-unknowns" scenario B: Fuel pump to cash register
2.3.4. "Ignore-unknowns" scenario C: Price management to volume tracker
2.3.5. "Ignore-unknowns" scenario D: Fuel pump to volume tracker
2.3.6. Discussion
2.4. Comparison
2.4.1. Overview
2.4.2. "Comparison" scenario A: Cash register and price management system
2.4.3. "Comparison" scenario B: Cash register and fuel pump
2.4.4. "Comparison" scenario C: Volume tracker and price management
2.4.5. "Comparison" scenario D: Volume tracker and fuel pump
2.4.6. Discussion
2.5. Specialization
2.5.1. Overview
2.5.2. "Specialization" scenario A: Hospital to GP
2.5.3. "Specialization" scenario B: GP to GP
2.5.4. "Specialization" scenario C: Hospital to hospital
2.5.5. "Specialization" scenario D: GP to hospital
2.5.6. "Specialization" scenario E: Physiotherapy to hospital
2.5.7. Discussion
2.6. Renaming
2.6.1. Overview
2.6.2. "Renaming" scenario A: Hospital to GP
2.6.3. "Renaming" scenario B: GP to GP
2.6.4. "Renaming" scenario C: Hospital to hospital
2.6.5. "Renaming" scenario D: GP to hospital
2.6.6. "Renaming" scenario E: Physiotherapy to hospital
2.6.7. Discussion
2.7. Customization
2.7.1. Overview
2.7.2. "Customization" scenario A: Video head office to Big-store head office
2.7.3. "Customization" scenario B: Video warehouse to Video head office
2.7.4. "Customization" scenario C: Big-store head office to Big-store branch
2.7.5. "Customization" scenario D: Big-store to Video head office
2.8. MathML
2.9. XSD versioning
2.9.1. Overview
2.9.2. "xsd" scenario A: New instance with new processor with new schema
2.9.3. "xsd" scenario B: New instance with new processor without new schema
2.9.4. "xsd" scenario C: New instance with old processor with new schema
2.9.5. "xsd" scenario D: New instance with old processor without new schema
2.9.6. "xsd" scenario E: Old instance
2.9.7. Discussion
2.9.7.1. Example 1: New check constraints
2.9.7.2. Example 2: New embedded element (I)
2.9.7.3. Example 3: New embedded element (II)
2.9.7.4. Example 4: Extension namespace labeling
2.9.7.5. Example 5: new attribute on wildcards
2.9.7.6. Example 6: xsd:all allowed at any level of a content model.
2.9.7.7. Example 7: grammaticalization of attributes

1. Introduction

It is important to be able to create different versions of XML Schemas. In some applications, schemas need to change over time to meet new requirements that may emerge. It is often not practical to simultaneously replace all the deployments of the old schemas with the new ones. So applications will need to cope with different versions coexisting in the system. Hence, versioning mechanisms in XML Schema should new versions to be created, and the schema processors need to handle instances defined by the different versions.

This document describes the desirable behaviours for use cases that involve XML Schema versioning. The use case approach aims to describes external interactions on the system (in this case, the system is the XML Schema processor). They deliberately do not describe any implementation specific mechanisms. Possible versioning mechanismes are discussed in the "Framework for discussion of versioning" <http://www.w3.org/XML/2004/02/xsdv.html>.

This document focuses on the versioning of XML Schemas. In particular, on the behaviour of XML Schema processors, which performs schema validation and exposes information via the Post-Schema Validation Infoset (PSVI). Schema versioning is also important to other types of systems, such as for the application and for code generators using XML Schemas. However, those other aspects of versioning are outside the scope of this document.

For more general information on versioning, the W3C Technical Architecture Group (TAG) is producing a TAG Finding about Versioning on the Web.

Discussions on versioning is being conducted on the <public-xml-versioning@w3.org> mailing list. Please send your comments on this document to that mailing list.

1.1. Classification

Several axes have been identified to help classify the different types of use cases. These axes are:

1.1.1. Schema Availability

This axis indicates which schemas are available to the processor. Schemas will be identified by a letter, for example "B". The notation "V(B)" will be used to indicate that a schema called V is a version of schema B.

A version of a schema may be versioned again. The notation "W(V(B))" indicate that a schema called W is a version of V, and V itself is a version of B.

The availability of schemas will depend on the schema processor and the application. Sometimes they can be updated with the new schemas. Schemas could be manually installed into the application, or the processor could automatically fetch them when they are needed. At other times, processors might not be configurable with the new schemas. They may be embedded devices that cannot be easily updated, or they might be disconnected from a network and can't automatically fetch any new schemas. Fetching external schemas might also be disallowed due to performance or security reasons.

1.1.2. Instance Processed

This axis indicates what kind of instance document is being processed. That is, which version of schema the instance correspond to. The same notation as for schemas will be used to denote the different versions of instances.

A schema processor needs to handle instances of different versions. It may accept, partially accept, or reject the instance. The desired behaviour of schema processors is the subject of these use cases.

1.1.3. Operation Performed

This axis describes what action is being performed by the XML Schema processor.

The most common action is schema validation, where an instance document is validated against a XML Schema. This produces a result indicating if the document is valid, and a Post Schema Validation Infoset (PSVI). This is what current schema processors do.

A new operation in versioning is the comparison of two schemas to see if they are versions of each other. Existing schema processors do not perform this function.

1.2. Common requirements

1.2.1. Multiple generational versioning

The versioning mechanism must allow for multiple generations of versioning. It must be scalable, so that it can be used many times to create new versions of already versioned schemas.

1.2.2. Author friendly

The versioning mechanism must be produce instance documents which are suitable for manual editing and viewing. This must hold when multiple generations of versioning is in effect. Editing of an N-th generation document should be as simple as editing a first generation document.

1.2.3. Uses only schema and instance

The versioning mechanism must work within the framework of XML instance documents and XML Schemas.

It must not rely on any external mechanisms. For example, some of the versioning use cases can be solved by using a negotiation process to agree upon the version being used between the sender and the receiver. However, negotiation is not a part of schema processing and is not possible in some situations (e.g. for static XML files, or when there is only a one way communication channel). Hence, negotiation cannot be relied upon as a suitable XML Schema versioning mechanism.

1.2.4. Fallback

The use cases specify the desired behaviour for the situations where a schema processor encounters something that they didn't expect.

There could be mechanisms that allow alternative fallback behaviours to be specified. These would be used instead of the desired behaviour in the schema processor.

The alternative fallback behaviour could be specified in a number of different places. The behaviour may come from:

The precedence of the fallbacks, when more than one is used, needs to be defined.

1.3. Terminology

The term "backward compatiable" will mean that an instance document defined by an old schema can be processed by an application that handles the new schema.

The term "forward compatiable" will mean that an instance document defined by a new schema can be processed by an application that handles the old schema.

[TODO]Consider defining four terms. Adding terms that relate to schema compatiability, and not just instance document compatiability.

2. Use cases

2.1. Major-minor

2.1.1. Overview

In this use case a schema is versioned many times. Some of those versions have insignificant changes made to them, but some others will undergo significant changes. The versioning mechanism needs a way to distinguish between the two types of changes.

In the application world, the concept of major and minor versions is commonly used for software releases. This convention is often used to convey information about interoperability: minor versions are guaranteed to be compatiable, major versions do not have such guarantees.

What constitutes an incompatible change depends on the particular application using the schemas. Hence, the distinction cannot be automatically determined by examining the schemas.

For example, consider a schema for a book store's inventory system. This will be known as schema B. It contains:

<xsd:complexType name="booktype">
  <xsd:sequence>
    <xsd:element name="title" type="xsd:string"/>
    <xsd:element name="author" type="xsd:string"/>
    <xsd:element name="price" type="xsd:decimal"/>
    <xsd:any minOccurs="0"/>
  </xsd:sequence>
</xsd:complexType>

An example instance of B is:

<book>
  <title>Fun with XML Schemas</title>
  <author>A. Writer</author>
  <price>39.95</price>
</book>

A minor version of the schema is created, schema V(B). The only change is the addition of an optional editor element. Schema V could contain:

<xsd:complexType name="booktype">
  <xsd:sequence>
    <xsd:element name="title" type="xsd:string"/>
    <xsd:element name="author" type="xsd:string"/>
    <xsd:element name="price" type="xsd:decimal"/>
    <xsd:element name="editor" type="xsd:string" minOccurs="0"/>
  </xsd:sequence>
</xsd:complexType>

An example instance of V(B) is:

<book>
  <title>More fun with XML Schemas</title>
  <author>A. Writer</author>
  <price>39.95</price>
  <editor>B. Jones</editor>
</book>

A major version of the schema is also created, schema W(B). The change in this major version is the addition of an optional discount element. For this particular application, the "price" must be interpreted along with the "discount", so this is an incompatible change. The old applications (that used schema B) should not treat instances of W as instances of B, otherwise they will be interpreting the data incorrectly.

The schema author has decreed this is a significant version change. Structurally, the change is the same as the minor change, but the impact on processing applications makes it a major change. In some cases, non-technical reasons may influence whether a version is minor or major (e.g. a business decision to force system incompatiability). So the change by itself is not sufficient to indicate if it as a minor or major version change.

The schema W(B) could contains:

<xsd:complexType name="booktype">
  <xsd:sequence>
    <xsd:element name="title" type="xsd:string"/>
    <xsd:element name="author" type="xsd:string"/>
    <xsd:element name="price" type="xsd:decimal"/>
    <xsd:element name="discount" type="xsd:decimal" minOccurs="0"/>
  </xsd:sequence>
</xsd:complexType>

An example instance of W(B) is:

<book>
  <title>New adventures with XML Schemas</title>
  <author>A. Writer</author>
  <price>39.95</price>
  <discount>0.90</discount>
</book>

In the followng scenarios, the term "old" will be used to refer to applications which only process the base schema B, and the term "updated" used to refer to applications that can also process the new versions V or W.

2.1.2. "Major-minor" scenario A: Minor without new schemas

[AXIS]Instance processed: V(B) [Minor version]

[AXIS]Schema Availability: B [Base]

[AXIS]Operation performed: Schema validation

Brief summary: The old inventory system processes an instance that has a minor version change.

Basic course of events:

1. An instance of V is processed.
2. The system identifies that it is a minor version of B.
3. It processes it using schema B.

Desired outcome:

2.1.3. "Major-minor" scenario B: Major without new schemas

[AXIS]Instance processed: W(B) [Major version]

[AXIS]Schema Availability: B [Base]

[AXIS]Operation performed: Schema validation

Brief summary: The old inventory system processes an instance that has a major version change.

Basic course of events:

1. An instance of W is processed.
2. The system identifies that it is a major version change from B.
3. The system cannot process the instance.

Desired outcome:

2.1.4. "Major-minor" scenario C: Minor with new schemas

[AXIS]Instance processed: V(B) [Minor version]

[AXIS]Schema Availability: B and V(B) [Base and Minor]

[AXIS]Operation performed: Schema validation

Brief summary: An updated inventory system processes an instance that has a minor version change.

Basic course of events:

1. An instance of V is processed.
2. The system identifies that it as an instance of V.
3. It processes it using schema V.

Desired outcome:

2.1.5. "Major-minor" scenario D: Major with new schemas

[AXIS]Instance processed: W(B) [Major version]

[AXIS]Schema Availability: B and W(B) [Base]

[AXIS]Operation performed: Schema validation

Brief summary: An updated inventory system processes an instance that has a major version change.

Basic course of events:

1. An instance of W is processed.
2. The system identifies that it is an instance of W.
3. The system processes it using schema W.

Desired outcome:

2.1.6. Discussion

This use case extends to multiple levels of versioning (both major and minor) even though just one level is shown here.

2.2. Object-oriented

2.2.1. Overview

This use case mimics the inheritance and polymorphic behaviour of object oriented programming languages.

In object oriented programming languages, the concept of inheritance can be used to create a new class based on an existing class. This establishes an "is-a" relationship between the versions. Polymorphism is one of the key features of object oriented programming: this is where an instance of a subclass can be used in the place of an instance of a superclass. The subclass is also known as a "derived class," and the superclass as a "base class."

Programmers may want to develop systems where the programming classes are serialized into XML. Hence, methods (implemented as services) need to behave the same way with regards to the serialized instances.

For example, consider a class that defines a Person. The XML Schema for the serialized class will be known as schema P.

class Person {
  string name;
  date dob; /* date of birth */
};

An example of a serialized Person object is:

<person>
  <name>Amy Smith</name>
  <dob>1938-11-01</dob>
</person>

An employee class is defined as inheriting from a Person, and adding an extra "employee_ID" member variable. The XML Schema for the serialized employee will be known as E(P).

class Employee: public Person {
  int employee_number;
};

An example of a serialized employee object is:

<person>
  <name>Michael Smith</name>
  <dob>1938-11-01</dob>
  <employee_number>69105</employee_number>
</person>

A manager is defined as inheriting from an employee, and adding an extra "project" member variable. The XML Schema for the serialized manager will be known as M(E(P)) or simply M.

class Manager: public Employee {
  string project;
};

An example of a serialized manager object is:

<person>
  <name>Steve Smith</name>
  <dob>1938-11-01</dob>
  <employee_number>69105</employee_number>
  <project>XY001</project>
</person>

A senior manager is defined as inheriting from manager, and adding an extra "department" member variable. The XML Schema for the serialized senior manager will be known as S(M(E(P))) or simply S.

class SeniorManager: public Manager {
  string department;
};

An example of a serialized senior manager object is:

<person>
  <name>Tim Smith</name>
  <dob>1938-11-01</dob>
  <employee_number>69105</employee_number>
  <project>RD001</project>
  <department>Research and Development</department>
</person>

Now consider a service that expects person objects. In this case it is a user database, which can accept any type of person (which includes employees, managers and senior managers).

class User_Database {
public:
  void register(Person p);
  ...
};

There is also a payroll service which accepts employees (which includes managers and senior managers), but not ordinary Persons (who are not employees).

class Payroll {
public:
  void pay_wages(Employee e);
  ...
};

2.2.2. "Object-oriented" scenario A: Employee as a person

[AXIS]Instance processed: E(P) [Employee]

[AXIS]Schema Availability: P [Person]

[AXIS]Operation performed: Schema validation

Brief summary: The user database receives an employee as a user. Although it does not know about employees, it accepts it because an employee is a version of a person. The schema processor is forward compatiable.

Basic course of events:

1. An employee instance is submitted into the user database service.
2. The service identifies the instance as a version of a person.
3. The instance is validated against the person schema.

Desired outcome:

2.2.3. "Object-oriented" scenario B: Manager as a person

[AXIS]Instance processed: M(E(P)) [Manager]

[AXIS]Schema Availability: P [Person]

[AXIS]Operation performed: Schema validation

Brief summary: The user database receives a manager as a user. Although it does not know about managers, it accepts it because it is an indirect version of a person. The schema processor is forward compatiable.

Basic course of events:

1. A manager instance is submitted into the user database service.
2. The service identifies the instance as a version of a person.
3. The instance is validated against the person schema.

Desired outcome:

2.2.4. "Object-oriented" scenario C: Manager in payroll

[AXIS]Instance processed: M(E(P)) [Manager]

[AXIS]Schema Availability: P and E(P) [Person and Employee]

[AXIS]Operation performed: Schema validation

Brief summary: The payroll service receives a manager object. Although it does not know about managers, it accepts it because a manager is a version of an employee. The schema processor is forward compatiable.

Basic course of events:

1. A manager instance is submitted into the payroll service.
2. The service identifies the instance as a version of an employee.
3. The instance is validated against the employee schema.

Desired outcome:

2.2.5. "Object-oriented" scenario D: Senior Manager in payroll

[AXIS]Instance processed: S(M(E(P))) [Senior Manager]

[AXIS]Schema Availability: P and E(P) [Person and Employee]

[AXIS]Operation performed: Schema validation

Brief summary: The payroll service receives a senior manager object. Although it does not know about senior managers, it does accept it because a senior manager is an indirect version of an employee. The schema processor is forward compatiable.

Basic course of events:

1. A senior manager instance is submitted into the payroll service.
2. The service identifies the instance as a version of an employee.
3. The instance is validated against the employee schema.

Desired outcome:

The processor might be able to easily determine that the instance is a person (from looking at the element name). However, the processor cannot blindly assume that any instance of a person is an employee. For example, there could be a customer subclass derived from the person class, but a customer is not an employee.

2.2.6. Discussion

Currently, XML Schema 1.0 can solve this use case if all the schemas are available to the processor. However, it cannot solve the situation described here where the newer versions of the schema are not available.

2.3. Ignore-unknowns

2.3.1. Overview

In this use case, schemas are written to match the expectations of the programs which process the instances. These programs operate in a "ignore what they don't expect" mode.

Consider a convenience store, with a point-of-sale (POS) cash register that receives product data from various devices in the store. There are many different types of devices and they are produced by different vendors.

Due to market forces and competition between the vendors, there is no single standard data schema that they all use. Each vendor enhances their own schemas to be more competative within their own line of products. However, this produces problems for the customers who use products from multiple vendors. Their schemas are similar to each other, because they borrow features from each other, but there is no well defined versioning lineage between them.

One vendor produces a price management system. It sends product pricing information to the cash registers. It generates XML instances which contains an item_code number, a description and a price. The schema for this will be called schema P.

An example price management instance P is:

<product>
  <item_code>0130655678</item_code>
  <description>Orange juice</description>
  <price>4.45</price>
</product>

Another vendor produces fuel pumps. It sends messages which contains additional information that is only relevant to the management of fuel pumps. This vendor may have based their fuel pump schema on the price management schema, or they might not have -- due to the nature of the unregulated marketplace, the relationships between the schemas is unclear.

The fuel pumps transmit instances containing a description, volume in litres, and a price. It is similar to the inventory, because it has a description and a price. However, it is different because the item code is replaced by a fuel_code, and it adds price_per_litre and litres elements. The schema for this will be called schema F.

An example fuel pump instance F is:

<product>
  <fuel_code>14</fuel_code>
  <description>Diesel</description>
  <price_per_litre>0.899</price_per_litre>
  <litres>42.1</litres>
  <price>37.85</price>
</product>

Note that elements designed to hold extension data do not work in this environment. Firstly, because it is not a scalable solution when there are many versions of versions. Secondly, because vendors value their own data as being important to them -- they do not want to make them appear inferior to the other data by placing them inside an extension element.

A third vendor produces cash registers. The cash registers simply displays the products to the user.

Since the cash register should interoperate with as many different types of devices as possible, it has been written to be flexible in what it expects. Specifically, it has been written with the assumption that it will ignore any extra elements that it didn't expect. The different devices can add extra elements without breaking the application.

XML Schema should help promote interoperability between the devices. Some of these devices may be embedded devices that are difficult to update and contain limited processing power. The more powerful devices might validate messages when they are received. The more primitive devices might not perform any schema validation at all. However, they will will reject bad messages. If there is a problem (e.g. accusations that the fuel pump created a bad message, or the cash register rejected a good message) the XML Schema could be brought out to settle the dispute. In either case, the XML Schema needs to be able to represent the input that the application expects.

The application could be written in many different ways. For example, the following is a fragment of an XSLT script that represents the behaviour of the application. The important aspect of this example is that it works on any document as long as there is a description and price in the transaction -- it deliberately ignores any extra elements that it does not use.

<xsl:template match="product">
  <tr>
    <td><xsl:value-of select="description"/></td>
    <td><xsl:value-of select="price"/></td>
  </tr>
</xsl:template>

The challenge is to represent what the cash register program accepts using XML Schema. The schema must reflect the "ignore what they don't expect" behaviour that is inherent in the programs. This schema will be called schema C. It should be usable as an effective filter: for discriminating between input the program accepts, and input that it will fail on.

A fourth vendor produces fuel volume trackers. These keep track of the amount of fuel sold. They are designed to accept the product messages from the fuel pumps. An example of how the volume trackers might be processing the instances is shown by the following program. This program uses uses an event driven XML parser (only the callback functions are shown here). The important point is that this program only looks at the "description" and "litres" elements -- ignoring anything else it is not expecting. Also, the two elements are expected in that particular order.

/* Numbers in square brackets indicate the expected sequence of events.
   initial state = STATE_START */

void
startElement (void* data, const XML_Char* name, const XML_Char** attr)
{
  switch (state) {
  case STATE_START:
    if (strcmp(name, "product") == 0) {
      state = STATE_IN_PRODUCT; /* [1] */
    }
    break;
 
  case STATE_IN_PRODUCT:
    if (strcmp(name, "description") == 0) {
      state = STATE_IN_DESCRIPTION; /* [2] */
    }
    break;
 
  case STATE_SEEN_DESCRIPTION:
    if (strcmp(name, "litres") == 0) {
      state = STATE_IN_LITRES; /* [5] */
    }
    break;
 
  case STATE_SEEN_LITRES:
    // ignore all other elements
    break;
  }
}

void
characters (void* data, const XML_Char* s, int length)
{
  if (state == STATE_IN_DESCRIPTION) {
    current_fuel = database_lookup_fuel(s, length); /* [3] */

  } else if (state == STATE_IN_LITRES) {
    database_add_to_fuel(current_fuel, string_to_number(s, length));
    current_fuel = 0; /* [6] */
  }
}

void
endElement (void* data, const XML_Char* name)
{
  if (strcmp(name, "description") == 0) {
    state = STATE_SEEN_DESCRIPTION; /* [4] */
  } else if (strcmp(name, "litres") == 0) {
    state = STATE_SEEN_LITRES; /* [7] */
  } else if (strcmp(name, "product") == 0) {
    state = STATE_START; /* [8] */
  }
}

The XML Schema for the volume tracker application will be called schema V.

2.3.2. "Ignore-unknowns" scenario A: Price management to cash register

[AXIS]Instance processed: P [Price management]

[AXIS]Schema Availability: C [Cash register]

[AXIS]Operation performed: Schema validation

Brief summary: The output from the price management system is sent to the cash register, and the extra data in it is ignored.

Basic course of events:

1. The price management system generates an instance of the price management schema.
2. The instance is sent to the cash register.
3. The cash register processes the instance.

Desired outcome:

2.3.3. "Ignore-unknowns" scenario B: Fuel pump to cash register

[AXIS]Instance processed: F [Fuel pump]

[AXIS]Schema Availability: C [Cash register]

[AXIS]Operation performed: Schema validation

Brief summary: The output from the fuel pump is sent to the cash register, and the extra data in it is ignored.

Basic course of events:

1. The fuel pump generates an instance of the fuel pump schema.
2. The instance is sent to the cash register.
3. The cash register processes the instance.

Desired outcome:

This scenario is the same as Scenario A, except that the price management system has been replaced by the fuel pump.

2.3.4. "Ignore-unknowns" scenario C: Price management to volume tracker

[AXIS]Instance processed: P [Price management]

[AXIS]Schema Availability: V [Volume tracker]

[AXIS]Operation performed: Schema validation

Brief summary: The output from the price management system is sent (in error) to the fuel volume tracker.

Basic course of events:

1. The price management system generates an instance of the price management schema.
2. The instance is sent to the volume tracker.
3. The volume tracker attempts to process the instance.

Desired outcome:

2.3.5. "Ignore-unknowns" scenario D: Fuel pump to volume tracker

[AXIS]Instance processed: F [Fuel pump]

[AXIS]Schema Availability: V [Volume tracker]

[AXIS]Operation performed: Schema validation

Brief summary: The output from the fuel pump is sent to the fuel volume tracker.

Basic course of events:

1. The fuel pump generates an instance of the fuel pump schema.
2. The instance is sent to the volume tracker.
3. The volume tracker processes the instance.

Desired outcome:

2.3.6. Discussion

This use case is about providing mechanisms in XML Schema so that it can more closely represent the class of documents accepted by XML processing programs. A certain class of programs are designed to be forward-compatiable by deliberately ignoring extra data that they didn't expect. This approach to versioning allows newer versions to add nodes without breaking older applications.

For example, schema "I" could contain:

<xs:complexType name="product_inventory">
  <xs:sequence>
    <xs:element name="item_code" type="xsd:string"/>
    <xs:element name="description" type="xsd:string"/>
    <xs:element name="price" type="xsd:decimal"/>
  </xs:sequence>
</xs:complexType>

Schema "F" could contain:

<xs:complexType name="fuel_pump_product">
  <xs:sequence>
    <xs:element name="fuel_code" type="xsd:string"/>
    <xs:element name="description" type="xsd:string"/>
    <xs:element name="price_per_litre" type="xsd:decimal"/>
    <xs:element name="litres" type="xsd:decimal"/>
    <xs:element name="price" type="xsd:decimal"/>
  </xs:sequence>
</xs:complexType>

With the current mechanisms in XML Schema, a suitable schema for representing what the cash register accepts cannot be written. For example, the following schema is not suitable, because it would reject instances from the inventory system and the fuel pump (even though the cash register can process them).

<!-- Note: this schema is not suitable, it is too restrictive -->
<xs:complexType name="cash_register_product">
  <xs:sequence>
    <xs:element name="description" type="xsd:string"/>
    <xs:element name="price" type="xsd:decimal"/>
  </xs:sequence>
</xs:complexType>

With this schema, instances of P and F are both invalid. However, the desired behaviour is that instances of P and F are both valid.

Similarly, a suitable schema for the volume tracker cannot be written using current XML Schema mechanisms.

<!-- Note: this schema is not suitable, it is too restrictive -->
<xs:complexType name="volume_tracker_product">
  <xs:sequence>
    <xs:element name="description" type="xsd:string"/>
    <xs:element name="litres" type="xsd:decimal"/>
    <xs:any minOccurs="0" maxOccurs="unbounded" processContents="skip"/>
  </xs:sequence>
</xs:complexType>

If the above schema was used as the volume tracker schema, instances of P and F would be both invalid. However, the desired behaviour is that instances of P are invalid, but instance of F must be valid.

2.4. Comparison

2.4.1. Overview

In this use case, schemas are compared to see if they are versions of each other.

This use case uses the situation of a convenience store with a cash register, price management system and fuel pumps. This situation was described in the previous use case.

2.4.2. "Comparison" scenario A: Cash register and price management system

[AXIS]Instance processed: none

[AXIS]Schema Availability: C and P [Cash register and Price management]

[AXIS]Operation performed: comparison

Brief summary: A systems integrator wants to check if the data produced by the price management system is suitable for the cash register to use.

Basic course of events:

1. The cash register schema is obtained.
2. The price management schema is obtained.
3. The two schemas are compared.

Desired outcome:

Although both schemas might have been developed independently, it is possible to consider the price management system as a compatiable version of the cash register schema. This operation allows that to be determined.

2.4.3. "Comparison" scenario B: Cash register and fuel pump

[AXIS]Instance processed: none

[AXIS]Schema Availability: C and F [Cash register and fuel pump]

[AXIS]Operation performed: comparison

Brief summary: A systems integrator wants to check if the data produced by the fuel pump is suitable for the cash register to use.

Basic course of events:

1. The cash register schema is obtained.
2. The fuel pump schema is obtained.
3. The two schemas are compared.

Desired outcome:

2.4.4. "Comparison" scenario C: Volume tracker and price management

[AXIS]Instance processed: none

[AXIS]Schema Availability: V and P [Volume tracker and price management]

[AXIS]Operation performed: comparison

Brief summary: A systems integrator wants to check if the data produced by the price management system is suitable for the volume tracker to use.

Basic course of events:

1. The volume tracker schema is obtained.
2. The price management schema is obtained.
3. The two schemas are compared.

Desired outcome:

2.4.5. "Comparison" scenario D: Volume tracker and fuel pump

[AXIS]Instance processed: none

[AXIS]Schema Availability: V and F [Volume tracker and fuel pump]

[AXIS]Operation performed: comparison

Brief summary: A systems integrator wants to check if the data produced by the fuel pump is suitable for the volume tracker to use.

Basic course of events:

1. The volume tracker schema is obtained.
2. The fuel pump schema is obtained.
3. The two schemas are compared.

Desired outcome:

2.4.6. Discussion

Comparison of schemas is a new type of operation. This use case has illustrated testing if the set of documents described by one schema is a subset of the set described by a different schema. Other possible operations could be to determine if that relationship is a proper subset, or equality.

Another example is XHTML. In XHTML 1.0, the strict, transitional, and frameset document types all use the same XML namespace. However, the schemas for them are slightly different. Also, variants of XHTML 1.0 such as XHTML Basic also share the same namespace, even though the schemas for them are different. Thus, an application that processes one variant of XHTML needs to know whether it can accept instances of another variant.

2.5. Specialization

2.5.1. Overview

In this use case, schemas are specialized with more specific versions, and processing software needs to operate in the presence or absence of those specialized schemas. A distinguishing characteristic of this use case is that instances of the specialized schemas are always valid instances of the base schema.

There is a generic base schema which has been approved by a country's health department. To ensure interoperability, the country's government has mandated that all compliant software must be able to store and process data that corresponds to this generic base schema. This generic base schema has been designed to store medical data, but in a very non-specific way so that it is flexible enough to handle a wide variety of data. This is necessary because medical data changes often due to new technology and practices, as well as being very diverse.

For example, the generic base schema could contain a generic datatype for storing two measurement values. We shall call this generic base schema: schema B.

<xsd:complexType name="measurement2">
  <xsd:sequence>
    <xsd:element name="value1">
      <xsd:complexType>
        <xsd:sequence>
          <xsd:element name="magnitude" type="xsd:decimal"/>
          <xsd:element name="unit" type="xsd:string"/>
        </xsd:sequence>
      </xsd:complexType>
    </xsd:element>
    <xsd:element name="value2">
      <xsd:complexType>
        <xsd:sequence>
          <xsd:element name="magnitude" type="xsd:decimal"/>
          <xsd:element name="unit" type="xsd:string"/>
        </xsd:sequence>
      </xsd:complexType>
    </xsd:element>
  </xsd:sequence>
</xsd:complexType>

An example instance of an element from schema B is:

<data>
  <value1>
    <magnitude>1660</magnitude>
    <unit>mm</unit>
  </value1>
  <value2>
    <magnitude>72</magnitude>
    <unit>kg</unit>
  </value2>
</data>

There are two products that use the generic schema: a General Practice (GP) management system, and a hospital clinical information management system. Clinical record information is exchanged between the two programs using messages containing XML. When the programs receive data, they validate it before storing or process them.

The country's General Practice doctor organization decides that it wants to standardize how blood pressures are recorded. They define it as a specialization (or constraint) of the generic base schema's measurement datatype. The versioning mechanism (whatever that may be) is used indicate that this is a version of the generic measurement datatype.

Here is an example from the specialized blood pressure schema, which we will call schema V(B). It constraints the units of both measurments to be mmHg.

<xsd:complexType name="blood_pressure">
  <xsd:sequence>

    <xsd:element name="value1">
      <xsd:complexType>
        <xsd:sequence>
          <xsd:element name="magnitude" type="xsd:decimal"/>
          <xsd:element name="unit">
            <xsd:simpleType>
              <xsd:restriction base="xsd:string">
                <xsd:enumeration value="mmHg"/>
              </xsd:restriction>
            </xsd:simpleType>
          </xsd:element>
        </xsd:sequence>
      </xsd:complexType>
    </xsd:element>

    <xsd:element name="value2">
      <xsd:complexType>
        <xsd:sequence>
          <xsd:element name="magnitude" type="xsd:decimal"/>
          <xsd:element name="unit">
            <xsd:simpleType>
              <xsd:restriction base="xsd:string">
                <xsd:enumeration value="mmHg"/>
              </xsd:restriction>
            </xsd:simpleType>
          </xsd:element>
        </xsd:sequence>
      </xsd:complexType>
    </xsd:element>

  </xsd:sequence>
</xsd:complexType>

An example instance of a blood pressure specialization V(B) is:

<data>
  <value1>
    <magnitude>142</magnitude>
    <unit>mmHg</unit>
  </value1>
  <value2>
    <magnitude>80</magnitude>
    <unit>mmHg</unit>
  </value2>
</data>

The General Practice software is modified or updated to have the blood pressure schema, but the hospital software is not. There can be many reasons why the hospital software does not have the blood pressure schema, such as: the timing cycle of software upgrades, it may be running in an off-line mode where schemas cannot be fetched, cost, performance, security or policy.

Later on, a physiotherapy clinic decides that it wants to further refine the definition of a blood pressure to only contain sensible values for the systolic and diastolic readings. The versioning mechanism would indicate that this refined blood pressure datatype is a version of the ordinary blood pressure datatype.

This is an excerpt from the refined physiotherapy schema W(V(B)). It shows that the numeric values in the measurement are restricted to certain ranges.

<xsd:complexType name="blood_pressure_refined">
  <xsd:sequence>

    <xsd:element name="value1">
      <xsd:complexType>
        <xsd:sequence>
          <xsd:element name="magnitude">
            <xsd:simpleType>
              <xsd:restriction base="xsd:decimal">
                <xsd:minInclusive value="90"/>
                <xsd:maxInclusive value="140"/>
              </xsd:restriction>
            </xsd:simpleType>
          </xsd:element>
          <xsd:element name="unit">
            <xsd:simpleType>
              <xsd:restriction base="xsd:string">
                <xsd:enumeration value="mmHg"/>
              </xsd:restriction>
            </xsd:simpleType>
          </xsd:element>
        </xsd:sequence>
      </xsd:complexType>
    </xsd:element>

    <xsd:element name="value2">
      <xsd:complexType>
        <xsd:sequence>
          <xsd:element name="magnitude">
            <xsd:simpleType>
              <xsd:restriction base="xsd:decimal">
                <xsd:minInclusive value="60"/>
                <xsd:maxInclusive value="90"/>
              </xsd:restriction>
            </xsd:simpleType>
          </xsd:element>
          <xsd:element name="unit">
            <xsd:simpleType>
              <xsd:restriction base="xsd:string">
                <xsd:enumeration value="mmHg"/>
              </xsd:restriction>
            </xsd:simpleType>
          </xsd:element>
        </xsd:sequence>
      </xsd:complexType>
    </xsd:element>

  </xsd:sequence>
</xsd:complexType>

2.5.2. "Specialization" scenario A: Hospital to GP

[AXIS]Instance processed: B [Hospital]

[AXIS]Schema Availability: B and V(B) [Hospital and GP]

[AXIS]Operation performed: Schema validation

Brief summary: The hospital system sends a message to the GP system. An instance of the base schema will be processed according to the base schema. Even though the processor has access to the specialized schema, it is not used. The schema processor is backward compatiable.

Basic course of events:

1. The hospital generates an instance of the base schema.
2. The base instance is sent to the GP system (which has access to both the base and the specialized schema).
3. The GP system processes the base instance.

Desired outcome:

Since the GP system is receiving a base instance, it cannot (and should not) use the specialization schema.

2.5.3. "Specialization" scenario B: GP to GP

[AXIS]Instance processed: V(B) [GP]

[AXIS]Schema Availability: B and V(B) [Hospital and GP]

[AXIS]Operation performed: Schema validation

Brief summary: One GP system sends a message to another GP system. An instance of the specialized schema will be processed according to the specialized schema. Even though the processor could have validated it using the base schema, it must use the specialized schema.

Basic course of events:

1. Another GP system generates an instance of the specialized schema.
2. The specialized instance is sent to the GP system (which has access to both the base and the specialized schema)
3. The GP system processes the specialized instance.

Desired outcome:

The receiving GP system needs to treat the instance as a blood pressure, because it wants to process it in a special way (e.g. graph it or apply decision support on it). Although it could also validate it using the generic base schema, it does not do so because the extra constraints in the blood pressure schema are important for the application to correctly interpret the data as a blood pressure.

2.5.4. "Specialization" scenario C: Hospital to hospital

[AXIS]Instance processed: B [Hospital]

[AXIS]Schema Availability: B [Hospital]

[AXIS]Operation performed: Schema validation

Brief summary: One hospital system sends a message to another hospital. An instance of the base schema is processed according to the base schema.

Basic course of events:

1. Another hospital system generates an instance of the base schema.
2. The base instance is sent to the hospital system (which only has access to the base schema).
3. The hospital system processs the base instance.

Desired outcome:

This use case does not invoke any special versioning feature, but it is included here for completeness.

2.5.5. "Specialization" scenario D: GP to hospital

[AXIS]Instance processed: V(B) [GP]

[AXIS]Schema Availability: B [Hospital]

[AXIS]Operation performed: Schema validation

Brief summary: A GP system sends a message to a hospital system. An instance of the specialized schema is processed using the base schema when the processor does not have access to the specialized schema. The schema processor is forward compatiable.

Basic course of events:

1. The GP system generates an instance of the specialized schema.
2. The specialized instance is sent to the hospital system (which only has access to the base schema).
3. The hospital system processs the specialized instance.

Desired outcome:

The hospital system cannot recognise the data as a blood pressure, but can process it generically. There might be a generic base schema database for storing them, or a generic viewer that can display the values.

2.5.6. "Specialization" scenario E: Physiotherapy to hospital

[AXIS]Instance processed: W(...(B)...) [Physiotherapy]

[AXIS]Schema Availability: B [Hospital]

[AXIS]Operation performed: Schema validation

Brief summary: A physiotherapy system sends a message to a hospital system. An instance of the refined physiotherapy schema is processed using the base schema (when the processor does not have access to the refined physiotherapy schema nor access to the specialized GP schema). The schema processor is forward compatiable across multiple versions.

Basic course of events:

1. The physiotherapy system generates an instance of the refined physiotherapy schema.
2. The refined physiotherapy instance is sent to the hospital system (which only has access to the base schema).
3. The hospital system processs the refined instance.

Desired outcome:

This scenario shows that the versioning must work across multiple generations of versions, not just between two successive versions.

2.5.7. Discussion

The inclusion of the physiotherapy in this use case is to highlight the point that the versioning mechanism must work when there are multiple versions involved. A solution that provides for only a single level of versioning is useless in the real world.

2.6. Renaming

2.6.1. Overview

In this use case, schemas are specialized with more specific versions and information items are renamed in those versions. The processing software needs to operate in the presence or absence of those specialized schemas. Note: this use case is similar to the "Specialization without renaming" use case, except that items are renamed.

There is a generic base schema which has been approved by a country's health department. To ensure interoperability, the country's government has mandated that all compliant software must be able to store and process data that corresponds to this generic base schema. This generic base schema has been designed to store medical data, but in a very non-specific way so that it is flexible enough to handle a wide variety of data. This is necessary because medical data changes often due to new technology and practices, as well as being very diverse.

For example, the generic base schema could contain a generic datatype for storing two measurement values. We shall call this generic base schema: schema B.

<xsd:complexType name="measurement2">
  <xsd:sequence>
    <xsd:element name="value1">
      <xsd:complexType>
        <xsd:sequence>
          <xsd:element name="magnitude" type="xsd:decimal"/>
          <xsd:element name="unit" type="xsd:string"/>
        </xsd:sequence>
      </xsd:complexType>
    </xsd:element>
    <xsd:element name="value2">
      <xsd:complexType>
        <xsd:sequence>
          <xsd:element name="magnitude" type="xsd:decimal"/>
          <xsd:element name="unit" type="xsd:string"/>
        </xsd:sequence>
      </xsd:complexType>
    </xsd:element>
  </xsd:sequence>
</xsd:complexType>

An example instance of an element from schema B is:

<data>
  <value1>
    <magnitude>1660</magnitude>
    <unit>mm</unit>
  </value1>
  <value2>
    <magnitude>72</magnitude>
    <unit>kg</unit>
  </value2>
</data>

There are two products that use the generic schema: a General Practice (GP) management system, and a hospital clinical information management system. Clinical record information is exchanged between the two programs using messages containing XML. When the programs receive data, they validate it before storing or process them.

The country's General Practice doctor organization decides that it wants to standardize how blood pressures are recorded. They define it as a specialization (or constraint) of the generic base schema's measurement datatype. The versioning mechanism (whatever that may be) is used indicate that this is a version of the generic measurement datatype.

Here is an example from the specialized blood pressure schema, which we will call schema V(B). It constraints the units of both measurments to be mmHg.

<xsd:complexType name="blood_pressure">
  <xsd:sequence>

    <xsd:element name="systolic">
      <xsd:complexType>
        <xsd:sequence>
          <xsd:element name="magnitude" type="xsd:decimal"/>
          <xsd:element name="unit">
            <xsd:simpleType>
              <xsd:restriction base="xsd:string">
                <xsd:enumeration value="mmHg"/>
              </xsd:restriction>
            </xsd:simpleType>
          </xsd:element>
        </xsd:sequence>
      </xsd:complexType>
    </xsd:element>

    <xsd:element name="diastolic">
      <xsd:complexType>
        <xsd:sequence>
          <xsd:element name="magnitude" type="xsd:decimal"/>
          <xsd:element name="unit">
            <xsd:simpleType>
              <xsd:restriction base="xsd:string">
                <xsd:enumeration value="mmHg"/>
              </xsd:restriction>
            </xsd:simpleType>
          </xsd:element>
        </xsd:sequence>
      </xsd:complexType>
    </xsd:element>

  </xsd:sequence>
</xsd:complexType>

An example instance of a blood pressure specialization V(B) is:

<blood_pressure>
  <systolic>
    <magnitude>142</magnitude>
    <unit>mmHg</unit>
  </systolic>
  <diastolic>
    <magnitude>80</magnitude>
    <unit>mmHg</unit>
  </diastolic>
</blood_pressure>

The General Practice software is modified or updated to have the blood pressure schema, but the hospital software is not. There can be many reasons why the hospital software does not have the blood pressure schema, such as: the timing cycle of software upgrades, it may be running in an off-line mode where schemas cannot be fetched, cost, performance, security or policy.

Later on, a physiotherapy clinic decides that it wants to further refine the definition of a blood pressure to only contain sensible values for the systolic and diastolic readings. The versioning mechanism would indicate that this refined blood pressure datatype is a version of the ordinary blood pressure datatype.

This is an excerpt from the refined physiotherapy schema W(V(B)). It shows that the numeric values in the measurement are restricted to certain ranges.

<xsd:complexType name="blood_pressure_refined">
  <xsd:sequence>

    <xsd:element name="systolic">
      <xsd:complexType>
        <xsd:sequence>
          <xsd:element name="magnitude">
            <xsd:simpleType>
              <xsd:restriction base="xsd:decimal">
                <xsd:minInclusive value="90"/>
                <xsd:maxInclusive value="140"/>
              </xsd:restriction>
            </xsd:simpleType>
          </xsd:element>
          <xsd:element name="unit">
            <xsd:simpleType>
              <xsd:restriction base="xsd:string">
                <xsd:enumeration value="mmHg"/>
              </xsd:restriction>
            </xsd:simpleType>
          </xsd:element>
        </xsd:sequence>
      </xsd:complexType>
    </xsd:element>

    <xsd:element name="diastolic">
      <xsd:complexType>
        <xsd:sequence>
          <xsd:element name="magnitude">
            <xsd:simpleType>
              <xsd:restriction base="xsd:decimal">
                <xsd:minInclusive value="60"/>
                <xsd:maxInclusive value="90"/>
              </xsd:restriction>
            </xsd:simpleType>
          </xsd:element>
          <xsd:element name="unit">
            <xsd:simpleType>
              <xsd:restriction base="xsd:string">
                <xsd:enumeration value="mmHg"/>
              </xsd:restriction>
            </xsd:simpleType>
          </xsd:element>
        </xsd:sequence>
      </xsd:complexType>
    </xsd:element>

  </xsd:sequence>
</xsd:complexType>

2.6.2. "Renaming" scenario A: Hospital to GP

[AXIS]Instance processed: B [Hospital]

[AXIS]Schema Availability: B and V(B) [Hospital and GP]

[AXIS]Operation performed: Schema validation

Brief summary: The hospital system sends a message to the GP system. An instance of the base schema will be processed according to the base schema. Even though the processor has access to the specialized schema, it is not used. The schema processor is backward compatiable.

Basic course of events:

1. The hospital generates an instance of the base schema.
2. The base instance is sent to the GP system (which has access to both the base and the specialized schema).
3. The GP system processes the base instance.

Desired outcome:

Since the GP system is receiving a base instance, it cannot (and should not) use the specialization schema.

2.6.3. "Renaming" scenario B: GP to GP

[AXIS]Instance processed: V(B) [GP]

[AXIS]Schema Availability: B and V(B) [Hospital and GP]

[AXIS]Operation performed: Schema validation

Brief summary: One GP system sends a message to another GP system. An instance of the specialized schema will be processed according to the specialized schema. Even though the processor could have validated it using the base schema, it must use the specialized schema.

Basic course of events:

1. Another GP system generates an instance of the specialized schema.
2. The specialized instance is sent to the GP system (which has access to both the base and the specialized schema)
3. The GP system processes the specialized instance.

Desired outcome:

The receiving GP system needs to treat the instance as a blood pressure, because it wants to process it in a special way (e.g. graph it or apply decision support on it). Although it could also validate it using the generic base schema, it does not do so because the extra constraints in the blood pressure schema are important for the application to correctly interpret the data as a blood pressure.

2.6.4. "Renaming" scenario C: Hospital to hospital

[AXIS]Instance processed: B [Hospital]

[AXIS]Schema Availability: B [Hospital]

[AXIS]Operation performed: Schema validation

Brief summary: One hospital system sends a message to another hospital. An instance of the base schema is processed according to the base schema.

Basic course of events:

1. Another hospital system generates an instance of the base schema.
2. The base instance is sent to the hospital system (which only has access to the base schema).
3. The hospital system processs the base instance.

Desired outcome:

This use case does not invoke any special versioning feature, but it is included here for completeness.

2.6.5. "Renaming" scenario D: GP to hospital

[AXIS]Instance processed: V(B) [GP]

[AXIS]Schema Availability: B [Hospital]

[AXIS]Operation performed: Schema validation

Brief summary: A GP system sends a message to a hospital system. An instance of the specialized schema is processed using the base schema when the processor does not have access to the specialized schema. The schema processor is forward compatiable.

Basic course of events:

1. The GP system generates an instance of the specialized schema.
2. The specialized instance is sent to the hospital system (which only has access to the base schema).
3. The hospital system processs the specialized instance.

Desired outcome:

The hospital system cannot recognise the data as a blood pressure, but can process it generically. There might be a generic base schema database for storing them, or a generic viewer that can display the values.

2.6.6. "Renaming" scenario E: Physiotherapy to hospital

[AXIS]Instance processed: W(...(B)...) [Physiotherapy]

[AXIS]Schema Availability: B [Hospital]

[AXIS]Operation performed: Schema validation

Brief summary: A physiotherapy system sends a message to a hospital system. An instance of the refined physiotherapy schema is processed using the base schema (when the processor does not have access to the refined physiotherapy schema nor access to the specialized GP schema). The schema processor is forward compatiable across multiple versions.

Basic course of events:

1. The physiotherapy system generates an instance of the refined physiotherapy schema.
2. The refined physiotherapy instance is sent to the hospital system (which only has access to the base schema).
3. The hospital system processs the refined instance.

Desired outcome:

This scenario shows that the versioning must work across multiple generations of versions, not just between two successive versions.

2.6.7. Discussion

This use case originated from discussions about the "specialization" use case. Although the "specialization" use case has its basis in a real world example, this one does not. However, it is an interesting use case worth considering.

2.7. Customization

2.7.1. Overview

In this use case, a schema is customized for local needs, but the local version must not break validation with the original schema.

This use case is different from the "health care / specialization" use case because the customizations are extensions of the base schema. Without any versioning mechanisms, instances of the customized schemas would not normally be valid instances of the base schema.

This use case involves a Big-store which has a head office and branches. One of the suppliers to Big-store is Video-supplier, which has a head office and warehouses.

A base schema is defined by a Big-store for invoices. We shall call this schema B. Big-store requires all of its suppliers to provide invoices using it.

An example instance of an item in the Big-store invoice is:

<item>
  <part>1138</part>
  <description>Generic widget</description>
  <quantity>12</part>
</item>

When the Big-store created this schema, it did not know how others may want to extend it, and they had no plans nor expectations on the reuse of the schema. This means they did not deliberately spend time putting in hooks for versioning. However, that does not preclude others from using any versioning mechanisms if they were avalabile by default.

Video-supplier decides to reimplement their stock tracking system to be natively based on the Big-store invoice schema. They want to avoid the need to translate between the schemas they use internally and those that it sends to external customers.

However, Video-supplier needs to store additional information in their internal invoices. To do this, they create a customized schema that is a version of the Big-store invoice schema.

In this example, they want to indicate which warehouse the items were shipped from. Here, they have added their customizations to the same namespace as the Big-store invoice. We shall call this schema V(B).

This is an example instance of the Video-supplier item:

<item>
  <part>1138</part>
  <description>Generic widget</description>
  <quantity>12</part>
  <source>
    <warehouse>W1</warehouse>
  </source>
</item>

An item is used in a number of different places in an invoice (e.g. items delivered, items backordered).

Invoices are sent from Video-supplier warehouses to the Video-supplier head office (which expects to see the customized information items). They are also sent from the Video-supplier head office to the Big-store head office. Invoices are also sent from the Big-store head office to their Big-store branches.

The versioning mechanism must accomplish two things: make it easy to write the extensions, and to not break processors expecting instances of the original schema.

Firstly, the versioning mechanism needs a simple way of indicating that wherever the Big-store invoice schema referred to a Big-store item, in the new Video-supplier schema a Video-supplier item must always be used instead. Redefining everything that refers to the Big-store item is not a good solution. The customised element could be used in many different places, leading to a maintenance problem. Redefining everything would effectively create a parallel copy of the schema, where the clear relationship with the original schema would have been lost. The clear relationship being that only the item has versioned, and nothing else.

Secondly, Video-supplier wants to send their customized invoices to the Big-store without translation (i.e. without needing to strip out the customized elements). Those new Video-supplier instances must work with old processors that expect Big-store instances, as shown in the scenarios below.

2.7.2. "Customization" scenario A: Video head office to Big-store head office

[AXIS]Instance processed: V(B) [Video-supplier]

[AXIS]Schema Availability: B [Big-store]

[AXIS]Operation performed: Schema validation

Brief summary: The Video-supplier head office sends an invoice to the Big-store head office. An instance of the customized schema is processed by the base schema (without access to the customized schema). The schema processor is forward compatiable.

Basic course of events:

1. The Video-supplier system generates an instance of the customized schema.
2. The customized instance is sent to the Big-store head office system (which only has access to the base schema)
3. The Big-store head office system processes the customized instance.

Desired outcome:

Although the customized instance contains extra elements which are not a part of the base schema, they must not cause the processor or the application to fail. The instance must be valid according to the base schema, because the contract between the two companies was based on exchanging invoices valid according to the Big-store schema. The information returned by the PSVI must not cause the Big-store application to break.

2.7.3. "Customization" scenario B: Video warehouse to Video head office

[AXIS]Instance processed: V(B) [Video-supplier]

[AXIS]Schema Availability: B and V(B) [Big-store and Video-supplier]

[AXIS]Operation performed: Schema validation

Brief summary: A Video-supplier warehouse sends an invoice to the Video-supplier head office. An instance of the customized schema is processed by the customized schema.

Basic course of events:

1. The Video-supplier warehouse system generates an instance of the customized schema.
2. The customized instance is sent to the Video-supplier head office.
3. The Video-supplier head office system processes the customized instance.

Desired outcome:

2.7.4. "Customization" scenario C: Big-store head office to Big-store branch

[AXIS]Instance processed: B [Big-store]

[AXIS]Schema Availability: B [Big-store]

[AXIS]Operation performed: Schema validation

Brief summary: The Big-store head office sends an invoice to a Big-store branch. An instance of the base schema is processed by a procesor with the base schema.

Basic course of events:

1. The Big-store head office system generates an instance of the base schema.
2. The base instance is sent to a Big-store branch.
3. The Big-store branch system processes the base instance.

Desired outcome:

2.7.5. "Customization" scenario D: Big-store to Video head office

[AXIS]Instance processed: B [Big-store]

[AXIS]Schema Availability: B and V(B) [Big-store and Video-supplier]

[AXIS]Operation performed: Schema validation

Brief summary: The Big-store (incorrectly) sends an invoice to the Video-supplier head office. An instance of the base schema is processed by a system that expects an instance of the customized schema.

Basic course of events:

1. The Big-store system generates an instance of the base schema
2. The base instance is sent to the Video-supplier head office.
3. The Video-supplier head office system processes the base instance.

Desired outcome:

The Video-supplier head office application expects that all invoices it receives are Cooltoy invoices, which contain the extra warehouse information. It must rejects the invoice because it does not have the warehouse information.

2.8. MathML

[TODO]This use case is currently incomplete.

What is the most convenient way to integrate new constructs (e.g. constructs which represent newly discovered mathematical constructs) into a specialized language like MathML? Is it possible to introduce new elements and attributes in such a way as to allow software which does not have hard-coded knowledge of them to do the right thing with them?

2.9. XSD versioning

2.9.1. Overview

The XML Schema definition language itself can be subject to versioning. Newer versions of the language may introduce changes that could affect how existing processors behave. It is desirable that the mechanism allow for forward and backward compatiability. This applies to both the schema for schemas, as well as to the schema processors.

Mechanisms in the XML Schema definition language must allow for future versions to be created that will introduce new constructs in the language. Those future versions must behave well with existing schema processors -- a default behavour for them must be defined. One possible default behaviour would be to ignore the new constructs, although there could be other possible behaviours.

There may be situations where the default behaviour could be redefined. The alternative behaviour would be specified in the new schema. However, this would produce different results between old processor which have access to the new schema, and old processors which do not.

Consider a version of the XML Schema definition language, version V. Additional constructs are added to it create a new version, version W. Schema W will be referred to as the "new" version, and schema V as the "old" version.

The versioning mechanisms must be able to indicate how the behaviour of version n processors handle version n+m schemas (where 1 <= m).

2.9.2. "xsd" scenario A: New instance with new processor with new schema

[AXIS]Instance processed: W [New]

[AXIS]Schema Availability: V, W [Old and new]

[AXIS]Schema processor designed for: V, W [Old and new]

[AXIS]Operation performed: Schema validation

Brief summary: A new version instance is processed by a new version schema processor which has access to the new schema.

Basic course of events:

1. The instance is read.
2. It is identified as an instance of the new schema, and the new schema is used.
3. The instance is validated against the new schema.

Desired outcome:

This is the simple case where everything (the instance, schema, and processor) is conforming to the new version, so it will all work properly.

2.9.3. "xsd" scenario B: New instance with new processor without new schema

[AXIS]Instance processed: W [New]

[AXIS]Schema Availbility: V [Old]

[AXIS]Schema processor designed for: V, W [Old and new]

[AXIS]Operation performed: Schema validation

Brief summary: A new version instance is processed by a new version schema processor, but the new schema is not available.

Basic course of events:

1. The instance is read.
2. It is identified as a version of the old schema, because the processor does not have access to the new schema.
3. The instance is validated against the old schema.

Desired outcome:

2.9.4. "xsd" scenario C: New instance with old processor with new schema

[AXIS]Instance processed: W [New]

[AXIS]Schema Availbility: V, W [New]

[AXIS]Schema processor designed for: V [Old]

[AXIS]Operation performed: Schema validation

Brief summary: A new version instance is processed by an old version schema processor, but it has access to the new schema.

Basic course of events:

1. The instance is read.
2. It is identified as an instance of the new schema, and the new schema is used.
3. However, the processor is not designed to accept the new schema, so it uses a default interpretation of the new schema constructs.
4. The instance is validated against the default interpretation of the new schema.

Desired outcome:

An alternative case would be if the new schema specified an alternative fallback behaviour. In that case, the old schema processor would interpret the new schema using the fallback behaviour, instead of the default behaviour. The instance would be validated using those alternative rules.

2.9.5. "xsd" scenario D: New instance with old processor without new schema

[AXIS]Instance processed: W [New]

[AXIS]Schema Availbility: V [Old]

[AXIS]Schema processor designed for: V [Old]

[AXIS]Operation performed: Schema validation

Brief summary: A new version instance is processed by an old version schema processor, without access to the new schema.

Basic course of events:

1. The instance is read.
2. It is identified as a version of the old schema, and so the old schema is used.
3. The instance is validated against the old schema.

Desired outcome:

2.9.6. "xsd" scenario E: Old instance

[AXIS]Instance processed: V [Old]

[AXIS]Schema Availbility: V, W [Old and new] or just V [Old]

[AXIS]Schema processor designed for: V, W [Old and new] or just V [Old]

[AXIS]Operation performed: Schema validation

Brief summary: An old version instance is processed.

Basic course of events:

1. The instance is read.
2. It is identified as an instance of the old schema, and so the old schema is used.
3. The instance is validated against the old schema.

Desired outcome:

This use case assumes that schema processors are always backward compatiable.

2.9.7. Discussion

The constructs that will be added to future versions of the XML Schema definition language is not yet known. These examples illustrate some possible types of changes that the versioning mechanism has to handle. It is desirable that the versioning mechanism can handle all of them, but a mechanism which handles only some of them may be acceptable.

These are only hypothetical examples. They should not be interpreted as indicating what features are being considered for future versions of XML Schema.

2.9.7.1. Example 1: New check constraints

A new top-level element called "xsd:check" is defined. It is analogous to a table-level check clause in SQL, and contains a series of predicates (expressed by xsd:test elements), each of which must be true of the document as a whole.

A processor designed to handle the new version of schemas will know how to check the constraints it contains. However, an older version processor will not know how.

The default behavior for an old version processor is to ignore the constraints and proceed without an error (although it might issue a warning).

The schema author should be able to indicate if the constraints cannot be ignored. In which case, an old version processor must signal an error.

Also, the creator of the XML instance may need to indicate how the old version processor should behave: whether it should ignore the checks, or fail if it can't process them.

2.9.7.2. Example 2: New embedded element (I)

Change of the syntax of content models. In addition to the 1.0 transfer syntax for content models, the WG wishes to allow a different XML transfer syntax. So where a version n processor expects an 'xsd:element', 'xsd:sequence', 'xsd:choice', or 'xsd:all' element, a version n+1 schema document may instead have a 'xsd:content' element.

Since a version n processor will not understand how to interpret the xsd:content element, a schema author interested in co-existence with version n processors may wish to specify a version-n-style content model as a fallback. (If the main appeal of xsd:content is that it provides currently unavailable functionality, the v.n fallback will be only an approximation. If the main appeal is that xsd:content is easier to read or more elegant, it seems unlikely that authors will want to provide the fallback for schemas being edited. Once the schema is frozen, however, the old syntax could be added by hand or by machine for portability's sake.)

2.9.7.3. Example 3: New embedded element (II)

Like the xsd:schema element itself, in v.n+1 the other top-level source declarations are also to be allowed to have xsd:check elements. The schema author may wish to provide fallbacks, where appropriate, using (say) key and keyref, which approximate the tests in the check clause and which can be performed by v.n processors.

The proper behavior of a v.n processor is to perform the fallback validation if any is specified, and to ignore the check clause otherwise (possibly with a warning).

2.9.7.4. Example 4: Extension namespace labeling

Addition of an "extension-namespace-prefixes" or "extension-namespaces" attribute to the xsd:schema element. This attribute allows a schema author to declare that certain namespaces should be recognized as containing elements or attributes which are extensions to the XSD specification; these extensions may be recognized and processed by some but not by all conforming processors and should not cause an error. A v.n processor should ignore the attribute (although it might raise an error if it encounters actual extension elements in the schema document in places where it's not prepared to find unknown material).

2.9.7.5. Example 5: new attribute on wildcards

A new local (unqualified) attribute, excluded-namespaces, is added to the xsd:any element, with the intended semantics being that the wildcard will match no elements in the namespaces named in the attribute value.

There are various possible desired outcomes; ideally, the WG should be able to achieve any of these.

(a) Version-N processors of the schema language should ignore the excluded-namespaces attribute; no fatal error should be raised (although a warning is in order). The result will be that the version-N processors will correctly accept anything valid according to the version-N+1 schema, but will incorrectly accept some documents which a version-N+1 processor would reject.

(b) Version-N processors of the schema language should ignore the excluded-namespaces attribute only if passed, at run time, a schema for schema documents which shows it as valid. Result as above.

(c) Schema authors should be able to cause version-N processors of the schema language either to ignore the excluded-namespaces attribute or to perform some alternative fallback behavior.

2.9.7.6. Example 6: xsd:all allowed at any level of a content model.

The xsd:all element is allowed not only at the top level of content models, but anywhere (roughly as in SGML). A schema author should have the ability to specify how a version-N processor should behave, either by providing a version-N formulation of a content model particle that can be used instead of the 'all'-group, or by specifying that a version-N processor must fail.

Optionally, the described behavior of the version-N processor may be made conditional on the processor being supplied with a schema for schema documents that certifies the un-understood use of the xsd:all element as valid.

2.9.7.7. Example 7: grammaticalization of attributes

Instead of requiring a content model followed by specification of attributes, the new schema language allows attribute declarations to occur in the content model. For example, inside a choice to indicate that one or the other, but not both, of the attributes specified may appear. (The effect will be similar to the content models of Relax NG.)

Two possible desired outcomes:

(a) a version-N processor should accept all the attribute declarations, ignoring their context, and interpret the content model as if the attribute declarations were not there. The result will be that the version-N processor will accept all documents valid according to the version N+1 schema, but will not enforce any co-occurrence constraints expressed by the content model.

(b) a version-N processor should fail unless the schema author provides an alternative formulation of the type in terms the version-N processor understands.

[TODO]Add discussion about which use cases are currently solvable by XML Schema 1.0.

[TODO]Add appropriate references.