XML | XML Schema

XML Schema Versioning Use Cases

[SUBTITLE]Draft - 17 March 2005

[SUBTITLE]W3C XML Schema Working Group

[VERSION]This version: http://www.w3.org/XML/2005/xsd-versioning-use-cases/xsd-versioning-use-cases-2005-03-17.html

[VERSION]Latest version: http://www.w3.org/XML/2005/xsd-versioning-use-cases/

[VERSION]Previous version: http://www.w3.org/XML/2005/xsd-versioning-use-cases/xsd-versioning-use-cases-2005-02-17.html

[VERSION]Editor:

[EDITORS]Hoylen Sue, DSTC Pty Ltd <h.sue@dstc.edu.au>

Abstract

This document describes use cases where XML Schemas are being versioned. These are situations where there are more than one XML Schemas and instances corresponding to them, and those schemas are based on each other. The use cases describe the different types of versioning behaviour that users want from XML Schema processors.

This document has been produced by the W3C XML Schema Working Group, to serve as input to the Working Group's work on the versioning of XML Schemas. It illustrates the types of versioning problems that could be solved by versioning mechanisms in XML Schema. However, note that the presences of a use case does not necessarily imply that XML Schema will be able to solve that particular versioning problem.

Status of this Document

This is a draft discussion document. Some of the use cases have been extensively discussed in the Working Group. However, the current set of use cases and text describing them have not been endorsed by the Working Group. The current document is in draft form, and is subject to change.

These use cases are based on real examples submitted by users of XML Schema.

The XML Schema Working Group would welcome additional uses cases which illustrate aspects of versioning which have not been captured by the existing use cases.

1. Introduction

1.1. Classification

1.1.1. Schema Availability
1.1.2. Instance Processed
1.1.3. Operation Performed
1.1.4. PSVI reporting

1.2. Terminology

2. Use cases

2.1. Convenience store

2.1.1. Overview

2.1.2. Use Case "Convenience" Scenario 1 - Item Configuration

2.1.2.1. Instance-F1 validated by Schema-F1
2.1.2.2. Instance-F1 partially validated by Schema-B
2.1.2.3. Instance-B partially validated by Schema-F1
2.1.2.4. Instance-CW validated against Schema-F1
2.1.2.5. Instance-F1 validated against Schema-CW
2.1.2.6. Instance-F2 validated against Schema-B
2.1.2.7. Instance-F2 validated against Schema-F1
2.1.2.8. Instance-F2 validated against Schema-CW
2.1.2.9. Instance-F1 validated against Schema-F2
2.1.2.10. Instance-B validated against Schema-F2
2.1.2.11. Instance-CW validated against Schema-F2

2.2. Comparison (Convenience store XSLT)

2.2.1. Overview
2.2.2. Use case "Comparison" scenario 1
2.2.3. Use case "Comparison" scenario 2

2.3. Specialization (Health care 1)

2.3.1. Overview
2.3.2. Use case "Specialization" scenario 1: Hospital to GP
2.3.3. Use case "Specialization" scenario 2: GP to GP
2.3.4. Use case "Specialization" scenario 3: Hospital to hospital
2.3.5. Use case "Specialization" scenario 4: GP to hospital
2.3.6. Use case "Specialization" scenario 5: Physiotherapy to hospital

2.4. Renaming (Health care 2)

2.4.1. Overview
2.4.2. Use case "Renaming" scenario 1: Hospital to GP
2.4.3. Use case "Renaming" scenario 2: GP to GP
2.4.4. Use case "Renaming" scenario 3: Hospital to hospital
2.4.5. Use case "Renaming" scenario 4: GP to hospital
2.4.6. Use case "Renaming" scenario 5: Physiotherapy to hospital

2.5. Customization (Invoice)

2.5.1. Overview
2.5.2. Use case "Customization" scenario 1: Video-supplier head office to Mega-store head office
2.5.3. Use case "Customization" scenario 2: Video-supplier warehouse to Video-supplier head office
2.5.4. Use case "Customization" scenario 3: Big-store head office to Big-store branch
2.5.5. Use case "Customization" scenario 4: Big-store to Video-supplier head office

2.6. UBL

2.7. Namespace overloading (XHTML)

2.7.1. Overview

2.8. XSLT

2.9. MathML

2.10. XSD versioning

1. Introduction

The ability to create different versions of XML Schemas is important. In some applications, schemas need to change over time and be adapted to meet new requirements. However, it is often not practical to instantaneously replace all the deployments of the old schemas with the new ones. Applications will need to cope with instances corresponding to the different versions of schemas. Versioning mechanisms will allow those new versions to be created, and the processors to handle instances from the different versions.

This document describes desirable behaviour in use cases that involve XML Schema versioning. The use case approach aims to describes external interactions on the system (in this case, the system is the XML Schema processor). They deliberately do not describe any implementation-specific mechanisms. Possible versioning mechanismes are discussed in "Framework for discussion of versioning" <http://www.w3.org/XML/2004/02/xsdv.html>.

This document focuses on the versioning of XML Schemas. In particular, on the behaviour of XML schema processors, which performs schema validation and exposes information via the Post-Schema Validation Infoset (PSVI). Schema versioning is also important to other types of systems, such as for the application and for code generators using XML Schemas. However, those other aspects of versioning are outside the scope of this document.

For more general information on versioning, the W3C Technical Architecture Group (TAG) is producing a TAG Finding on versioning on the Web.

Discussions on versioning is conducted on the <public-xml-versioning@w3.org> mailing list. Please send your comments on this document to that mailing list.

1.1. Classification

Several different axes have been identified to classify the different types of use cases. These axes are:

Schema Availability
Instance Processed
Operation Performed
PSVI reporting

1.1.1. Schema Availability

This axis indicates which schemas are available to the processor. Schemas will be identified by a letter, for example "B". The notation "V(B)" will be used to describe a schema called V that is a version of schema B.

If a schema is based on more than one schema, the base schemas are all listed. For example, "V(B,C,D)" denotes a schema called V that is a version of schemas B, C, and D.

A version of a schema may be versioned again. The notation "W(V(B))" indicates a schema called W which is a version of V, and V is a version of B.

It is important to remember that versioning can occur multiple times. The notation "D(...(B)...)" is use to be denote a schema D that is ultimately a version of B. However, there may be several versions between them.

The availability of schemas will depend on the schema processor and application. Sometimes processors can be easily updated with the new schemas. Schemas could be manually installed into the application, or the processor could automatically fetch them when they are needed. At other times, processors might not be configurable with the new schemas. They may be embedded devices that cannot easily be changed, or they might be disconnected from a network and can't automatically fetch any new schemas. Fetching external schemas might also be disallowed due to performance or security reasons.

1.1.2. Instance Processed

This axis indicates what kind of instance is being processed. That is, which version of schema the instance correspond to. The same notation as for schemas will be used to denote the different versions of instances.

A schema processor needs to handle instances of different versions. It may accept, partially accept, or reject the instance. The behaviour of schema processors is the subject of these use cases.

1.1.3. Operation Performed

This asis describes what action is being performed by the XML Schema processor.

The most common action is schema validation, where an instance document is validated against a XML Schema. This produces a result indicating if the document is valid, and a Post Schema Validation Infoset (PSVI). This is what current schema processors do.

A new operation in versioning is the comparison of two schemas to see if they are versions of each other.

1.1.4. PSVI reporting

[TODO]To be completed. Is this really an axis?

[TODO]Investigate if behaviour or fallback is another valid axis. Possibilities are: Must fail if you encounter unexpected things; Must ignore any unexpected things you come across; Trigger special action; Fallback based on schema; Fallback based on data in the instance. Fallback behaviour might be another use case.

1.2. Terminology

The term "backward compatiable" will mean an instance of an old schema can be processed by a processor that handles the new schema. Processing does not necessarily mean in the same way as an old schema processor would.

The term "forward compatiable" will mean an instance of the new schema can be processed by a processor that handles the old schema. Processing does not necessarily mean in the same way as a new schema processor would.

2. Use cases

2.1. Convenience store

[TODO]This use case has not been finished yet.

2.1.1. Overview

Consider a convenience store, with an electronic cash register, fuel pumps, car wash, and back-office accounting and management system.

The computer systems in the cash register, fuel pumps, car wash, and back-office system are built and sold by different vendors, but they must cooperate:

The peripherals (scanners, card readers, check readers, etc.) send fragmentary transaction data to the cash register.
The back-office system programs the cash register, car wash, and fuel pumps with products, prices, and rules to apply (e.g. discounts, restrictions, limitations, etc.).
The cash register creates a stream of transaction data to be read by the back-office system.
The fuel pumps and car-wash report totalizer information at various times of the day to the back-office system.

In order to provide for some interoperation among devices, vendor-neutral organizations create schemas which provide definitions for common document exchanges. Because the vendors compete for sales, it is to each vendor's advantage to add value by providing more useful information than the other vendor's product; this useful information most naturally takes the form of new child elements or attributes on elements defined in the common schema.

Because the devices must cooperate, dropped or garbled messages are a potential problem; if one device rejects a message sent by another device, the result is likely to be finger-pointing between the vendors about whose device is at fault: did the fuel pump emit a bad message, or did the cash register reject a perfectly legal data stream? In order to make such disputes relatively easy to resolve, it is extremely helpful to have a written description of the set of messages the devices are required to accept. An obvious choice is to have this description take the form of a schema document: that is, the accept set is defined as the set of documents valid against a particular schema. (Either full-validation (VF) or or partial-validation (VP) will do, although VP may involve some further stipulations about how full or partial the validation has to be.)

Any approach to versioning which involves accepting partially valid documents abandons the idea of using the schema as defining the contract between sender and recipient, and that leaves the vendors without a convenient way to adjudicate disputes over dropped data, greatly reducing the value of the schema or DTD.

For security or processing-footprint reasons, the convenience store devices do not load new schemas dynamically; schemas may be changed as part of an upgrade or as part of system maintenance, but the upgrades to each device are scheduled by their respective vendors, not always under the control of the convenience store operator. Vendors may be understandably loath to spend time and effort changing the schemas in their deployed devices just to support an upgrade made by a competitor to a different device. Therefore, in deploying a new version of a common schema (or a new set of value-added additions to it), a vendor cannot count on corresponding changes to the other devices. This requirement for coordination severely complicates the logistics of upgrading systems.

2.1.2. Use Case "Convenience" Scenario 1 - Item Configuration

Basic Cash register schema:

For example, consider a basic cash register that processes sales transaction information. It might expect data according to this datatype (Schema-B):

<xsd:complexType name="basic-item">
  <xsd:sequence>
    <xsd:element name="price" type="xsd:decimal"/>
    <xsd:element name="description" type="xsd:string"/>
  </xsd:sequence>
</xsd:complexType>

An example instance is (Instance-B):

<item>
  <price>12.90</price>
  <description>Unleaded fuel</description>
</item>

Fuel pump schema:

A vendor that supplies the gas/petrol station market produces fuel pumps and special fuel cash registers. These exchange data using a version of the basic cash register schema. This version adds information that is relevant to managing a fuel sale (Schema-F1).

<xsd:complexType name="fuel-item">
  <xsd:sequence>
    <xsd:element name="price" type="xsd:decimal"/>
    <xsd:element name="description" type="xsd:string"/>
    <xsd:element name="pump-id" type="xsd:integer"/>
    <xsd:element name="litres" type="xsd:decimal"/>
  </xsd:sequence>
</xsd:complexType>

An example instance (Instance-F1):

<item>
  <price>12.90</price>
  <description>Unleaded fuel</description>
  <pump-id>1</pump-id>
  <litres>42.1</litres>
</item>

2.1.2.1. Instance-F1 validated by Schema-F1

The fuel cash register is, obviously, designed to use the extra information provided by the gas pump. It must validate a fuel sale message against the fuel sale schema. However, some installations may have a mixture of different types of devices and cash registers.

2.1.2.2. Instance-F1 partially validated by Schema-B

If a fuel pump sends a fuel sale message to a basic cash register, the basic cash register may want to validate the fuel cash register document against the basic cash register schema, since the basic cash register is unaware of the fuel pump version of the schema.

2.1.2.3. Instance-B partially validated by Schema-F1

If a device sends a basic cash register message to a fuel cash register, then the fuel cash register needs to validate the document, even if that document conforms to the basic (not fuel) cash register schema.

Car wash schema:

A different vendor manufactures car wash machines which exchange a car wash item message, which is another version of the basic cash register item message. This car wash schema (Schema-CW) contains:

<xsd:complexType name="car-wash-item">
  <xsd:sequence>
    <xsd:element name="price" type="xsd:decimal"/>
    <xsd:element name="description" type="xsd:string"/>
    <xsd:element name="clean-level" type="xsd:integer"/>
  </xsd:sequence>
</xsd:complexType>

An instance document looks like this (Instance-CW):

<item>
  <price>10.95</price>
  <description>Standard car wash</description>
  <clean-level>1</clean-level>
</item>

Similar to the situation with the fuel pump manufacturer, the vendor also produces car wash cash registers that can accept these enhanced car wash versions of the sales message.

However, the versioning mechanism now also needs to handle additional versioning situations.

2.1.2.4. Instance-CW validated against Schema-F1

If a car wash machine sends a car wash item message to a fuel cash
register, the latter needs to (partially) validate it against the
basic cash register schema.

2.1.2.5. Instance-F1 validated against Schema-CW

???

Enhanced fuel cash register

The fuel pump vendor produces a new type of fuel pump, which sends enhanced fuel sale messages. The schema (Schema-F2) is an enhanced version of the fuel item schema.

<xsd:complexType name="enhanced-fuel-item">
  <xsd:sequence>
    <xsd:element name="price" type="xsd:decimal"/>
    <xsd:element name="description" type="xsd:string"/>
    <xsd:element name="pump-id" type="xsd:integer"/>
    <xsd:element name="litres" type="xsd:decimal"/>
    <xsd:element name="pay-at-pump" type="xsd:boolean"/>
    <xsd:element name="credit-card-number" type="xsd:string"/>
  </xsd:sequence>
</xsd:complexType>

An instance is (Instance-F2):

<item>
  <price>12.90</price>
  <description>Unleaded fuel</description>
  <pump-id>1</pump-id>
  <litres>42.1</litres>
  <pay-at-pump>true</pay-at-pump>
  <credit-card-number>5313 1234 1234 1234</credit-card-number>
</item>

2.1.2.6. Instance-F2 validated against Schema-B

If a basic cash register receives an enhanced fuel sale message... ???

2.1.2.7. Instance-F2 validated against Schema-F1

If the original fuel cash register recieves an enhanced fuel sale message... ???

2.1.2.8. Instance-F2 validated against Schema-CW

If a car wash cash register receives an enhanced fuel sale message... ???

2.1.2.9. Instance-F1 validated against Schema-F2

???

2.1.2.10. Instance-B validated against Schema-F2

???

2.1.2.11. Instance-CW validated against Schema-F2

???

2.2. Comparison (Convenience store XSLT)

2.2.1. Overview

In this use case, schemas are compared to see if they are versions of each other.

A back-office system receives sales transaction data from a number of point-of-sale (POS) devices. There are many different types of POS devices and they are produced by different vendors.

Due to market forces and competition between the vendors, there is no single standard data schema that they all use. Each vendor enhances their own schemas to be more competative within their own line of produces. However, this produces problems for customers who use products from multiple vendors. Their schemas are similar to each other, because they borrow features from each other. However, there is no well defined versioning lineage.

The back-office system has been designed to receive sales transaction data. For example, this data could contain entries corresponding to this XML Schema datatype:

<xs:complexType name="transaction">
  <xs:sequence>
    <xs:element name="plu" type="xsd:nonNegativeInteger"/>
    <xs:element name="description" type="xsd:string"/>
    <xs:element name="unitprice" type="xsd:decimal"/>
    <xs:element name="quantity" type="xsd:integer"/>
    <xs:element name="lineprice" type="xsd:decimal"/>
  </xs:sequence>
</xs:complexType>

An example instance is:

<transaction>
  <plu>1861003129</plu>
  <description>Widgets</description>
  <unitprice>10.95</unitprice>
  <quantity>2</quantity>
  <lineprice>21.90</lineprice>
</transaction>

Due to an industry initiative, products encoded with a UPC-A label must get special handling. This handling includes provisions for computing checksums and checking lengths. The POS vendor introduces an encoding attribute on the "plu" element to designate the encoding type. At the same time, the vendor introduces a new "discount" element to designate price reductions.

<xs:complexType name="transaction">
  <xs:sequence>
    <xs:element name="plu" 
      <xs:complexType>
        <xs:simpleContent>
          <xs:extension base="xs:nonNegativeInteger">
             <xs:attribute name="encoding" type="xsd:string"/>
          </xs:extension>
        </xs:simpleContent>
      </xs:complexType>
    </xs:element>
    <xs:element name="description" type="xsd:string"/>
    <xs:element name="unitprice" type="xsd:decimal"/>
    <xs:element name="discount" type="xsd:decimal" minOccurs="0"/>
    <xs:element name="quantity" type="xsd:integer"/>
    <xs:element name="lineprice" type="xsd:decimal"/>
  </xs:sequence>
</xs:complexType>

An example instance is:

<transaction>
  <plu encoding="upcA">1861003129</plu>
  <description>Widgets</description>
  <unitprice>10.95</unitprice>
  <discount>0.10</discount>
  <quantity>2</quantity>
  <lineprice>21.90</lineprice>
</transaction>

2.2.2. Use case "Comparison" scenario 1

[AXIS]Instance processed: N/A

[AXIS]Schema Availability: B and V(B) [Back-office and Vendor]

[AXIS]Operation performed: comparison

Brief summary: A systems integrator wants to check if the data produced by the new POS device is suitable for the back-office system.

Basic course of events:

1. The back-office schema is obtained.

2. The POS device schema is obtained.

3. The two schema are compared.

The outcome:

Processor indicates that instances that are schema valid against V(B) are also valid against B.

2.2.3. Use case "Comparison" scenario 2

[AXIS]Instance processed: V(B)

[AXIS]Schema Availability: B [Back-office]

[AXIS]Operation performed: Schema validation

Brief summary: The back-office system receives data from the new POS device.

Basic course of events:

1. The POS device generates an XML document according to the new schema.

2. The document is sent to the back-office system.

3. The back-office system validates the document using the original back-office schema.

The outcome:

The schema validator indicates that the document valid against the back-office schema.
The back-office system is able to accept the new message.

[TODO]More scenarios? Maybe show a counter-example.

2.3. Specialization (Health care 1)

2.3.1. Overview

In this use case, schemas are specialized with more specific versions, and processing software needs to operate in the presence or absence of those specialized schemas. A distinguishing characteristic of this use case is that instances of the specialized schemas are always valid instances of the base schema.

There is a generic base schema which has been approved by a country's health department. To ensure interoperability, the country's government has mandated that all compliant software must be able to store and process data that corresponds to this generic base schema. This generic base schema has been designed to store medical data, but in a very non-specific way so that it is flexible enough to handle a wide variety of data. This is necessary because medical data changes often due to new technology and practices, as well as being very diverse.

For example, the generic base schema could contain a generic datatype for storing two measurement values. We shall call this generic base schema: schema B.

<xsd:complexType name="measurement2">
  <xsd:sequence>
    <xsd:element name="value1">
      <xsd:complexType>
        <xsd:sequence>
          <xsd:element name="magnitude" type="xsd:decimal"/>
          <xsd:element name="unit" type="xsd:string"/>
        </xsd:sequence>
      </xsd:complexType>
    </xsd:element>
    <xsd:element name="value2">
      <xsd:complexType>
        <xsd:sequence>
          <xsd:element name="magnitude" type="xsd:decimal"/>
          <xsd:element name="unit" type="xsd:string"/>
        </xsd:sequence>
      </xsd:complexType>
    </xsd:element>
  </xsd:sequence>
</xsd:complexType>

An example instance of an element from schema B is:

<data>
  <value1>
    <magnitude>1660</magnitude>
    <unit>mm</unit>
  </value1>
  <value2>
    <magnitude>72</magnitude>
    <unit>kg</unit>
  </value2>
</data>

There are two products that use the generic schema: a General Practice (GP) management system, and a hospital clinical information management system. Clinical record information is exchanged between the two programs using messages containing XML. When the programs receive data, they validate it before storing or process them.

The country's General Practice doctor organization decides that it wants to standardize how blood pressures are recorded. They define it as a specialization (or constraint) of the generic base schema's measurement datatype. The versioning mechanism (whatever that may be) is used indicate that this is a version of the generic measurement datatype.

Here is an example from the specialized blood pressure schema, which we will call schema V(B). It constraints the units of both measurments to be mmHg.

<xsd:complexType name="blood_pressure">
  <xsd:sequence>

    <xsd:element name="value1">
      <xsd:complexType>
        <xsd:sequence>
          <xsd:element name="magnitude" type="xsd:decimal"/>
          <xsd:element name="unit">
            <xsd:simpleType>
              <xsd:restriction base="xsd:string">
                <xsd:enumeration value="mmHg"/>
              </xsd:restriction>
            </xsd:simpleType>
          </xsd:element>
        </xsd:sequence>
      </xsd:complexType>
    </xsd:element>

    <xsd:element name="value2">
      <xsd:complexType>
        <xsd:sequence>
          <xsd:element name="magnitude" type="xsd:decimal"/>
          <xsd:element name="unit">
            <xsd:simpleType>
              <xsd:restriction base="xsd:string">
                <xsd:enumeration value="mmHg"/>
              </xsd:restriction>
            </xsd:simpleType>
          </xsd:element>
        </xsd:sequence>
      </xsd:complexType>
    </xsd:element>

  </xsd:sequence>
</xsd:complexType>

An example instance of a blood pressure specialization V(B) is:

<data>
  <value1>
    <magnitude>142</magnitude>
    <unit>mmHg</unit>
  </value1>
  <value2>
    <magnitude>80</magnitude>
    <unit>mmHg</unit>
  </value2>
</data>

The General Practice software is modified or updated to have the blood pressure schema, but the hospital software is not. There can be many reasons why the hospital software does not have the blood pressure schema, such as: the timing cycle of software upgrades, it may be running in an off-line mode where schemas cannot be fetched, cost, performance, security or policy.

Later on, a physiotherapy clinic decides that it wants to further refine the definition of a blood pressure to only contain sensible values for the systolic and diastolic readings. The versioning mechanism would indicate that this refined blood pressure datatype is a version of the ordinary blood pressure datatype.

This is an excerpt from the refined physiotherapy schema W(V(B)). It shows that the numeric values in the measurement are restricted to certain ranges.

<xsd:complexType name="blood_pressure_refined">
  <xsd:sequence>

    <xsd:element name="value1">
      <xsd:complexType>
        <xsd:sequence>
          <xsd:element name="magnitude">
            <xsd:simpleType>
              <xsd:restriction base="xsd:decimal">
                <xsd:minInclusive value="90"/>
                <xsd:maxInclusive value="140"/>
              </xsd:restriction>
            </xsd:simpleType>
          </xsd:element>
          <xsd:element name="unit">
            <xsd:simpleType>
              <xsd:restriction base="xsd:string">
                <xsd:enumeration value="mmHg"/>
              </xsd:restriction>
            </xsd:simpleType>
          </xsd:element>
        </xsd:sequence>
      </xsd:complexType>
    </xsd:element>

    <xsd:element name="value2">
      <xsd:complexType>
        <xsd:sequence>
          <xsd:element name="magnitude">
            <xsd:simpleType>
              <xsd:restriction base="xsd:decimal">
                <xsd:minInclusive value="60"/>
                <xsd:maxInclusive value="90"/>
              </xsd:restriction>
            </xsd:simpleType>
          </xsd:element>
          <xsd:element name="unit">
            <xsd:simpleType>
              <xsd:restriction base="xsd:string">
                <xsd:enumeration value="mmHg"/>
              </xsd:restriction>
            </xsd:simpleType>
          </xsd:element>
        </xsd:sequence>
      </xsd:complexType>
    </xsd:element>

  </xsd:sequence>
</xsd:complexType>

2.3.2. Use case "Specialization" scenario 1: Hospital to GP

[AXIS]Instance processed: B [Hospital]

[AXIS]Schema Availability: B and V(B) [Hospital and GP]

[AXIS]Operation performed: Schema validation

Brief summary: The hospital system sends a message to the GP system. An instance of the base schema will be processed according to the base schema. Even though the processor has access to the specialized schema, it is not used. The schema processor is backward compatiable.

Basic course of events:

1. The hospital generates an instance of the base schema.

2. The base instance is sent to the GP system (which has access to both the base and the specialized schema).

3. The GP system processes the base instance.

The outcome:

The GP processor determines that it is an instance of the base schema.
Valid against the base schema.
PSVI returns all information items expected from the base schema.

Since the GP system is receiving a base instance, it cannot (and should not) use the specialization schema.

2.3.3. Use case "Specialization" scenario 2: GP to GP

[AXIS]Instance processed: V(B) [GP]

[AXIS]Schema Availability: B and V(B) [Hospital and GP]

[AXIS]Operation performed: Schema validation

Brief summary: One GP system sends a message to another GP system. An instance of the specialized schema will be processed according to the specialized schema. Even though the processor could have validated it using the base schema, it must use the specialized schema.

Basic course of events:

1. Another GP system generates an instance of the specialized schema.

2. The specialized instance is sent to the GP system (which has access to both the base and the specialized schema)

3. The GP system processes the specialized instance.

The outcome:

The GP processor determines that it is an instance of the GP schema.
Valid against the specialized schema.
PSVI returns all information items expected from the specialized schema.

The receiving GP system needs to treat the instance as a blood pressure, because it wants to process it in a special way (e.g. graph it or apply decision support on it). Although it could also validate it using the generic base schema, it does not do so because the extra constraints in the blood pressure schema are important for the application to correctly interpret the data as a blood pressure.

2.3.4. Use case "Specialization" scenario 3: Hospital to hospital

[AXIS]Instance processed: B [Hospital]

[AXIS]Schema Availability: B [Hospital]

[AXIS]Operation performed: Schema validation

Brief summary: One hospital system sends a message to another hospital. An instance of the base schema is processed according to the base schema.

Basic course of events:

1. Another hospital system generates an instance of the base schema.

2. The base instance is sent to the hospital system (which only has access to the base schema).

3. The hospital system processs the base instance.

The outcome:

The hospital processor determines that it is an instance of the hospital schema.
Valid agains the base schema.
PSVI returns all information items expected from the base schema.

This use case does not invoke any special versioning feature, but it is included here for completeness.

2.3.5. Use case "Specialization" scenario 4: GP to hospital

[AXIS]Instance processed: V(B) [GP]

[AXIS]Schema Availability: B [Hospital]

[AXIS]Operation performed: Schema validation

Brief summary: A GP system sends a message to a hospital system. An instance of the specialized schema is processed using the base schema when the processor does not have access to the specialized schema. The schema processor is forward compatiable.

Basic course of events:

1. The GP system generates an instance of the specialized schema.

2. The specialized instance is sent to the hospital system (which only has access to the base schema).

3. The hospital system processs the specialized instance.

The outcome:

The hospital processor determines that it is an instance of the hospital schema.
Valid agains the base schema.
PSVI returns all information items expected from the base schema.

The hospital system cannot recognise the data as a blood pressure, but can process it generically. There might be a generic base schema database for storing them, or a generic viewer that can display the values.

2.3.6. Use case "Specialization" scenario 5: Physiotherapy to hospital

[AXIS]Instance processed: W(...(B)...) [Physiotherapy]

[AXIS]Schema Availability: B [Hospital]

[AXIS]Operation performed: Schema validation

Brief summary: A physiotherapy system sends a message to a hospital system. An instance of the refined physiotherapy schema is processed using the base schema (when the processor does not have access to the refined physiotherapy schema nor access to the specialized GP schema). The schema processor is forward compatiable across multiple versions.

Basic course of events:

1. The physiotherapy system generates an instance of the refined physiotherapy schema.

2. The refined physiotherapy instance is sent to the hospital system (which only has access to the base schema).

3. The hospital system processs the refined instance.

The outcome:

The hospital processor determines that it is an instance of the hospital schema.
Valid against the base schema.
PSVI returns all information items expected from the base schema.

This scenario shows that the versioning must work across multiple generations of versions, not just between two successive versions.

2.4. Renaming (Health care 2)

2.4.1. Overview

In this use case, schemas are specialized with more specific versions and information items are renamed in those versions. The processing software needs to operate in the presence or absence of those specialized schemas. This use case is similar to the "Specialization without renaming" use case, except that items are renamed.

For example, the generic base schema could contain a generic datatype for storing two measurement values. We shall call this generic base schema: schema B.

<xsd:complexType name="measurement2">
  <xsd:sequence>
    <xsd:element name="value1">
      <xsd:complexType>
        <xsd:sequence>
          <xsd:element name="magnitude" type="xsd:decimal"/>
          <xsd:element name="unit" type="xsd:string"/>
        </xsd:sequence>
      </xsd:complexType>
    </xsd:element>
    <xsd:element name="value2">
      <xsd:complexType>
        <xsd:sequence>
          <xsd:element name="magnitude" type="xsd:decimal"/>
          <xsd:element name="unit" type="xsd:string"/>
        </xsd:sequence>
      </xsd:complexType>
    </xsd:element>
  </xsd:sequence>
</xsd:complexType>

An example instance of an element from schema B is:

<data>
  <value1>
    <magnitude>1660</magnitude>
    <unit>mm</unit>
  </value1>
  <value2>
    <magnitude>72</magnitude>
    <unit>kg</unit>
  </value2>
</data>

Here is an example from the specialized blood pressure schema, which we will call schema V(B). It constraints the units of both measurments to be mmHg.

<xsd:complexType name="blood_pressure">
  <xsd:sequence>

    <xsd:element name="systolic">
      <xsd:complexType>
        <xsd:sequence>
          <xsd:element name="magnitude" type="xsd:decimal"/>
          <xsd:element name="unit">
            <xsd:simpleType>
              <xsd:restriction base="xsd:string">
                <xsd:enumeration value="mmHg"/>
              </xsd:restriction>
            </xsd:simpleType>
          </xsd:element>
        </xsd:sequence>
      </xsd:complexType>
    </xsd:element>

    <xsd:element name="diastolic">
      <xsd:complexType>
        <xsd:sequence>
          <xsd:element name="magnitude" type="xsd:decimal"/>
          <xsd:element name="unit">
            <xsd:simpleType>
              <xsd:restriction base="xsd:string">
                <xsd:enumeration value="mmHg"/>
              </xsd:restriction>
            </xsd:simpleType>
          </xsd:element>
        </xsd:sequence>
      </xsd:complexType>
    </xsd:element>

  </xsd:sequence>
</xsd:complexType>

An example instance of a blood pressure specialization V(B) is:

<blood_pressure>
  <systolic>
    <magnitude>142</magnitude>
    <unit>mmHg</unit>
  </systolic>
  <diastolic>
    <magnitude>80</magnitude>
    <unit>mmHg</unit>
  </diastolic>
</blood_pressure>

This is an excerpt from the refined physiotherapy schema W(V(B)). It shows that the numeric values in the measurement are restricted to certain ranges.

<xsd:complexType name="blood_pressure_refined">
  <xsd:sequence>

    <xsd:element name="systolic">
      <xsd:complexType>
        <xsd:sequence>
          <xsd:element name="magnitude">
            <xsd:simpleType>
              <xsd:restriction base="xsd:decimal">
                <xsd:minInclusive value="90"/>
                <xsd:maxInclusive value="140"/>
              </xsd:restriction>
            </xsd:simpleType>
          </xsd:element>
          <xsd:element name="unit">
            <xsd:simpleType>
              <xsd:restriction base="xsd:string">
                <xsd:enumeration value="mmHg"/>
              </xsd:restriction>
            </xsd:simpleType>
          </xsd:element>
        </xsd:sequence>
      </xsd:complexType>
    </xsd:element>

    <xsd:element name="diastolic">
      <xsd:complexType>
        <xsd:sequence>
          <xsd:element name="magnitude">
            <xsd:simpleType>
              <xsd:restriction base="xsd:decimal">
                <xsd:minInclusive value="60"/>
                <xsd:maxInclusive value="90"/>
              </xsd:restriction>
            </xsd:simpleType>
          </xsd:element>
          <xsd:element name="unit">
            <xsd:simpleType>
              <xsd:restriction base="xsd:string">
                <xsd:enumeration value="mmHg"/>
              </xsd:restriction>
            </xsd:simpleType>
          </xsd:element>
        </xsd:sequence>
      </xsd:complexType>
    </xsd:element>

  </xsd:sequence>
</xsd:complexType>

2.4.2. Use case "Renaming" scenario 1: Hospital to GP

[AXIS]Instance processed: B [Hospital]

[AXIS]Schema Availability: B and V(B) [Hospital and GP]

[AXIS]Operation performed: Schema validation

Basic course of events:

1. The hospital generates an instance of the base schema.

2. The base instance is sent to the GP system (which has access to both the base and the specialized schema).

3. The GP system processes the base instance.

The outcome:

The hospital processor determines that it is an instance of the base schema.
Valid against the base schema.
PSVI returns all information items expected from the base schema.

Since the GP system is receiving a base instance, it cannot (and should not) use the specialization schema.

2.4.3. Use case "Renaming" scenario 2: GP to GP

[AXIS]Instance processed: V(B) [GP]

[AXIS]Schema Availability: B and V(B) [Hospital and GP]

[AXIS]Operation performed: Schema validation

Basic course of events:

1. Another GP system generates an instance of the specialized schema.

2. The specialized instance is sent to the GP system (which has access to both the base and the specialized schema)

3. The GP system processes the specialized instance.

The outcome:

The GP processor determines that it is an instance of the GP schema.
Valid against the specialized schema.
PSVI returns all information items expected from the specialized schema.

2.4.4. Use case "Renaming" scenario 3: Hospital to hospital

[AXIS]Instance processed: B [Hospital]

[AXIS]Schema Availability: B [Hospital]

[AXIS]Operation performed: Schema validation

Brief summary: One hospital system sends a message to another hospital. An instance of the base schema is processed according to the base schema.

Basic course of events:

1. Another hospital system generates an instance of the base schema.

2. The base instance is sent to the hospital system (which only has access to the base schema).

3. The hospital system processs the base instance.

The outcome:

The hospital processor determines that it is an instance of the hospital schema.
Valid agains the base schema.
PSVI returns all information items expected from the base schema.

This use case does not invoke any special versioning feature, but it is included here for completeness.

2.4.5. Use case "Renaming" scenario 4: GP to hospital

[AXIS]Instance processed: V(B) [GP]

[AXIS]Schema Availability: B [Hospital]

[AXIS]Operation performed: Schema validation

Basic course of events:

1. The GP system generates an instance of the specialized schema.

2. The specialized instance is sent to the hospital system (which only has access to the base schema).

3. The hospital system processs the specialized instance.

The outcome:

The hospital processor determines that it is an instance of the hospital schema.
Valid agains the base schema.
PSVI returns all information items expected from the base schema.

2.4.6. Use case "Renaming" scenario 5: Physiotherapy to hospital

[AXIS]Instance processed: W(...(B)...) [Physiotherapy]

[AXIS]Schema Availability: B [Hospital]

[AXIS]Operation performed: Schema validation

Basic course of events:

1. The physiotherapy system generates an instance of the refined physiotherapy schema.

2. The refined physiotherapy instance is sent to the hospital system (which only has access to the base schema).

3. The hospital system processs the refined instance.

The outcome:

The hospital processor determines that it is an instance of the hospital schema.
Valid against the base schema.
PSVI returns all information items expected from the base schema.

This scenario shows that the versioning must work across multiple generations of versions, not just between two successive versions.

2.5. Customization (Invoice)

[TODO]This section is incomplete: it only has example instances, and we need to show the example schemas corresponding to them.

2.5.1. Overview

In this use case, a schema is customized for local needs, but the local version must not break validation with the original schema.

This use case is different from the "health care / specialization" use case because the customizations are extensions of the base schema. Without any versioning mechanisms, instances of the customized schemas would not normally be valid instances of the base schema.

This use case involves a Mega-store which has a head office and branches. One of the suppliers to Mega-store is Video-supplier, which has a head office and warehouses.

A base schema is defined by a Mega-store for invoices. We shall call this schema B. Mega-store requires all of its suppliers to provide invoices using it.

An example instance of an item in the Mega-store invoice is:

<item>
  <part>1138</part>
  <description>Generic widget</description>
  <quantity>12</part>
</item>

When the Mega-store created this schema, it did not know how others may want to extend it, and they had no plans nor expectations on the reuse of the schema.

Video-supplier decides to reimplement their stock tracking system to be natively based on the Mega-store invoice schema. They want to avoid the need to translate between the schemas they use internally and those that it sends to external customers.

However, Video-supplier needs to store additional information in their internal invoices. To do this, they create a customized schema that is a version of the Mega-store invoice schema.

In this example, they want to indicate which warehouse the items were shipped from. Here, they have added their customizations to the same namespace as the mega-store invoice. We shall call this schema V(B).

This is an example instance of the Video-supplier item:

<item>
  <part>1138</part>
  <description>Generic widget</description>
  <quantity>12</part>
  <source>
    <warehouse>W1</warehouse>
  </source>
</item>

An item is used in a number of different places in an invoice (e.g. items delivered, items backordered).

Invoices are sent from Video-supplier warehouses to the Video-supplier head office (which expects to see the customized information items). They are also sent from the Video-supplier head office to the Mega-store head office. Invoices are also sent from the Mega-store head office to their Mega-store branches.

The versioning mechanism must accomplish two things: make it easy to write the extensions, and to not break processors expecting instances of the original schema.

Firstly, the versioning mechanism needs a simple way of indicating that wherever the Mega-store invoice schema referred to a Mega-store item, in the new Video-supplier schema a Video-supplier item must always be used instead. Redefining everything that refers to the Mega-store item is not a good solution. The customised element could be used in many different places, leading to a maintenance problem. Redefining everything would effectively create a parallel copy of the schema, where the clear relationship with the original schema would have been lost. The clear relationship being that only the item has versioned, and nothing else.

Secondly, Video-supplier wants to send their customized invoices to the Mega-store without translation (i.e. without needing to strip out the customized elements). Those new Video-supplier instances must work with old processors that expect Mega-store instances, as shown in the scenarios below.

2.5.2. Use case "Customization" scenario 1: Video-supplier head office to Mega-store head office

[AXIS]Instance processed: V(B) [Video-supplier]

[AXIS]Schema Availability: B [Big-store]

[AXIS]Operation performed: Schema validation

Brief summary: The Video-supplier head office sends an invoice to the Big-store head office. An instance of the customized schema is processed by the base schema (without access to the customized schema). The schema processor is forward compatiable.

Basic course of events:

1. The Video-supplier system generates an instance of the customized schema.

2. The customized instance is sent to the Big-store head office system (which only has access to the base schema)

3. The Big-store head office system processes the customized instance.

The outcome:

Valid against the base schema
PSVI returns all information items expected from the base schema.

Although the customized instance contains extra elements which are not a part of the base schema, they must not cause the processor or the application to fail. The instance must be valid according to the base schema, because the contract between the two companies was based on exchanging invoices valid according to the Big-store schema. The information returned by the PSVI must not cause the Big-store application to break.

2.5.3. Use case "Customization" scenario 2: Video-supplier warehouse to Video-supplier head office

[AXIS]Instance processed: V(B) [Video-supplier]

[AXIS]Schema Availability: B and V(B) [Big-store and Video-supplier]

[AXIS]Operation performed: Schema validation

Brief summary: A Video-supplier warehouse sends an invoice to the Video-supplier head office. An instance of the customized schema is processed by the customized schema.

Basic course of events:

1. The Video-supplier warehouse system generates an instance of the customized schema.

2. The customized instance is sent to the Video-supplier head office.

3. The Video-supplier head office system processes the customized instance.

The outcome:

Valid against the customized Video-supplier schema.
PSVI returns all information items expected from the customized Video-supplier schema.

2.5.4. Use case "Customization" scenario 3: Big-store head office to Big-store branch

[AXIS]Instance processed: B [Big-store]

[AXIS]Schema Availability: B [Big-store]

[AXIS]Operation performed: Schema validation

Brief summary: The Big-store head office sends an invoice to a Big-store branch. An instance of the base schema is processed by a procesor with the base schema.

Basic course of events:

1. The Big-store head office system generates an instance of the base schema.

2. The base instance is sent to a Big-store branch.

3. The Big-store branch system processes the base instance.

The outcome:

Valid according to the base schema.
PSVI returns all information items expected from the base Big-store schema.

2.5.5. Use case "Customization" scenario 4: Big-store to Video-supplier head office

[AXIS]Instance processed: B [Big-store]

[AXIS]Schema Availability: B and V(B) [Big-store and Video-supplier]

[AXIS]Operation performed: Schema validation

Brief summary: The Big-store (incorrectly) sends an invoice to the Video-supplier head office. An instance of the base schema is processed by a system that expects an instance of the customized schema.

Basic course of events:

1. The Big-store system generates an instance of the base schema

2. The base instance is sent to the Video-supplier head office.

3. The Video-supplier head office system processes the base instance.

The outcome:

Invalid according to the Video-supplier schema.
PSVI indicates that parts of the instance is invalid.

The Video-supplier head office application expects that all invoices it receives are Cooltoy invoices, which contain the extra warehouse information. It must rejects the invoice because it does not have the warehouse information.

[TODO]Do we also need an V(...(B)...) scenario too?

2.6. UBL

[TODO]This needs to be written up.

http://www.idealliance.org/papers/dx_xmle03/html/abstract/03-04-03.html

Two largish goals

1. Version N system (application hard-coded for N), and get version K data, where K later N

a. Have access to information about both schemas, so can look at K schema and figure out what to make of it So system written with knowledge that later schemas might come into play, so don't have first class knowledge of them, but did expect them. Also know that K schema will have been constructed in accordance with UBL design evolution process. (New NS for every major version and use redefinition to define relationship.)

b. Without new schema

2. Ability to document the relationship between versions with caveat that explicit signalling should not be required for all cases

2.7. Namespace overloading (XHTML)

[TODO]Is this more about mechanisms (and should be described in a separate document)?

2.7.1. Overview

In XHTML 1.0, the strict, transitional, and frameset document types all use the same XML namespace. However, the schemas for them are very different. Additionally, variants of XHTML 1.0 such as XHTML Basic also share the same namespace, even though the schema for them is different.

Developers are often tempted into designing extensibility into their schemas at the infoset level rather than at the schema level. For example, they may add a "version" attribute to the root element: Hoping that future applications will be able to change their behaviour based on the value in that attribute. However, schema processors are not aware of application semantics, and cannot handle schema validation on the different versions.

Consider a real estate schema to record details about a property. In the original version, only the number of bathrooms is captured.

<property xmlns="http://realestate.example.com/property" version="1.0">
  <bedrooms>3</bedrooms>
  <bathrooms>1</bathrooms>
</property>

In the next version of the application, the schema has been changed. Here, details of how many baths, showers, and toilets are individually recorded.

<property xmlns="http://realestate.example.com/property" version="2.0">
  <bedrooms>3</bedrooms>
  <bathrooms>
    <bath>1</bath>
    <shower>1</shower>
    <toilet>2</toilet>
  </bathrooms>
</property>

[TODO]Can schemas (without co-constraits) ever hope to cope with this? If people choose to do things this way, is it schema's role to address the issues that arise?

2.8. XSLT

[TODO]"Example is XSLT -- must a transform continue to work if the input uses a new version of its vocabulary."

2.9. MathML

[TODO]Need more information on this one.

2.10. XSD versioning

[TODO]This needs to be completed.

[TODO]Add text to use cases to discuss which ones are currently solvable by XML Schema 1.0.

XML Schema Versioning Use Cases

Abstract

Status of this Document

Table of contents

1. Introduction

1.1. Classification

1.1.1. Schema Availability

1.1.2. Instance Processed

1.1.3. Operation Performed

1.1.4. PSVI reporting

1.2. Terminology

2. Use cases

2.1. Convenience store

2.1.1. Overview

2.1.2. Use Case "Convenience" Scenario 1 - Item Configuration

2.1.2.1. Instance-F1 validated by Schema-F1

2.1.2.2. Instance-F1 partially validated by Schema-B

2.1.2.3. Instance-B partially validated by Schema-F1

2.1.2.4. Instance-CW validated against Schema-F1

2.1.2.5. Instance-F1 validated against Schema-CW

2.1.2.6. Instance-F2 validated against Schema-B

2.1.2.7. Instance-F2 validated against Schema-F1

2.1.2.8. Instance-F2 validated against Schema-CW

2.1.2.9. Instance-F1 validated against Schema-F2

2.1.2.10. Instance-B validated against Schema-F2

2.1.2.11. Instance-CW validated against Schema-F2

2.2. Comparison (Convenience store XSLT)

2.2.1. Overview

2.2.2. Use case "Comparison" scenario 1

2.2.3. Use case "Comparison" scenario 2

2.3. Specialization (Health care 1)

2.3.1. Overview

2.3.2. Use case "Specialization" scenario 1: Hospital to GP

2.3.3. Use case "Specialization" scenario 2: GP to GP

2.3.4. Use case "Specialization" scenario 3: Hospital to hospital

2.3.5. Use case "Specialization" scenario 4: GP to hospital

2.3.6. Use case "Specialization" scenario 5: Physiotherapy to hospital

2.4. Renaming (Health care 2)

2.4.1. Overview

2.4.2. Use case "Renaming" scenario 1: Hospital to GP

2.4.3. Use case "Renaming" scenario 2: GP to GP

2.4.4. Use case "Renaming" scenario 3: Hospital to hospital

2.4.5. Use case "Renaming" scenario 4: GP to hospital

2.4.6. Use case "Renaming" scenario 5: Physiotherapy to hospital

2.5. Customization (Invoice)

2.5.1. Overview

2.5.2. Use case "Customization" scenario 1: Video-supplier head office to Mega-store head office

2.5.3. Use case "Customization" scenario 2: Video-supplier warehouse to Video-supplier head office

2.5.4. Use case "Customization" scenario 3: Big-store head office to Big-store branch

2.5.5. Use case "Customization" scenario 4: Big-store to Video-supplier head office

2.6. UBL

2.7. Namespace overloading (XHTML)

2.7.1. Overview

2.8. XSLT

2.9. MathML

2.10. XSD versioning