XML Schema and Web Services

An Experience Report on behalf of BT submitted to the
W3C Workshop on XML Schema 1.0 User Experiences.
June 21-22, Oracle Conference Centre, Redwood Shores, CA

Jon Calladine jon.calladine@bt.com
Paul Downey paul.downey@bt.com

Abstract

This report presents a series of BT user stories which chart our experience in using XML Schema to describe messages exchanged by Web services.

We explain how the value of a standard is greatly diminished, even one as widely used as XML Schema, when it is implemented inconsistently and offer best practices in conjunction with test cases targeted at description as the best way of communicating which aspects of schema may be relied upon to interoperate.

User Story - Enabling the Infrastructure

CSS Customer Service System
BT’s primary OSS system for 23 million customers holds over 13 Terabytes of data. Running on 28 Mainframe images, CSS supports an online population of over 40,000 users generating over 230 million transactions per day.

CAMSS Customer Assisted. Maintenance for Special Services
BT’s primary trouble/ticketing system handles a peak of 2,000 faults per minute. This large mainframe system has a hot standby as an outage of over two hours could cost BT more than £1 Million per hour in penalties.

COSMOSS Customer Oriented Service Management of Special Services
The largest single CICS DB2 system within BT provides order handling for all of BT’s private circuit and broadband products and supports some 6,000 users along with a vast array of satellite applications.

An early success for use of XML within BT was in the exposure of existing mainframe services using SOAP/XML messages described in WSDL. These existing services typically consist of structured text messages which could be exchanged over a variety of transports including raw TCP/IP sockets, JMS and IBM MQSeries, however most systems accessed these services via a mid-tier using a client/server technology such as DCE and later CORBA.

It was a relatively simple process to provide HTTP access to an XML encoding of these messages, encapsulated in SOAP exchanges and then to describe the messages exchanged in XML Schema wrapped inside a WSDL document.

Selling the use of the burgeoning Web services technologies was successful for the following reasons:

  1. There was a wide level of support across a wide variety of platforms. Every vendor either provided support for code generation from WSDL or had such technology on its product roadmap.
  2. Improved interoperability saved BT having to develop and support a multitude of custom APIs and protocol conversions.
  3. The convenience of generate, build, run, matched a pattern already embraced by developers using DCE, COM and CORBA.
  4. Developers were able to work with these mainframe services in a wide range of platforms and in their own terms.

During this time we endeavoured to ensure the services worked with as many Web service toolkits as possible by practical testing with vendor toolkits. This included the development of an interoperability TestBench. The TestBench took a WSDL description of a service combined with input and expected output data in a canonical format, invoking the service using a variety of toolkits. At the height of this activity we tested each new service published against 15 varieties of SOAP toolkit.

We also tracked the practical testing work undertaken by the informal soapbuilders forum.

Notable features of these services:

  1. The toolkits used a very restricted vocabulary of XML Schema, but this restricted vocabulary was sufficient for the simple services being represented and well understood from our practical testing and tracking of soapbuilders.
  2. The messages generated by tools contained annotations to assist mapping to data structures such as 'xsi:type' and attributes in the 'soapenc' namespace. For the most part these could be ignored, except some tools refused to decode a list of repeated items unless it was accompanied by a soapenc:arrayType. In some cases the arrayType had to declare the number of items appearing in the XML list. It was often difficult to generate and add these annotations when processing the documents without prior knowledge of the types being passed complicating matters somewhat for those processing at the XML document level.
  3. SOAP encoding provided support for multi-referenced data. Here, graph data structures could be represented using ID pointers to data items marked with an IDREF attribute. Many tools sent SOAP/XML messages in this complex form, regardless of the data being exchanged, however it was easy to write a generic process to flatten these structures out when processing the messages at the XML document level.

Versioning of Web services became an issue during this time. Having published a service, there was a natural expectation that with a self describing format such as XML, we should be able to add values to the message which an existing receiver would simply ignore. So long as we didn't remove any expected values or change the meaning of data items in a message, we should be able to evolve services without impacting existing consumers. Unfortunately some code generating tools rejected messages containing additional unexpected values. Work-rounds offered by articles in technical journals all involved using schema constructs above and beyond those well supported by our toolkits.

User Story - Document Centric Services

The BMS programme included a large number of services aimed at a very diverse set of consumers. The majority of the services developed exposed XML data generated by a third-party packaged application. Following from the success of the mainframe Web services, the use of SOAP/XML messages exchanged described by XML Schema and wrapped in WSDL quickly became an obvious technology choice for integration, however the publication of the WS-I Basic Profile caused us to reassess our use of SOAP encoded messages and move into the literal exchange of XML documents, once again described using XML Schema wrapped in WSDL.

BMS Bearer Management System
A body of work designed to migrate BT’s copper records from CSS to a packaged application implemented in Oracle and running on a cluster of systems running Unix. The completed system will be accessed by 20,000 concurrent users and feed 35 different applications running on wide variety of platforms.

It quickly became apparent that XML Schema structures which best modelled the complex and highly variable data being exchanged didn't interoperate well with our existing Web service toolkits. An XML Schema construct not accepted by a tool would be apparent in one of the following ways:

  1. The WSDL and XML Schema would be rejected by the toolkit.
  2. An individual construct would be represented as a generic XML node, rather than in the convenient terms the developer might expect for other constructs.
  3. The entire document would be represented as a generic XML node, rather than in the convenient terms the developer might expect for other constructs.

Working directly with vendors to highlight interoperability became highly time consuming and was not a rewarding experience. Advice given often involved changing the XML Schema to use simpler XML Schema constructs, or the same construct in a manor which caused an issue with another tool. It was difficult to persuade vendors to accept the importance of an interoperability issue once they had given us such a resolution; harder still to receive fixes in the existing older version of their products being used.

It became evident that in many cases little or no practical testing of many XML Schema constructs could have taken place at source; testing was once again in our hands.

At this point we looked to the XML Schema test cases for generating simple tests to demonstrate interoperability tests with Web service tools to vendors. This was not a simple task: a lack of consistent meta data within the test pack made finding tests which exhibit an individual aspect of schema difficult. In many cases it was unclear if a test intended to have a positive or negative outcome.

Given the importance of code binding to developers, and the control we had over the schemas being published, we once gain resorted to publishing a 'Web services toolkit friendly vocabulary'. This profile, based upon practical testing of tools, was a very reduced subset of the XML Schema 1.0 specification, included the following rules:

  1. Avoid the date and time types, in particular duration

    We found simple types not to be well supported, especially those using patterns to restrict the value range.

  2. Avoid user defined simple types

    We found simple types not to be well supported, especially those using patterns and ranges.

  3. Namespace qualify schema elements

    This has been an area of contention with several people reporting improved interoperability with some tools by using no-namespace and chameleon schemas. This is also one of the least well understood aspects of schema.

  4. Always qualify schema references

    i.e. use type='xs:string' rather than type='string'

  5. Use 'venetian blind' style schemas

    We encountered difficulties with anonymous complexTypes and placing of minOccurs/maxOccurs on a 'sequence'

  6. Nest repeated elements in their own container

    Giving each repeated element a wrapper element worked best with tools which mapped XML to data structures and help avoid invalidating the UPA rules.

  7. Avoid 'all' and 'choice'

    We found these constructs to not be well supported. In particular 'all' had a number of restrictions. Many people wanted to use 'all' to represent a Java class (whose members appear in a random order under reflection) and then extend the types which invalidates UPA.

  8. Use nillable='true' and minOccurs='0' for optional elements

    We discovered many tools sent xsi:nil='true' regardless of the published schema. Some vendors even disputed that the default value for nillable is 'false', or could not agree upon the semantics of the nillable attribute.

We also encountered a large number of issues using xs:include and xs:import, in particular when trying to follow the advice from the WS-I Basic Profile in combination with schemas embedded in WSDL, constraints from XML Schema and variances in implementation.

As a result of this exercise, the resulting schemas no longer expressed many of the constraints, many of which were not expressible with the full language of XML Schema anyway mainly due to the lack of co-constraints. The description of the messages became very open and highly optional and unwieldy for developers. It was no longer possible to gauge the shape or size of expected messages from the schemas themselves.

Developers who preferred to work with the XML content of SOAP messages directly using techniques such as DOM, SAX, XPath, etc reported a lack of documentation and other forms of support from vendors who continued to promote the data binding aspects of using WSDL.

The value of providing an XML Schema description at all has been questioned by those forced to resort to using DOM. In particular the inability to evolve schemas easily led a number of projects to publish WSDLs with a XML Schema containing a single xs:any node, or to encode XML inside a single xs:string.

User Story - Third Party Schemas

Wrapped in WSDL and exposed as a SOAP/XML document exchange, the OASIS defined Service Provision Markup Language (SPML) was reused by BT to provide a Web service interface to exchange contact details, billing addresses and other information held against partner organisations. Unfortunately several users of this service encountered difficulties using Version 1.0 of the SPML Schema :

21CN BT is currently investing £10 Billion over 5 years replacing the UK’s public switched telephone network (PSTN) with an IP-based network to provide both voice and data services. The use of XML technologies, in particular Web services, is at the heart of this programme.
  1. The SPML schema was rejected by several tools, with reports of ‘non deterministic’ constructs and invalidation of the UPA rule.
  2. The schema made good use of substitution groups, poorly implemented by code generators.
  3. The use of mixed content elements caused invalid representations or the entire Schema to be rejected by many code generators.

In this instance, we had a stark choice to either:

In either case, much of the value of reusing the standard specification is lost.

User Story - Going to Market

Parlay is an industry consortium that specifies standards for accessing and controlling telephonic systems. These include the Parlay X Web service API described in WSDL which enable the calling of services such as call control, conferencing, audio, text and picture messaging and billing.
http://www.parlay.org.

BT is currently in the process of marketing products and services to enable developers from other organisations to converse in real time with BT’s exchange equipment. Many of these services will be presented using Web services and XML Schemas defined outside of BT’s direct control by consortia such as Parlay.

Some services, such as ‘manage my broadband account’ and ‘initiate a third party call’, will be targeted at a mass market and will need to support a wide range of disparate developers and environments. Implementations which only interoperate with a small set of consumers reduces the market place for such services. It is difficult for BT as a service provider to apply pressure to a vendor to interoperate with standards compliant services, especially if BT has no relationship with the vendor beyond it happening to be the technology chosen by a potential, possibly as yet unknown, customer.

Regulation in the UK Telecoms industry may make it unacceptable for BT to provide different qualities of service to different organisations, even competitors. Regulation may require BT to open up services which would otherwise remain for its own internal use. Furthermore, regulation may stipulate that the same level of service has to be provided to external organisations as is provided internally across divisions of BT. In a regulated marketplace, it is often not possible to provide services which only work for certain ‘best of breed’ technologies. Open standards based solutions are often BT’s only option when publishing services.

Summary

  1. XML Schema is currently the most widely agreed upon standard for describing Web service messages.

  2. There remains a high expectation amongst developers that tools which generate and consume an XML Schema wrapped in WSDL should provide a seamless and full abstraction of the XML messages exchanged.

  3. In our experience, XML Schema is implemented inconsistently in vendor tools, especially those which used schemas to generate mappings into code and other forms of data.

  4. There already is a lowest common denominator ‘profile’ of XML Schema implicit in the features implemented consistently across current implementations.

  5. Practical interoperability testing is essential to divine this profile, but is expensive for a single organisation to undertake by itself.

  6. An agreed upon set of XML Schemas and instance documents purposed for testing Web service tools are essential to expand this implict profile.

  7. Working around interoperability issues with vendor supplied tools is difficult and sometimes impossible when using XML Schemas published by third-parties, standards organisations and consortia.

  8. Best Practices are required for a number of different aspects of schema, such as versioning of messages, providing partial understanding, combining schemas in a single processor and composing schemas with other languages such as RDF and Schematron.

We intend bringing more concrete schema and instance documents from these user stories to the Workshop.