W3C Workshop on XML Schema 1.0 User Experiences: IBM's position

W3C Workshop on XML Schema 1.0 User Experiences

IBM's Position

27 May, 2005

It has now been four years since XML Schema 1.0 was granted Recommendation status by the W3C. During this period, the XML community has developed a substantial body of experience with Schema's application. IBM believes that now is a good time to solicit community feedback, and we thank the W3C for convening this valuable workshop. We look forward to engaging with the other workshop participants.

IBM believes that critical mass, stability and compatibility are important characteristics of any successful technology. At this point, we believe that the industry has reached a tipping point in terms of the broad adoption of XML Schema. Indeed, the principle strength of XML Schema is its ubiquity and widespread deployment. While it is most appropriate to consider improving this important language, the top priority must be on preserving the value that it's already delivering. For the same reasons that XML 1.1 has struggled to gain traction, any incompatible change to Schema is similarly going to face an uphill fight, except insofar as it delivers compelling value that the community demands. Thus, the bar should be set very high on making changes.

That said, there is little dispute that there are complexity issues with XML Schema. It has become somewhat of a popular sport to knock XML Schema in the blogosphere, in trade publications, and on the various XML mailing lists. We believe that some but not all of these complexities are causing real hardship to users. IBM agrees that, for example, the syntax of XML Schema is unnecessarily cumbersome; it is verbose, difficult to remember and painful to hand author. Although the Unique Particle Attribution constraint greatly facilitates data typing and programming language bindings, it can be frustrating to users. In particular, the interaction of UPA with wildcard content complicates the design of evolvable schemas. Having acknowledged such problems, it is also important to emphasize that a majority of users find XML Schema to be a highly effective tool for much of what they're doing. We routinely work with users who report that it is meeting their needs well.

If the W3C is to tackle the shortcomings of XML Schema 1.0, it must do so with a willingness to abandon the work (I.e. prior to going to Recommendation) if the results prove unlikely to be widely deployed. We suggest that the W3C investigate changes primarily in a few key areas that users have reported as being critical needs. For example, simple new features to support schema versioning might be of such value that users and vendors would deploy them in a new version of XML Schema. However, any proposed changes should be carefully considered against the inevitable resistance to their adoption given the near ubiquitous deployment and deeply entrenched use of XML Schema 1.0.

In other respects, we believe that the community would be best served through a focus on helping users of XML Schema be more successful through improvements in tooling support and compatible tweaks to the specification. We note that for us as middleware vendors, compatibility depends not only on the surface syntax and validation rules of schema, but also on its component structure [1], as the latter is directly reflected in our APIs and thus in our users' applications.

XML schema has also been criticized for being too powerful. In particular, data-binding of XML to Java and other programming language environments, using XML Schema to provide type metadata, is an increasingly important application of XML Schema, especially in the context of Web services. Anecdotal feedback from the Web services community has suggested that inconsistent support for certain valid XML Schema constructs across the various development tool environments has contributed significantly to the interoperability challenges faced by Web services developers. IBM believes that there is definitely need to improve interoperability of available tooling in that space, and that should be a significant goal of this workshop.

That said, many of the XML Schema constructs that Java and .Net data-binding tools struggle with are not arcane complexities of XML Schema. Rather, they are fundamental to the value of XML. One example is Schema's xsd:choice, which maps most naturally to what some programming languages support as "tagged unions". The fact that some programming languages lack the ability to express choice and/or tagged unions is a characteristic of the languages, not XML. xsd:choice is a simple and yet fundamental feature of XML Schemas and of XML itself (I.e. DTDs).

Similarly, one of the great values of XML is its ability to handle document-oriented constructions such as mixed content. Without mixed content, there could be no XHTML, no Atom or RSS feeds, no XSLT stylesheets, and indeed no resumes, technical reports, or other documents for which XML is so widely used. The ability to support documents and data using the same infrastructure is one of the key innovations of XML. The fact that Java or .Net strings don't store mixed content is an understandable concern. However, we believe that the right answer is not to "dumb down" XML or Schema to eliminate the core value of XML but rather, to embrace XML and XML Schema and focus on improving the programming models for dealing with XML content.

IBM is deeply committed to improving the tooling and programming models for handling XML and XML Schema for the benefit of the community as demonstrated through initiatives such as Service Data Object API (SDO) [2], research initiatives such as XJ [3], that integrates XML and XML Schemas as first-class constructs within the Java programming language, and the continued improvements in performance and capability of the Xerces parser and Eclipse Web Tools open source projects. Overall, we look forward to working with the community to address its needs and concerns in a manner that yields the most value for the least disruption.

[1] http://www.w3.org/TR/2004/PER-xmlschema-1-20040318/#key-component
[2] http://www-106.ibm.com/developerworks/java/library/j-sdo/
[3] http://www.research.ibm.com/xj/

Thank you

Christopher Ferris
Sandy Gao
Steve Holbrook
Noah Mendelsohn