XML Schema: Looking Forward

Mark Nottingham, David Orchard, Radu Preotiuc-Pietro and James Taylor, BEA Systems

BEA has extensive experience implementing XML Schema, both for purposes of validation and those of data binding to Java. Most of our efforts are concentrated in XMLBeans, now part of the Apache XML project. XMLBeans allows full-fidelity access to XML instance data and full Schema support, whilst maintaining the most natural possible binding to Java interfaces and classes.

Having what we believe is a complete and conformant implementation took considerable effort, and led us to several insights that we’d like to share.

We believe that one of the critical hurdles for XML Schema to pass in the near future is support for versioning and extensibility; as a result, David Orchard is spending a considerable amount of time pursuing this goal in the TAG and elsewhere. The XML Schema Working Group’s publication of Versioning Scenarios is an important step in this direction.

Beyond extensibility and versioning, we’re concerned about the divergence in data models for XML; in particular, the XPath and XQuery Data Model introduces primitive types that cannot be expressed in XML Schema, and does not align with the Post Schema Validation Infoset. We feel that these data models and type systems must converge, or place an increasing burden on both implementers and end users.

We’re also concerned about the lack of take-up of XML Schema 1.1 by key vendors; without their participation, it makes little sense to continue this effort.

Finally, we believe that the complexity of Schema is, in many scenarios, unnecessary, and often actively harmful. More than anything, this is the feedback we hear from our customers and end users. We are also concerned that Schema’s complexity may be the root cause of so many incomplete and/or incorrect implementations.

Despite its complexity, XML Schema only partially addresses some requirements; while it is primarily focused on validation, the reality of its use as a data modeling language cannot be ignored. In the long term, we believe that XML Schema will either need to address these problem domains more directly, or risk losing its value to domain-specific schema languages.

Therefore, we would be very interested if the W3C chose to produce a small number of domain-specific profiles of XML Schema; e.g., a “data” profile that disallowed mixed content and the constraint of some facets, whilst making ordering non-significant by default, possibly extending Schema with new data types and structures as well. This would make it easier to implement, easier to use, and easier to verify correct behaviour.

We look forward to hearing the input of other vendors and end users at the workshop, and participating in the resulting discussions.