© 2001 Microsoft Corporation. All rights reserved.
Web services are reliant on and will become ever more reliant on recommendations promulgated by the W3C. A W3C recommendation is, in effect, a design of which there are likely to be multiple implementations. With so much dependent on W3C recommendations, it is imperative that there be considerable assurance of the quality of those designs in anticipation of their widespread deployment in various implementations. Moreover, a significant component of such assurance is the successful interoperation of those very implementations.
We believe that it is useful to view a W3C recommendation as a kind of software product—with all the life cycle considerations attendant there to. In that light, much of what we know about quality assurance in the software domain can be applied directly to the task at hand. In the spirit of such an application we should be able to answer the following questions
In this note we will examine these questions informed by both our historical experience in developing enterprise software products, and our more recent experience in the design and implementation of the XML Schema Proposed Recommendation.
That there should be test cases is uncontroversial. Debate is more likely to ensue regarding what test cases should test and how best to create such test cases. To some degree XML itself helps in framing the questions. In fact, we believe that the following theses are entirely defensible:
· a test case is an XML instance
· a test case can be effectively characterized by an XML schema
That said, there still remain the questions concerning what and how.
In an open standards forum such as the W3C it would be difficult to do other than to license every participant to submit test cases. On the other hand, the form of those test cases can be highly regulated. Indeed, it should be incumbent upon W3C working groups to design such regulation into their recommendations.
For most working groups the first order of business is to develop requirements. These requirements, in turn, are based on use cases. It should be standard practice among working groups to evolve use cases into test cases while developing a descriptive language for such test cases. This descriptive language should prescribe not only the form of a test case but also its function. Of course, the construction of test cases must be governed by more than just use cases. Most W3C recommendations are specifications of languages with particular syntax and semantics. Many test cases will be motivated entirely by the desire to assure that an implementation actually adheres to both the syntactic and semantical particulars of the specified language. For the most part, it will be infeasible to test all possible combinations. Suites of test cases will thus inevitably reflect some strategy of unit testing (of individual syntactic and semantical components) and system testing (of characteristic combinations of units). The test description language ought to be capable of characterizing a test case from the perspective of the aforementioned strategy.
We have already noted that W3C recommendations are commonly couched in specifications of languages. Increasingly commonly, the syntax of such languages is given by an XML schema. Often the semantics of the language is presented in terms of a data model and transformations upon the data set instances thereof. XML Schema defines a post schema validation information set which encapsulates the consequences of a schema validation episode. Specifically, the input to schema validation is a collection of information sets and the output is the post schema validation information set. As it happens, there are at least three implementations of XML schema under way capable of rendering both the inputs and outputs as schematized XML instances. This is suggestive of a generally applicable instrumentation methodology: make everything XML!
To recapitulate: Assume that a W3C recommendation specifies a language whose semantics is given by transforming an input data set into an output data set. Assume further that all the data sets have schematized XML representations. Assume lastly that an implementation is capable of rendering the relevant data sets as XML. This puts us in a position to be able to characterize (in XML) a test case in terms of its inputs and expected outputs as XML. The test is XML, the test data is XML, and test instrumentation is the comparison of an expected XML output with an actual XML output.
Thus far, we have seen how we might reduce both test cases and test cases to collections of XML instances. Having done that reduction, we could put everything in an XML repository. We can think of testing as a workflow enactment wherein we marshal test inputs and testing platforms to create test outputs. While they have yet to be standardized, there do exist XML-based descriptions of such workflows. The point is that test execution is amenable to XML description. On the assumptions we have given, all the constituents of the testing and implementation of an XML-based standard could be placed in an XML repository. In so doing, we reduce the problem of managing XML testing, to the problem of managing an XML repository. One consequence of the XML Query Working Group is that there will be many solutions to the repository management problem.
There will inevitably arise disputes as to whether an implementation is compliant with a W3C recommendation. If testing is formulated in the fashion that we described above, we posit the following principles of compliance adjudication:
In the foregoing we have introduced and motivated the testing of implementations of a W3C recommendation against a suite of test cases. We have argued for a formal XML-based representation and description of such test cases. We observed that combined with workflow, the aforementioned representations and descriptions would enable the automation of the testing activity. Finally, we identify and circumscribe the element of human judgment still required to determine an implementation to be compliant with a W3C recommendation.
 Needless to say, we believe that the interoperability of implementations is only measurable modulo test cases. That is, two implementations are deemed interoperable if they both have the input/output behavior specified by a test suite.