Test Case Description Language 1.0

Submission to QA Working Group 13 October 2003

This version:: http://www.w3.org/QA/WG/2003/10/tcdl-20031013
Latest version:: http://www.w3.org/TR/qaframe-intro/
Previous version:: http://www.w3.org/QA/WG/2003/10/tcdl-20031012

Editors:: David Marston (David_Marston@us.ibm.com)
Contributors:: See Acknowledgments.

The Test Case Description Language (TCDL) is an XML vocabulary for describing test materials, intended to be delivered as part of such materials. It allows each test lab using the materials to get set up to run tests repeatedly. The design presented here is a flexible one, intended to be adjusted by each Working Group to fit the classes of product that would be the subject of their Recommendations, and the Dimensions of Variability present in their Recommendations.

Status of this document

This document is a proposal from the author to the QA Working Group, for its consideration and discussion of future development, if any. It has not yet been discussed, approved, or endorsed by the Working Group, even less by W3C.

This is the second version of this document. (tcdl-20031012.html is the first version).

Posting of this discussion draft by QAWG does not imply endorsement by the QAWG or the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.

A refinement of this document is already in preparation, and is expected to be available before consideration and discussion by QAWG.

1. Goals of the TCDL

1.1. Scope and goals

This document specifies an XML vocabulary used to catalog most of the Test Materials that a WG would provide. The catalog data will be used by a test lab to set up and run the test materials (TM) against one or more test subjects. In more detail, the lab uses the data when:

While TCDL is designed to provide enough data, in fine enough particles, to support automation of the above activities, it should also prove useful in situations where automation is not feasible. The primary design goal is that the TCDL document can be transformed by XSLT into documents of many kinds, including executable scripts or program code.

The WG may also use TCDL as part of their test materials development and maintenance activities. For this reason, an optional module known as the Status-Tracking Feature has been defined. For more perspective on how this feature provides a partial solution, see section 3.7 Status Tracking below. The part of TCDL that is always used is referred to as the TCDL Core in this document.

Each document conforming to TCDL catalogs a set of test cases. This provides a checklist of all the cases in XML form, which can be transformed into a variety of convenient documents for human and machine consumption. Within and across the test cases, it catalogs all the files in the Test Materials that are directly involved in testing an implementation. If the files are delivered in a multi-level directory structure, the TCDL instance document also documents the structure, and does so in a way that parallel empty structures can be generated at run time.

TCDL includes provisions for descriptive strings for each test case and each instance of other objects in its domain. A minimal descriptive string is required. These strings can be used in log files and result reports.

There is enough detail about test cases that a test lab should be able to update their copy of the test materials to a newer version and perform some simple code management functions without needing to have a direct interface to the code management system of the W3C or other development site for the test materials.

By defining each test case separately from the files and filenames it references, TCDL supports reuse of test data, possibly even across test regimes.

Each test-case element in a TCDL document can be transformed by XSLT into one line of a script (or "batch file" or equivalent) or a line of source code for a programming language. The automation system that executes the code need not have any facility for excluding certain test cases; the filtering can be done as part of the XSLT transformation.

Transformations of the TCDL data can produce scripts and checklists for human consumption as well. For example, a test regime for a text-to-speech system may require that a human see a word, hear the word pronounced by the system, and press one of a set of buttons to evaluate the sound. The TCDL catalog provides the words that are test cases and the test harness can take the words as a plain-text list or as a program with a sequence of calls to the routine that tests one word and accepts a button press action.

Transformations of the TCDL data can produce text files and checklists in other formats, such as the venerable comma-separated value (CSV) format, if required.

1.2. Role and Non-Goals

The TCDL supports automated setup of the system(s) used for testing, automated running of test cases, automated comparison of results, and automated cleanup of the system(s). When results should be archived, the TCDL can be transformed into a script for copying and/or renaming log files or direct result files. If the test materials come with a suggested tree structure of directories for files, the bits of information about the tree are isolated so that they can be used to generate empty directories according to the same structure.

In our terminology, it is the test harness that actually causes tests to be run and results to be obtained or derived. TCDL provides the "line items" over which the harness can iterate. TCDL also provides data that can be used in reporting case-by-case results.

The TCDL serves as a catalog or manifest to be delivered with a collection of test materials. Materials are delivered as files, whether or not they are used as files when tests are executed. Other resources should be used to perform the archiving and versioning of files, as would be done in a "configuration management system" (CMS) tool. With the Status-Tracking Feature of TCDL in effect, a user can know the approval status of a test, which is not the same as tracking its history. If test files and the associated TCDL instance document are stored in a CMS, the CMS can serve up historical snapshots. All snapshots that are suitable for delivery to test labs must have a TCDL document that is consistent with the rest of the snapshotted files, and the coordinated set of files should be labeled for extraction by name from the CMS. The labeling allows another test lab to reproduce claimed results independently. The TCDL document lists all the (substantive) files in the snapshot, but does not otherwise fill the CMS role.

The TCDL document of a test suite can be transformed into a script that will check that all necessary files are present on the system where testing will be run. Where more than one machine is needed for testing (e.g., client and server), the lab can also produce scripts to move files from system to system. Test data files, or indeed any needed file, need not originate with the test regime in question. For example, the WG could specify that in addition to the test materials they supply, a lab should also download test data from another source, possibly even a non-W3C source. The TCDL should list every file needed to run the tests, regardless of source or how many downloads it takes to obtain all the materials.

The test regime may require live use of on-line services, such as fetching a document from an HTTP server or access to a Web service. In such a case, the TCDL specifies the URI. The TCDL document can be transformed into a list of all servers that need to be pinged before a test run begins.

Certain elements and attributes of each test case are descriptive texts that can be used in catalogs for human reading or to annotate log files. Many of the functional data items may also be useful as log file annotations. A TCDL document doesn't contain bug tracking data, other than the status of a test case if the Status-Tracking Feature is used, but the descriptive data can be imported into a separate bug tracking facility.

Each test case MUST be flagged with data about its applicability if it could be run selectively. Reasons for excluding the case from a particular test run fall into two broad classes: the test requires resources that are not currently available, or the test has some characteristics that make it inapplicable against some test subjects or under very formal testing conditions. The resource issues can be identified by the "scenario" or the identifier of an input. Reasons for filtering out a test due to inapplicability generally match up with the "Dimensions of Variability" (DoV) defined in QA Framework: Specification Guidelines [QAF-SPEC]. In addition, the version of the specification and applicable errata can be considered. TCDL supports all the following filtering criteria:

Notice that upward compatibility is assumed. Most test cases span all versions, so TCDL is designed to support a test suite that spans all versions. Individual cases can be marked to start and/or stop being applicable at a designated version number.

If citations of applicable specifications are done well, TCDL can be used to identify all tests that exercise a particular provision or test assertion. In that way, TCDL can support ongoing improvements to coverage.

1.3. Using TCDL

Each class of product described in QA Framework: Specification Guidelines [QAF-SPEC] will adapt TCDL to its testing needs. For example, a consumer of an XML vocabulary must accept all the required syntax variations, may accept the optional variations within the specification's restrictions, must reject all incorrect input, and must exhibit the specified behavior for the valid input. A test case consists of the input to be consumed, a scenario keyword to set the general behavior pattern, and a description of the behavior in sufficient detail that correct and incorrect behavior can be normatively distinguished.

Scenarios allow mixing test cases that need differing inputs and produce differing results. In particular, error scenarios can be provided in the same test suite as the "positive" test cases. Each scenario can have its own requirements for number and kinds of input, and for environmental conditioning to occur before the test is run.

1.4. Associated Work and Unmet Needs

TCDL is not a bug tracking system for the test materials, nor for the specifications. !!!

The QAWG may recommend certain practices for either bug tracking or a CMS. For additional discussion of the limits to usefulness of TCDL, see [Chapter 6 ???] below.

1.5. Terminology and Notations for This Document

In the normative parts of this document, the keywords "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" will be used as defined in RFC 2119 [RFC2119]. When used with their RFC2119 meanings, they will be all uppercase. Occurrences of these words in lowercase comprise normal prose usage, with no normative implications.

Please consult the QA Glossary, [QA-GLOSSARY], for the definitions of other QA terms used in this document.

2. Major Sections and Concepts

The bulk of the data is test-case elements, but there are side tables of data specific to each test regime. Tests can be grouped. !!!

2.1. Tables of resources

2.2. Identification of each test case

2.3. Descriptive data and citations for each test case

For each test case, there is required data to express what the case is about. Optional notes can be used to explain the test case to any desired degree of detail. In addition, citation elements provide derivation data and can be explanatory if done well.

The purpose element contains a one-line statement of the purpose of the test. Every test case should have a distinct string value for the purpose statement, though the WG is responsible for enforcing distinctiveness. The one-line statement can appear in rendered human-readable catalogs and in log files of test results. Test case creators should envision an analysts comparing the purpose strings of two similar cases when they write them.

Each test case must be traceable back to the specifications, and citation elements are used to record the applicable provisions. At least one citation element is required for every case. There is no upper limit on the number of citations, and catalogers are encouraged to make a citation-spec entry for every document that may have normative bearing on a test case.

To allow for varying degrees of precision in the markup of different Recommendations, several citation types are recognized, and a WG may define others. The most coarse-grained would be a numbered chapter or section. Fine-grained citations may point to an individual sentence, test assertion, grammar production, or other single line.

2.4. Scenarios and external conditions

2.5. Data for filtering test cases

2.6. Inputs and outputs

2.7. Grouping of test cases

3. Primary Features

This chapter presents design rationales and requirements for individual data items. !!!

3.1. Citations

3.2. Creator credits

3.3. Descriptive strings and notes

3.4. Supported versions

There is no need to maintain separate test materials for each version of the Recommendation. Each individual test case can be marked to indicate the range of versions to which it applies. The WG should specify whether versions are designated by number (1.0, 1.1, etc.) or by date. A test case with no version information shown is assumed to be applicable to all versions published so far, including drafts of the next-higher version. Thus, version data can be used for filtering, but the filtering mechanism must be written only to exclude cases that have the version data and have a value out of the applicable range.

Use version-add to indicate that a case is not applicable prior to the indicated version. For example, a test case with a version-add of "2.0" begins to be applicable for version 2.0 of the Recommendation. A test subject that claims to implement version 2.0 of the Recommendation should have the case applied, while a test subject that implements any lower-numbered version (but not 2.0) should have the case excluded.

Use version-drop to indicate that a case becomes non-applicable effective with the indicated version. For example, a test case with a version-drop of "3.0" is applicable for versions up until 3.0 of the Recommendation. A test subject that claims to implement version 2.0 of the Recommendation should have the case applied, while a test subject that implements 3.0 should have the case excluded.

The version-add element can be used in combination with version-drop to delimit a range of applicable versions. No more than one of each can be specified.

3.5. Data for filtering test cases

3.6. Naming schemes for test cases and files

3.7. Status tracking

3.8. Tracking needed cases

3.9. Indication of successful outcome

3.10. Modification dates

3.11. Directory tree structure

3.12. Namespaces

3.13. No sequencing implied

3.14. Grouping of test cases

4. Data Items in Detail

5. Tooling Built on TCDL

This chapter describes particular ways to use the TCDL markup. The anticipated uses inspired many of the design details.

5.1. Filtering Cases by XSLT Transformation

By having information about a test case contained within a single element, the test lab can filter the set of cases by omitting one element for each case to be excluded. Using XSLT, a transformation can loop over all test-case elements, testing each one for the filtering criteria.

5.2. Creating Scripts by XSLT Transformation

TCDL is intended to provide separate information items that may be assembled into strings for a scripted or manifest-driven automated testing environment.

5.3. Selective Query of Data About Tests

A TCDL catalog may be queried by XQuery. Past experience indicates that the citations are used as query criteria, since they represent the substance of the test case. The more precise the citations, the better for querying.

5.4. Relationship to EARL

TCDL is used to plan and operate the application of tests to an implementation. Test outcomes ("Pass", "Fail", "CannotDetermine") are not directly defined in TCDL, unless they are very simple. TCDL provides enough data to govern automated resolution of raw results into outcomes, while EARL addresses reporting of outcomes.

Due to filtering based on TCDL data, some tests may not be run against a particular implementation. If the WG or test lab desires to report outcomes for tests not run (keywords: "NotApplicable", "UnableToRun", etc.), the filtering process on the TCDL catalog can be re-run to produce appropriate EARL data for those cases.

6. Issues and Implications

TCDL is designed to provide content-management data that is optimized for running tests. This chapter discusses other concerns that may affect test materials. The concerns will be addressed with content-management tools and techniques appropriate to other phases of test case development.

6.1. Associating Cases with Errata

Successive waves of approved errata can be treated like versions of a Recommendation. In particular, TCDL assumes that each wave of errata constitutes an ever-better refinement of the specification. An erratum may cause an existing test case to become included or excluded from the normative set, but it is expected that once resolved, the action is not undone by subsequent errata.

6.2. Unreviewed and Disputed Tests

The TCDL is not designed as a vehicle for tracking the review status of each test case. However, there are reasons to provide a simplified representation of the status in TCDL. Use the Status-Tracking Feature, an optional module in W3C QA terminology, to store a status code for each test case. Specifically, the statuses of "Approved" and "Formerly-Approved" bear on reporting. All statuses other than "Approved" and "Rejected" denote a test case whose status is unresolved. A test lab may wish to run unresolved-status tests to see what results are obtained with available implementations. This is especially likely as part of a WG or TTF evaluation of test cases that may become Approved.

If disputes or incomplete work are tracked in a mechanized issues-tracking system, the TCDL could be extended by using an @issues attribute to hold an issue ID.

See section 3.7 for details about attributes the record the approval status of a test case.

6.3. Tracking Needed Cases

Typically, test cases needed but not yet written would be tracked through the issues list or a content management tool optimized for the collection of test assertions. TCDL could be applied to this problem if each identified need is accompanied by the empty shell of a test case. Naturally, the TCDL must use the @status feature and must have a status code such as "Needed" to represent the entries that are unfilled.

6.4. Vague Provisions

In the course of recognizing and formalizing errata, test cases may be written to explore the implications of potential resolutions. An implementer or user may accompany their erratum report by one or more test cases to illustrate the issue. Upon resolution of the issue, each case should have a specified result (possibly an error). The cases can be moved to "Approved" status and marked with an errata-add that indicates the erratum number or publication date of the erratum.

Prior to resolution of the erratum, the test cases can be included in the published test materials under either of these restrictions:

6.5. "Files" on other media

A good test suite must be usable by numerous test labs against numerous implementations and repeatable over time. [cite???] Thus, it must be stored in a persistent format that we describe as "files" in this document. Assume that the test lab will determine how to parse or transform the files for dynamic usage as required in their specific environment.

6.6. Testing Extensibility But Not Extensions

Where a Recommendation provides visible mechanisms for extension, the conformance of those mechanisms can be tested. No special TCDL provisions are required when every implementation must behave the same way.

7. Conformance

7.1. Normative Parts

7.2. Extensibility

This specification defines a number of required elements and attributes. !!! Status-Tracking Feature is to be treated as a module that must be implemented as a complete package, if at all.

7.3. Conformance Definition

An instance of TCDL must contain all the required elements, attributes, and non-null text strings. All values restricted to an enumerated set must conform to the set defined and published by the WG. A particular test regime may mandate that each TCDL instance use the status-tracking feature in addition to the TCDL Core features.

7.4. Conformance Claims

A TCDL instance conforms to the TCDL 1.0 if it satisfies the constraints enumerated in the previous section. An assertion of conformance to this specification (a conformance claim) SHOULD come from an identifiable source and MUST specify:

The schema for this specification ([Schema]) is the Implementation Conformance Statement (ICS) pro forma for this specification. !!!

7.5. Conformance Disclaimer

A. References

B. Glossary (Non-Normative)

C. Acknowledgments

The following QA Working Group and Interest Group participants have contributed significantly to the content of this document:

The following participants in an OASIS Technical Committee and other outside efforts have contributed to earlier versions of XML vocabularies for describing test cases:

Editorial note
QAWG may not want to suggest that W3C would put up a CMS.

Editorial note
EXpect this document to provide an initial set of citation types.

Editorial note
The versioning section of Chapter 4 will give exact syntax for indicating versions by number or by date.

Editorial note
An example of a stylesheet could be provided.

approval status	Each test case may progress through various statuses on its way to becoming Approved or Rejected. Actual status codes will be specified elsewhere.
outcome	The resolution of the test result against the provided reference result, yielding values such as "Pass" or "Fail".
test regime	A single program of testing for one class of product, typically specified by a primary Recommendation that may normatively include other Recommendations and Standards. A set of test materials covers one test regime. One Rec may cover more than one test regime, which usually occurs when it defines more than one class of product.

Table of contents