How to build a test suite for validators and other testing tools? What are the requiremements and use cases? How do we manage the additional levels of complexity? These questions are applied to the MarkupValidator for now.

Requirements for a Test suite for MarkupValidator

Use cases

We want a system that allows, in priority (use cases):

a large (enough) collection of test
(very easy) contribution of a test case with, ideally, each bug report
testing of several instances of the validator + comparison of different instances + comparison with expected result (so a fixed list of validator-uri is a no-no)
Classification of test. possibility to run only a subset of the suite (a possible application of a test cases management system, see QaTools)

Architecture / Systems involved

We are at a stage where the architecture of the Markup Validator is changing from monolithic to modularized, and the test suite will have to follow this evolution. Regardless, we need a working test suite as soon as possible with the current architecture.

Usage scenario

What would the system do (an application of the TestSuiteArchitecture):

read list of cases
filter (keep only subset we want to test)
run the test tool on each test case
get outcome (EARL format?)
scrape results (in whatever output we want to test, usually XHTML I suppose)
Compare test result with expected outcome
compare test result with known results from other instances

Automated Testing

Testing the Monolithic Validator

See Bjoern's proposal of a Test::Builder based system.

How this could be derived (or generated) from our collection of tests, automatically or not, is yet to be sorted out.

Testing the modularized Validator

Each module will have its own test suite, also made with Test::Builder. The validator's test suite will merely test how we put the blocks together, and UI tests.

UI Testing

The general agreement seems to be that UI tests should be separated from the operational/feature testing.

Possible UI tests:

escaping
URIs
dialogs
proper output according to options (source tree etc.)

Test cases

Let us just pretend for a moment that a test case is a document (I have ideas to refine that, which I will explain later). Considering a test case linked to the validation engine (as opposed to UI, formatting, etc, which the system should also address).

What's the expected result?

The result that the production validator gives, and which, for consistency, the tested version is supposed to give, or is it the known (or guessed) validity of the test case wrt a given DTD? This is not such a complicated question, I guess, and once we make a clear distinction between the expected (validity) and usual results, it is possible to envision having the expected result "hardcoded" with the test case, and keep former results (including the reference results from the current prod tool) in EARL or whatever other format suits the purpose (EARL looks quite good).

This raises the general question of what the markup validator is assessing; as of today, it does DTD-based validation, encoding declaration consistency, some HTTP protocols points (but not all, e.g. doesn't check that XHTML 1.1 is served with an appropriate MIME-type); I guess we first need to define clearly what it does today that we want to make sure it continues doing (regression testing), and then envision a more "aggressive" approach of defining what we want it to do tomorrow (test driven development).

Note that a test case is not necessarily just one file, but one file and how it is served (charset, media type). -- KarlDubost

Note that some discussion started on EARL output format for the CSS Validator. It is quite difficult to define something else then "it pass the test 'markup is valid'". For automation, a SOAP Version 1.2 output format would be a good addition. A release could be tested against all the unit tests in a very easy way. -- YvesLafon

Test case description

From all the thoughts above I can start outlining how a test would be described:

URI of the tested document
tools' options (to be appended/sent in a GET/POST)
DTD used
validity wrt DTD / well formed (if XML) / properly served / ???
what is tested (?)
long description of the test case (not fond of having the document describe itself, inconvenient for empty documents for example)

(note this relates to the more general question of TestCaseMetadata)

MarkupValidator/TestSuite