SVG and WebCGM Test Suites
Lofton Henderson
Prepared for: W3C QA Workshop
NIST (Gaithersburg), 3-4 April 2001
Latest Revision: April 1, 2001
Introduction
Over the past year, we have finished work on a preliminary SVG conformance
test suite, and work is in progress on a test suite for WebCGM 1.0. Some of
the issues and methodologies of these projects are unique to graphics, but
much is common to conformance work for a wider range of specifications. We
will present the principles and methodologies, summarize the test suite
contents, identify issues and shortcomings, and extract lessons which might be
useful for other conformance work.
In interpreting the material in this paper, it is useful to know:
- The SVG test suite was constructed from scratch within the SVG Working
Group over a period of 12 months, from a late and "stable" working draft
through two CR draft public releases.
- The WebCGM test suite is being constructed by adapting extensive
pre-existing CGM materials in the NIST-ATA test suite for the ATA
GRexchange 2.4 profile of CGM, and new materials are being designed and
added as needed for the significant dynamic functionalities of WebCGM
1.0.
Characteristics of the Suites
Focus - Conformance of What?
In the domain of graphics standards, conformance work has focused on three
areas:
-
conformance of graphics format file instances;
-
conformance of graphics format generators;
-
conformance of graphics format interpreters and viewers.
Both the SVG and WebCGM projects focused on #3, viewer conformance.
Note. There is an existing WebCGM instance validator that is very complete,
MetaCheck. There is no complete SVG instance validator, other than various XML
tools to validate against the DTD. There has been a lot of activity, but not
much useful result, in the notoriously difficult "generate conformance"
topic.
Principal Purposes
For both the SVG and WebCGM projects, these principal purposes were
agreed:
-
provide self-assessment tool for implementation builders;
-
help implementation builders achieve interoperability of
implementations;
-
help users assess the fidelity and completeness of implementations to the
respective (SVG or WebCGM) specification.
Although there have been other benefits, such as improvements to the
standards (Recommendations) themselves, these where not principal goals at the
start of work.
Non-Goals
It is not a goal of either project to build:
- a certification suite or establish a certification service. In general,
this requires much more rigor, formality, and "defensability" in the test
suite materials.
- "goodness" tests, i.e., tests to measure such parameters as viewer
performance, optional non-normative features, etc.
- a demo suite, although one effect of a thorough conformance test suite
is to demonstrate of all of the functionality of the standard.
(Note. Demo suites tend to be more for the marketing of the standard or
products, prioritizing the attractive and entertaining, and generally
sacrificing some testing principles such as "atomicity". We make this point
because of an audience suggestion at the March 2001 W3C Technical Plenary,
"...should contain lots of realistic, typical, legal files.")
Methodology
Overview
There are two meanings to "methodology" the context of building these
graphics test suites.
- the contents of the suite, and how it approaches the testing of
viewers;
- the process of building the suites.
(Note. Methodology could also refer to the methodology of applying the
tests, which question becomes particularly interesting in the context of a
certification service. But for the most part, it is outside of the scope of
this paper.)
Kinds of Tests in the Suites
Extending prior work, in the SVG project we further developed and defined a
notion of progressive testing. Progressive is used both in the sense of a
logical order in which to expose a viewer to tests, and a sensible order in
which to develop test materials.
- Basic Effectivity (BE) - verify rudimentary capability across all
functional areas;
- Detailed (DT) - comprehensively probe all testable assertions.
The SVG suite was scoped initially as a BE-plus-DT (full BE, substantial
DT) project, but the scope was narrowed to BE in the face of higher than
expected labor to build the tests and infrastructure. BE-leve is the defined
scope of the WebCGM suite, although some pre-existing content which is adapted
for the WebCGM suite more resembles DT tests.
- Error Tests (ER) - test viewer adherence to normative specifications in
the standard for handling of erroneous content.
While ER tests are applicable and potentially in the scope of the SVG work,
no ER tests are yet built. ER tests are inapplicable to WebCGM, as there are
no normative specifications of error response - WebCGM defines conforming
viewer behavior on conforming content.
Notwithstanding the disclaimer about demo test suites, in both projects we
identified that a few demo file instances would be desirable. These could be
considered to be a "BE test of combinations" (going beyond the testing
principal of "atomicity", which is to identify one atomic functionality and
test it in isolation):
- Demo (DM) - "real world" file instances, from SVG or WebCGM generator
products, ideally complex and not hand-crafted.
Test Suite Contents
Both the SVG and WebCGM suites contain:
- a collection of Test Case (TC) instances;
- for each TC instance, a Reference Image which illustrates the expected
eesult (a correct rendering of the content)
- an Operator Script, which describes how to run test, what constitutes
pass/fail ("Verdict Criteria"), and a verbal description of the graphical
content.
- an XML database describing the Test Cases
- one or more harnesses, for organizing the presentation of the materials
and navigation through the suite.
The following table compares some details of the two suites.
Comparison of SVG & WebCGM Test
Suite Contents |
|
SVG
|
WebCGM |
Test Cases |
127 BE test cases (complete BE suite). |
~230 existing BE/DT static-graphics tests;
~25 additional (est) for dynamic BE tests. |
Reference Images |
PNG raster images, 450x450 |
GIF in existing static, 1000x1000 (will be converted to
PNG); PNG for new tests. |
Operator Scripts |
Prose descriptions of test purpose, of expected visual
result, what deviations are permissable for "pass". |
Existing static: terse operator instructions and
checkpoints for certification testing.
New dynamic: more descriptive (like SVG).
|
Test Harnesses |
4 different harnesses for different viewer types. HTML
or SVG linked pages, one per test case, which present reference image,
OS, rendered content, and links through test suite. Generated from XML
database via XSLT. |
Existing: Single HTML frameset with pull-down forms for
test case navigation, button to view reference image, and OS presented
in right frame. Modifications: add button to access test itself.
Generated from XML base via JavaScript-DOM program. |
XML database |
One XML description file per test case, including the
Operator Script, and identification of link neighbors. |
Single XML file with TestCase elements, each of which
contains test purpose, version information, operator script, etc. |
You can see sample content of each test suite in the respective references.
How they Were Built
Idealized Process
The following idealized process for test suite construction has been widely
applied in conformance suite work, graphics and otherwise.
- analyze the standard (Recommendation) and extract all testable
assertions - Test Requirements (TR).
- synthesize and associate with the TRs a set of Test Purposes (TP).
- write and implement a set of Test Cases (TC) which realize the Test
Purposes.
Content Guidelines
The reference documents for the two projects
discuss in some detail, what guidelines and principles we followed in
generating the actual test cases - atomicity, consolidation for conciseness,
self-documenting, etc. We emphasize one of the principles in particular,
because it is one of the most important and at the same time proved to be one
of the most problematic:
- Traceability. A test must be traceable back to a statement or statements
in the standard's text (Recommendation text).
Actual Processes & Results
Here is a summary comparison of the two suites:
SVG & WebCGM Process Details |
|
SVG
|
WebCGM |
TR extraction |
For BE tests: implicit and informal - read chapter and
identify major functional components which should be touched by a BE
test. |
For existing BE/DT static graphics tests: nothing
done originally; and, won't retrofit because of cost.
New dynamic tests: TR extraction done (into a HTML table and XML
database) - see WebCGM reference for
the full TR set.
|
TP synthesis |
For BE tests: implicit and informal (note that the SVG reference does contain bibliographic
reference to a sample formal TR/TP process for DT-level tests for
'path' operator.) |
Existing BE/DT static-graphics tests: nothing
done.
New dynamic tests: TP synthesis done (into a HTML table and XML
database) - see WebCGM reference for
the full TR set.
|
TC instance |
Hand edit in standard text editor. |
Existing static graphics tests: apply global changes
to adapt to WebCGM with hand edit ClearText and convert, or by MetaWiz
script.
New dynamic: hand edit new ClearText and convert, or construct in
MetaWiz.
|
Reference image |
Direct SVG rasterization to PNG from implementations;
or, screen capture and SaveAs PNG; or, ... See SVG reference for details. |
Existing static graphics tests: convert existing GIF
files to PNG.
New dynamic: as SVG (note also, reference image will sometimes be
HTML). See WebCGM reference for
details.
|
Operator Script |
One XML description file per test case, including the
Operator Script, and identification of link neighbors. |
Single XML file with TestCase elements, each of which
contains test purpose, version information, operator script, etc. |
Repository |
CVS on a centralized server, with R/W access to
authorized test suite contributors. |
Simple disk cache/repository. |
Public access |
ZIP archive release of CVS repository to public at
reasonable intervals. Also browsable/executable online. |
TBD (but likely similar to SVG). |
Serialization |
Automated via CVS $Revision$ keyword. |
Manual, or via an automated serialization feature built
into MetaWiz. |
Notes on table:
- In some of the "dynamic" WebCGM tests (esp. CGM-to-HTML navigation
tests), the reference image (expected result) will not be a picture, but
rather a browser snapshot of some presented HTML.
- "MetaWiz" refers to a Windows tools, that provides a drag-and-drop
interface for CGM test case generation, featuring a "meta-CGM" language
with looping, includes, and other useful control structures.
- See serialization description in "Lessons"
section.
Shortcomings
There are some things we would do differently or better, and things we
should have done but didn't.
- traceability - this shouldd be a key requirement in any test suite. It
wasn't done in SVG (but the negative impact may be mitigated by the BE
nature). It wasn't done in existing static-graphics BE/DT tests for
WebCGM. It certainly should be done for any DT tests in either project. It
will be done in the new dynamic tests for WebCGM.
- too much manual effort (more automation needed), especially in the TR
extraction and traceability implementation.
- imprecise visual methods - in graphics test suites, the pass/fail
criterion is largely visual, and is easily subject to operator error.
Lessons Learned & Issues Identified
- The process of building test suites confers tremendous benefit to the
standard (Recommendation itself). SVG was done during the standardization
and led to many changes; WebCGM was done after the fact, and is leading to
numerous defect corrections.
- Get started early in the standardization cycle - ideally at the first
"stable" Working Draft.
- Companion to #2. Beware the pitfalls of working against early, unstable
specifications.
- Not only does early conformance work detect ambiguities and defects in
the standards, it also forces consideration of "fuzzy" conformance
statements, optional features, "recommended" behaviors, and the like, all
of which are inimical to the goal of building a cadre of strongly
interoperable applications and implementations of the standards.
- For the graphics test suites, a full DT suite is probably something like
10 times the number of test cases as the BE suite.
- Opinion. The greatest value for effort invested arguably comes from the
BE suite. There is probably something like a 90-10 rule here: 90% of the
benefit (to implementations, the standard, etc) from 10% of the test
materials (a comprehensive BE-level suite).
- #6 notwithstanding, a full DT suite is essential for guiding and
enforcing completely interoperable implementations (aside: in such
mission-critical application areas as aircraft maintenance manuals, "98%
interoperable" is not good enough).
- The processes we described are labor-intensive. More automation is
needed.
- Especially labor-intensive are the TR/TP phases and in constructing
trace-back. The standards documents typically don't facilitate this.
(Note. The OASIS XSLT/Xpath Conformance TC has considered this issue and
has designed some interesting labor-saving and error-reducing
methodologies).
- Labor requirements can be reduced by leveraging existing test suites and
QA materials (e.g., from members), but this introduces a new set of
problems: retroactive quality screening of large numbers of tests for
correctness, retrofiting traceability features, etc. (Again, the
XSLT/Xpath Conformance TC has experience here.)
- The TR extraction often involves interpretation, paraphrasing, or
synthesis of the text of the specification. I.e., the TR is sometimes not
stated obviously. We are not sure whether or not this is always
avoidable.
- Social comment. While the WG definitely should play a role in the
building of test suites, on the other hand the WG members who are willing
to invest much effort in test suite construction comprise a significant
minority - the prevailing view (I believe) is that the WG is an arena for
technical invention.
- Resources. Be prepared to spend at least one person-year of labor, even
for a BE suite. For DT, be ready for 1-1/2 to 4 years, or more. See SVG reference document for a cursory survey of
several efforts, graphical and non-graphical.
- Serialization. It is imperative to provide a
versioning interlock between test case instance and "expected result"
(reference image for graphical suites), so that it is always clear whether
or not a reference image came from a particular test case file version.
Automation is important here - manual serialization is tedious and easily
overlooked.
- Issue. An interoperability conformance suite might (should?) test
features in the standard which are optional or recommended, whereas a
strict certification suite would not.
- Visual comparison of rendered content with Reference Image (expected
result) might be okay at BE level. However, it is both imprecise and labor
intensive. In graphics, this is a difficult technical issue and some
attractive ideas such as "XOR the images" have significant problems.
(Reliably deterministic automated methods to declare pass/fail may be
unattainable, however we think that some automated techniques that provide
indicative aids to manual inspection might be attainable).
References
You can find much more detail about the SVG and WebCGM projects, including
extensive bibliographies, in:
The SVG test suite itself is available online from the Web page: