From W3C Wiki
This page collects and categorizes information about goals and requirements for the common W3C test-suite framework for browser testing. [SAZ: afaik, the scope should not be limited to browser alone][JS: ATAG 2.0 and UAAG 2.0 will need to test authoring tools and media players in addition to browsers]
Test-case serving (Web server)
The following are requirements for any mechanisms for serving test cases over the Web.
Server separate from w3.org that can run server-side scripts (e.g. PHP / Python), is backed by Mercurial or Git, and has decent amount of freedom when it comes to configuring details via .htaccess. E.g. full control over media types and charsets.
XMLHttpRequest, CORS, EventSource, HTML5, Widgets WARP will all need a setup like this.
On 14-Feb-2011, PLH announced the following are "live" e.g. for testing:
See also Dom's 17-Feb-2011 clarifications re the constraints of these hosts e.g. PHP only.
To test the WebSocket protocol and its client API we would need to install e.g. http://code.google.com/p/pywebsocket/ on top of the basic server and run it.
Might need something else: http://code.google.com/p/pywebsocket/issues/detail?id=65 Ideas?
To test browser security more completely when it comes to HTTP you need the following:
- Different domains, e.g. http://foo.example.org vs http://bar.example.org, but also http://example.org vs http://example.invalid (different as far as http://publicsuffix.org/ is concerned)
- Different ports, e.g. http://example.org:80 vs http://example.org:81
- HTTPS with Extended Validation
- HTTPS with an invalid certificate
Test-case execution (in-browser/client)
The following requirements are not categorized yet:
- allow tests to be created based on smaller tests (this would allow one action to be repeated several times within the same test) (detect failure under certain conditions)
- allow testing of error handling
- allow testing of time based information (SVG animation, HTML video)
- allow more than one way to test functionality:
- allow direct contributions to the test-runner and framework code from external individuals or entities
And here are some possible considerations:
- Tests that require a top level browsing context
A test must NOT depend on the test runner used to run a set of tests. A test may be able to generate its result automatically (such as Script test) or not (such as Self describing test). If it is automatic, it is the responsibility of the test to report its result to the test runner above it. Otherwise, it is the responsibility of the test runner to gather the result from an alternate source (such as a human).
aka human or manual tests.
This is the most basic level. A file (or more) is displayed and a human indicates if the test is passed or failed. Ideally, we should avoid those types of tests as much as possible since it requires a human to operate. Some folks want to have a comment field as well.
Plain text output
This is equivalent as doing saveAsText on two files and comparing the output.
Two pages are displayed and the rendered pages are compared for differences.
For comparison, we might be able to use HTML5 Canvas, or an extension to get screenshots. Worth case scenario is to use a human to compare the rendered pages.
compare equivalent pages
(@@ through screen shots?) Not sure how this one differs from the one above...
Some engines could dump their in memory view/layout model, ie the one directly affecting the rendering.
The test result is established through scripting:
We're looking at using testharness.js for those. Note that it doesn't preclude human intervention sometimes, such as authorizing geo information, pressing a key or a button, etc.
The test runner (see diagram) is responsible for running a series of tests and gathering the results for all of them.
- Loads tests automatically based on test manifest files containing metadata about test uri, type, etc.
- Allows all or a subset of tests to be run [SAZ: selection of subset is based on the metadata describing the test; for instance, to select all tests that apply to a certain feature, element, or other aspect of the test]
- allow tests to be run in random order and repetitively (detect failure under certain conditions)
- allow the test suite to be run on multiple platforms (mobiles, windows, mac os, ubuntu)
- Output the results in some way (XML, json, database?)
List of known test runners:
- Allows manual tests to be run by humans, ie have pass/fail/unknown buttons.
- Allows reftests to be run by humans, e.g., by automatically switching between test view and ref view several times per second and asking the user if they see flickering (automatic running of reftests will require browser-specific code and is explicitly out of scope) [SAZ: does not have to be automatic switching -- could also have the user manual switch between the views to compare the outputs]
- [SAZ: allow automatic and manual gathering of context information, such as the browser version, OS platform, and relevant configuration settings and assistive technology if applicable]
- [JS: Allows a text comment field for human evaluator notes (e.g test conditions, failure notes) on the individual test result that can be included in the reporting. E.g. they might write: "the authoring tool implements this SC with a button that automatically sends the content being edited to the XXX Checker accessibility checking service".]
Most test review in many working groups is currently done informally via a mailing list. This doesn't work so well, especially for for large testsuites. Maybe there is an existing tool that can help us here.
- provide a mechanism to review tests without putting a Working Group on the critical path for every single test [SAZ: the work of the WCAG 2.0 Test Samples Development Task Force (TSD TF) included the development of a review process that allowed the Task Force to pre-review tests yet allow the Working Group to make the final decision]
- allow an easy way to submit a test. A Web author should be able to submit a test to the W3C. See also Policies for Contribution of Test Cases to W3C
- allow anyone to easily give feedback on tests, not just named reviewers or people with W3C accounts
- reasonable mercurial integration
- allow management 100,000 or more tests per spec
- track the state of a test (under review, (approved, rejected))+
- associate issues, action items, mailing list threads to tests (integration with W3C tracker?)
- allow stable dated release of test suites. @@version control per test?
- track the state of a test suite (use case: browser vendors want to track changes to test suite in order to stay in sync)
- Produce a machine-readable report in some format (could be current XML or some other possibly non-XML format). [SAZ: the Evaluation and Report Language (EARL) provides a machine-readable format for expressing test results (in RDF but with an XML serialization]
- output should be resuable by other applications (such as validators? [SAZ: yes, and accessibility checkers etc.]) or in answering questions such as "is feature X supported on Browser 4.3? What does Browser 4.3 support?"
Test-case spec annotations
In order to know which areas of a spec are well-tested and hence have a sense for (an upper bound on) the completeness of the testsuite as well as the areas where it would be most profitable to direct new testing effort, it would be beneficial to produce an annotated version of the spec that associates each testable assertion in the spec with a link to one or more test cases for that assertion. Requirements:
- Map each test onto a piece of spec
- Fine grained definition of "piece"; some sections are long and contain many normative requirements so paragraph-level is probably the minimum useful level
- Good behaviour in the face of spec modifications, deletions, insertions and rearragements.
[SAZ: the work of the WCAG 2.0 Test Samples Development Task Force (TSD TF) included a metadata format that is based on the more elaborate Test Case Description Language (TCDL)
Requirements not yet categorized
- be intended for CR and post CR phases. The test suite should be suitable to evaluate if the spec is implementable, but it should also be used to promote interoperability
- allow the test suite to be ran by external entities or individual (it may be that the test suite can only be ran under specific conditions) (should it be available as a W3C widget to facilitate deployment on mobiles?)
- allow simple (eg testing the value of an attribute) or complex tests (eg acid or stress tests) to be part of the test suite
- allow a test to cover multiple specifications and sections of specifications
- be suitable for HTTP 1.1, HTML5, CSS 2.1, CSS 3, ES5, Web APIs (HTML DOM, DOM L2, Selectors, Geolocation, XHR, etc.), MathML 1.0, SVG 1.1, Web sockets Protocols, etc.
- allow testing of different layers: network (HTTP, low bandwith/latency, server throlling), syntax, DOM, layout model, rendering
- ideally, the browser vendors should help us getting what we need to run the tests on their products.
- allow for multiple test licenses
- How can the framework help ensure the completeness of a test suite with regards to a particular specification?
- regroup a set of existing tests from different sources (DOM, CSS, SVG, HTML, etc.). Can we create a test runner to run them all? Is it possible to convert them?
- regroup the set of metadata needed/provided in the existing testing framework/tests.
Existing work to consider:
- CSS test suite ([MWI testing framework (self describing and DOM testing), metadata associated with tests, test format, submission process, review process)
- SVG test suite
- jQuery test suite (DOM testing)
- Selectors API test suite
- webkit test suite (testing methods, metadata associated with tests, test format)
- mozilla test suite (testing methods, metadata associated with tests, test format)
- QA Framework: Test Guidelines
- QA taxonomy of tests
- Conformance Test Suites for mobile web technologies
- Mobile Web Test Suites Working Group
- WHATWG Test suite
- Windows Internet Explorer Testing Center
- Browser Tests
that one is highly interesting. Looks like the guy is trying to do what we need for pixel comparison
- <canvas> tests
- HTML 4.01 Test Suite - Assertions
- Brad Pettit presentation during a DOM face-to-face meeting
- Design Notes for a Test Review System
- Test Swarm
- SVG Conformance Test Suite
- Watir and FireWatir. Watir allows one to automate tests using Watir drives browsers the same way people do. It clicks links, fills in forms, presses buttons. Watir also checks results, such as whether expected text appears on the page. It does not seem to provide screenshot facilities unfortunately. No support for Safari on Windows?
- Browserscope is a community-driven project for profiling web browsers. The goals are to foster innovation by tracking browser functionality and to be a resource for web developers.
- WCAG 2.0 Test Samples include a metadata format for describing the tests and a review process to review the tests without bottlenecking a Working Group, yet comply with the W3C Process
- W3C Evaluation and Report Language (EARL) 1.0 is a machine-readable format for expressing test results from quality assurance reviews (including accessibility) -- this can be used to output test results