The idea started with the fact that we have a number of Working Groups who are trying to review the way they do testing, but also increase the number of tests they are doing as well.
The CSS Working Group was foremost in mind when it comes to testing. The Group has several documents in Candidate Recommendation stage that are waiting tests and testing. The HTML Working Group is starting to look into testing as well and a key component of ensure the proper success of HTML 5 is through testing. The specification is quite big to say the least and, when it comes to testing, it’s going to require a lot of work. We also have more and more APIs within the Web Apps group, Device API, Geolocation, etc. The SVG Working Group has a test suite for 1.2, but they’re looking at different ways of testing as well. The framework produced by the MWI Test Suites framework allow two methods. One requires a human to look at it and select pass/fail. The other one is more suitable for script tests, ie APIs testing.
A bunch of us, namely Mike Smith, Fantasai, Jonathan Watt, Doug Schepers, and myself, decided to get together to discuss this and figure out how to improve the situation. We focused on three axes: test submissions, test reviews and how to run a test.
First, we’d like ideally every single Web author to be able to submit tests, so when they run into a browser bug based on a specification, it should be easy for them to submit a test to W3C. It should also allow browser vendors to submit thousands of tests at once. There is the question of how much metadata do you require when submitting a test. For example, we do need to know at some point which feature/part of a spec is being tested. We should also as many format as possible for tests. Reftests, mochitests, DOM-only tests, human tests, etc. The importance aspect here is to be able to run those tests on many platforms/browsers as possible. A test format that can only be ran on one browser is of no use for us.
Once a test has been submitted, it needs to be reviewed. The basic idea behind improving test reviews is to allow more individuals to contribute. The resources inside W3C aren’t enough to review ten of thousands of tests. We need to involve the community at large by doing crowd reviews. It will allow the working groups to only focus on the controversial tests.
Once the test got reviewed, we need to run them on the browsers, as many as possible. Human tests for example are easy to run on all of them, but it does require a lot of humans. Automatic layout tests are a lot trickier, especially on mobiles. We focused on one method during our gathering: screenshot based approach. The basic idea here is that a screenshot of the page is compared to a reference. Mozilla developed a technology called ref-tests that compares Web pages themselves. You write two pages differently that are supposed the exact same rendering and compare their screenshots. It avoids a lot of cross-platforms issues one can. The way Mozilla is doing that is via the mozPaint API in debug mode. That works well, but only works in Mozilla. You can guess that other browser vendors have a similar to automatically take screenshots as well. We wanted to find a way to do this with all browsers without forcing them or us to write significant amounts of code. We found a Web site called browsertests.org and we got in touch with that Sylvain Pasche and, with his help, we started to make some improvements on his application. It works well on desktops at least. Once again, we don’t think W3C is big enough to replicate all types of browser environments, so we should make it easy for people to run the tests in their browser and report the results back to us. Plenty of testing frameworks have been done already and we should try to leverage them as much as possible.
We started to set up a database for receiving the tests and their results. We’d like to continue the efforts on the server/database side, as well as continuing to improve Sylvain’s application, allowing more tests methods and formats. Testing the CSS or HTML5 parser should be allowed for example.
You’ll find more information at our unstable server but keep in mind that:
- we’re in the very early stages
- this server is a temporary one that I managed to steal for a few days from our system folks. They’ll want it back one of those days and I need to find a more stable home prior to that event. I’ll update the link once this happens but expect it to break if you bookmark it.
- Unless I can secure more resources for the project, we won’t go far by ourselves.
The server also contains links to more resources on the Web related to various testing efforts, as well as a more complete of what we wish the testing framework to accomplish.
For the conclusion, I’d like to thank Mike Smith and Doug Schepers, and especially Jonathan Watt and Fantasai from the Mozilla Foundation. They all accepted to argue and code for 8 days around the simple idea of improving the state of testing at W3C. I hope we’re going to be able to take this project off the ground in the near future. If you’re interested in contributing, got ideas and time, don’t hesitate to contact me.