Creating (semi-)automated tests for WCAG is key to affordable, large scale research. The tests are designed in a way that they are useable by people with a variety of skills. The results too should be informative, not just to developers, but to website managers, policy makers and disability advocates and others.
The objective of this community is to create and maintain tests that can be implemented in large scale monitoring tools for web accessibility. These tests will be either automated, or semi-automated, in which tools assist non-expert users to evaluate web accessibility. By comparing the test results with results from expert accessibility evaluators, we aim to track the accuracy of the tests we've developed. This allows for an iterative improvement and adjustment of the tests as web development practices change and evolve. It also provides the statistical bases on which large scale accessibility monitoring and benchmarking can be built.
This group will not publish specifications.
Note: Community Groups are proposed and run by the community. Although W3C hosts these conversations, the groups do not necessarily represent the views of the W3C Membership or staff.
In the past two months the Auto-WCAG community has been hard at work developing a proposal, for the WCAG working group to set up a taskforce to take on accessibility conformance testing. The goal of the taskforce is to harmonise and stimulate innovation of accessibility test tools such as those on the Web Accessibility Evaluation Tools List.
The proposal is currently being reviewed and will be shared with the WCAG working group in the coming month. The latest version of this proposal can be found on the Auto-WCAG wiki.
With close to 20 participants, yesterday’s Auto-WCAG meeting was the biggest one so far. The topic of the meeting: Should there be a standardized accessibility rules? The answer according to most participants: Yes, there should be. Together with W3C’s Judy Brewer and Shadi Abou-Zahra we have started to explore the possibilities to further harmonize and standardize rules for accessibility conformance testing. Something many participants of yesterday’s meeting iterated would be very helpful to them.
As it stands today, conformance testing for accessibility is a challenge for many organizations. Figuring out if your web content meets the accessibility standard is trickier than it may sound. The WCAG 2.0 standard is extremely useful in outlining the accessibility problems people with disabilities face. However, what it does not do is provide instructions on how to consistently test for such problems. This is where accessibility rule sets come in. Unlike WCAG success criteria, which are high level and technology independent, accessibility rules would be designed to reliably identify accessibility issues in specific technologies.
A major advantage gained from such rules is that many can be entirely automated. This makes them ideal for continuous integration testing. Automated testing is simply faster and resolving issues found this way is more cost effective. Then there is the problem of accuracy. If a rule incorrectly identifies an error (for example, because there is an alternative it could not detect), this can significantly frustrate the development process.
Many such rule sets currently exist, but those rules often disagree with each other for a variety of reasons, such as differing views on accessibility and overlooking creative solutions to accessibility problems. Our hope is that the accessibility community can pull together on this question, so we can better integrate our work into common practices of web development and feel confident about having done so.
After almost two years of working on our W3C wiki, we have decided to start moving our work over to Github! Over the past few years many of us have worked with Github, and we have found it to be an amazing tool for community collaboration. Github is going to make it easier for people to review and contribute to our work, which we hope will improve the harmonization process.
Another great Auto-WCAG meeting has completed. We were happy to have David Berman contribute to our meetings for the first but hopefully not last time.
Several things were worked on during today’s telco. Auto-WCAG is looking for better collaboration with the outside world. Because of that we’ve decided that we are going to explore new options for collaboration with each other and with all of you.
Fresh off the press is our newest test case: SC2-4-7-focus-in-viewport. This test case is designed to check that when elements receive focus, they are displayed on the page. It ensures that things like skiplinks, that are often hidden on default, are positioned on screen once they receive focus.
IBM has released new functionality to it’s Digital Content Checker. The DCC is IBM’s cloud based accessibility service. The new update has also made local testing possible in order to improve the security of the content being tested.
You can learn more about DCC’s features at https://console.ng.bluemix.net/catalog/services/digital-content-checker
Yesterday was the first of our monthly auto-wcag telcos for 2016. We have happily concluded that the work of 2015 have been well received, including by the commission of the European Union who helped us fund the launch of this lovely initiative.
But it’s 2016 now. Stop living in the past. So what will we be doing in the coming year? We’ve set our sighs as follows:
Keep working on the test cases that we’ve created up until this point. We also aim to communicate about our work more frequently to increase the visibility of the group
We’re going to start reaching out to tool developers more actively, to look for feedback on the work we’ve done and to look for greater harmonization with regards to automated accessibility testing
And lastly but no less important we feel that auto-wcag should be the place people look towards for the latest information about tools available on the market.
We would like to thank everyone who has participated to our community effort the past year. As for now, time to get to work. Happy automating!
Web accessibility evaluations can serve a number of different purposes ranging from quality assurance and error repair, individual reports and awareness raising, to benchmarking and monitoring. Many people who would like to know the accessibility status of a web page aren’t experts in the field. In such situations they rely on tools that produce reports about (potential) errors.
Only some aspects of the Web Content Accessibility Guidelines (WCAG) 2.0 can be checked automatically. The majority of Success Criteria require human judgment.
The W3C Automated WCAG Monitoring Community Group is developing a new approach to involve non-experts in the data collection process for an accessibility study. By combining the benefits of automated and manual testing we aim to improve both the quality and the coverage of evaluation results.
Automated checker tools and human judgment
Human intervention is needed in web accessibility testing because automatic testing alone cannot cover all aspects of WCAG. Many of the tools mentioned on the W3C Web Accessibility Evaluation Tools List acknowledge that fact and report issues that can’t be tested automatically as “warnings”or “potential problems”.
The main target audience of these tools are web developers. The tools are intended for use during the creation of a web site and for subsequent quality assurance. This leads to some limitations: The output of the tools contains a lot of technical terms such as HTML element names and references to technical documentation such as the Techniques for WCAG 2.0. Therefore the tools can only be used by persons with web development expertise.
Moreover, repair instructions like “Ensure that the img element’s alt text serves the same purpose as the image.” are targeted towards improvements of the web content, and are not appropriate in the context of monitoring and status reports.
The WCAG Evaluation Methodology recommends involving users with disabilities in the evaluation of a web site. However, if this is done in an informal way without controlled setting, the results are often biased because personal opinion, individual expertise, or other factors influence the result. Especially the level of expertise of the user has a strong influence on the accuracy of the results.
This leads to the conclusion that evaluators should be grouped by their level of expertise rather than by type of disability. Clearly worded questions could lead to better answers from all users with little knowledge of web accessibility.
Structured semi-automatic evaluation approach
The objective of auto-wcag is to create a process with clear structure and instructions that are easy to understand so that even non-experts can follow. Standardized questions reduce the influence of individual opinion. A clear wording and predefined answer options instead of general statements or repair instructions lead to higher quality answers and thus to more reliable results.
Each auto-wcag test case consists of a selector and one or more test steps. There are automatic steps, which can be done by a tool, and manual steps, which require human input.
The manual steps describe tool support and instructions for non-expert users. Tool support can include highlighting the test subject, presenting alternative content that is not directly visible without special settings in the user agent, or providing other specific presentations of the content. These features allow the users to focus on the test subject. The users don’t have to identify the relevant item on the page and the distraction caused by irrelevant items it reduced.
Clear instructions and additional help text enable non-experts to answer the questions as well. The template also captures two additional properties of the test steps: the requirement of interaction and the consideration of context.
The original content and the (programmatically determined) alternative content are presented alongside each other. The question asks if the alternative describes the original content. This type applies to all kinds of non-text content such as images, audio and video as covered by Success Critereon 1.1.1 Non-text Content and Guideline 1.2 Time-based Media. For example, a paragraph of text is presented together with the programmatically determined language, the user is asked if the language is specified correctly.
The web content (or parts of it) is presented to the user in a specific way. For instance with resized text or in linearized form. The questions address features and problems of this presentation. This type applies for example to 1.4.4 Resize text: The text of the web page is resized to 200% and the user is asked if all content is still present.
The complete web page is presented to the users. In this type of test the user is instructed to interact with the web content and to make a statement about the operability. This type is used to check Success Criteria addressing operability. Moreover the behavior of focus, input, and error handling can be covered. For example, the user is asked to move the focus around the web page with the keyboard and to answer the question if the focus got trapped in any component of the page.
So far we have covered semi-automatic tests where the tool can determine applicability and present the preprocessed subject of the test to the non-expert user. However, there are also cases where applicability cannot be determined automatically and the user acts as a manual selector. In this type of user input the user is asked to identify content items that might cause accessibility barriers, such as use of color or other sensory characteristics to reference elements of the web page. It can also be applied to instances of flashing and auto-updating content that can’t be controlled by the user. For example, users could be asked to identify moving, blinking, or scrolling content that plays automatically and cannot be paused.
Some participants of the auto-wcag community group are currently implementing the prototype of a User Testing Tool based on the questions developed in the structured approach described in this post. The tool runs in the user’s web browser and connects to a database storing the user input. The data can then be combined with the results from other (automatic) tools to create a report about the evaluated web content.
About the author
Annika Nietzio is a web accessibility expert working at Research Instistute Technologie and Disability in Germany. In the EIII project she is exploring new ways to combine the results from automated and manual web accessibility evaluations.
The focus of the workshop was to examine how accessibility testing can be automated and writing of automatic tests. Eleven experts have shone their light on the field of automated accessibility testing. The workshop was held in Utrecht, Netherlands. Accessibility Foundation was the host.
The eleven accessibility experts have worked hard to examine how accessibility testing can be automated. The latest developments in the field were discussed and further explored. There were also exploratory talks with the W3C for a closer cooperation.
In small groups, several test cases where examined and elaborated. The written test cases were then been reviewed and discussed by the group. Elements such as audio and video, longdesc attributes and the use of color are examined.
Besides the writing of test cases there were also user feedback sessions. Eric Eggert (W3C) led these sessions to test a new W3C WAI tool. Participants of the workshop gave in sessions of 20 minutes their feedback.
Every day there was an inspiring presentation. On the first day Shadi Abou-Zahra (W3C) told us everything about EARL. On Tuesday Eric Velleman (Accessibility Foundation) gave us insight into the Website Accessibility Conformance Evaluation Methodology (WCAM-EM) document. From California Jesse Beach (Facebook) gave, on the last day, a presentation about QUAIL 3 and open source automated testing.
Many components in the field of automated testing have been discussed and various test cases have been written. We worked on 15 test cases, 10 criteria have been addressed. Two test cases have been completed, and six test cases will have a final review in the coming weeks. After three productive days we can conclude that this workshop was a success.
We are proud to announce Shadi Abou-Zahra, Jesse Beach and Eric Velleman as speakers for the first Auto-WCAG workshop in June.
Shadi Abou-Zahra (W3C) will give a presentation about EARL, Jesse Beach (Facebook) shares her knowledge about Quail 3 and Eric Velleman (Accessibility Foundation) will talk about WCAG-EM.
The presentations will be held in the afternoon and will be broadcast live. The exact times and days will soon be announced.
About the event
The workshop is a three day event from the 15th to the 17th of June in Utrecht, The Netherlands. The primary focus of the workshop is to write and review additional test cases for automating WCAG conformance testing. The afternoon sessions will be broadcast for those unable to attend in person to participate online.
Jesse Beach is a software developer who builds tools for other developers. Visit her on Github: https://github.com/jessebeach
Accessibility evaluation is a method to determine deviation from best practices and standards. Humans do this, for the most part, very well and slowly. Machines do this, for the most part, unevenly and yet at great speed.
During the several years I’ve spent exploring techniques for automated accessibility testing in Quail, I recognized a few types of persistent challenges. Let’s go through them.
Identifying generic DOM elements
By identify, I don’t mean finding elements. That’s pretty easy to do with selectors. The difficulty is uniquely identifying a single element so that it can be referenced later. Let’s say we have a list of links with images in them used for navigation.
Providing a unique selector for, as an example, the second link in the list, is difficult.
Certainly we could be more specific about the parent-child relationship. Sorry about the ‘>’ HTML entities. I can’t get them to display as a great than sign, the CSS child element selector.
And perhaps even include the href attribute.
It’s likely that this selector will be unique on the page, but it’s not guaranteed. With Quail, we take into account several attributes, like href, the help make a DOM element unique. Obviously, we also look for IDs, which presumably are only used once on a page, but even that isn’t guaranteed! In writing this article, I realized we should also be including relative DOM ordering. That’s why we write articles about our work, in order to learn.
In a perfect system, any specification violation would be associated with a unique selector identifying the element in a document associated with the violation. In the wild, unwashed web, we can never rely on this to be true.
Testing the DOM in the DOM
For some standards, such as color contrast, we need to generate a rendered web page. This is necessary because CSS is a complex beast. Browser vendors have a decade and a half experience interpreting and rendering DOM representations with CSS applied. End users use browsers as well. So they are the best tools we have to understand how HTML and CSS will combine to produce visual output. What I’m saying is, the environment we consume content in, is also the environment we’re running out tests in. In other words, the inmates are definitely running this asylum.
My favorite and most frustrating set of evaluations concerns invalid HTML. Here is an example.
<strong><em>pots of gold</strong></em>
Notice that the closing tags are incorrectly nested. Here’s another.
<div><p>baskets of fruit</div>
Notice the p tag isn’t closed. This one is tricky because it’s actually valid HTML; the closing
tag is optional. A browser will handle this just fine. Now, what about this example below.
<div><span><div>strings of pearls</span></div></div>
That will cause a ruckus on your page. A browser will attempt to fix this nesting error, but the results are unpredictable. And remember, when we query an HTML document, we’re not querying the text that was sent to the browser, we’re querying the DOM. The DOM is a model of the HTML text document it received from the server. Our automated assessments are interacting with a mediated HTML representation, which is often fine. But in the case of malformed HTML, it hides the problems from us.
In the current version of Quail, we make an AJAX call for the host page, get the text payload and run that string through tests for malformed HTML. It’s by no means ideal. The better solution would be to do assessments of malformed HTML outside a browser altogether and this is something we will implement in the future.
One of the shortcomings of Quail early on was our singular testing platform — PhantomJS. PhantomJS is what is known as a headless browser. Unlike a web browser you use to view the internet, a headless browser has no visual component. It is lightweight and meant to render pages as if they were to be displayed — it just doesn’t display them. PhantomJS has also been on the cusp of a major version for release for years now. It’s a wonderful tool, but not without frustrating shortcomings.
To really test a web page, you need to run it in browsers: various versions on various platforms. To do this, the assessments need to be run through a web driver that knows how to talk to various browser executables. This infrastructure is much more complex than a test runner that spins up PhantomJS. Quail (a wonderful tool with frustrating shortcomings), is itself on the cusp of a major version release. We are introducing support for test runs using WebdriverIO and Selenium.
Selenium will allow us to run assessments in different browsers. Much thanks to OpenSauce for making this sort of testing available to open source projects!
Writing automated tests for accessibility specifications will challenge you as a developer and in a good way. You’ll need to understand the fundamentals of the technologies that you use every day to get your work done. It’s like doing sit-ups or jogging; you’re in better shape for any sort of physical activity if you practice these basics often. Anyone is welcome to join the Quail team in translating the work of the Auto-WCAG Monitoring Group into coded examples of accessibility guideline assessments. Visit us on our Github project page, check out the tasks and propose a Pull Request!
About the author
Jesse Beach is a software developer at Facebook who builds tools for other developers. Visit her on Github: https://github.com/jessebeach