Creating (semi-)automated tests for WCAG is key to affordable, large scale research. The tests are designed in a way that they are useable by people with a variety of skills. The results too should be informative, not just to developers, but to website managers, policy makers and disability advocates and others.
The objective of this community is to create and maintain tests that can be implemented in large scale monitoring tools for web accessibility. These tests will be either automated, or semi-automated, in which tools assist non-expert users to evaluate web accessibility. By comparing the test results with results from expert accessibility evaluators, we aim to track the accuracy of the tests we've developed. This allows for an iterative improvement and adjustment of the tests as web development practices change and evolve. It also provides the statistical bases on which large scale accessibility monitoring and benchmarking can be built.
This group will not publish specifications.
Note: Community Groups are proposed and run by the community. Although W3C hosts these conversations, the groups do not necessarily represent the views of the W3C Membership or staff.
Another great Auto-WCAG meeting has completed. We were happy to have David Berman contribute to our meetings for the first but hopefully not last time.
Several things were worked on during today’s telco. Auto-WCAG is looking for better collaboration with the outside world. Because of that we’ve decided that we are going to explore new options for collaboration with each other and with all of you.
Fresh off the press is our newest test case: SC2-4-7-focus-in-viewport. This test case is designed to check that when elements receive focus, they are displayed on the page. It ensures that things like skiplinks, that are often hidden on default, are positioned on screen once they receive focus.
IBM has released new functionality to it’s Digital Content Checker. The DCC is IBM’s cloud based accessibility service. The new update has also made local testing possible in order to improve the security of the content being tested.
You can learn more about DCC’s features at https://console.ng.bluemix.net/catalog/services/digital-content-checker
Yesterday was the first of our monthly auto-wcag telcos for 2016. We have happily concluded that the work of 2015 have been well received, including by the commission of the European Union who helped us fund the launch of this lovely initiative.
But it’s 2016 now. Stop living in the past. So what will we be doing in the coming year? We’ve set our sighs as follows:
Keep working on the test cases that we’ve created up until this point. We also aim to communicate about our work more frequently to increase the visibility of the group
We’re going to start reaching out to tool developers more actively, to look for feedback on the work we’ve done and to look for greater harmonization with regards to automated accessibility testing
And lastly but no less important we feel that auto-wcag should be the place people look towards for the latest information about tools available on the market.
We would like to thank everyone who has participated to our community effort the past year. As for now, time to get to work. Happy automating!
Web accessibility evaluations can serve a number of different purposes ranging from quality assurance and error repair, individual reports and awareness raising, to benchmarking and monitoring. Many people who would like to know the accessibility status of a web page aren’t experts in the field. In such situations they rely on tools that produce reports about (potential) errors.
Only some aspects of the Web Content Accessibility Guidelines (WCAG) 2.0 can be checked automatically. The majority of Success Criteria require human judgment.
The W3C Automated WCAG Monitoring Community Group is developing a new approach to involve non-experts in the data collection process for an accessibility study. By combining the benefits of automated and manual testing we aim to improve both the quality and the coverage of evaluation results.
Automated checker tools and human judgment
Human intervention is needed in web accessibility testing because automatic testing alone cannot cover all aspects of WCAG. Many of the tools mentioned on the W3C Web Accessibility Evaluation Tools List acknowledge that fact and report issues that can’t be tested automatically as “warnings”or “potential problems”.
The main target audience of these tools are web developers. The tools are intended for use during the creation of a web site and for subsequent quality assurance. This leads to some limitations: The output of the tools contains a lot of technical terms such as HTML element names and references to technical documentation such as the Techniques for WCAG 2.0. Therefore the tools can only be used by persons with web development expertise.
Moreover, repair instructions like “Ensure that the img element’s alt text serves the same purpose as the image.” are targeted towards improvements of the web content, and are not appropriate in the context of monitoring and status reports.
The WCAG Evaluation Methodology recommends involving users with disabilities in the evaluation of a web site. However, if this is done in an informal way without controlled setting, the results are often biased because personal opinion, individual expertise, or other factors influence the result. Especially the level of expertise of the user has a strong influence on the accuracy of the results.
This leads to the conclusion that evaluators should be grouped by their level of expertise rather than by type of disability. Clearly worded questions could lead to better answers from all users with little knowledge of web accessibility.
Structured semi-automatic evaluation approach
The objective of auto-wcag is to create a process with clear structure and instructions that are easy to understand so that even non-experts can follow. Standardized questions reduce the influence of individual opinion. A clear wording and predefined answer options instead of general statements or repair instructions lead to higher quality answers and thus to more reliable results.
Each auto-wcag test case consists of a selector and one or more test steps. There are automatic steps, which can be done by a tool, and manual steps, which require human input.
The manual steps describe tool support and instructions for non-expert users. Tool support can include highlighting the test subject, presenting alternative content that is not directly visible without special settings in the user agent, or providing other specific presentations of the content. These features allow the users to focus on the test subject. The users don’t have to identify the relevant item on the page and the distraction caused by irrelevant items it reduced.
Clear instructions and additional help text enable non-experts to answer the questions as well. The template also captures two additional properties of the test steps: the requirement of interaction and the consideration of context.
The original content and the (programmatically determined) alternative content are presented alongside each other. The question asks if the alternative describes the original content. This type applies to all kinds of non-text content such as images, audio and video as covered by Success Critereon 1.1.1 Non-text Content and Guideline 1.2 Time-based Media. For example, a paragraph of text is presented together with the programmatically determined language, the user is asked if the language is specified correctly.
The web content (or parts of it) is presented to the user in a specific way. For instance with resized text or in linearized form. The questions address features and problems of this presentation. This type applies for example to 1.4.4 Resize text: The text of the web page is resized to 200% and the user is asked if all content is still present.
The complete web page is presented to the users. In this type of test the user is instructed to interact with the web content and to make a statement about the operability. This type is used to check Success Criteria addressing operability. Moreover the behavior of focus, input, and error handling can be covered. For example, the user is asked to move the focus around the web page with the keyboard and to answer the question if the focus got trapped in any component of the page.
So far we have covered semi-automatic tests where the tool can determine applicability and present the preprocessed subject of the test to the non-expert user. However, there are also cases where applicability cannot be determined automatically and the user acts as a manual selector. In this type of user input the user is asked to identify content items that might cause accessibility barriers, such as use of color or other sensory characteristics to reference elements of the web page. It can also be applied to instances of flashing and auto-updating content that can’t be controlled by the user. For example, users could be asked to identify moving, blinking, or scrolling content that plays automatically and cannot be paused.
Some participants of the auto-wcag community group are currently implementing the prototype of a User Testing Tool based on the questions developed in the structured approach described in this post. The tool runs in the user’s web browser and connects to a database storing the user input. The data can then be combined with the results from other (automatic) tools to create a report about the evaluated web content.
About the author
Annika Nietzio is a web accessibility expert working at Research Instistute Technologie and Disability in Germany. In the EIII project she is exploring new ways to combine the results from automated and manual web accessibility evaluations.
The focus of the workshop was to examine how accessibility testing can be automated and writing of automatic tests. Eleven experts have shone their light on the field of automated accessibility testing. The workshop was held in Utrecht, Netherlands. Accessibility Foundation was the host.
The eleven accessibility experts have worked hard to examine how accessibility testing can be automated. The latest developments in the field were discussed and further explored. There were also exploratory talks with the W3C for a closer cooperation.
In small groups, several test cases where examined and elaborated. The written test cases were then been reviewed and discussed by the group. Elements such as audio and video, longdesc attributes and the use of color are examined.
Besides the writing of test cases there were also user feedback sessions. Eric Eggert (W3C) led these sessions to test a new W3C WAI tool. Participants of the workshop gave in sessions of 20 minutes their feedback.
Every day there was an inspiring presentation. On the first day Shadi Abou-Zahra (W3C) told us everything about EARL. On Tuesday Eric Velleman (Accessibility Foundation) gave us insight into the Website Accessibility Conformance Evaluation Methodology (WCAM-EM) document. From California Jesse Beach (Facebook) gave, on the last day, a presentation about QUAIL 3 and open source automated testing.
Many components in the field of automated testing have been discussed and various test cases have been written. We worked on 15 test cases, 10 criteria have been addressed. Two test cases have been completed, and six test cases will have a final review in the coming weeks. After three productive days we can conclude that this workshop was a success.
We are proud to announce Shadi Abou-Zahra, Jesse Beach and Eric Velleman as speakers for the first Auto-WCAG workshop in June.
Shadi Abou-Zahra (W3C) will give a presentation about EARL, Jesse Beach (Facebook) shares her knowledge about Quail 3 and Eric Velleman (Accessibility Foundation) will talk about WCAG-EM.
The presentations will be held in the afternoon and will be broadcast live. The exact times and days will soon be announced.
About the event
The workshop is a three day event from the 15th to the 17th of June in Utrecht, The Netherlands. The primary focus of the workshop is to write and review additional test cases for automating WCAG conformance testing. The afternoon sessions will be broadcast for those unable to attend in person to participate online.
Jesse Beach is a software developer who builds tools for other developers. Visit her on Github: https://github.com/jessebeach
Accessibility evaluation is a method to determine deviation from best practices and standards. Humans do this, for the most part, very well and slowly. Machines do this, for the most part, unevenly and yet at great speed.
During the several years I’ve spent exploring techniques for automated accessibility testing in Quail, I recognized a few types of persistent challenges. Let’s go through them.
Identifying generic DOM elements
By identify, I don’t mean finding elements. That’s pretty easy to do with selectors. The difficulty is uniquely identifying a single element so that it can be referenced later. Let’s say we have a list of links with images in them used for navigation.
Providing a unique selector for, as an example, the second link in the list, is difficult.
Certainly we could be more specific about the parent-child relationship. Sorry about the ‘>’ HTML entities. I can’t get them to display as a great than sign, the CSS child element selector.
And perhaps even include the href attribute.
It’s likely that this selector will be unique on the page, but it’s not guaranteed. With Quail, we take into account several attributes, like href, the help make a DOM element unique. Obviously, we also look for IDs, which presumably are only used once on a page, but even that isn’t guaranteed! In writing this article, I realized we should also be including relative DOM ordering. That’s why we write articles about our work, in order to learn.
In a perfect system, any specification violation would be associated with a unique selector identifying the element in a document associated with the violation. In the wild, unwashed web, we can never rely on this to be true.
Testing the DOM in the DOM
For some standards, such as color contrast, we need to generate a rendered web page. This is necessary because CSS is a complex beast. Browser vendors have a decade and a half experience interpreting and rendering DOM representations with CSS applied. End users use browsers as well. So they are the best tools we have to understand how HTML and CSS will combine to produce visual output. What I’m saying is, the environment we consume content in, is also the environment we’re running out tests in. In other words, the inmates are definitely running this asylum.
My favorite and most frustrating set of evaluations concerns invalid HTML. Here is an example.
<strong><em>pots of gold</strong></em>
Notice that the closing tags are incorrectly nested. Here’s another.
<div><p>baskets of fruit</div>
Notice the p tag isn’t closed. This one is tricky because it’s actually valid HTML; the closing
tag is optional. A browser will handle this just fine. Now, what about this example below.
<div><span><div>strings of pearls</span></div></div>
That will cause a ruckus on your page. A browser will attempt to fix this nesting error, but the results are unpredictable. And remember, when we query an HTML document, we’re not querying the text that was sent to the browser, we’re querying the DOM. The DOM is a model of the HTML text document it received from the server. Our automated assessments are interacting with a mediated HTML representation, which is often fine. But in the case of malformed HTML, it hides the problems from us.
In the current version of Quail, we make an AJAX call for the host page, get the text payload and run that string through tests for malformed HTML. It’s by no means ideal. The better solution would be to do assessments of malformed HTML outside a browser altogether and this is something we will implement in the future.
One of the shortcomings of Quail early on was our singular testing platform — PhantomJS. PhantomJS is what is known as a headless browser. Unlike a web browser you use to view the internet, a headless browser has no visual component. It is lightweight and meant to render pages as if they were to be displayed — it just doesn’t display them. PhantomJS has also been on the cusp of a major version for release for years now. It’s a wonderful tool, but not without frustrating shortcomings.
To really test a web page, you need to run it in browsers: various versions on various platforms. To do this, the assessments need to be run through a web driver that knows how to talk to various browser executables. This infrastructure is much more complex than a test runner that spins up PhantomJS. Quail (a wonderful tool with frustrating shortcomings), is itself on the cusp of a major version release. We are introducing support for test runs using WebdriverIO and Selenium.
Selenium will allow us to run assessments in different browsers. Much thanks to OpenSauce for making this sort of testing available to open source projects!
Writing automated tests for accessibility specifications will challenge you as a developer and in a good way. You’ll need to understand the fundamentals of the technologies that you use every day to get your work done. It’s like doing sit-ups or jogging; you’re in better shape for any sort of physical activity if you practice these basics often. Anyone is welcome to join the Quail team in translating the work of the Auto-WCAG Monitoring Group into coded examples of accessibility guideline assessments. Visit us on our Github project page, check out the tasks and propose a Pull Request!
About the author
Jesse Beach is a software developer at Facebook who builds tools for other developers. Visit her on Github: https://github.com/jessebeach
The group has finished the test case for provision of short text alternatives as required by WCAG Success Criterion 1.1.1.
Its core component is the text computation algorithm as defined in the current UAAG as well as the WAI-ARIA recommendation. This algorithm specifies how user agents should handle the different attributes that may be used to provide a textual alternative.
It does not only cover images, but also input elements of type image, areas of an image map and embed- and object-elements.
Assuming that the attributes are accessibility supported, all these elements are semi-automated tested for the correct use of sufficient techniques for provision of a textual alternative. All circumstances, in which WCAG allows to omit a text alternative, such as for grouped images or as a part of a link, are considered. Additionally it is tested for the correct hiding of purely decorative content from assistive technologies and common failures like the use of placeholders or filenames as an alternative.
On most elements human evaluation is needed for the complete test.
The tests are semi-automated so automated tools can implement many of the 18 steps. The Auto-WCAG community is looking forward to see what developers can do with this great new test case.
There are many great tools on the market that can check the accessibility of web pages. The Web Accessibility Evaluation Tools List is a great resource to find checkers for different types of content. Many of them focus on testing specific aspects of accessibility, such as color contrast or parsing. But some have a broader scope and will check many different aspects and report the conformance to WCAG success criteria.
I encourage all web professionals to use an accessibility checker in their daily work. But as an accessibility auditor with 8 years of experience, I must confess that I don’t use any of these checkers myself. To test HTML pages, the only tools I use are a DOM inspector, a color analyzer and a validator. So why the difference?
Automated accessibility testing is tricky. WCAG was never designed to be automated. There is a good argument to be made that by definition, automated testing of accessibility is impossible. Think about it. If you want to test if some piece of content is accessible, you should compare the existing implementation to what the component should be like when it is accessible. To automate this, you need two things. You need to automatically determine what an accessible version would be like and you need some way to compare it to the current situation.
The first part of this is important. Imagine a tool that could reliably determine what the text alternative of an image should be. We could compare that to the actual alternative and we would have our test, right? However, if there was such a tool, assistive technologies could also implement it. And if they did, we wouldn’t have an accessibility problem with text alternatives anymore.
This idea seems to be true for most accessibility problems: If we can automatically determine the solution, the problem goes away. Because of this, accessibility checkers are mostly unable to determine if a success criteria was met, except where no assistive technologies are involved. But what our tools certainly can do, is to look for symptoms of accessibility barriers and fail a success criteria based on those.
Symptoms Of Inaccessibility
If you’ve done anything with HTML in the past 10 years, you probably know that you shouldn’t use the <font> element. It is an outdated solution to styling text. There is nothing inherently wrong about the <font> element, but many accessibility checkers flag the <font> element as an error. Why do that for an element that is not inherently inaccessible?
One way you could use the <font> element to create an accessibility problem is the following:
Here the font element is used to provide information, which is not available in text. This is a failure for criterion 1.4.1. A checker that fails the criterion for using the <font> element would be correct in doing so in this situation. It assumes the <font> element is often used to provide information that is not otherwise available and fails the criterion based on that assumption.
Assumptions are the basis of automated accessibility tests. Checkers look for symptoms of accessibility barriers, such as the use of a <font> element, and assume they found a barrier. Every automated test I know of works on assumptions in one way or another. Even a test such as color contrast assumes there is no conforming alternative version. The important question then becomes: how accurate are these assumptions?
Dealing With Assumptions
Most tests in tools are based on the test designer’s experience with front end development practices. This experience greatly influences the accuracy of an accessibility checker. The required accuracy of the tests depends quite a lot on who is using the tool. For me, as an external accessibility auditor, I need a very high degree of accuracy. Double checking the results of a checker takes a lot of time, often more then it would to do the test manually. Therefore I tend not to use these tools.
For web developers and QA teams accuracy is less of an issue. It may be okay to flag <font> elements as errors, as using them is not a good idea anyway. Similarly, you could fail <table> elements without <th> elements, or <select> elements with an onchange=”” attribute. There are many tests that checkers can use, that can be very meaningful to your organisation, without them always accurately identifying accessibility errors.
Accessibility checker tools are great! They provide a quick and relatively inexpensive way to find accessibility barriers on your website. They are useful during the development to encourage a style of coding that avoids accessibility barriers. They also provide a good starting point for anyone who wants to build accessibility into their quality assurance process, though they don’t give you the whole picture.
Accessibility checkers have limitations. Being aware of those means you can make better decisions about the tools you can use. The field of web accessibility has long been focused on manual audits, but there is a clear precedence for the use of tools. As long as we understand their limitations, we can manage them and get better and more efficient because of them.
About The Author
Wilco Fiers is a web accessibility consultant and auditor at Accessibility Foundation NL. He is founder and chair of the Auto-WCAG community group. Wilco has participated in a variety of accessibility projects such as WAI-AGE, WAI-ACT and EIII as well as being a developer in open source projects such as QuailJS and WCAG-EM Report Tool.
Last week the first version of the WCAG-EM Report Tool: Website Accessibility Evaluation Report Generator was published. This tool is developed by the Education and Outreach Working Group (EOWG). The tool helps to generate website accessibility evaluation reports according to Website Accessibility Conformance Evaluation Methodology (WCAG-EM). The tool guides you through the steps of WCAG-EM to create a structured evaluation report.
The WCAG-EM Report Tool project is closely related with the work of the Automated WCAG Monitoring community group. Both project groups are working on the development of supporting tools for testing websites on accessibility. Where the Auto-WCAG group focuses on the development of (semi-)automated tests for WCAG, the Report tool focuses on the creation of the evaluation reports.