Conformance Issues

This document is Outdated

See the Silver Conformance Design (May 2019) for the updated version. It was moved to Google docs to allow public comment and discussion.

Goals for Conformance

We did 16 months of research to see what users need from accessibility guidelines. The Silver Design Sprint held in March 2018 suggested:

Design a conformance structure and style guides that shift emphasis from “testability” to “measureability” so that guidance can be included that is not conducive to a pass/fail test. Pass/ fail tests can be included, but they are not the only way to measure conformance.
Develop scorecard or rubric measures for testing task accomplishment, instead of technical page conformance.
Develop a point and ranking system that will allow more nuanced measurement of the content or product: e.g. a bronze, silver, gold, platinum rating where the bronze rating represents the minimal conformance (roughly equivalent to meeting WCAG 2 AA), and increasing ranks include inclusive design principles, task-based assessment, and usability testing.
Include a definition and concept for “substantially meets” so people are not excessively penalized for bugs that may not have a large impact on the experience of people with disabilities.
Remove “accessibility supported” as an author responsibility and provide guidance to authoring tools, browsers and assistive technology developers of the expected behaviors of their products.
Develop a more flexible method of claiming conformance that is better suited to accommodate dynamic or more regularly updated content.

Current Status of Conformance Prototype

We have worked out a general framework for the conformance system and how it works in the overall information architecture. See the Conformance Prototype draft for details. Here is a high-level overview.

Silver Conformance Prototype Proposal

Information Architecture

We aren’t losing content, we are restructuring

Flattening the overall structure of WCAG 2.x
- From: Principle, Guidelines, Success Criteria and Techniques
- To: Guidelines and Methods
Guidelines are general information and intent written in plain language
Methods are specific examples, instructions, and tests with more technical information
We are adding a tagging engine to make it easier for people to find information.
There will be an API so people can extract the data to use for their own purposes.

How WCAG moves to Silver

Principles -> tags. It allows us to add Principles, if appropriate, and assign multiple Principles.
Guidelines and technology neutral Success Criteria -> Guidelines
Technology specific Success Criteria and Techniques -> Methods
Understanding becomes part of the long description of Guidelines or Methods. It will be broken up. Example: Use of Color would go toward a Guideline, but the examples are technology specific and would go to a Method.
Levels (A, AA, AAA) are deleted. Silver conformance levels will be overall for the product/project, not by Success Criteria.
Success Criteria numbers are deleted. Guidance will be known by a unique handle, so that Silver scales for new guidance. We will tag Guidelines and Methods with the WCAG 2.1 number, so it can easily be found.

Beyond Web Content

Add new guidance that was not able to be included in WCAG 2.1
In response to new technology (ex. Home automation)
Silver is not restricted to web content (so it needs a new name)
Silver will include advice for:
- Browsers and user agents
- Authoring tools and development frameworks
- Assistive technology
That advice will probably be in the Methods. We want to flag Methods where new features or bugs are in parts of the accessibility stack that authors and content creators can't do much about.

Proposed Silver Architecture Overview: Guidelines are evaluated by different types of measures, like pass/fail tests, automated tests, usability tests, user testing and more. The results of the measures are scored by a point scoring system. The score is then used to determine the level: Bronze, Silver or Gold. Sites that currently meet WCAG 2.x AA could be grandfathered in at Bronze level.

Testing

Silver will include the types of tests currently used in WCAG.

Automated tests (see the work of AutoWCAG Community Group and ACT Task Force
Manual tests (from WCAG Techniques)

Silver will also allow tests that are new to the WCAG testing. Somewhere there is a list of new types of tests we brainstormed.

Rubrics
Distance from a mean (good sample)
Proposal for Heuristic Evaluations from Charles Hall
Cognitive Walkthrough
Task Completion

The answers could be expressed as:

Scale (for example, rate on a scale of 1-5, level 1-2 don't pass, extra points for levels 4-5?)
User Research results
Documentation of the process used

We need to have repeatability of test methods.

Examples for how Silver Tests could be scored

Scoring

Point scoring system - we have been working on a point system and have a number of prototypes. This is what we most need help on. It must be transparent and have rules that can be applied across different guidance. We are not going to individually decide what a Method is worth because it doesn't meet the needs of regulators for transparency, it doesn't scale, and it is too vulnerable to influence.

Overview of current prototype:

spreadsheet to prototype a point scoring system using existing WCAG SCs.

Other proposals

Proposal for weighted scoring formula from Tim Boland

Levels

Levels are not by individual success criteria, there are overall for the site, product, or project as defined by the organization.
Levels should be named so that they are internationally obvious what is worst to best. We are grateful to the Olympics for making Bronze, Silver, and Gold internationally known.

Regulatory Conformance

How do we best support the regulatory environment?

Use Case from Japan (thank you, Makoto): "We set the rule which require web content owners (ex. web masters) to select 40 web pages from the website and test the 40 web pages. They must select more than 25 web pages by using random selection. And if there isn't any issues within the 40 pages, they can make their conformance claim for the entire website. We came up with the idea of '40 web pages' originally from Unified Web Evaluation Methodology（UWEM）1.2 developed in EU around year 2007 - 2008. And we set to "40" In terms of man-hours and costs. Also we encourage a small website which has less than 100 web pages to test every single web page within the website."

Exceptions

These are use cases where organizations make a good-faith effort to make their site accessible and it still has problems. If we have a list of use cases, we can address them.

"Substantially conforms" came out of the Silver research where companies had a generally accessible site, but it was so large or updated so quickly that it wasn't possible to guarantee that it was 100% conformant. Facebook was an example of a site that was literally impossible to test because it was updated tens of thousands of times per second.
"Tolerance" is a different concept of a less-than-ideal implementation but no serious barriers. I think we could collect those "less than ideal" examples when we write the tests for the user need. How we would we flag them as "less than ideal" and refer people to better methods seems like a solvable problem.
"Accessibility Supported" is another slice of this problem, where organizations code to the standard, but it doesn't work because of some bug or lack of implementation in the assistive technology. We have discussed noting the problem in the Method, and then tagging the Method for the assistive technology vendors to know they have a problem, or make it easy for SME's to file bugs against the AT (or user agents, or platforms, etc.)
Where something conforms, but the users are still not able to go through the task or get the information they need.
Being dependent on an external vendor and you can't fix it until the vendor fixes it.
A Map application where the complexity of the visual experience is too overwhelming to express in text equivalent.

Evergreen Recommendation?

W3C is considering the possibility of changing its Process to allow specifications that can be continually updated with periodic numbered Recommendations. This is an interesting possibility that the Silver project leadership are discussing. It is still in the formative stage, but if we decide to pursue this path, it will have an impact on the W3C Conformance section. Here are links to some preliminary public information that we wanted to save for future reference:

Additional information and Resources

Silver Conformance draft - This is a working document. The appendices are large and contain the material from WCAG 2.1 we were using for test purposes. You don't need to review them. When we wanted to save some work that we were no longer using, we threw it in the Appendix.
Google Drive folder on Conformance There are more supporting documents with ideas. We worked out some ideas of how to assign the points in the point system, for example. All of our work is in this Conformance public Google drive folder. You should be able to comment on any document in the folder. Please let us know if you have any difficulty accessing the material.

CSUN 2019 presentation "Future of Accessibility Guidelines for Web and ICT" for a overview. There are speaker notes explaining each slide and providing links to further detail. The Silver wiki main page has links to all reports on the project. There is probably too much information, and I am including it only if you want background information.

Issues

How do we make conformance better aligned with the experience of people with disabilities? People with different disabilities have different experiences.
Will the model we are proposing address the needs identified?
What measurements should we encourage?
How do we set up a point scoring system that will be transparent, fair, and motivate or reward organizations to do more? There is an experiment with a point scoring spreadsheet. That is not intended to be used by regular users, only accessibility policy experts, regulators, and lawyers. (Bruce recommends a proof of concept that is more exaggerated (order of magnitude) to develop the concept, then refine it later. )
How do we maintain a point system so it stays current, but is protected from "gaming"?
How do we set up methodologies for task-based assessment that can be used across a breadth of websites and products? The nuance of defining a task (granularity, paths, whether different multiple paths are more accessible to certain disabilities)
How do we migrate people from WCAG 2.x to Silver from a compliance viewpoint? (for example, should Bronze level equal WCAG 2.0 or WCAG 2.1?
How do we decide what are minimums that organizations must meet? Should that just be the non-interference success criteria of WCAG 2.x or are there more?
Should we require points be spread over categories of user needs? What list of user needs should we use?
How do we draw a line between "equivalent experience" and not identical experience? The example is a Map application where the complexity of the visual experience is too overwhelming to express in text equivalent.