ARIA/AG TPAC joint meeting – 14 November 2025

Meeting minutes

*Introductions*

<chrisp> +present

Daniel: https://w3.org/2025/talks/accessibility-testing/

Slideset: https://www.w3.org/2025/Talks/TPAC/accessibility-testing/Overview.html

Daniel: Here today to discuss some of the challenges we're facing with the various testing initiatives we have. Currently split across various working groups, which poses challenges. How we solve this is worth discussing.

Identified challenges: duplication of effort; how can we leverage the expertise of the different groups doing similar things

How to test conformance

Fragmentation: different test cases spread around different places, different testing groups that have difficulty when it comes to reviewing work and making sure their voice is heard when crafting the requirements

Some degree of difficulty with people not participating in the working groups figuring out how to participate; e.g. which group should they participate in

Overall goal: test conformance. We're here to make sure it's possible to test conformance with WCAG, ARIA, accessibility API mappings. Ultimate goal should be that we can ensure conformance with various specs.

Can be done multiple ways

First: what's computed by the browser

Second: how this computation is communicated to accessibility platforms which are running on the different operating systems and the communication after that to the assistive tech (AT). Making sure there is at least some similarity in this communication on different operating systems, scenarios, etc.

Third and final: making sure AT is actually rendering something that makes sense from both an author's perspective and a computations perspective. If there is a button that says hello, there should be a way for the AT to render that correctly.

Testing the browser accessibility: this is done using WPT. There's an accessibility investigation area within Interop. Testing computation, but also accessibility API mappings. AAM testing is currently focused on Linux, but aim is to extend to other platforms.

Accessibility support definition: talking about ??, solution shouldn't cost more for a disabled person than a non-disabled person

??2

<spectranaut_> From the slides: Definition of accessibility supported starts with: supported by users' assistive technologies as well as the accessibility features in browsers and other user agents

To test accessibility support, there are several ACTs.

PowerMapper's accessibility project.

mark: Been doing AT tests since around about 2012-2013. About the time ARIA 1 became rec.

Started conformance checker for WCAG among other things. Reason we started testing is that we were basing results on WCAG failures and sufficient techinques. Custome rfeedback was that some failures weren't failing, some things that apparently worked didn't work. need to find out what's really accessibility supported.

Wasn't really any information in 2012. Some sufficient techniques had some data, but not many. Needed to start testing stuff.

Tested screen readers at the time with various browsers.

Been doing that every year for about 14 years.

Kind auseful because worst thing you can do in a conformance checker is say something doesn't coform, then tell user to do something that's conforming but doesn't work.

Most of the tests are about labelling things, about 50%

Just under 20000 test results. All been done manually.

Daniel: Some of this data that PowerMapper contributed is being used in the AGWG conformance testing project. Task force is in charge of writing ACT rules format, which defines how a test rule looks like and what the components of the rule and how it needs ot be written.

Applicability: defines the scope, sets out where something should be tested (under which circumstances)

Expectations: complements; defines common conventions around testing scenario so that it's clear what we're testing, but also what common conventions supposed to be around testing scenario

Test cases section: most relevant; pieces of code to evaluate around test applicability. Passed, failed, applicable test cases so you can evaluate by yourself and then decide whether your outcome is same as that specified by rules

Lately we've been having more and more test implementers.

Consistent rules: if you fail taht something that shouldn't be failing, that's a problem.

We should be able to identify whether something passes or fails, and if you don't ge tthis right, it's a consistency issue.

Coming back to further efforts around testing accessibility support:

Getting beyond labels and names, potential way to automtae screen reader and accessibility testing, a couple of projects related to ARIA WG mostly

One of these is ARIA-AT project which Matt_King and others are involved in: provide a number of test plans and test cases, run three screen readers, provide results based on test plans.

The AT runs the tests, the framework grabs the outcome from the AT.

Second: Accessibility Compat Data project that Lola is running. Basically trying to address AT testing from a more holistic approach, getting data from several places including WPT test results as well as ARIA-AT and potentially others, and providing that data in a machine readable format that can then be used in developer resources such as MDN

Open questions:

How can we make these projects work together? How can we connect the more and ensure everyone is aware what is going on without understanding all the details?

How can we leverage the projects ?? future requirements, how we can ensure specificatoins going forward are possible to test

How can we improve working process

<Zakim> jcraig, you wanted to ask if, when Daniel finishes his slides, I could share a diagram I use to explain the overlap and scope of different accessibility testing initiatives. It may be helpful given the AGWG question: how can we use one of more of these for WCAG testing...

<shawn> +1 to see (and hear descrription of) diagram

jcraig: gave a brief overview of the various projects, including a diagram which might be useful here; overview of what part of the stack is tested by each project

<Zakim> Matt_King, you wanted to share context about ARIA-AT

Matt_King: Background on ARIA-AT. Started working on ARIA Practices Guide back in 2014. Very similar feedback: we would provide a pattern, tell people this is how you make toggles, etc., then feedback saying "you can do it that way, but it doesn't necessarily work; works in one screen reader, not another". Patterns following the spec, but not necessarily working.

Formed ARIA-AT group in 2018.

Two distinct goals:

1. Explicit goal of building a platform that enables screen reader developers to develop consensus around what is an interoperable tabs interface, what is an interoperable menu interface, etc., so you can build on one platform but have confidence that it'll work on others.

2. It wouldn't succeed without automation, so we have an AT Driver spec which is a WebDriver BiDi analogy for driving screen readers in the context of a browser. Part of the browser testing tools working group charter. Early draft form. We have implementations for JAWS, NVDA and VoieOver. We use those to run the ARIA-AT tests.

<jugglinmike> w3c/at-driver

If you look at ARIA-AT, you can see results based on test plans; e.g. tabs in the APG; or from the point of view of ARIA roles and states and properties; or from the point of view of HTML elements used in those things. All on ARIA-AT website

aria-at.w3.org

About 65000 test results at this point. A lot driven by automation.

daniel: What these projects all have in common that can be delivered is a resource that could contribute to give more clarity around the test results???

What are the bits that are missing from the current setup that could be improved to deliver more

jcraig: *shows diagram*

Web author on the left, end user on the right. Talking about mainstream web stack.

Regardless of what framework you use; HTML, CSS, JS, etc.; that in turn is interpreted by the browser's rendering engine, WebKit, Blink, Gecko

All of input and output is handled by the rendering engine

Each of the engines has its own accessibility representation of that source. Some of the source accessibility info comes from HTML, augmentation by web author (ARIA, etc.), helps browser rendering engine build into a model of what it thinks the author wants, what the accessibility tree should be

Different mappings for different platforms; e.g. Mac and Windows

That is what most assistive tech use to convey info back to the user: speech output, braille output, interpreting input from the user back through this stack to the web page

The automated testing stacks of all of these are client side automation, PowerMapper, Axe Core.

Slightly more functionality that can be achieved by in-browser audits; e.g. client side stuff and alos reach into engine a bit

Web Platform Tests, the work we did in Interop Accessibility, is intended to test the accessibility engine. Look at working state or broken state of web page. Do all the engines do esactly the same thing? That's what we're testing

spectranaut_: has been working on Acatia web platform tests. Testing APIS. Testing engine itself, not the content.

Platform specific automation packages that we're not going into here

Dashed line for AT testing, end-to-end testing

Manual testing that is used to validate these tests like PowerMapper. ARIA-AT has automated tests as Matt_King mentioned.

Broader project. Goal of that is to get end user expectations.

Lola with Accessiblity Compat Data, not necessarily automtaed but trying to gather all the data from all these different projects and funnnel that into a single set of results we can use

We do have some overlap. Not necessarily conflicting. They're testing different parts of the stack.

daniel: Do you all feel you have the support you need from W3C or different working groups involved? Or is there anything that needs to be improved, that is missing; e.g. would benefit from having a common venue

Do we need a monthly meeting or something like that to share progress so we can stay in sync? or is it all working fine?

lola: Great ot have some kind of monthly session to coordinate and keep updated on what we're all doing.

Regarding what's missing, for ACD, this week has been first time where we've reconsidered ARIA-AT as a data source. Originally, we felt it might not be ready. Mainly focusing on the browser side and then supplementing the screen reader side with reosurces that Tetrilogical provided. However, it now feels like ARIA-AT is ready so we can see how it might integrate with ACD.

Would love to connect with the folks involved in ARIA-AT in how we could make that work. Key thing that's missing.

WPT might also help us get closer to that goal.

Cyns: Another thing that would be useful is to try and use the same test files across all the different layers that we're testing. if we're testing an anchor, testing the same way across the different projects so we can be sure where the problem lies

I know there are some gaps, but I think we'd also find a lot of duplication

<Zakim> jcraig, you wanted to ask what ideas AGWG folks had about automated testing that led to this joint meeting

Might also reduce the amount of work we have to do

jcraig: was under the impression that it was AGWG folks that raised this question that led to this joint meeting about how the automated testing projects could benefit WCAG conformance testing. Would love ot hear more ideas from AGWG folks about what they had in mind. If we could figure out what portion of this we could squeeze that work into, where it would be appropriate to start researching

Do you know enough about the different stacks to know which part the conformance testing would fit into? Are you looking for more info?

All of the above?

Daniel: My understanding is that AGWG wants to give us a heads up that this is coming, that they're defining requirements for WCAG3, that they'll be requesing feedback from ARIA WG, etc., but the specifics aren't fleshed out yet

Need to flesh it out all together, just the very early stages

kevin: WCAG3 stuff different from looking at testing landscape

Matt_King: Felt that this week has been very good in terms of informatoin sharing. Highlihts the importance of more info sharing across the projects, but also across working groups. Cyns observation that maybe there are some opportunities for us ot have at least a set of common tests that go through the entire journey from author to user could be extremely powerful thing. Mark and I have connected about some of the ways we may be able to

collaborate.

Ultimately, one of the things I've been thinking about regarding how ARIA-AT and how it relates to WCAG sufficient techniques

In WCAG2, accessibility supported didn't seem to really mean much. Is there an opportunity for that to change in WCAG3?

<mbgower> +1 to matt's comment on accessibility support!

In ARIA-AT, one of the ways we've tied it back to patterns is that we're surfacing the data about the interop state

If we can do same thing with sufficient techniques, that data could be always up to date for various screen readers. People would be able to say that in the context of WCAG.

<Zakim> mbgower, you wanted to say shortcomings on visual importance and keyboard predictability

mbgower: Can up with own SCinside IBM ot capture accessibility support because had to doquite a bit of digging to figure out whether it's author, AT, etc.

If you could just capture that data, you can deal with that later???

Accessiblity is predecated on pepole doing rational things.

Keyboard support definition: doesn't say anywhere what keyboard suppot has to be

Nothing saying that keyboard support has to follow convention, can do whatever keyboard support and it will "pass"

Would be good to come up with some standard that defines keyboard conventions, etc.

AT biased towards screen reader. Visual info isn't getting caught becaus eit doesn't align with screen reader expectations. need to muscle up there because we don't have programmatic expectations.

<Rachael> +1 to expected conventions including visual

We need to define those visual expectations. Can hopefully get support from user agents to help with this. These things are slipping through the cracks.

kevin: Talking with Daniel about ARIA-AT. Becoming aware of different aspects ot testing happening in a lot of different places. Wonder whether we should be starting to draw those things together and what that might look like.

Sounds like lots of bits going on

<Zakim> jugglinmike, you wanted to make a distinction between "shared tests" and "shared test cases"

jugglinmike: Sharing tests might undermine that concept of the subtle differences between the projects.

Talk about shared test cases, the thing that differs is what we're actually asserting. This distinction could help in how we tlak about this.

<Zakim> Rachael, you wanted to speak about accessibilty supported in WCAG 3

Rachael: Regarding accessibility supported in WCAG3, agree that we need a whole new section on it.

Should be in draft coming out in December

Potential that if conformance claim is made in a scoped accessibiliy support set that sometimes authors don't need to resolve things because ???

<Zakim> jcraig, you wanted to mention ARIA-AT is doing a reasonable job to cover the standard expectations Mike Gower mentioned and to mention the multiple types of ~monthly meetings and their scope

jcraig: Standard expectations are different on different platforms for keyboard conventions. ARIA-AT is doing a really good job of building a framework to allow for this type of thing. Not only do the ARIA-AT tests say "here are the platform specific screen reader key to do this" - e.g. navigate to this button -

e.g. standard activation key on Mac is space bar, standard activation key on Windows is enter

ARIA-AT is doing a good job of allowing an abstraction of what the expectations are, but let that abstraction be specifically defined in each of the different contexts

Not all right yet, but there's a path to get there

What about "the monthly meeting"? Actually multiple.

There is one for WPT Interop project. It doesnt make sense for WCAG to be part of that because it's more tightly scoped.

This covers WebDriver and Acacia testing.

<spectranaut_> s/acatia/acacia/

There is another meeting that Lola manages monthly-ish about Accessibility Compat Data.

<mgifford2> lolaslab/accessibility-compat-data

Open to a third monthly meeting about testing.......

Would be interesting to define: what is it that WCAG conformance tests are trying to test?

In Interop meeting, both those projects are intended to test rendering engine, not the content

<spectranaut_> me s/Acccessibilitow/wicccessibilitow/

As I understand, WCAG3 conformance tests, content is input, output is whether or not it matches some algorithm or response based on what is defined in WCAG3.

They're not incompatible; they're testing different things, solving different problems. Wouldn't want to cross the streams.

Jamie: While it absolutely sense to information-share and have a shared pool of test cases

Jamie: However, there is a slightly different set of expertise to write these tests

Jamie: Also, coordination will be an interesting challenge in itself, and I'd hate to see the test-writing effort stalled on coordination issues

Jamie: So I'm cautioning against letting the perfect being the enemy of the good. Just avoiding the risk of trying to make the test cases common before we have the tests at all

Matt_King: Earlier advocating information sharing. Not sure if it needs to take on the form of a monthly meeting. Feel like maybe not a lot of overlap in the actual work. More concerned about we have a coherent strategy across all of thes ethings. To extent possible, where it is good to align or de-duplicate, our strategy should address that.

From perspective of conformance, we have a tricky problem in area of "accessibility supported". Most interop is all based on testing. In case of ARIA-AT or PowerMapper etc., there's no standard. So focus has been on building consensus on what the expectations should be.

???

If they're leveregaging sufficient technique for non-normative but widely accepted convention, that ???

If we're going to have AT interop, we'd have to have normative specs, which has obviously been a not a hot potatoe but a flaming nuclear potato

Think we can get to a place where we don't have to go into that space, but still have the benefits of something like it if we can get consensus around the tests

Cyns: Accessibility supported is significant improvement on what we had in WCAG1

Excited about having much more rigorous, tested way to say a thing is or is not accessibility supported

And that user agents have achieved that

For the test case alignment, interested in working on that. Spreadsheet not a meeting

Not sure who should talk to about this

Talk to Matt_King

Lola: +1 to Matt_King's comment

Align on strategy, direction and funding

Opportunity: Multiple projects could go for funding together

Make sure we're not individually going for same funding that could conflict

<shawn> [ Shawn interested in helping communicate the alignment, clarify the scope and differences, etc. ]

<Zakim> Rachael, you wanted to agree we can document without requiring

Rachael: There is a way to walk the line between documenting on what's going on and encouraging more harmonisation while not requiring it

Should figure out right way to have those conversatoins going forward

Draft of WCAG3 late December early January

Pulled together all the requirements

Big differences between WCAG3 and 2: first, much more granular

lot more requirements than success criteria in WCAG2

In WCAG2, lot of requirements joined together

WCAG3, breaking it out into smaller pieces, makes it a bit more testable

One of our requests for those working in testing: look at the draft that's coming out and dive into details of is what we're defining testable?

WCAG2 was much more technology agnostic

Step further in WCAG3: make it solution agnostic. This is the requirement. Images need to have text alternative, but not saying that authors necessarily have ot define it.

Over time, doesn't necessarliy have to be the author. Want to write standards in a way that they can live for a long time and reflect changes in technology.

New technology could solve the problems differently.

Appreciate feedback on that from testing standpoint.

Not ready yet, coming soon

Huge ask :)

jcraig: Could Rachael kevin etc., if video content doesn't have captions but platform its on provides captions, what would be sufficient? Perfection never expected, "good enough" is kinda hard to write

Rachael: Looking at a couple of different approaches

Two of three different kinds of requirements being discussed in detail at present

Things that are really basic like text can be detected and images can be detected, saying AT can get to them in some form

Next layer: something is provided; e.g. captions are provided. No quality check at all, just something there

Next level up: quality checks. Captions accurately reflect what the content is.

Separated the way it's fixed from the end result

*examples*???

<daniel-mac> q/

Daniel: This is what we have for today. Thanks! Appreciation to those joining remotely in particular :)

Not worry so much about specific process, just in making it happen and supporting that wherever possible

Zakim: end meeting

– DRAFT –
ARIA/AG TPAC joint meeting

14 November 2025

Attendees

Meeting minutes

Diagnostics