(MEETING TITLE) – 08 October 2025

Meeting minutes

Review agenda and next meeting dates

Matt_King: Requests for changes to agenda?

Matt_King: hearing none, we'll stick with the agenda as scheduled

Matt_King: Next CG meeting: Thursday October 16

Matt_King: Next AT Driver Subgroup meeting: Monday October 13

jugglinmike: October 13 is a holiday in the US, so we will have to reschedule the AT Driver Subgroup meeting

IsaDC: PAC is also closed on that day

jugglinmike: Let's push it to the same time on following week--Monday, October 20

Current status

Matt_King: We changed "switch made from HTML checkbox" to "draft review"

Matt_King: Everything else is the same

Matt_King: We'll probably move on to the two-state checkbox example today

Matt_King: And soon after that: the quantity spin button

Issue 1521: Rename test plan directories

github: w3c/aria-at-app#1521

Matt_King: This is about the repository where we write all of our tests

Matt_King: I am asking what it would take to possibly adopt a naming convention for the directories where the test plans lives

Matt_King: That's because I am preparing for the time when we have a lot more test plans that are not coming from the APG--test plans where people write the test case itself directly in the repository. The code, example, etc. will all be sourced from this repository. And the tests are more atomic: a specific ARIA feature in a much simpler context

Matt_King: We're planning to teach people how to do that at TPAC

Matt_King: I'm going to come up with an example of a simple test plan that they can essentially copy and then modify

Matt_King: Essentially, we have a few kinds of tests: you "navigate to", "get information about", and "operate"

Matt_King: Imagine we have a test plan for "aria-details" or "aria-errormessage". The directory could just be named according to the name of the property

Matt_King: I'm not sure what to do for ARIA roles because it could become confusing when an element and property have the same name

Matt_King: Someone might see a directory like "aria-button" and get confused, thinking that there is a property named "aria-button"

Matt_King: One approach is that we could put "apg-" as a prefix to the directories that originated from the APG

Matt_King: The plan is to have a lot more test plans in the repository which come from different places

Matt_King: In this issue, I list some of the downstream impact of changing directory names

ChrisCuellar: I think that captures it accurately!

Matt_King: Would prepending all the APG tests with "apg-" be a good approach?

mfairchild: It sounds fine to me, but is there a reason we couldn't use sub-folders?

Matt_King: I would love to do that. It adds other kinds of complexity, but I don't know if it's more or less complex

Matt_King: The biggest problem would be that we would have to decide ahead of time what all of the "sources" are, and the code in the system would have to account for that... Or I guess, maybe under the "test" directory, the system just sweeps all sub-folders.

ChrisCuellar: I'm thinking about precedence in the Web Platform Tests project, which makes heavy use of hierarchical directory structure

ChrisCuellar: There's no way to anticipate future entries, just due to the nature of the web platform. And I think building to support that open-endedness makes sense

Matt_King: Did anyone have pushback against the idea of a "screen-reader" folder?

jugglinmike: I think having a directory structure by AT doesn't give us much flexibility, since we might be replicating tests across different AT folders. We might also run into version parity issues. Namespacing by AT seems not ideal.

Matt_King: Maybe that's not necessarily a knock on subdirectories but more about how we work with different AT's, which is a future-looking question. But the general question of subdirectories is still open I think if we segment by things like ARIA, HTML-AAM, APG, etc.

ChrisCuellar: I think subdirectory structure according to the type of feature under test makes sense. Unlike segmenting by AT, segmenting by feature will probably not involve test plans replicated across subdirectories

Matt_King: Oh yes, for sure

jugglinmike: But there's not a one-to-one mapping of test to feature, tho.

Matt_King: But the subdirectory structure could be reflect the scope of the test at least. It tells us the focus of the test even if it's not exactly testing one feature.
… I think we have the same issues in WPT. Look at how accessibility is tested broadly across WPT.

jugglinmike: Bocoup is currently engaged in this problem with WPT specifically. We're trying to classify tests by web feature. It's a pretty complex problem.

Matt_King: Accessibility bugs are rarely traceable back to isolated features. Bugs are often contextual.

Matt_King: I would prefer subdirectories--it seems more in-line with what other repositories do, and it is generally easier for humans to parse. It's also harder to get wrong (especially since naming involves typing strings, but if a subdirectory already exists, then there is less of a hazard)

howard-e: It's a similar level-of-effort whether we use sub-directories or directory name prefixes

Matt_King: Based on what we've learned from the reference work so far is that we would have three sub-directories: html, aria, and apg

Matt_King: Okay, this is really helpful. In terms of making a plan for this issue, do you think we're better-positioned, now?

ChrisCuellar: I think so! We're going to move everything except for support.json and commands.json into a new sub-directory named "apg"

Matt_King: Correct

Matt_King: And we'll have a very crisp plan for when we actually enact this change

IsaDC: As the person who writes the test plans, the way I organize them is by sub-directories. So for me, that would be the best approach, if possible

Matt_King: Good!

Question about Disclosure Navigation Menu Example

Matt_King: This showed up in the test queue again

Matt_King: I was scratching my head because I was quite certain it wasn't present last week

Matt_King: It has a run by Joe_Humbert and a run by JAWS Bot

ChrisCuellar: Before we rolled out a bunch of fixes last week (for the last bits of the updates to the bot-running), we fixed an issue where some bot runs were getting misclassified as automated updates. That's where we're trying to collect everything when we're doing an automated re-run. There were just some issues in how we were filtering bot runs which we corrected in the previous release. I think that test plan was hidden previously

, and now it's back where it belongs

Matt_King: We had previously published this, I thought. I may be mistaken about that, though

IsaDC: We had "rating radio", but not disclosure

Matt_King: We were re-running disclosure for the latest version of JAWS.

Matt_King: Maybe I'm not remembering correctly, and it got hidden by mistake, and I just didn't notice that we never published it

ChrisCuellar: We can look into this. I want to verify my hypothesis. It sounds like you folks don't recognize this run, though

Matt_King: It says here that Joe_Humbert's run has zero verdicts

ChrisCuellar: It looks like it might be a bot run that got re-assigned but never finishd

Matt_King: Yeah, and the other one looks like an automated bot run where almost all the verdict got assigned...

Matt_King: If you could do a little investigation to determine why this happened

Matt_King: It's odd that all the responses matched but the verdicts were not assigned

ChrisCuellar: It appears as though all the tests were run by Hadi and IsaDC

Carmen: I can raise an issue for this

Matt_King: Alright, thanks

Running Switch test plan

Matt_King: I commented on your issue, Elizabeth

Matt_King: I think it's possible that the cause of your problem is that the function keys aren't doing what they're supposed to do

Matt_King: I found an article

<Matt_King> https://support.apple.com/guide/voiceover/change-function-key-behavior-mchlp2685/10/mac/26

Matt_King: This is an Apple article about how you change the "function" behavior so that you don't have to press the "Fn" key

Elizabeth: I'm getting the correct response, now, actually

IsaDC: The conflicts are gone, now, so this Test Plan is done

IsaDC: I've just now marked it as such

Matt_King: Awesome. We can advance that test plan to "candidate" because we have all three ATs done

Running test plan for Tabs with Automatic Activation

Matt_King: We had nine conflicting results with Hadi, but Hadi isn't present today

IsaDC: I can e-mail him

Running test plan for Tabs with Manual Activation

Matt_King: IsaDC iss still working on this one in JAWS

IsaDC: I'm going to run that test plan for JAWS

Matt_King: For NVDA, we need one more tester

Matt_King: Do we have somebody who is available to take on the NVDA run?

Elizabeth: I'm done with "switch", but I don't have access to a PC at the moment

dean: I can test "Tabs with Manual Activation" with NVDA. That was my intention--to do that once I finished it with VoiceOver, but I'll just do that, now.

Matt_King: Cool!

Updating reports to latest screen reader versions

Matt_King: We have this feature in the system for when a new version of a screen reader comes out. We can have the bot run the test plans for all the published reports. The bot goes through and runs the test plan with a new version of the screen reader. If it gets the same response, then it assigns the same verdicts

Matt_King: Bocoup folks: does the bot also carry forward the designation of "untestable" in this case?

Matt_King: We should verify what it does with untestables and side-effects

Matt_King: Anyway, back to the context of this topic

Matt_King: If the bot can re-assign all the verdicts based on matching with historic AT responses, then it immediately publishes a new report

Matt_King: Otherwise, we need to have a human review

Matt_King: Because of the differences in the way the bots record output and the way that humans report output, the distinction between the AT responses can be trivial

Matt_King: But other times, it is not

Matt_King: for instance, the presence or absence of hint text can interfere with the matching heuristic

Matt_King: We can find the tests requiring oversight in the test queue via a button at the top. There are filters there: one for manual test runs (that's the default), and one for automated updates. If you press the filter for "automated updates", then right now, you will see six entries.

Matt_King: So the job is to just go through and look at these responses and determine whether or not the new response passes or fails the verdict.

Matt_King: So we have six of these ready to go. Is there anybody who wants to take on some of this work?

Elizabeth: Yes, I have some time for this, actually

Matt_King: Okay, I can assign the first one to you right now (the "action menu button")

ChrisCuellar: It should stay in the "automated" section even after you assign it

Matt_King: I assigned the first one to you, Elizabeth

ChrisCuellar: Yup, I see it assigned to Elizabeth, and it is indeed still in the "automated updates" section

Matt_King: This is a pretty good case for why we might want a new filter for "My Test Runs"

Matt_King: Elizabeth, when you open that, you'll see it looks like a little bit different

Matt_King: For each command, you'll see what the response was for the previous report, and in the "edit" box, you'll see what the response was in the latest automated run. We don't want you to actually edit that response; we just want you to report whether the new response satisfies the assertions

– DRAFT –
(MEETING TITLE)

08 October 2025

Attendees