W3C

– DRAFT –
ARIA and Assistive Technologies Community Group Weekly Teleconference

09 April 2025

Attendees

Present
Carmen, ChrisCuellar, Hadi, IsaDC, James, Joe_Humbert, jugglinmike, Matt_King, mmoss
Regrets
-
Chair
-
Scribe
jugglinmike

Meeting minutes

Review agenda and next meeting dates

Matt_King: Next AT Driver Subgroup meeting: Monday April 14

Matt_King: Next Community Group Meeting: Thursday April 17

https://github.com/w3c/aria-at/wiki/April-9%2C-2025-Agenda

Matt_King: Requests for changes to agenda?

Current status

Matt_King: We have 14 plans in candidate review

Matt_King: We have two in the test queue, and then the "disclosure" plan is on hold while we wait for the JAWS update

Matt_King: We'll talk about the other two plans which are currently in the queue

Matt_King: The agenda has a link to a spreadsheet we are building to make a schedule for the upcoming work https://docs.google.com/spreadsheets/d/14QIhQB9ufdUdZNhb3TNZzoZkGBq0oHawft-S5pm0ioY/edit?gid=0#gid=0

Testing Radio group with roving tab index

Matt_King: In this test plan, we only have one conflict with JAWS. Everything else is done

Matt_King: I think we may have narrowed this down to Windows 10 and Windows 11 giving different results

James: I think we have two testers on Windows 10 who have produced on set of results consistently, and two testers on Windows 11 who have produced a different set of results consistently

IsaDC: Are you sure we have two testers on Windows 11?

Matt_King: Well, we know IsaDC is on Windows 11

Matt_King: And we know James and Joe_Humbert have Windows 10

Matt_King: I am on Windows 11. It seems like I should check this one out

Matt_King: If I get results like James and Joe_Humbert, then IsaDC's machine would be the outlier, and we'd be wondering what's going on

Joe_Humbert: Is this something we should ask Vispero about?

Matt_King: Absolutely! If I observe what IsaDC has reported, then I will reach out to Brett at Vispero

Matt_King: In that case, we might have to collaborate with Vispero about what JAWS ought to say. That would be a new circumstance for us

Matt_King: Anyway, I'll take this as an action item for the next meeting

Testing vertical temperature slider

Matt_King: We have multiple situations with this test plan

Matt_King: Let's start with the issue James raised yesterday. It's an APG issue

Matt_King: The way the test plan is currently written, it's not possible for people to get the same output for the down arrow for navigating to the slider with NVDA

Joe_Humbert: I noticed that with JAWS, you get to the slider because you down arrow three times. With NVDA, you don't get to the slider because you only down arrow twice

Matt_King: Yes

Matt_King: I think we should update this test plan

Matt_King: James raised an issue against APG

Matt_King: I think the arrow is due to a bug in the test case. The "25 degrees C" that appears should not be there because it is duplicative

Matt_King: I don't know why it's there; we'll discuss that in the next APG Task Force meeting

Matt_King: To work around the bug, I think we should just update the test to change the command with another arrow key press

IsaDC: I have the pull request for that ready to go. I wanted to discuss it here, first

Matt_King: I'm suggesting that the APG remove the text label, but that will come later.

Joe_Humbert: They would have to remove it completely because it's text. You don't want to hide text from a screen reader user.

James: It would be acceptable to hide the text because it is duplicated in the aria-value text.

Joe_Humbert: I still think it's generally a bad practice

Matt_King: Yeah, the APG might come back and say that it's fine the way it is

Joe_Humbert: I can see the text rendered on the screen twice. One is above, and one is next to the slider "thumb"

James: That seems visually duplicative

ChrisCuellar: I agree

James: Hiding text can indeed set a dangerous precedent, but so does enunciating text twice. I think it should just be in once place for everyone

Matt_King: For context, we've had many requests to add "aria-hidden" to the text "temperature"

Matt_King: The Task Force has pushed back on that

Joe_Humbert: I think the number on the top is better because it is larger and easier to read and because it doesn't move with the "thumb"

Matt_King: Maybe you don't need the one on the rail

Matt_King: Well, we'll see what the Task Force says. For now, I'm glad IsaDC has a patch ready

Matt_King: Applying that will mean running that one test again

James: Do we need to alter the "navigate backwards" tests?

Matt_King: Nope

Matt_King: Were there any conflicts which were not related to that "moving forward" test?

IsaDC: The negative assertions, what should we do, for now, when we find negative assertions that would technically pass?

Matt_King: This is an issue which appears later on the agenda

Matt_King: James raised an issue for this, and then I have worked on a solution.

Matt_King: The testing for this--and I think even the prior slider stuff that we've done for NVDA--I think we'll need to revisit. I think this is blocked until we can generate an accurate report. What I mean by that will become more clear in a minute

Re-run of JAWS for color viewer slider

Matt_King: JAWS has fixed some bugs, so it would be advantageous if we could re-run these tests

Hadi: I can help

Joe_Humbert: I can, as well

Matt_King: Great! We want to use the latest JAWS. The version which was released in March of this year

IsaDC: I will assign you both

Matt_King: Great!

App issue 1352 - Changes for negative side effects and untestable assertions

github: w3c/aria-at-app#1352

Matt_King: I put together a detailed plan that talks about the test runner and the reports

Matt_King: When you encounter a test like this (where it technically passed an assertion but for an inappropriate reason)

Matt_King: ...there would be a checkbox which indicates whether the assertions were testable

Joe_Humbert: So if it doesn't work as we think it should (it skips over or it says nothing at all, for example), there will be a test box for us to use which says, "we can't apply these assertions"

Matt_King: [summarizes the full UI and workflow as proposed in the issue]

Matt_King: I would like this to be expedited as quickly as possible so that we can get accurate reports on all of the sliders. I think we may even need to re-run a few VoiceOver tests because we encountered this problem and the way we reported them to Apple was confusing

Carmen: Sounds good. We have a planning meeting tomorrow, so we can prioritize this work accordingly

Hadi: How often is this condition?

Matt_King: It occurred on any test plan with Shift+J for one screen reader. We also just found it in a class of tests for NVDA

Matt_King: So far, it's happened in probably seven or eight test plans and with two screen readers

Matt_King: We discussed about what to do without this new feature. We could just mark assertions as failing, but that gives a misleading picture of what is wrong. It produces confusing data, and I don't think we want that

App issue 1162 - Automated updating of reports for new AT releases

github: w3c/aria-at-app#1162

Matt_King: This feature affects everyone, but it's really only used by admins

Matt_King: When a new version of JAWS comes out (one is due in May)--or NVDA for that matter

Matt_King: We would like to be able to re-run all the test plans in "candidate review" using the bots

Matt_King: It won't automatically add the bot to the system and start running the tests. It will require some administrative input. This interface is for performing that task. It is deployed to the "sandbox" server right now, and it is ready for testing

Matt_King: In the issue I linked to in Carmen's latest comment, there are instructions for testing

Matt_King: As soon as we have approval from IsaDC and me, we'll have this pushed out

Matt_King: This will be a big deal for us, especially when JAWS releases the next version

Joe_Humbert: With this kind of automation, will it be possible to get results for previous versions, or will that require manual testing?

Matt_King: I think we could add older versions of screen readers to the bots and use those

jugglinmike: For NVDA and VoiceOver, yes. For JAWS, we may need to do some extra work (depending on whether Vispero hosts previous releases)

jugglinmike: So to support older versions of JAWS, we may need to keep those versions on hand ourselves

Issue 1213 - Minimum version requirements

github: w3c/aria-at#1213

Matt_King: I don't know if we want a policy on this issue or if we just want it to be an admin decision every time you add something to the test queue

Matt_King: Depending on how the run is added, the app may or may not present a user interface for setting the version

Matt_King: I think that may be an omission in the design. I think addressing that omission may resolve the issue

IsaDC: That would definitely resolve it

Matt_King: We want to control the minimum version of the AT when adding a test plan to the test queue to the report status dialog

Matt_King: Carmen, can you create an issue for that in the aria-at-app repository?

Carmen: Sure

Issue 1211 - Reliability of app support for advancing new test plan versions

github: w3c/aria-at#1211

Matt_King: we expect results to be copied into the draft for the report of the new test run. We've seen some inconsistent behavior on this, though

Matt_King: James filed an issue, and howard-e shared a very detailed response. Have you had a chance to review howard-e's response, James?

James: I did read this when it was first posted; I will have to refresh my memory

Matt_King: I don't think that we have a current behavior failure

Matt_King: We did have an example, but we destroyed it when we fixed the problem

Matt_King: We're going to have an opportunity coming up. IsaDC is working on a change to the slider. We'll see if that one works correctly. It might have something to do with which specific things get changed in the test plan. We can just leave this issue open until we see a problematic behavior again

James: We're missing the ability to update the reference without changing anything in the test plan itself.

James: Some change would warrant changing the reference date. But sometimes we have to make a small change to make settings work. What we don't have in the app is to essentially take notice of that

James: from howard-e's response, it seems as though the app is only aware of a command being change, an assertion being changed, or a change to an assertion ID

James: ...but we also want the app to take notice if we change the reference or the setup script

James: So right now, we've pushed a new test plan, and it doesn't get re-imported

Matt_King: That's a different problem, them. This is about copying results

Matt_King: If, for example, the assertions change, then you don't copy the results from the prior into the new

Matt_King: If the setup script changed, is that another one that should void prior results? What about the reference?

James: It's tricky to say because that's on a "test" level

Matt_King: Right

Matt_King: One of the side-effects of maintaining who the tester is, is that we currently don't have a function for the tester to be changed from one person to another

Matt_King: It would be really nice if, when something was assigned to me and I did half the work, if I could re-assign it to Joe_Humbert. Then Joe_Humbert would assume responsibility for everything I've done, and he could finish the rest of the work

IsaDC: With the bot, it would be really useful to have that because sometimes we have the bot collect responses, then we assign to a tester, and then that tester can't help, but we aren't able to re-assign the run to another tester

Matt_King: That sounds like another feature request

Matt_King: A button for "change assignee"

Matt_King: We could even make the person's name into that button. Right now, it's a link to their GitHub profile

Matt_King: You can propose something

Matt_King: Right now, I would prioritize this as "P2"

Carmen: Got it!

Matt_King: If a copy is in prior results that aren't value, it's up to someone to re-run those results or make sure the previously-copied results are valid

Matt_King: Do we want to err on the side of over-copying (copying things that may have been voided), or under-copying?

James: I would like to test things like these before they go into the main app

James: I think that, regardless of the route we take, it needs to be possible for us to--when we make a change to the test plan, run it through a separate environment which is a copy of production, in order to review the actual change

James: Then we can immediately halt and not deploy to production because something unexpected happened

Matt_King: Essentially testing the test plan itself

IsaDC: Yes!

Matt_King: Okay, that is a separate issue. It's on the agenda, though we won't get there today

Matt_King: I think it might not be a massive piece of work to make it happen. We'll save the discussion for when we get to that issue

Matt_King: But in the mean time, if you can reflect on how safe we want to play it, I think that would be helpful

James: I would also love the ability to "roll back" anything that happened. Whether due to a bug or an expected-but-hard-to-predict behavior, I would love to be able to revert

IsaDC: I'm pushing some changes, and I would like to know if the results we have now--will we have a way to get them back?

James: We're making a change to a test plan, and it's possible that the same issue in the app will occur. Do we have a strategy to address it if we lose the results?

Carmen: I can ask howard-e tomorrow

Matt_King: Let's do it today and pay attention to what happened. If something goes wrong, we can send howard-e a detailed e-mail with what happened

Carmen: directly after this call, I will see if we can do a database dump. I'll reach out to you soon, IsaDC

App issue 1365 - Bot reliability reporting

github: w3c/aria-at-app#1365

Matt_King: Bocoup has come up with a testing methodology to test the reliability of the bots

Matt_King: I included a link to a recent report in the agenda

Carmen: We are testing consistency by running each test five times and determining whether there were different responses in each "Trial"

Carmen: You will see that NVDA is over 99% consistent. Our focus this year is on VoiceOver--it is currently at 91%, and we would like to raise it to at least 95%

ChrisCuellar: Right now, it is triggered completely manually

ChrisCuellar: We would like to run it as a service on a regular schedule

Matt_King: I wonder if this will change with different versions of screen readers and bots

ChrisCuellar: Now that we're letting the CG know about the reports, they are almost like a feature that we can iterate on and that we can improve in response to external input

Matt_King: Okay, this is great. I'm really happy to have these metrics!

Minutes manually created (not a transcript), formatted by scribe.perl version 244 (Thu Feb 27 01:23:09 2025 UTC).

Diagnostics

Succeeded: s/the report/a recent report/

All speakers: Carmen, ChrisCuellar, Hadi, IsaDC, James, Joe_Humbert, jugglinmike, Matt_King

Active on IRC: ChrisCuellar, Joe_Humbert, jugglinmike