W3C

– DRAFT –
ARIA and Assistive Technologies Community Group Weekly Teleconference

17 April 2025

Attendees

Present
Carmen, ChrisCuellar, dean, howard-e, Isa, IsaDC, james, Joe_Humbert, jugglinmike, Matt_King, mmoss, present
Regrets
-
Chair
-
Scribe
jugglinmike

Meeting minutes

Review agenda and next meeting dates

Matt_King: Requests for changes to agenda?

<mmoss> +present

ChrisCuellar: We filed an issue related to consistency reports for disclosure

ChrisCuellar: It's not higher priority than anything on the agenda today, though

Matt_King: Then let's plan to discuss it next week

Matt_King: Next Community Group Meeting: Wednesday April 23

Matt_King: Next AT Driver Subgroup meeting: Monday May 12

Current status

<ChrisCuellar> The disclosure nav related issue raised by Bocoup to be discussed next week: w3c/aria-at-app#1298

Matt_King: We're shooting for a new test plan roughly every week. We'll see how that goes

Matt_King: The stuff that we have in draft review right now--we will get a new one rolling today (the "rating radio group")

Matt_King: And we have one that's currently blocked by conflicts in JAWS results--"radio group with roving tab index"

Matt_King: I still haven't done anything on that to work on breaking the tie

Matt_King: Though we did raise it with Vispero yesterday, so we may get additional information on unblocking it when we talk to them next

Matt_King: Vertical temperature slider is on hold for the moment due to a couple issues

Matt_King: As is "disclosure navigation"

Matt_King: I think that for JAWS, we have to re-run the test plan because the output will be significantly different

Matt_King: For the other screen readers, we can probably leave the reports as-is because we don't anticipate changes (that is, for NVDA and Voiceover)

Matt_King: Coming up soon, will be the other disclosure navigation plan

Matt_King: Or maybe we should do the other disclosure first, and move that one down two weeks

Isa: Sounds good

Matt_King: So this will come up in about three weeks, instead. I will update the spreadsheet after this meeting

Speadsheet: https://docs.google.com/spreadsheets/d/14QIhQB9ufdUdZNhb3TNZzoZkGBq0oHawft-S5pm0ioY/edit?gid=0#gid=0

<jugglinmike> s/Spreadsheet:/The spreadsheet is available here:/

Isa: The versioning stuff with VoiceOver is set from 11-point-something to begin with

Matt_King: Oh, that's because I added it from the "report status" dialog. That's the only way to add the bot, and we have an open issue for being able to set the version number there

Matt_King: I think we need some other ways to run the bot, but we can discuss that elsewhere

Isa: It would be good to have a way of editing the AT version without deleting the report and re-adding the test plan for that specific AT from the Test Queue

Matt_King: If we were able to add it with the correct minimum version, then we wouldn't need to edit it

Isa: I agree

Matt_King: We need some volunteers for all three screen readers

Joe_Humbert: I can run it for JAWS

Matt_King: Yay! Can you assign yourself in the Test Queue?

Joe_Humbert: Sure

Matt_King: Next on the list is NVDA. The bot has successfully recorded responses

Joe_Humbert: I can do all three if necessary

Isa: Thank you! I will assign you the bots' results for both NVDA and VoiceOver

Matt_King: Wow, Joe_Humbert's killing it

Joe_Humbert: I had a question about the vertical temperature slider

Matt_King: I left the vertical temperature slider off the agenda

Isa: We changed one test

Joe_Humbert: Do I need to go back and re-do the test for that one result?

Isa: It's not urgent because that test plan is blocked for now.

Matt_King: Before we finish the NVDA one, we're waiting on issue 1352

Matt_King: And for JAWS, we have a conflict that's very similar to the one we have for the "roving tab index radio group"

Matt_King: You can do the JAWS one, but hold off on the NVDA one

Joe_Humbert: Got it

mmoss: I can also run it for VoiceOver

Isa: I will run the bot, so you should have the responses available to you after the end of this meeting

Re-run of JAWS for color viewer slider

Matt_King: How did we get two rows in the table for the same minimum version of JAWS and the same browser? Is that a bug in the app?

Joe_Humbert: I also see those two rows

m

Matt_King: Is it possible to add the same one twice and create two rows?

howard-e: It may have stemmed from the functionality that was added to allow "exact" versus "minimum." It shouldn't be possible

Matt_King: I'm hesitant to hit "delete" because I don't want it to accidentally delete the row above

howard-e: I can't tell when it was introduced immediately.

howard-e: But I reproduced it locally right now, and I can confirm that it is safe to delete

Matt_King: Okay, I'm hitting "delete report" right now

Joe_Humbert: It's gone and the prior work is still there

Matt_King: Yay! We're good

Carmen: I will raise an issue related to that.

Joe_Humbert: It reported the same version in both of them

Joe_Humbert: Since I reloaded the page, I see myself assigned to the "rating radio group" ones, but NVDA and VoiceOver says "0 of 15 tests evaluated"

Isa: I assigned the bot's results to Joe_Humbert, so it should show them

Matt_King: "Evaluated" means you just have to assign the verdicts

Joe_Humbert: Ah, yes. It says "evaluated" while the row for JAWS says "completed"

Isa: We need a second person for NVDA

dean: I can do the "rating radio group" for NVDA

Isa: I will run the bot and assign the result to dean

Matt_King: Awesome. Thank you

App issue 1367 - Prototyping decisions for manual testing on mobile

github: w3c/aria-at-app#1367

howard-e: Initially, when thinking about the manual tester's collection, we've always wanted to do it on-device or as close to the device as possible

howard-e: Perhaps using system utterances in the same way we're doing with NVDA and VoiceOver

howard-e: iOS's security precautions make that difficult

howard-e: For Android, we're considering an approach of taking video recordings and extracting text data from that video stream. But that's very much a fallback solution

howard-e: What we have built today, however, is a set of scripts that allow one to collect the utterances from TalkBack

howard-e: One can press the "run test" button and begin recording utterances

howard-e: We've confirmed that we can get system-level utterances from Android

howard-e: We've started work on this for iOS, but the feasibility is less clear

howard-e: That's all presented in this issue, and at the end, I've included four questions

howard-e: First: should we move forward with building an interface to make it easier for others to evaluate the Android-specific solution that I just described?

howard-e: Second: should we halt the approach for VoiceOver (because there is so much uncertainty) and move forward with the fallback video-recording approach?

howard-e: So it's really just the two questions

Matt_King: I'm personally very reluctant to do anything that relies on OCR, even as a fallback, because it seems very unreliable, potentially super-flaky, compute-heavy, and not very scalable

Matt_King: We can think of it as a fallback, but in my mind, it's a very distant fallback

Matt_King: And given where we're at right now, I think even a prototype on Android (one that just experiments with different ways to make the experience viable) could be really useful because we don't know what ANY method of doing this could feel like for human testers

Matt_King: We have humans who will manually run the tests on Android devices. Right now, doing that on a mobile device would be ridiculously time-intensive because it would involve human transcription (running the test runner on both the mobile device and a laptop)

Matt_King: The question is, what do we actually want the tester to perform on their device? Which steps?

Matt_King: Let's imagine how we eliminate the horror of this work?

james: If I could open a web page on a mobile device and sign in, it's a painful experience, sure. But if you're running the test plan on your laptop, it would be good if you could press a button on the page on your laptop and have that open the page on your phone

Matt_King: A dream user-experience would be if, somehow, your Android device is connected to what you're doing on your laptop in such a way that you would be able to run through the test plan on your laptop, but when you press "open test plan" on the laptop, it actually opens in your Android device. When you execute the prescribed gestures on the Android device, the computer is automatically collecting the AT response into the appropria

te field on the laptop

howard-e: I left out two details here. The setup scripts allow you to directly open the test page. At the end of the utterance collection (The end of the script or a closed process), the AT responses are available on the clipboard

james: Just to be clear, though, it's a non-starter to expect anyone to input assertion verdicts on the mobile device

james: Any mobile-oriented testing needs to assume that the primary driver is a computer connected to a mobile device (because form entry is unavoidable painful on mobile)

howard-e: I am not proposing a mobile-first experience. I want to limit the interaction with the Android device as much as possible--just to direct interaction with the test page

dean: Is there something like the VoiceOver recorder that will simply record everything and transcribe it--that could then be cut-and-pasted into the results?

Joe_Humbert: Possibly on Android. ADB is very powerful, but it's also very fickle

howard-e: That's the Android Debug Bridge. It's powering some of the scripts I'm presenting today

dean: If something could record what I'm doing with VoiceOver on my iPhone and just put it in Notes, I could open Notes on my laptop and just paste it into my test results

dean: It's not any more work than what I'm doing now on my desktop with the VoiceOver recorder

james: The VoiceOver testing experience, even with the recorder, is extremely inefficient. So using it as a baseline for our mobile testing solution doesn't seem ideal

dean: Agreed

howard-e: To your point, dean, I took a look at that for iOS. Even in generating the transcript which is provided by Apple for its own recording--that transcript is significantly incorrect

Matt_King: I don't want to worry too much about those kind of problems at this point

<Joe_Humbert> I have to drop. I will try to get all the work done by next wednesday

Matt_King: If it turns out that we can do this well on Android but that problems with Apple products make it so that we can't do it well on Apple products, let's bring that as feedback to Apple and point to our work with Android as a baseline for what is reasonable or good

Matt_King: But it's really hard to have any conversation without having some kind of concrete experience

Matt_King: james described an ideal workflow, but a prototype isn't about implementing an idea. It's about taking a first step in that direction

Matt_King: So if howard-e could bring us a functional "step one", we can review and discuss next steps

howard-e: Got it

howard-e: This issue is now slightly outdated. The team at Bocoup has begun investigation at doing a similar approach on iOS. It's certainly less documented, so it's requiring additional research, but we're hopeful

james: What does the approach look like?

howard-e: It's using system-level intercepts of the voice. We wouldn't expect testers to use it on their personal devices. It probably gets us to a position that's better for automation

james: With a custom voice like we're using on macOS?

howard-e: No, that's not necessarily possible on iOS

ChrisCuellar: We're taking inspiration from security researchers

james: That may raise issues with Apple's acceptance of results collected in that manner

james: It's worth noting that eSpeak is available on iOS and it is open source

james: It's available in the iOS app store

Matt_King: That must mean that custom synthesizers are already supported

james: Yes. It's a universal macOS and iOS implementation

Matt_King: Anyway, let's start there with some "step 1" prototype for Android and see where it takes us

howard-e: Sounds good! Thanks!

Capturing lessons learned from JAWS AT driver experience

Matt_King: I don't necessarily want to talk about the lessons learned right now, but I do want to discuss how we capture the lessons and respond to them

Matt_King: We could do a one-off meeting with some of the members here

Matt_King: We have such a small attendance on the automation subgroup meeting, it's realistic to move it

Matt_King: Maybe we want to do a poll to reschedule

Matt_King: It seemed like james wanted to do this sooner rather than later

james: It's not urgent. It just depends on whether Vispero would benefit from anything happening sooner. And also whether they can join that call

james: I'm happy for that to happen in the next automation meeting

Matt_King: So even if we had it in the first half of that meeting, you would be able to attend that first half?

james: Actually, I can join for the full hour on May 12

james: Okay, let's see if we can get Brett and Glenn to join that one

jugglinmike: I've been working with them directly, so I'll reach out

Issue 1355 - Version of Bot used for a run

github: w3c/aria-at-app#1355

Matt_King: As I understand the issue, it's if we say that the minimum version of NVDA requires the latest version of NVDA, then if we run it with the bot, this pull request ensures that it only runs if that bot has that version available... Is that right?

howard-e: For the longest time, it has been prioritizing the release date when it came to choosing the "latest". There was some oversight when it comes to patch release of older versions

howard-e: For instance, first version 13 is released, then 14 is released, then 13.1 is released

howard-e: One would expect version 14 to be the latest version, but the system was selecting 13.1

howard-e: We saw this mainly with NVDA versions and with VoiceOver versions. Mainly VoiceOver versions

Matt_King: We always have the option of not adding a patched version into the system at all

Matt_King: Is this primarily effecting the order that versions are presented in the drop down?

howard-e: Yes, and that order was also being implicitly used in bot selection. And that is where the true concern popped up

Matt_King: We have a challenge right now where, when we run with the bot, we aren't able to choose a specific version. We only have one way of kicking off the bot right now.

Matt_King: I'm going to raise an issue about reconsidering the use-cases for starting bot runs

Minutes manually created (not a transcript), formatted by scribe.perl version 244 (Thu Feb 27 01:23:09 2025 UTC).

Diagnostics

Succeeded: s/Matt_Link/Matt_King/

Failed: s/Spreadsheet:/The spreadsheet is available here:/

Succeeded: s/has/has successfully/

Succeeded: s/completed"/evaluated"/

Maybe present: Speadsheet

All speakers: Carmen, ChrisCuellar, dean, howard-e, Isa, james, Joe_Humbert, jugglinmike, Matt_King, mmoss, Speadsheet

Active on IRC: ChrisCuellar, Joe_Humbert, jugglinmike, mmoss