W3C

– DRAFT –
ARIA and Assistive Technologies Community Group Weekly Teleconference

30 March 2023

Attendees

Present
James_Scholes, jugglinmike, Matt_King, michael_fairchild, Sam_Shaw
Regrets
-
Chair
Matt King
Scribe
jugglinmike

Meeting minutes

Review Agenda and Next Meeting Date

<Matt_King> Next meeting is scheduled for April 6.

ARIA and Assistive Technologies Community Group Weekly Teleconference

Matt_King: Our next meeting will be April 6 as per the usual schedule

AT Support Table Launch Update

AT Support Table Launch Update

Matt_King: We have made a lot of progress since last week!

Matt_King: The support tables for Button and Toggle Button have been merged to the main branch. That does not mean that they are live yet, but they will be

Matt_King: Alert, Link, and Radio are in progress and will be merged soon

Matt_King: James Scholes, and I have tentative plans to meet with Vispero next week

Matt_King: We will also attempt to meet with Apple

Matt_King: These two stakeholders will see support tables for all 5 patterns we're targeting

Matt_King: They'll also see a draft of the announcement. And I'll share that draft in this meeting next week

Matt_King: The live reports on the ARIA-AT site for alert, button, and toggle button have all been updated

Matt_King: I still have to do that for Radio and Link

Matt_King: Bocoup fixed the issues with those that I'd previously reported

Matt_King: So things are all lining up for the launch!

Current testing check-in

Matt_King: We're going to get more data for the five plans. Originally, we only had data for JAWS + NVDA in Chrome and voiceover in safari. We're hoping to get even more combinations in time to be live for April 13

James_Scholes: All testing is complete for Command Button and Toggle button, for two testers each. One conflict

James_Scholes: Command Button for NVDA and Firefox is complete for two testers. Toggle Button for NVDA and Firefox is completed on the PAC side, but it looks like Alyssa has not yet started

James_Scholes: VoiceOver with Chrome is complete from PAC and from John, but it looks like there are two conflicts

James_Scholes: Toggle Button in VoiceOver and Chrome is complete from two testers, but there are seven conflicts

James_Scholes: Good progress--only one test run still to be completed (from two plans, three combinations with two testers each)

Matt_King: Once you look into that conflict, let me know if we need to put any conflict resolution issues on the agenda for next week

Process (Working Mode) Questions Issue 914

w3c/aria-at#914

Matt_King: Some background: we're working on some analysis of the current app functionality and comparing it to the working mode

Matt_King: I'm building a [GitHub] Project to map out exactly what requirements of the working mode are not supported (either correctly or at all) that are necessary to delivering "recommended" reports

Matt_King: I didn't reference that [GitHub] Project here, yet. We'll talk more about that later

Matt_King: As I'm doing that, I'm going through the working mode and looking at various scenarios for how we use it

Matt_King: The first scenario -- the "happy path" or "scenario 0"

Matt_King: A perfect draft goes into the working more and goes straight to community feedback. Everyone runs it with no feedback and there's no conflict. It goes to the "candidate" phase, so the implementers look at it and approve it without comment. So it reaches the "recommended" phase

Matt_King: When reviewing with "scenario 0" in mind, I came up with three questions

Matt_King: Those are listed in the GitHub Issue we're discussing, now

Matt_King: First question: "Should we scope test plans to a group of AT with identical testing requirements?"

Matt_King: Right now, the scope of all of our test plans is JAWS, NVDA, and VoiceOver for macOS

Matt_King: There are two reasons why scope is super-important

Matt_King: One is that we seek consensus from major stakeholders which include developers of the ATs

Matt_King: Two is that it determines which ATs we consider when we're trying to prove whether the test is good

Matt_King: At some point in the future, we will be testing VoiceOver for iOS and TalkBack for Android. We'll also be testing Narrator and maybe Chrome Box. Beyond that, we'll hopefully be testing voice recognition and eye gaze (way down the road)

Matt_King: What should we do when we add additional ATs? Should they be new test plans? Or should they get added to an existing test plan?

James_Scholes: Second question: do all future ATs have the same testing requirements?

James_Scholes: I ask because when you create a test plan, it's possible to have only a subset of tests that apply to a given AT. For instance, the "mode switching" tests apply to NVDA and JAWS, but they do not apply to VoiceOver

James_Scholes: If we were to update an existing test plan to add voice recognition commands (for example). We could add them to an existing test plan either by extending all of the existing tests to support speech recognition commands, but if we decided that a particular test did not apply, we could simply omit it.

James_Scholes: So I'm inclined to do that rather than create a whole new test plan

michael_fairchild: My process is similar to what James_Scholes has outlined

Matt_King: Let's talk about different possible approaches before discussing pros and cons of particular approaches

Matt_King: We could look at AT that have essentially the same functionality--desktop screen readers as a category. They largely perform the same functions in very similar ways. But they're quite different from mobile screen readers in fundamental ways. And very different from eye gaze, voice control, and magnification

Matt_King: We could have a test plan scoped to just a specific type of AT where they essentially mirror one another. Where we have the need to support similar tests. Maybe not identical tests, but where we only have ocassional need for minor differences

Matt_King: Or we could group them in broad categories: "all screen readers" or "all eye gaze ATs"

michael_fairchild: what if we limited each test plan to a single AT?

Matt_King: If we did that, we'd have to determine which test plans require agreement with one another in order to establish interoperability

Matt_King: If I compare ARIA-AT to wpt.fyi... In wpt.fyi, we have a spec like the one for CSS flexbox. It contains normative requirements, and those requirements are translated to tests

Matt_King: I kind of look at the set of tests in a test plan as equivalent to the tests in wpt.fyi

Matt_King: For everyone who makes "widget X", the test plan is a way of saying, "here is the set of tests to verify that you have created an interoperable implementation of 'widget X'"

michael_fairchild: So a test plan is a way to verify that several ATs are interoperable. Is that the only way to verify interoperability?

Matt_King: For sure not--keep thinking outside the box!

James_Scholes: If we just limit ourselves to the screen reader and browser combinations that we have now, we are basically right now saying that it's acceptable to compare across all of those

James_Scholes: Is it reasonable to make the same assertion after adding additional screen readers? Do we expect to hold iOS VoiceOver to the same standards as the macOS version?

James_Scholes: Would it be reasonable to compare the set of results between a screen reader and a voice recognition tool (given that the tests could be significantly different)?

Matt_King: Right now, we list the test plans along the left-hand column. But actually, right now, those test plans are synonymous with a test case.

Matt_King: Let's say that we're adding support for Combobox with Eye Gaze tools... The tests are completely different, but we can still give a report about how well a particular eye gaze tool satisfies the expectations

James_Scholes: It doesn't make sense to compare the support of JAWS and Dragon Naturally Speaker for a given pattern

James_Scholes: It makes sense mathematically, but users may be using both of those ATs

James_Scholes: It also makes me think that the table would grow much too large

Matt_King: The presentation doesn't concern me so much. We could aggregate the data in many ways

James_Scholes: I still think that they would be better-served by separate categories

James_Scholes: e.g. one for screen readers and one for magnifiers

James_Scholes: as opposed to having them all mixed: "here are the results for five screen readers and four magnifiers" etc.

Matt_King: I can imagine for some patterns, the expectations for all desktop screen readers are the same

Matt_King: But when it comes to desktop screen readers versus mobile screen readers, we may end up with dedicated tests that are quite different

Matt_King: We have to consider when/why we are asking AT developers to revisit test plans. If we change an existing test plan by adding VoiceOver for iOS, does it make sense to be asking Vispero to review the new version of the test plan?

Matt_King: Do we have to "re-do: the transition from Candidate whenever we add new ATs to a Recommended test plan?

Matt_King: We might say that two products are different enough that they need separate test plans for the same pattern

Matt_King: But if we add Narrator to the test plan that JAWS, NVDA, and VoiceOver already went through, I would expect that those three already agree.

jugglinmike: Doesn't that give undue preference to the ATs which happen to participate earlier?

Matt_King: Yes

James_Scholes: It seems undesirable to have to revisit consensus that we've already obtained whenever adding a new AT

James_Scholes: I'd like to explore a concrete scenario in which adding a new AT would require the tests in a recommended test plan to be changed

Matt_King: We're out of time. We will continue this discussion. We'll get answers to these questions and make whatever changes to the working mode they imply. Thanks, all!

Minutes manually created (not a transcript), formatted by scribe.perl version 210 (Wed Jan 11 19:21:32 2023 UTC).

Diagnostics

All speakers: James_Scholes, jugglinmike, Matt_King, michael_fairchild

Active on IRC: jugglinmike, Matt_King