Meeting minutes
Review agenda and next meeting dates
Matt_King: Requests for changes to agenda?
<mmoss> +present
ChrisCuellar: We filed an issue related to consistency reports for disclosure
ChrisCuellar: It's not higher priority than anything on the agenda today, though
Matt_King: Then let's plan to discuss it next week
Matt_King: Next Community Group Meeting: Wednesday April 23
Matt_King: Next AT Driver Subgroup meeting: Monday May 12
Current status
<ChrisCuellar> The disclosure nav related issue raised by Bocoup to be discussed next week: w3c/
Matt_King: We're shooting for a new test plan roughly every week. We'll see how that goes
Matt_King: The stuff that we have in draft review right now--we will get a new one rolling today (the "rating radio group")
Matt_King: And we have one that's currently blocked by conflicts in JAWS results--"radio group with roving tab index"
Matt_King: I still haven't done anything on that to work on breaking the tie
Matt_King: Though we did raise it with Vispero yesterday, so we may get additional information on unblocking it when we talk to them next
Matt_King: Vertical temperature slider is on hold for the moment due to a couple issues
Matt_King: As is "disclosure navigation"
Matt_King: I think that for JAWS, we have to re-run the test plan because the output will be significantly different
Matt_King: For the other screen readers, we can probably leave the reports as-is because we don't anticipate changes (that is, for NVDA and Voiceover)
Matt_King: Coming up soon, will be the other disclosure navigation plan
Matt_King: Or maybe we should do the other disclosure first, and move that one down two weeks
Isa: Sounds good
Matt_King: So this will come up in about three weeks, instead. I will update the spreadsheet after this meeting
Speadsheet: https://
<jugglinmike> s/Spreadsheet:/The spreadsheet is available here:/
Isa: The versioning stuff with VoiceOver is set from 11-point-something to begin with
Matt_King: Oh, that's because I added it from the "report status" dialog. That's the only way to add the bot, and we have an open issue for being able to set the version number there
Matt_King: I think we need some other ways to run the bot, but we can discuss that elsewhere
Isa: It would be good to have a way of editing the AT version without deleting the report and re-adding the test plan for that specific AT from the Test Queue
Matt_King: If we were able to add it with the correct minimum version, then we wouldn't need to edit it
Isa: I agree
Matt_King: We need some volunteers for all three screen readers
Joe_Humbert: I can run it for JAWS
Matt_King: Yay! Can you assign yourself in the Test Queue?
Joe_Humbert: Sure
Matt_King: Next on the list is NVDA. The bot has successfully recorded responses
Joe_Humbert: I can do all three if necessary
Isa: Thank you! I will assign you the bots' results for both NVDA and VoiceOver
Matt_King: Wow, Joe_Humbert's killing it
Joe_Humbert: I had a question about the vertical temperature slider
Matt_King: I left the vertical temperature slider off the agenda
Isa: We changed one test
Joe_Humbert: Do I need to go back and re-do the test for that one result?
Isa: It's not urgent because that test plan is blocked for now.
Matt_King: Before we finish the NVDA one, we're waiting on issue 1352
Matt_King: And for JAWS, we have a conflict that's very similar to the one we have for the "roving tab index radio group"
Matt_King: You can do the JAWS one, but hold off on the NVDA one
Joe_Humbert: Got it
mmoss: I can also run it for VoiceOver
Isa: I will run the bot, so you should have the responses available to you after the end of this meeting
Re-run of JAWS for color viewer slider
Matt_King: How did we get two rows in the table for the same minimum version of JAWS and the same browser? Is that a bug in the app?
Joe_Humbert: I also see those two rows
m
Matt_King: Is it possible to add the same one twice and create two rows?
howard-e: It may have stemmed from the functionality that was added to allow "exact" versus "minimum." It shouldn't be possible
Matt_King: I'm hesitant to hit "delete" because I don't want it to accidentally delete the row above
howard-e: I can't tell when it was introduced immediately.
howard-e: But I reproduced it locally right now, and I can confirm that it is safe to delete
Matt_King: Okay, I'm hitting "delete report" right now
Joe_Humbert: It's gone and the prior work is still there
Matt_King: Yay! We're good
Carmen: I will raise an issue related to that.
Joe_Humbert: It reported the same version in both of them
Joe_Humbert: Since I reloaded the page, I see myself assigned to the "rating radio group" ones, but NVDA and VoiceOver says "0 of 15 tests evaluated"
Isa: I assigned the bot's results to Joe_Humbert, so it should show them
Matt_King: "Evaluated" means you just have to assign the verdicts
Joe_Humbert: Ah, yes. It says "evaluated" while the row for JAWS says "completed"
Isa: We need a second person for NVDA
dean: I can do the "rating radio group" for NVDA
Isa: I will run the bot and assign the result to dean
Matt_King: Awesome. Thank you
App issue 1367 - Prototyping decisions for manual testing on mobile
github: w3c/
howard-e: Initially, when thinking about the manual tester's collection, we've always wanted to do it on-device or as close to the device as possible
howard-e: Perhaps using system utterances in the same way we're doing with NVDA and VoiceOver
howard-e: iOS's security precautions make that difficult
howard-e: For Android, we're considering an approach of taking video recordings and extracting text data from that video stream. But that's very much a fallback solution
howard-e: What we have built today, however, is a set of scripts that allow one to collect the utterances from TalkBack
howard-e: One can press the "run test" button and begin recording utterances
howard-e: We've confirmed that we can get system-level utterances from Android
howard-e: We've started work on this for iOS, but the feasibility is less clear
howard-e: That's all presented in this issue, and at the end, I've included four questions
howard-e: First: should we move forward with building an interface to make it easier for others to evaluate the Android-specific solution that I just described?
howard-e: Second: should we halt the approach for VoiceOver (because there is so much uncertainty) and move forward with the fallback video-recording approach?
howard-e: So it's really just the two questions
Matt_King: I'm personally very reluctant to do anything that relies on OCR, even as a fallback, because it seems very unreliable, potentially super-flaky, compute-heavy, and not very scalable
Matt_King: We can think of it as a fallback, but in my mind, it's a very distant fallback
Matt_King: And given where we're at right now, I think even a prototype on Android (one that just experiments with different ways to make the experience viable) could be really useful because we don't know what ANY method of doing this could feel like for human testers
Matt_King: We have humans who will manually run the tests on Android devices. Right now, doing that on a mobile device would be ridiculously time-intensive because it would involve human transcription (running the test runner on both the mobile device and a laptop)
Matt_King: The question is, what do we actually want the tester to perform on their device? Which steps?
Matt_King: Let's imagine how we eliminate the horror of this work?
james: If I could open a web page on a mobile device and sign in, it's a painful experience, sure. But if you're running the test plan on your laptop, it would be good if you could press a button on the page on your laptop and have that open the page on your phone
Matt_King: A dream user-experience would be if, somehow, your Android device is connected to what you're doing on your laptop in such a way that you would be able to run through the test plan on your laptop, but when you press "open test plan" on the laptop, it actually opens in your Android device. When you execute the prescribed gestures on the Android device, the computer is automatically collecting the AT response into the appropria
te field on the laptop
howard-e: I left out two details here. The setup scripts allow you to directly open the test page. At the end of the utterance collection (The end of the script or a closed process), the AT responses are available on the clipboard
james: Just to be clear, though, it's a non-starter to expect anyone to input assertion verdicts on the mobile device
james: Any mobile-oriented testing needs to assume that the primary driver is a computer connected to a mobile device (because form entry is unavoidable painful on mobile)
howard-e: I am not proposing a mobile-first experience. I want to limit the interaction with the Android device as much as possible--just to direct interaction with the test page
dean: Is there something like the VoiceOver recorder that will simply record everything and transcribe it--that could then be cut-and-pasted into the results?
Joe_Humbert: Possibly on Android. ADB is very powerful, but it's also very fickle
howard-e: That's the Android Debug Bridge. It's powering some of the scripts I'm presenting today
dean: If something could record what I'm doing with VoiceOver on my iPhone and just put it in Notes, I could open Notes on my laptop and just paste it into my test results
dean: It's not any more work than what I'm doing now on my desktop with the VoiceOver recorder
james: The VoiceOver testing experience, even with the recorder, is extremely inefficient. So using it as a baseline for our mobile testing solution doesn't seem ideal
dean: Agreed
howard-e: To your point, dean, I took a look at that for iOS. Even in generating the transcript which is provided by Apple for its own recording--that transcript is significantly incorrect
Matt_King: I don't want to worry too much about those kind of problems at this point
<Joe_Humbert> I have to drop. I will try to get all the work done by next wednesday
Matt_King: If it turns out that we can do this well on Android but that problems with Apple products make it so that we can't do it well on Apple products, let's bring that as feedback to Apple and point to our work with Android as a baseline for what is reasonable or good
Matt_King: But it's really hard to have any conversation without having some kind of concrete experience
Matt_King: james described an ideal workflow, but a prototype isn't about implementing an idea. It's about taking a first step in that direction
Matt_King: So if howard-e could bring us a functional "step one", we can review and discuss next steps
howard-e: Got it
howard-e: This issue is now slightly outdated. The team at Bocoup has begun investigation at doing a similar approach on iOS. It's certainly less documented, so it's requiring additional research, but we're hopeful
james: What does the approach look like?
howard-e: It's using system-level intercepts of the voice. We wouldn't expect testers to use it on their personal devices. It probably gets us to a position that's better for automation
james: With a custom voice like we're using on macOS?
howard-e: No, that's not necessarily possible on iOS
ChrisCuellar: We're taking inspiration from security researchers
james: That may raise issues with Apple's acceptance of results collected in that manner
james: It's worth noting that eSpeak is available on iOS and it is open source
james: It's available in the iOS app store
Matt_King: That must mean that custom synthesizers are already supported
james: Yes. It's a universal macOS and iOS implementation
Matt_King: Anyway, let's start there with some "step 1" prototype for Android and see where it takes us
howard-e: Sounds good! Thanks!
Capturing lessons learned from JAWS AT driver experience
Matt_King: I don't necessarily want to talk about the lessons learned right now, but I do want to discuss how we capture the lessons and respond to them
Matt_King: We could do a one-off meeting with some of the members here
Matt_King: We have such a small attendance on the automation subgroup meeting, it's realistic to move it
Matt_King: Maybe we want to do a poll to reschedule
Matt_King: It seemed like james wanted to do this sooner rather than later
james: It's not urgent. It just depends on whether Vispero would benefit from anything happening sooner. And also whether they can join that call
james: I'm happy for that to happen in the next automation meeting
Matt_King: So even if we had it in the first half of that meeting, you would be able to attend that first half?
james: Actually, I can join for the full hour on May 12
james: Okay, let's see if we can get Brett and Glenn to join that one
jugglinmike: I've been working with them directly, so I'll reach out
Issue 1355 - Version of Bot used for a run
github: w3c/
Matt_King: As I understand the issue, it's if we say that the minimum version of NVDA requires the latest version of NVDA, then if we run it with the bot, this pull request ensures that it only runs if that bot has that version available... Is that right?
howard-e: For the longest time, it has been prioritizing the release date when it came to choosing the "latest". There was some oversight when it comes to patch release of older versions
howard-e: For instance, first version 13 is released, then 14 is released, then 13.1 is released
howard-e: One would expect version 14 to be the latest version, but the system was selecting 13.1
howard-e: We saw this mainly with NVDA versions and with VoiceOver versions. Mainly VoiceOver versions
Matt_King: We always have the option of not adding a patched version into the system at all
Matt_King: Is this primarily effecting the order that versions are presented in the drop down?
howard-e: Yes, and that order was also being implicitly used in bot selection. And that is where the true concern popped up
Matt_King: We have a challenge right now where, when we run with the bot, we aren't able to choose a specific version. We only have one way of kicking off the bot right now.
Matt_King: I'm going to raise an issue about reconsidering the use-cases for starting bot runs