Meeting minutes
Review agenda and next meeting dates
Matt_King: Next AT Driver Subgroup meeting: Monday April 14
Matt_King: Next Community Group Meeting: Thursday April 17
https://
Matt_King: Requests for changes to agenda?
Current status
Matt_King: We have 14 plans in candidate review
Matt_King: We have two in the test queue, and then the "disclosure" plan is on hold while we wait for the JAWS update
Matt_King: We'll talk about the other two plans which are currently in the queue
Matt_King: The agenda has a link to a spreadsheet we are building to make a schedule for the upcoming work https://
Testing Radio group with roving tab index
Matt_King: In this test plan, we only have one conflict with JAWS. Everything else is done
Matt_King: I think we may have narrowed this down to Windows 10 and Windows 11 giving different results
James: I think we have two testers on Windows 10 who have produced on set of results consistently, and two testers on Windows 11 who have produced a different set of results consistently
IsaDC: Are you sure we have two testers on Windows 11?
Matt_King: Well, we know IsaDC is on Windows 11
Matt_King: And we know James and Joe_Humbert have Windows 10
Matt_King: I am on Windows 11. It seems like I should check this one out
Matt_King: If I get results like James and Joe_Humbert, then IsaDC's machine would be the outlier, and we'd be wondering what's going on
Joe_Humbert: Is this something we should ask Vispero about?
Matt_King: Absolutely! If I observe what IsaDC has reported, then I will reach out to Brett at Vispero
Matt_King: In that case, we might have to collaborate with Vispero about what JAWS ought to say. That would be a new circumstance for us
Matt_King: Anyway, I'll take this as an action item for the next meeting
Testing vertical temperature slider
Matt_King: We have multiple situations with this test plan
Matt_King: Let's start with the issue James raised yesterday. It's an APG issue
Matt_King: The way the test plan is currently written, it's not possible for people to get the same output for the down arrow for navigating to the slider with NVDA
Joe_Humbert: I noticed that with JAWS, you get to the slider because you down arrow three times. With NVDA, you don't get to the slider because you only down arrow twice
Matt_King: Yes
Matt_King: I think we should update this test plan
Matt_King: James raised an issue against APG
Matt_King: I think the arrow is due to a bug in the test case. The "25 degrees C" that appears should not be there because it is duplicative
Matt_King: I don't know why it's there; we'll discuss that in the next APG Task Force meeting
Matt_King: To work around the bug, I think we should just update the test to change the command with another arrow key press
IsaDC: I have the pull request for that ready to go. I wanted to discuss it here, first
Matt_King: I'm suggesting that the APG remove the text label, but that will come later.
Joe_Humbert: They would have to remove it completely because it's text. You don't want to hide text from a screen reader user.
James: It would be acceptable to hide the text because it is duplicated in the aria-value text.
Joe_Humbert: I still think it's generally a bad practice
Matt_King: Yeah, the APG might come back and say that it's fine the way it is
Joe_Humbert: I can see the text rendered on the screen twice. One is above, and one is next to the slider "thumb"
James: That seems visually duplicative
ChrisCuellar: I agree
James: Hiding text can indeed set a dangerous precedent, but so does enunciating text twice. I think it should just be in once place for everyone
Matt_King: For context, we've had many requests to add "aria-hidden" to the text "temperature"
Matt_King: The Task Force has pushed back on that
Joe_Humbert: I think the number on the top is better because it is larger and easier to read and because it doesn't move with the "thumb"
Matt_King: Maybe you don't need the one on the rail
Matt_King: Well, we'll see what the Task Force says. For now, I'm glad IsaDC has a patch ready
Matt_King: Applying that will mean running that one test again
James: Do we need to alter the "navigate backwards" tests?
Matt_King: Nope
Matt_King: Were there any conflicts which were not related to that "moving forward" test?
IsaDC: The negative assertions, what should we do, for now, when we find negative assertions that would technically pass?
Matt_King: This is an issue which appears later on the agenda
Matt_King: James raised an issue for this, and then I have worked on a solution.
Matt_King: The testing for this--and I think even the prior slider stuff that we've done for NVDA--I think we'll need to revisit. I think this is blocked until we can generate an accurate report. What I mean by that will become more clear in a minute
Re-run of JAWS for color viewer slider
Matt_King: JAWS has fixed some bugs, so it would be advantageous if we could re-run these tests
Hadi: I can help
Joe_Humbert: I can, as well
Matt_King: Great! We want to use the latest JAWS. The version which was released in March of this year
IsaDC: I will assign you both
Matt_King: Great!
App issue 1352 - Changes for negative side effects and untestable assertions
github: w3c/
Matt_King: I put together a detailed plan that talks about the test runner and the reports
Matt_King: When you encounter a test like this (where it technically passed an assertion but for an inappropriate reason)
Matt_King: ...there would be a checkbox which indicates whether the assertions were testable
Joe_Humbert: So if it doesn't work as we think it should (it skips over or it says nothing at all, for example), there will be a test box for us to use which says, "we can't apply these assertions"
Matt_King: [summarizes the full UI and workflow as proposed in the issue]
Matt_King: I would like this to be expedited as quickly as possible so that we can get accurate reports on all of the sliders. I think we may even need to re-run a few VoiceOver tests because we encountered this problem and the way we reported them to Apple was confusing
Carmen: Sounds good. We have a planning meeting tomorrow, so we can prioritize this work accordingly
Hadi: How often is this condition?
Matt_King: It occurred on any test plan with Shift+J for one screen reader. We also just found it in a class of tests for NVDA
Matt_King: So far, it's happened in probably seven or eight test plans and with two screen readers
Matt_King: We discussed about what to do without this new feature. We could just mark assertions as failing, but that gives a misleading picture of what is wrong. It produces confusing data, and I don't think we want that
App issue 1162 - Automated updating of reports for new AT releases
github: w3c/
Matt_King: This feature affects everyone, but it's really only used by admins
Matt_King: When a new version of JAWS comes out (one is due in May)--or NVDA for that matter
Matt_King: We would like to be able to re-run all the test plans in "candidate review" using the bots
Matt_King: It won't automatically add the bot to the system and start running the tests. It will require some administrative input. This interface is for performing that task. It is deployed to the "sandbox" server right now, and it is ready for testing
Matt_King: In the issue I linked to in Carmen's latest comment, there are instructions for testing
Matt_King: As soon as we have approval from IsaDC and me, we'll have this pushed out
Matt_King: This will be a big deal for us, especially when JAWS releases the next version
Joe_Humbert: With this kind of automation, will it be possible to get results for previous versions, or will that require manual testing?
Matt_King: I think we could add older versions of screen readers to the bots and use those
jugglinmike: For NVDA and VoiceOver, yes. For JAWS, we may need to do some extra work (depending on whether Vispero hosts previous releases)
jugglinmike: So to support older versions of JAWS, we may need to keep those versions on hand ourselves
Issue 1213 - Minimum version requirements
github: w3c/
Matt_King: I don't know if we want a policy on this issue or if we just want it to be an admin decision every time you add something to the test queue
Matt_King: Depending on how the run is added, the app may or may not present a user interface for setting the version
Matt_King: I think that may be an omission in the design. I think addressing that omission may resolve the issue
IsaDC: That would definitely resolve it
Matt_King: We want to control the minimum version of the AT when adding a test plan to the test queue to the report status dialog
Matt_King: Carmen, can you create an issue for that in the aria-at-app repository?
Carmen: Sure
Issue 1211 - Reliability of app support for advancing new test plan versions
github: w3c/
Matt_King: we expect results to be copied into the draft for the report of the new test run. We've seen some inconsistent behavior on this, though
Matt_King: James filed an issue, and howard-e shared a very detailed response. Have you had a chance to review howard-e's response, James?
James: I did read this when it was first posted; I will have to refresh my memory
Matt_King: I don't think that we have a current behavior failure
Matt_King: We did have an example, but we destroyed it when we fixed the problem
Matt_King: We're going to have an opportunity coming up. IsaDC is working on a change to the slider. We'll see if that one works correctly. It might have something to do with which specific things get changed in the test plan. We can just leave this issue open until we see a problematic behavior again
James: We're missing the ability to update the reference without changing anything in the test plan itself.
James: Some change would warrant changing the reference date. But sometimes we have to make a small change to make settings work. What we don't have in the app is to essentially take notice of that
James: from howard-e's response, it seems as though the app is only aware of a command being change, an assertion being changed, or a change to an assertion ID
James: ...but we also want the app to take notice if we change the reference or the setup script
James: So right now, we've pushed a new test plan, and it doesn't get re-imported
Matt_King: That's a different problem, them. This is about copying results
Matt_King: If, for example, the assertions change, then you don't copy the results from the prior into the new
Matt_King: If the setup script changed, is that another one that should void prior results? What about the reference?
James: It's tricky to say because that's on a "test" level
Matt_King: Right
Matt_King: One of the side-effects of maintaining who the tester is, is that we currently don't have a function for the tester to be changed from one person to another
Matt_King: It would be really nice if, when something was assigned to me and I did half the work, if I could re-assign it to Joe_Humbert. Then Joe_Humbert would assume responsibility for everything I've done, and he could finish the rest of the work
IsaDC: With the bot, it would be really useful to have that because sometimes we have the bot collect responses, then we assign to a tester, and then that tester can't help, but we aren't able to re-assign the run to another tester
Matt_King: That sounds like another feature request
Matt_King: A button for "change assignee"
Matt_King: We could even make the person's name into that button. Right now, it's a link to their GitHub profile
Matt_King: You can propose something
Matt_King: Right now, I would prioritize this as "P2"
Carmen: Got it!
Matt_King: If a copy is in prior results that aren't value, it's up to someone to re-run those results or make sure the previously-copied results are valid
Matt_King: Do we want to err on the side of over-copying (copying things that may have been voided), or under-copying?
James: I would like to test things like these before they go into the main app
James: I think that, regardless of the route we take, it needs to be possible for us to--when we make a change to the test plan, run it through a separate environment which is a copy of production, in order to review the actual change
James: Then we can immediately halt and not deploy to production because something unexpected happened
Matt_King: Essentially testing the test plan itself
IsaDC: Yes!
Matt_King: Okay, that is a separate issue. It's on the agenda, though we won't get there today
Matt_King: I think it might not be a massive piece of work to make it happen. We'll save the discussion for when we get to that issue
Matt_King: But in the mean time, if you can reflect on how safe we want to play it, I think that would be helpful
James: I would also love the ability to "roll back" anything that happened. Whether due to a bug or an expected-but-hard-to-predict behavior, I would love to be able to revert
IsaDC: I'm pushing some changes, and I would like to know if the results we have now--will we have a way to get them back?
James: We're making a change to a test plan, and it's possible that the same issue in the app will occur. Do we have a strategy to address it if we lose the results?
Carmen: I can ask howard-e tomorrow
Matt_King: Let's do it today and pay attention to what happened. If something goes wrong, we can send howard-e a detailed e-mail with what happened
Carmen: directly after this call, I will see if we can do a database dump. I'll reach out to you soon, IsaDC
App issue 1365 - Bot reliability reporting
github: w3c/
Matt_King: Bocoup has come up with a testing methodology to test the reliability of the bots
Matt_King: I included a link to a recent report in the agenda
Carmen: We are testing consistency by running each test five times and determining whether there were different responses in each "Trial"
Carmen: You will see that NVDA is over 99% consistent. Our focus this year is on VoiceOver--it is currently at 91%, and we would like to raise it to at least 95%
ChrisCuellar: Right now, it is triggered completely manually
ChrisCuellar: We would like to run it as a service on a regular schedule
Matt_King: I wonder if this will change with different versions of screen readers and bots
ChrisCuellar: Now that we're letting the CG know about the reports, they are almost like a feature that we can iterate on and that we can improve in response to external input
Matt_King: Okay, this is great. I'm really happy to have these metrics!