Meeting minutes
Review agenda and next meeting dates
Matt_King: Requests for changes to agenda?
Matt_King: hearing none, we'll stick with the agenda as scheduled
Matt_King: Next CG meeting: Thursday October 16
Matt_King: Next AT Driver Subgroup meeting: Monday October 13
jugglinmike: October 13 is a holiday in the US, so we will have to reschedule the AT Driver Subgroup meeting
IsaDC: PAC is also closed on that day
jugglinmike: Let's push it to the same time on following week--Monday, October 20
Current status
Matt_King: We changed "switch made from HTML checkbox" to "draft review"
Matt_King: Everything else is the same
Matt_King: We'll probably move on to the two-state checkbox example today
Matt_King: And soon after that: the quantity spin button
Issue 1521: Rename test plan directories
github: w3c/
Matt_King: This is about the repository where we write all of our tests
Matt_King: I am asking what it would take to possibly adopt a naming convention for the directories where the test plans lives
Matt_King: That's because I am preparing for the time when we have a lot more test plans that are not coming from the APG--test plans where people write the test case itself directly in the repository. The code, example, etc. will all be sourced from this repository. And the tests are more atomic: a specific ARIA feature in a much simpler context
Matt_King: We're planning to teach people how to do that at TPAC
Matt_King: I'm going to come up with an example of a simple test plan that they can essentially copy and then modify
Matt_King: Essentially, we have a few kinds of tests: you "navigate to", "get information about", and "operate"
Matt_King: Imagine we have a test plan for "aria-details" or "aria-errormessage". The directory could just be named according to the name of the property
Matt_King: I'm not sure what to do for ARIA roles because it could become confusing when an element and property have the same name
Matt_King: Someone might see a directory like "aria-button" and get confused, thinking that there is a property named "aria-button"
Matt_King: One approach is that we could put "apg-" as a prefix to the directories that originated from the APG
Matt_King: The plan is to have a lot more test plans in the repository which come from different places
Matt_King: In this issue, I list some of the downstream impact of changing directory names
ChrisCuellar: I think that captures it accurately!
Matt_King: Would prepending all the APG tests with "apg-" be a good approach?
mfairchild: It sounds fine to me, but is there a reason we couldn't use sub-folders?
Matt_King: I would love to do that. It adds other kinds of complexity, but I don't know if it's more or less complex
Matt_King: The biggest problem would be that we would have to decide ahead of time what all of the "sources" are, and the code in the system would have to account for that... Or I guess, maybe under the "test" directory, the system just sweeps all sub-folders.
ChrisCuellar: I'm thinking about precedence in the Web Platform Tests project, which makes heavy use of hierarchical directory structure
ChrisCuellar: There's no way to anticipate future entries, just due to the nature of the web platform. And I think building to support that open-endedness makes sense
Matt_King: Did anyone have pushback against the idea of a "screen-reader" folder?
jugglinmike: I think having a directory structure by AT doesn't give us much flexibility, since we might be replicating tests across different AT folders. We might also run into version parity issues. Namespacing by AT seems not ideal.
Matt_King: Maybe that's not necessarily a knock on subdirectories but more about how we work with different AT's, which is a future-looking question. But the general question of subdirectories is still open I think if we segment by things like ARIA, HTML-AAM, APG, etc.
ChrisCuellar: I think subdirectory structure according to the type of feature under test makes sense. Unlike segmenting by AT, segmenting by feature will probably not involve test plans replicated across subdirectories
Matt_King: Oh yes, for sure
jugglinmike: But there's not a one-to-one mapping of test to feature, tho.
Matt_King: But the subdirectory structure could be reflect the scope of the test at least. It tells us the focus of the test even if it's not exactly testing one feature.
… I think we have the same issues in WPT. Look at how accessibility is tested broadly across WPT.
jugglinmike: Bocoup is currently engaged in this problem with WPT specifically. We're trying to classify tests by web feature. It's a pretty complex problem.
Matt_King: Accessibility bugs are rarely traceable back to isolated features. Bugs are often contextual.
Matt_King: I would prefer subdirectories--it seems more in-line with what other repositories do, and it is generally easier for humans to parse. It's also harder to get wrong (especially since naming involves typing strings, but if a subdirectory already exists, then there is less of a hazard)
howard-e: It's a similar level-of-effort whether we use sub-directories or directory name prefixes
Matt_King: Based on what we've learned from the reference work so far is that we would have three sub-directories: html, aria, and apg
Matt_King: Okay, this is really helpful. In terms of making a plan for this issue, do you think we're better-positioned, now?
ChrisCuellar: I think so! We're going to move everything except for support.json and commands.json into a new sub-directory named "apg"
Matt_King: Correct
Matt_King: And we'll have a very crisp plan for when we actually enact this change
IsaDC: As the person who writes the test plans, the way I organize them is by sub-directories. So for me, that would be the best approach, if possible
Matt_King: Good!
Question about Disclosure Navigation Menu Example
Matt_King: This showed up in the test queue again
Matt_King: I was scratching my head because I was quite certain it wasn't present last week
Matt_King: It has a run by Joe_Humbert and a run by JAWS Bot
ChrisCuellar: Before we rolled out a bunch of fixes last week (for the last bits of the updates to the bot-running), we fixed an issue where some bot runs were getting misclassified as automated updates. That's where we're trying to collect everything when we're doing an automated re-run. There were just some issues in how we were filtering bot runs which we corrected in the previous release. I think that test plan was hidden previously
, and now it's back where it belongs
Matt_King: We had previously published this, I thought. I may be mistaken about that, though
IsaDC: We had "rating radio", but not disclosure
Matt_King: We were re-running disclosure for the latest version of JAWS.
Matt_King: Maybe I'm not remembering correctly, and it got hidden by mistake, and I just didn't notice that we never published it
ChrisCuellar: We can look into this. I want to verify my hypothesis. It sounds like you folks don't recognize this run, though
Matt_King: It says here that Joe_Humbert's run has zero verdicts
ChrisCuellar: It looks like it might be a bot run that got re-assigned but never finishd
Matt_King: Yeah, and the other one looks like an automated bot run where almost all the verdict got assigned...
Matt_King: If you could do a little investigation to determine why this happened
Matt_King: It's odd that all the responses matched but the verdicts were not assigned
ChrisCuellar: It appears as though all the tests were run by Hadi and IsaDC
Carmen: I can raise an issue for this
Matt_King: Alright, thanks
Running Switch test plan
Matt_King: I commented on your issue, Elizabeth
Matt_King: I think it's possible that the cause of your problem is that the function keys aren't doing what they're supposed to do
Matt_King: I found an article
<Matt_King> https://
Matt_King: This is an Apple article about how you change the "function" behavior so that you don't have to press the "Fn" key
Elizabeth: I'm getting the correct response, now, actually
IsaDC: The conflicts are gone, now, so this Test Plan is done
IsaDC: I've just now marked it as such
Matt_King: Awesome. We can advance that test plan to "candidate" because we have all three ATs done
Running test plan for Tabs with Automatic Activation
Matt_King: We had nine conflicting results with Hadi, but Hadi isn't present today
IsaDC: I can e-mail him
Running test plan for Tabs with Manual Activation
Matt_King: IsaDC iss still working on this one in JAWS
IsaDC: I'm going to run that test plan for JAWS
Matt_King: For NVDA, we need one more tester
Matt_King: Do we have somebody who is available to take on the NVDA run?
Elizabeth: I'm done with "switch", but I don't have access to a PC at the moment
dean: I can test "Tabs with Manual Activation" with NVDA. That was my intention--to do that once I finished it with VoiceOver, but I'll just do that, now.
Matt_King: Cool!
Updating reports to latest screen reader versions
Matt_King: We have this feature in the system for when a new version of a screen reader comes out. We can have the bot run the test plans for all the published reports. The bot goes through and runs the test plan with a new version of the screen reader. If it gets the same response, then it assigns the same verdicts
Matt_King: Bocoup folks: does the bot also carry forward the designation of "untestable" in this case?
Matt_King: We should verify what it does with untestables and side-effects
Matt_King: Anyway, back to the context of this topic
Matt_King: If the bot can re-assign all the verdicts based on matching with historic AT responses, then it immediately publishes a new report
Matt_King: Otherwise, we need to have a human review
Matt_King: Because of the differences in the way the bots record output and the way that humans report output, the distinction between the AT responses can be trivial
Matt_King: But other times, it is not
Matt_King: for instance, the presence or absence of hint text can interfere with the matching heuristic
Matt_King: We can find the tests requiring oversight in the test queue via a button at the top. There are filters there: one for manual test runs (that's the default), and one for automated updates. If you press the filter for "automated updates", then right now, you will see six entries.
Matt_King: So the job is to just go through and look at these responses and determine whether or not the new response passes or fails the verdict.
Matt_King: So we have six of these ready to go. Is there anybody who wants to take on some of this work?
Elizabeth: Yes, I have some time for this, actually
Matt_King: Okay, I can assign the first one to you right now (the "action menu button")
ChrisCuellar: It should stay in the "automated" section even after you assign it
Matt_King: I assigned the first one to you, Elizabeth
ChrisCuellar: Yup, I see it assigned to Elizabeth, and it is indeed still in the "automated updates" section
Matt_King: This is a pretty good case for why we might want a new filter for "My Test Runs"
Matt_King: Elizabeth, when you open that, you'll see it looks like a little bit different
Matt_King: For each command, you'll see what the response was for the previous report, and in the "edit" box, you'll see what the response was in the latest automated run. We don't want you to actually edit that response; we just want you to report whether the new response satisfies the assertions