18:59:41 RRSAgent has joined #aria-at 18:59:46 logging to https://www.w3.org/2023/06/05-aria-at-irc 18:59:46 RRSAgent, make logs Public 18:59:47 please title this meeting ("meeting: ..."), jugglinmike 19:00:05 meeting: Bi-Weekly Meeting of Assistive Technology Automation Subgroup of ARIA-AT Community Group 19:00:16 present+ jugglinmike 19:03:06 mmoss has joined #aria-at 19:05:47 Sam_Shaw has joined #aria-at 19:05:47 present+ 19:06:01 scribe+ 19:06:36 present+ James Scholes 19:06:49 Matt_King has joined #aria-at 19:06:58 present+ 19:07:29 Agenda: https://www.w3.org/events/meetings/cff6ac42-bdc2-4a38-8993-7dd46e507741/20230605T120000 19:08:51 TOPIC: w3c/aria-at - #945 - Rethinking the wording for assertion verdicts 19:09:02 https://github.com/w3c/aria-at/issues/945 19:10:18 jugglinmike: The working mode doesn't have the term verdict yet, but its one we intend to add. 19:10:45 jugglinmike: The working mode refers to verdicts as supported, not supported etc 19:11:20 jugglinmike: automation refers to the verdicts as acceptable, omitted, contradictory 19:11:32 jugglinmike: I have a proposal for a new set of terms 19:12:05 correction: automation refers to the verdicts as good output, no output, incorrect output 19:12:45 the proposed new terms are acceptable, omitted, contradictory 19:15:21 js: I like the new terms you proposed. In terms of bubbling up the results, I wonder if no support, partial support, supported is clearer 19:15:30 MK: Thats why I wanted to use numbers 19:15:58 MK: Partial support could mean anything between a little support to almost fully supported 19:16:27 JS: I agree but if something is 90% supported, the remaining 10% could still make it unusable 19:18:27 MK: I agree, unless we have multiple layers of assertions we don't need numbers. We also want to be diplomatic 19:19:45 present+ 19:20:26 MK: I think for your solution is pretty solid 19:21:01 MK: We just need to decide if we extend the use of these terms, or bubble them up 19:22:18 jugglinmike: Yes bubbling up we need to consider, the case where a feature is all supported except one, its not supported. For verdicts that can be in three states, understanding why its partially supported is tough. I'm not sure if bubbling can work if we are looking for a percent score 19:22:25 MK: Yeah supported needs to be binary 19:24:32 JS: I think we need all three states 19:25:21 MK: What do the responses tell us? Either there is some support there or there isn't. Then the reasons is because someone tried, or someone didn't try to support 19:25:51 MK: If you measuring something using a percentage, then it needs to be binary 19:26:53 JS: for the reports, are there three levels to two of support? 19:27:04 MK: Any level of support beyond assertion is a percentage. 19:27:23 MK: At the AT level, the test level, at the AT level, all will be a percentage 19:28:12 MK: So we would say, using Mikes terminology, At the assertion level if the response is omitted or contradictory then that counts as a 0. If its acceptable then it counts as a 1. 19:28:41 MK: We could do other reports we could run that say what percent is contradictory, which percent is omitted 19:29:04 MK: I don't know that we need to bubble up these terms in the reports we have now 19:29:35 MK: We don't need terms for working mode, its just level of support 19:29:54 jugglinmike: I do think the working mode uses supported not supported. 19:29:59 MK: I can get rid of that 19:31:01 MK: I have some other issues for the working mode, particularly 950, I think we need to work on another iteration of the working mode and share it with the community 19:35:24 MK: We could have a binary state for assertions, and get rid of contradictory 19:35:33 JS: I agree, but we should rewrite the terms 19:35:48 JS: Lets add this to the agenda for the CG meeting thursday 19:36:59 jugglinmike: What I'm hearing is, we like the terms I proposed, but we may not need three terms 19:37:18 JS: It will make the testing easier if we just have two states/terms 19:38:36 MK: Okay but if this task isn't on the critical path, I want to be conscious of that 19:38:43 JS: This could speed up the process 19:39:00 MK: But its not a blocker, we can talk about enhancements in the near future 19:39:34 Michael Fairchild: Is there a third state where we publish a report with some of the data missing? 19:40:09 JS: No, really, but we need to consider this. 19:40:42 JS: If there is a situation where only 50% of test have been completed, what does that look like for a percent supported? 19:41:19 MK: We made a decision to change the working mode, and to get rid of the three output terms 19:42:02 MK: The question before we change the UI, is do we go from 3 to 2 states? Acceptable, not, contradictory 19:43:08 TOPIC: w3c/aria-at - #946 - Disambiguating 'Test Plan Run' in the Working Mode 19:43:27 MK: I'll comment on this issue and we can move it forward outside this meeting 19:43:36 TOPIC: Review of rationale for omitting explicit references to automation from the Working Mode - we touched on this during the 2023-05-22 meeting and agreed that Matt's perspective was critical 19:44:24 jugglinmike: This came up two weeks ago. We were talking about my task of describing automation how it layers on to the working mode, but the working modes doesn't describe automation 19:44:39 jugglinmike: James was not convinced of the utility of organizing our work that way 19:44:59 jugglinmike: As this has been a theme of my work, I want to make sure we are aligned on our direction 19:45:10 JS: Yes I wasn't sure what we were trying to achieve. 19:46:22 JS: For our tests, it doesn't matter if the responses are entered by a human or machine. But the results may need to be checked from a human. We are a long way away from automation checks responses and interprets them and providing their own verdicts 19:46:43 JS: Even if we get to that point, in many many years, I still think its valuable to have a human check the responses. 19:47:19 JS: The automation may be able to say a response is unexpected, but it wont be able to categorize how its unexpected 19:48:28 MK: I asked Boaz about abstracting the working mode. I want to make sure the working mode states here are how the business things work. Its the process for generating the spec, but its not a operations manual 19:49:01 MK: I think that there are some things about how the group currently uses the app that can be written into documentation. 19:49:52 MK: I think later on we can decide what a human does, or a machine does, but is outside the set of principles of the work 19:50:47 JS: That makes sense, but lets make that very clear. For someone new to the project, we want them to understand both angles, not just one dry article that describes who does what 19:50:59 MK: I still think we should get the roles out there 19:52:17 MK: The working mode does need to specify who does what "Directors need to approve this" The scope of the working mode needs to include scope of authority 19:53:14 JS: Okay I agree. The work that happens day to day is more practical, how the app works, what it does well, etc. I do think there is a disconnect between the working mode and how we actually do things. This is partially what Mike is bringing up. 19:53:33 JS: We need a document that outlines governance, and another document that defines how we work 19:54:02 JS: The governance document is more abstract, and you can go directly to a implementation. There needs to be a step between 19:54:54 MK: I agree, we are slowly building towards this. The wiki work I recently did to describe how we write tests, how we onboard people. We dont have much in the way of app documentation 19:57:05 jugglinmike: There is one thing that comes to mind that is fundamental the work Im doing. When we talk about roles, who is responsible for intitating automation? I've been assuming thats a test admins job, if that's the case then we have to talk about what the test admin is doing. Theres another framing however that changes what we build, which is the testers responsibility, can matt assign louis some tests and then louis runs that automatio[CUT] 19:57:27 MK: Its features design, we can say it both ways. 19:58:18 MK: We could make a feature where a tester uses AT to generate a response, and then adjusts it to be correct, and submits it as part of a manual test 19:58:55 MK: So right now I believe we said our MVP for Automation is, somebody, we didn't say who, is for a test plan run can we collect responses 19:59:20 MK: The automation will know what AT to spin up, what the tests are, and run them 19:59:56 MK: We can add to that, MVP Prime, if any of the responses, if there is a previous run of that same plan, if any differences exist flag those 20:00:44 MK: Thats so we can identify regression, If a new version of chrome comes out, automation can recheck everything and say yep its still supported 20:01:18 jugglinmike: so for the short term for me, Can I propose a change to the working mode that would capture a test admin to collect AT Responses? 20:01:23 MK: I don't think we need that 20:02:00 jugglinmike: right now the working mode just describes running the tests. We need to split up running tests and assigning verdicts. And we need to define the actors who will do these thigns 20:02:23 JS: The more we abstract these details, the more it becomes vague. 20:02:42 JS: If were saying this level of detail needs to go into another document, thats something else 20:03:40 MK: So test admin can run tests, but there nothing in the working mode that says what a tester needs to do to run a test. If the first thing they do is press a button and AT runs the test. 20:04:25 MK: The working mode doesn't care what buttons to press or what the scope of the test is. Running a test can be, I ran a test and got the same results as the AT. We can write a manual to describe that process, which is what we do now. 20:04:47 jugglinmike: So what we are saying is there is no change to the working mode. 20:04:56 MK: Yes I don't see a need to change. 20:06:08 MK: The working mode says the goal of the work is, make judgements about a test, how are the screen readers behaving, acceptable or not? That is the role of the tester. The test admin role is to make sure they agree with what the testers are doing, and resolve when there are conflicts. The working mode doesn't say what buttons to press or how many characters to enter 20:06:32 jugglinmike: So should we give running Automation to just test admin, or to everyone? 20:06:42 MK: What ever you think is better and faster? 20:06:51 JS: I don't think human testers need to be involved in that 20:07:43 JS: The pattern we follow now, granted testers assign themselves to tests, but for the most part, we gather info on who is willing to do what tests, then we assign the tests and work to resolve conflicts. The test admin is the gatekeeper to make sure everything stays on track 20:08:12 JS: We dont want people assigning to things we not ready to review 20:09:03 JS: I see automation in a similar light, once in place it may make this easier. The more we use the system the more it may know what we want, but there still is a manual element of having humans run tests and review conflicts. 20:10:31 MK: I'm good with a conservative approach, we should roll out the smallest, simplest, least risky/most useful approach. Lets not give to much power to everyone day 1 20:11:42 jugglinmike: I'm envisioning, the test admin can see who has been assigned to a test plan, but now they have a new ability to say collect new responses for this tester. 20:12:00 jugglinmike: As responses came in they would be entered in the correct places. 20:12:19 jugglinmike: If we make space for the system to have errors, in that we can retry certain commands 20:13:03 jugglinmike: in the case where there is an issue, we can have another tester run a particular assertion and compare results 20:14:07 MK: I think so, We may want to do that like the test plan is in the queue, instead of assigning the test they just plan a "run" button that creates a unassigned data set, that when its done we can assign someone to it who will complete and validate the report. 20:15:42 MK: Please put together a design proposal and lets go through it. I think you are on the right track 20:19:59 jongund has joined #aria-at 21:16:09 Zakim, end the meeting 21:16:09 As of this point the attendees have been jugglinmike, Sam_Shaw, James, Scholes, Matt_King, mmoss 21:16:11 RRSAgent, please draft minutes 21:16:13 I have made the request to generate https://www.w3.org/2023/06/05-aria-at-minutes.html Zakim 21:16:20 I am happy to have been of service, jugglinmike; please remember to excuse RRSAgent. Goodbye 21:16:20 Zakim has left #aria-at 21:20:15 RRSAgent, make log public 21:41:41 jongund has joined #aria-at 22:48:12 jongund has joined #aria-at