October 23, 2019 ARIA and Assistive Technologies community Group telecon -- 23 Oct 2019

<scribe> scribe: Jean-Francois_Hector

Prototype plan

VY: This prototype is a way to test our test design.

The CSS working group test harness would require refactoring. I don't think that we should reuse it, but we can draw inspiration from it.

Aim by end of this week is to propose building something. It will be a pretty simple test runner. So that we can discuss it next Wednesday.

It'd be nice if our test harness were compatible with WPT tests. There's a lot of infrastructure we could borrow from WPT tests.

Yesterday I've taken everything that we've come to conclude about, and I've put it in a WPT format, for us to discuss today.

MCK: We did a pretty thorough job at looking at ways of reusing open source software. The bottom line is that there were either not well suited, or had bloat.

One exception was the CSS test harness, but the primary maintainer is planning on a major refactor.

I want to make sure that we all agree that building our own prototype of a harness is the right way to go.

JG: What technology would be used to build the harness?

VY: I haven't yet made conclusions on what technology the test harness should use.

MCK: That also depends on what hosting options are available to us. Will we have to use W3C login? I believe that having a github-based login would be better. But we don't yet know what we can run from Github

JG: Authentication also a key consideration

VY: The goal for this next month is to see what it's like to present the test that we write, and come to conclusions about how we write the tests.

The test suite should be treated separately than the test harness.

MCK: The idea of using WPT format is really good, because having the test defined is a big part of the work.

MF: I'm good with this approach of continuing with the prototype.

I'm curious about whether and how the project I've been doing can be explored.

It's hosted on a virtual machine in digital ocean (similar to AWS). It's a NodeJS application.

The hosting is very cheap.

MCK: Having it hosted in something that is recognised as W3C sanctioned would be good.

Right now we are a community group, which is different from being a working group. I don't know if community groups get allocated a space within W3C hosting environments.

JG: I have several servers that I maintain, with University of Illinois domain. If you need a place, especially for early prototype, it's an option.

WPT format for ARIA-AT tests

<spectranaut> https://github.com/w3c/aria-at/pulls

<spectranaut> https://github.com/w3c/aria-at/pull/12

<spectranaut> example of an automated test: http://web-platform-tests.live/css/css-grid/alignment/grid-alignment-implies-size-change-001.html

VY: With WPT the tests run as javascript files, and the output is put in an HTML document.

<spectranaut> manual test in wpt repository: https://github.com/web-platform-tests/wpt/blob/master/css/css-color/lab-004.html

Manual tests are possible with WPT. In those instances, the test file is a description of what you should do.

<spectranaut> https://github.com/web-platform-tests/wpt/blob/master/css/css-color/lab-004.html

MCK: These WPT HTML files, where tests are written, do you normally write this HTML manually, or are there some tools to help you generate them in WPT format?

VY: I haven't seen tools, but I expect some working groups have tools. WPT is a huge test suite.

JG: The ARIA working group uses WPT to test ARIA implementations.

For ARIA, they had a wiki page, and they had a converted that converted the wiki page into WPT format.

My guess is that this is still the case.

<spectranaut> wpt html file: https://github.com/web-platform-tests/wpt/blob/master/css/css-grid/alignment/grid-alignment-implies-size-change-001.html

These tests are self-contained HTML files.

VY: These tests are self-contained HTML files.
... Looking at the pull request now

<spectranaut> walking through: https://github.com/w3c/aria-at/pull/12/files#diff-d2f730e4e9d1fcae4a80e40d270e9f2f

Let's go through the 'read-checkbox.html', within the pull request

What I'm going to say here is recorded in the description of the Pull Request

Test harness is the javascript library that allows for each test to be self contained.

See example of abstract operating instruction line 15

MCK: On line 15, is the word 'action' in this file coming from a WPT format?

VY: No, this is the first draft of an API that we can completely change

MCK: So the language of our API, and the abstract language that we use, are pretty important here. It's something we do need to settle on.

Michael, I'm curious to hear how this compares to how you've architectured 'http://a11ysupport.io'

MF: It's hard for me to fully compare it. In my project, we don't test a group of assertions at the same time: we only test one assertion at a time.

e.g. you would test whether the role is announced, then whether the name is announced, etc and all of those tests are recorded individually.

If you're using the command to navigate to the next form field, that implies that you are in reading mode. In http://a11ysupport.io I use that to infer the screen reader mode

MCK: You can tab in reading mode, and you can tab in interaction mode, and get pretty different results

E.g. the Bluejeans website, when I'm booking meetings: I force it into interaction mode, and keep it there, because with the way that they coded it this helps me get the information that I need

You get different information in different modes

I sometimes disable the automatic mode switching

NVDA doesn't always stick in interaction mode all the time

We want to make sure that there are the right test pre-conditions

Going beyond screen reader testing, we probably need to make sure that we can describe pre-conditions that is not screen-reader specific.

I'd be comfortable with the ARIA-AT project being scoped by the type of assistive technologies to be supported, so we don't need to be hang up on that, but it's good to think about extensibility

MF: When the user says 'this is a success', does that imply that it's been a success with all the possible commands? or only some of them?

VY: We should make it clear that testers should test with every command. Initially they have two result option: pass / fail. If it fails, they could indicate what failed

I want it to be the test author who decides what they record about the results of each test. Maybe that flexibility is bad and we should make it more programmatic.

I'm imagining that the test runner will know what AT you're testing, and the ARIA-AT harness will know based on the URL parameters what AT you're testing, and show concrete commands

MCK: I'm still a bit fuzzy about how the specific information needed to complete a test are organised across the different places, and pulled together

VY: e.g. The test harness will know everything that it needs to know to tell testers what commands they should use to put a screen reader to Interaction mode. It has a map of abstract commands to specific instructions

MCK: In our architecture, we need a high level description of all the files – where all the information would be stored – and what we would need across those files

JG: VoiceOver doesn't really have a reading vs interaction mode.

MCK: The reading mode would be when you have the QuickNav enabled.

VJ: To summarise one thing that I've been thinking about:

I'm going forward with the assumption that every screen reader we're going to be testing has these abstract modes, or that there's a way to describe these tests in a generalised way that applies to all screen readers

And then we need to be able to map that to the different screen readers

MCK: Yes, and we want to make sure that the test expectations are well documented.

I don't know for sure how to do the same thing for mobile screen readers. I think that it's a good idea to test our concept around desktop screen readers first.

We could treat mobile screen readers as a different class of assistive technologies, and extend the API accordingly

It might add some other API calls or parameters, rather than need completely different files

VJ: This test page has 2 'presentATTest'

between the two, the checkbox state changes.

scribe: when testing just reading the state

so that if setting the state of the checkbox is broken, that doesn't contaminate the reading test

MCK: That's cool. That ensures that the test page's preconditions are met

VJ: So the first test is reading an unchecked checkbox, and the second test is reading a test checkbox

So we might end up with 4 tests here, as there are two reading tests and two screen reader modes

Should the runner show you the first test in reading mode, and then show you another test which is the same but in interactive mode? Not sure yet

MCK: This is more a matter for how we write tests. But users might not care that these two tests are defined in the same file

VJ: A test file can have multiple tests in it. The tests are presented one at a time. You mark the results, then you see the next test. The tests could be presented in an iFrame

I wonder whether this test should cover the checkbox's mixed state, even though in this situation it doesn't have much meaning

VJ: I've created several HTML files, which share the same JavaScript file and CSS file

It's much easier to write a test page that has one checkbox, than one that has 4

MCK: We could define a correlation between this simple two state example and the one in the APG. As long as we reuse exactly the same code except for removing some checkboxes

We could change the APG two state checkbox example to make it simpler

What we do depends on whether we would want to test the grouping label or not

JF: We could keep the examples as they are in the APG, but focus the testing instructions on just 1 checkbox

MCK: That could be an approach

For things like Grid, where there are three grids one the same APG example page, we should simplify the example page so each only contains one grid

JG: Yes maybe we should simplify what's in the APG examples. For example, the sandwich condiments checkbox examples could be a test for grouping labels.

VJ: I'm afraid of the tests and examples being maintained in two different places

I like the philosophy that these tests could be standalone and be reusable.

If we change the example, we'd need to check that the tests don't break.

What we need the examples to be for the APG, and what we might need them to be for testing, might come into conflict

The APG examples can be too complicated for what we need for testing

MCK: Where the AT often fall apart, is when things are like in the real world rather than isolated

I hope that they don't come into to much conflict, i.e. that we can design these tests so that the examples that serve both for test and APG example. I'd like to test that premise

VJ: Let's go with that premise for now, and see if we can make it work

MCK: The test could become invalid, if the example changes. The test totally depends on the example. So how do we ensure that the dependencies is managed?

VJ: These tests could be part of the APG.

MCK: We already have a test directory in the APG for regression tests.

VJ: There's one more test file that we haven't talked about: it's the one about operating checkbox
... It'd be useful to know what the list of commands is, corresponding to 'operating checkbox' for example

MCK: Operating might work in reading mode, and not in interaction mode, and vice versa

There might be another assertion about whether the name of the checkbox is announced after it's been operated.

Not all Screen readers have done that

Maybe we should differentiate between the must haves and the nice to haves

MCK: There's a part of me that thinks that I won't be able to fully comprehend the consequences of all these decisions before we have a partially functional test runner

VJ: Yes, it's a bit of a chicken-and-egg problem

<jongund> got to go

<jongund> thnaks

MCK: One is actually writing content that will be in these files. And I'm also going to do some testing myself to test the interface. I imagine multiple iterations on both. The sooner we can get to the point where we can test our ideas, the better.

VJ: Maybe it makes sense to similarly write the test that you've outlined for Menubar. And it'd be useful to have this table between actions and modes and specific operating instructions

E.g. 'use these commands to navigate to the checkbox'

MCK: Different objects have different ways to navigate to them (eg x for checkbox, t for table). Some of these instructions are common to several objects (eg tabbing to focusable elements)

JF: I'll put together a first version of this mapping between abstract operating instruction, concrete command, and screen reader mode

<scribe> MEETING: ARIA and Assistive Technologies Community Group for Oct 23, 2019

- DRAFT -

October 23, 2019 ARIA and Assistive Technologies community Group telecon

23 Oct 2019

Attendees

Contents

Prototype plan

WPT format for ARIA-AT tests

Summary of Action Items

Summary of Resolutions

Scribe.perl diagnostic output