November 27, 2019 ARIA and Assistive Technology Community Group Teleconference -- 27 Nov 2019

Prototype Status Update

<scribe> scribe: Jean-Francois_Hector

Matt: I've figured out a way for Valérie to have 3 more days to work on the prototype this year.

Three Fridays: 29th, 6th and 13th

They're not meeting dates, but I (Matt) can provide updates and so forth and be the conduit for that

I was looking at the prototype. We're in a pretty good place in so far as we've figured out a lot of stuff. But in order for the prototype to do what we need to do, before developing a full app, we need a little bit more

Valérie and I talked about these priorities

Having a pretty usable report is a super high priority

And also focusing on improving the usability and accessibility of the harness

We feel like the runner is already in a pretty decent state, and we want to minimise the amount of time we put to it – other than the reporting part

<spectranaut> runner: https://w3c.github.io/aria-at/runner

The runner

<spectranaut> runner: https://w3c.github.io/aria-at/runner

VY: Here's a link to the runner

Purpose is to select a bunch of tests and perform them

I have a bit of usability feedback from Matt and Michael

This is not for reuse, just a prototype

If you select one of the checkbox test, you can see the same tests but now it's in an iFrame

<spectranaut> https://w3c.github.io/aria-at/tests/checkbox/read-checkbox.html

Go to that URL to see the tests

There's a concept of 'Testing multiple behaviours in a single test'

That's proved to be confusing

Each test is now going to be a unique combination of screen reader mode and thing to test [I didn't get the details]

Matt: [...] will explicitly be 4 tests (rather than be rolled together in 1 single test)

That makes it more clear how many more tests you have to run, and how to navigate through them

VY: We're thinking of splitting the results by user task. Is that what we're thinking?

MCK: It's where the expected output is different that matters. It depends.

VY: Re iFrames: if you click on 'run tests', the tests are currently in an iFrame for the runner

MCK: The URL is staying the same for each test file

VY: Yes, that's not ideal

MCK: There's a risk someone might lose their work

MF: You could also just save current result to session storage until you start a new test

VY: It's on my shortlist to do something like that

MCK: Maybe for the prototype it doesn't matter too much

The iFrame should have a title tag. It's hard when I tab to it and I hear a URL

MCK: To minimise the amount of time that Valérie spends on documentation, I'm going to try to document some of the things on the wiki.

Another priority that I see for the prototype, if we can get to it, is the ability to review the tests.

Right now the only way to review the tests is to look at the HTML file. It's hard, especially as they are in multiple files, to get a sense of what we're testing, the flows, etc.

And you can't get it from the test names.

Reviewing the tests that someone has put in. I'm going to work on developing a format for that

And to facilitate writing tests, which is hard, we could potentially change that report into a form (with edit fields), with the result generating an HTML test file

So we'd have a way to generate the tests and review the tests while being able to focus on the tests (rather than the coding of the tests)

It's a lot of work to write tests even just in a spreadsheet. But we could come up with a standard way to do it and a way to make it quicker

That form to generate tests is likely beyond the scope of this current prototype. But it's something we'd need to prototype before we go into production

MCK: I have one set of tests related to interactions (with editing keys that you have to press when you have a pop-up open), which I'm not even sure how to write. So it might be a challenge for the harness. I have an idea that I'd like to share later this week so we can assess.

MF: I'm happy with the work and progress made so far. Good work Valérie

I haven't yet authored a test, but I think I have a good idea of how that works. I'm curious about what you think: when you create the combobox Matt, how long that creating that test take?

MCK: A lot. Although translating it to NVDA will be quite straightforward as all will be identical except keystrokes

For VoiceOver we would leave out the automatic mode switch test

For desktop screen readers, I think that adapting it will be straightforward.

I don't know how much work to translate to VoiceOver yet. But it takes time to verify that a specific assertion is appropriate for another screen reader

I don't see a way around doing that

I see three patterns that are hardest from the point of view of authoring the tests: combobox, menubar and grid (the easiest of all those three)

I want to do all of those before the end of the year

The next trickiest will probably multiselect listbox and spinbutton

But once we have menubar, the menubutton task will be easy to get

After you get a certain core set of tests down, it's going to be a lot easier. And then a lot of the work needed is reviewing to avoid mistakes

VY: It sounds like Matt is developing a testing philosophy about how to make these tests easier

MCK: A standardised wording

VY: I think that writing the tests is always going to take time, as we want them to be easy to perform

MCK: With the prototype, I'd like to give it to, say, people at NVDA, and ask whether they could have one of their expert tester users use it. And ask them whether they 1) agree with the test, and 2) does it capture everything that needs capture re. NVDA support?

And get feedback for them. For example: are these tests vendor neutral?

I want to do that with the prototype.

That's one of the goal of the prototype: are we on the right path from the point of view of the screen reader developers?

I want to go back to the user cases work we did at the beginning (see the Wiki), and see which user cases we're covering, and what we would need to change for production

Those are the things that I think we can accomplish with the prototype

Reports

<Matt_King> https://github.com/w3c/aria-at/issues/19

See what Matt wrote in Issue 19

MF: I agree with everything you said there Matt

I agree with your approach with 'pass', 'fail' and a single column

MCK: The most important thing here is how to summarise support.

I think that we count a 'command - assertion' pair as a single entity

Eg 'insert+up - speak name' is one

Eg 'insert+tab - speak role' is another one

And we're counting every single input that we've gathered

And the unexpected things are not necessarily associated with a specific command (they're associated with a task). They could be associated with a command

'What's in the left hand column' and 'What is it that we're counting' are the two most important questions, I think

Putting all of the assertions in the test title makes things hard

So we've talked about having more brief test titles

Eg 'Navigating to empty editable combobox switches mode' could be a title.

Details of the roles and states etc are not in the title, but in the detail page on the test harness

Use cases

<spectranaut> https://github.com/w3c/aria-at/wiki/High-level-use-cases

VY: Here's the link
... Use cases for web developers

Situation of struggle 1.1: I’m developing a web-based user interface. I want to make it accessible. But I'm not sure which ARIA attributes are well supported across different browsers and screen readers.

Situation of struggle 1.2: The code I’ve written isn’t producing the results I expected. Am I doing something wrong? Is there a bug in my codebase? Or is it an issue with the browser or screen reader? Or maybe there's nothing wrong. I’m actually not sure how this screen reader is expected to behave.

MCK: If you look at a report page, you would be able to see that there are failures for a specific behaviour

You would see "oh, there are fails related to that? What are those fails?" and you'd be able to see that JAWS is doing something weird related to the bug you're seeing

MF: How would website developers find the results?

MCK: We're not going to solve that until we have the results into the APG

JF: I believe that results should be available outside the APGs, as many developers don't know about the APGs
... There's a use case where a web developer isn't replicating an APG pattern, but just using some ARIA and wanting to know support for the ARIA they're using

MCK: There could be multiple paths. Eg through the APG, or MDN. On the APG there might be an index of attributes

If you're using an ARIA attribute, there isn't a way to use most of them on their own without them being part of some kind of pattern.

I imagining that we'll first build to build the infrastructure for all of this, and then figure out how to make sure that search results are good

VY: Lets' look at situations of struggle for AT developers

Situation of struggle 2.1: I want the screen reader I’m working on to offer good support for an ARIA attribute. But it's not clear what supporting this attribute means. It's difficult to know how the attribute should be rendered in different situations. There are diverging opinions about what good looks like.

MCK: The ability to review the results will help here. So it's an important priority

I can imagine a screen reader developer say "What did you expect the screen reader to do here?" or "How did you actually test it?" if there are 10 fails

VY: Situation of struggle 2.3: My team and I are working to improve a screen reader’s support for ARIA. We need to prioritise our efforts. But we don’t have a clear view of exactly where our current implementation is incomplete, broken, or fails to meet customers’ expectations.

MCK: Having the 'must have' 'should have' columns would be critical

VY: Situation of struggle 2.4: I’m concerned that the screen reader I’m working on is not going to be evaluated in the right way, and that results made public might not be fair. I don’t know/believe that the tests assertions are based on a correct understanding of how my screen reader should work.

Right now it's hard to review tests, but if we can make it easier other people can raise issues and discuss what they think is wrong

MCK: We should prioritise coming up with a format for that
... We haven't figured out how to associate assertions with one or more ARIA attributes. I think that we need to do that.

This is only for reporting, so I wonder whether we need to do it in the prototype. Maybe not. It's debt if we don't do it, but maybe that's ok and we just need to raise an issue for it

VY: We should go through the backlog of issues and clean it

MCK: That's something that the group can do after we're done with the prototype contract
... I'm super pumped about demoing this prototype. I feel that we've learnt so much about what we want to put into a system

And now I'm more convinced than ever that building it from scratch is the right decision. 100% convinced that this is the right way to go

VY: It took a while because we thought a lot about the design of the test, but it's not very much code. So we don't have to worry too much about debt

<scribe> MEETING: ARIA and Assistive Technologies Community Group for Nov 27, 2019

- DRAFT -

November 27, 2019 ARIA and Assistive Technology Community Group Teleconference

27 Nov 2019

Attendees

Contents

Prototype Status Update

The runner

Reports

Use cases

Summary of Action Items

Summary of Resolutions

Scribe.perl diagnostic output