Accessibility Conformance Testing Teleconference -- 05 Oct 2016

Availability survey

<Wilco> https://www.w3.org/2002/09/wbs/93339/availability/

wilco: shadi set up a survey for our availability
... please fill it out!

shadi: you can keep updating the survey, it will keep the latest entry

shadi: before or after the call

ISO take-aways

alistair: it was an overview of the good practice people provide on acceptance test
... it's an area we're interested in
... we're looking for features
... positive or negative features, how to write those up
... then best practices on how to write those tests
... not rocket science, but good advice
... basics that software testers learn

wilco: do we have both postiive and negative feature tests?

alistair: failure techniques are generating negatives
... success techniques generate positives
... we can look at negatives first

wilco: looks good, we should take that into consideration
... a11y testing does have both pos and neg testing
... looking at axe and other tools, there's a clear distinction
... it seems it makes a lot of sense
... what I still don't think we have is the ability to say "pass"
... so it's non-conformance testing instead of conformance testing
... is there anything about that in ISO?

alistair: the ISO stuff doesn't go in these details
... it's more about writing standards

wilco: so we'd have to decide on whether we want to do that

alistair: in my experience you settle on one way
... what we want to decide is whether the end product is a claim or rather "you have not done this"

wilco: it seems to me we're aiming at the latter

alistair: right

wilco: did you work with Charu?

alistair: no, just did this 20 mins in the morning ;-)

shadi: things we can start putting in requirements:
... atomicity: how atomic is a test
... another question is what we just discussed about positives or negatives
... I feel a lot of this is good stuff to put in some kind of requirements
... atomicity, relation to SC
... translate what we mean by atomic in the context of WCAG

alistair: the 1st thing is not to build massive tests
... for SC, look at the techniques you can use to meet that
... then break it down to smaller tests
... that means we need to know what techniques we're following
... another thing is non technique tests
... e.g. an image: a technique test is test if there's an aria label value
... a non-technique test is "does the image has a computed name value"

shadi: is this now the right time to discuss this and put that in reqs?
... so that we have an understanding, and detailed later in spec

<Wilco> https://www.w3.org/WAI/GL/task-forces/conformance-testing/wiki/ACT_Framework_Requirements

shadi: right now our reqs are very high level

alistair: ultimately we want to achieve massive amount of test coverage
... we'll want both approaches
... top-down: you don't know the udnerlying techniques
... bottom-up: you know the techniques,, tests if implemented well

wilco: I've looked at other areas, e.g. SVG
... if you look to make assesments about a11y support requirements
... you need to consider which exact attribute led to an accessible name
... it goes back to techniques
... you may have used a known technique, or another one. we need to report on these differences

alistair: it's almost the difference between warnings and fails
... you won't be able to say "you've done it wrong" if you look only at aria-label for instance
... if you look at the outcome, it's about the SC
... if you look at the techniques, you can say "yes you pass" but maybe a better way is another technique (???)

shadi: what I don't know in our fwk is if we're expecting some kind of a11y support as input of our test
... assumptions as input
... for that baseline you define which does or doesn't work
... I don't want to get into the specifics right now
... is this the stufff we want in our req document?

kathy: on the a11y support side we'll always have difficult times
... to say this technique is supported or not
... so many different components are part of it
... we can't really have that in here. w/b great if we could
... but I'd be concerned if we tried to put a11y support there, if only for maintenance difficulty

wilco: we've discussed a11y support before, when we started ACT
... we don't want to have that baked into rules, for the reasones you mentioned
... you need to decide on granularity
... if we can achieve that, ideally, is to get to a point where the user has some sort of baseline
... (probably up to the tools developers)
... so you can say "given this support", maybe a matrix, as input in a rule
... so we're not stuck in a11y support issues

alistair: I agree 100% with you both
... definitely not put that into the tests themeselves, but make sure we have the broader coverage
... the tools developer can apply some weightings, but that's not for us to hardcode in the tests

wilco: what I want is to put a11y support data as input, and build results based on that

alistair: it looks like something tools developers should put in
... we're talking about how to write tests
... how to utilize them is different

shadi: isn't it part of how to write tests?
... I agree that we shouldn't hard code a11y support in tests
... whether we want a mechanism that tool developers or a user puts a database of assumptions in

alistair: it's still metadata related to the test, so you need to maintain it

shadi: let's take a simple example, aria-label
... right now, the test is to check for aria-label
... if we're not considering a11y support, it's just a test to see if aria-label exists
... a 2nd approach, is to have somewhere else in the test rule a statement to say "this rule relies on aria-label to be implemented"
... so when you implement, you can look at these statements
... in the reqs, we want to come to an agreement and write that down
... how our fwk deals with a11y support statements

alistair: I agree
... you need to push the information in a kind of "relies upon" section

kathy: I think that would work
... I agree with shadi
... the more information the better
... we just need to be careful about not including things that would be impossible to maintain

wilco: should we add a resolution?

shadi: my suggestion is to put that in the requirements document, or record a resolution to do so
... and for people to think about other headings to put in the requirements document
... you can send an email and request comments

<shadi> romain: to what extent do we want to say in the reqs how the rules will look like

<shadi> ...like reqs for reqs

wilco: the way I see it is: what features would I want the rules to have, what are the quality aspects
... the way to deal with a11y support is one of them

shadi: we can take a middle way
... I don't want to wordsmith, but we can find a way to turn that in a req
... we don't have to take a decision right now
... we might later have other areas that influence this decision
... for now, we need to gather these topics, to have a kind of skeleton for our fwk

ARIA test harness take-aways

<Wilco> https://www.w3.org/WAI/GL/task-forces/conformance-testing/wiki/Testing_Resources#Take-aways_from_WPT_.26_ARIA_Test_Harness

wilco: little documentation, but interesting ideas
... WPT have 5 test types
... some of them pretty similar to what we'll be doing
... ** goes over the 5 test types **
... about the requirements, we're aligned on atomicity, being short minimal, cross platform
... e.g. "no proprietary techniques"
... they also talk about self-contained tests
... all their tests are written in HTML
... we don't use a format in auto-wcag, but WPT is explicit about it
... do we want to say what format we want the rules to be written in?

shadi: can you elaborate?

wilco: all their tests are in HTML
... their tests are really atomic, which is very powerful

alistair: self-contained tests is very important
... in auto-wcag we have some dependencies between tests

wilco: the reason for that is atomicity
... it results in very big rules
... point 6 is about "accessible name", "role". we had that disucssion
... all tests are "pass" or "fail" tests
... do we need an output format? or do we only want a pass/fail

alistair: I think you'll need more than that
... non-applicable, pass/fail, review

wilco: do we really need something like EARL or can we make it work with something simpler
... (boolean result)

shadi: EARL is trying to describe results, which can come from any model (binary or not)
... we concluded we have 4/5 types
... pass, fail, untested, etc
... my impression is that e.g. "cannot tell" must be in there
... there may be triggers for a test, but the test won't be able to tell pass or fail
... "non-applicable" might not be needed
... in terms of warnings, they are usually essentially a "pass", just ugly ones
... we may not want a value for "warning", it's just an attrribute
... some tools like to say "nearly pass"
... if it's a very minor issue. it kinda works.
... it's actually a fail. you have a problem, it just doesn't have much impact
... these are nuances on pass
... that's why you have extra info in EARL

wilco: we'll definitely to have this discussion again.

shadi: we can untangle this from EARL
... for now, we can say we'll need to define the results (with potential candidates)
... when we get into that, we can discuss it more

alistair: it needs to be totally non-subjective
... "nearly pass" is very subjective
... "non-applicable" is not subjective, and very useful to know
... not having it may lead to wrong conclusions
... all of these have to be non-subjective

wilco: +1
... shadi, have you looked at github?

shadi: you all should have access and be able to edit
... I set up a team
... I've not set up the automated publishing process

<Wilco> https://www.w3.org/WAI/GL/task-forces/conformance-testing/track/

wilco: quick look at action items
... last thing: we have our planning document

<Wilco> https://www.w3.org/WAI/GL/task-forces/conformance-testing/wiki/Month_by_month_plan

wilco: by december-ish we'll need agreement on the requirement, and a first draft

Accessibility Conformance Testing Teleconference

05 Oct 2016

Attendees

Contents

Availability survey

ISO take-aways

ARIA test harness take-aways

Summary of Action Items

Summary of Resolutions