See also: IRC log
<Wilco> https://www.w3.org/2002/09/wbs/93339/availability/
wilco: shadi set up a survey for our
availability
... please fill it out!
shadi: you can keep updating the survey, it will keep the latest entry
shadi: before or after the call
alistair: it was an overview of the good
practice people provide on acceptance test
... it's an area we're interested in
... we're looking for features
... positive or negative features, how to write those up
... then best practices on how to write those tests
... not rocket science, but good advice
... basics that software testers learn
wilco: do we have both postiive and negative feature tests?
alistair: failure techniques are generating
negatives
... success techniques generate positives
... we can look at negatives first
wilco: looks good, we should take that into
consideration
... a11y testing does have both pos and neg testing
... looking at axe and other tools, there's a clear distinction
... it seems it makes a lot of sense
... what I still don't think we have is the ability to say "pass"
... so it's non-conformance testing instead of conformance testing
... is there anything about that in ISO?
alistair: the ISO stuff doesn't go in these
details
... it's more about writing standards
wilco: so we'd have to decide on whether we want to do that
alistair: in my experience you settle on
one way
... what we want to decide is whether the end product is a claim or
rather "you have not done this"
wilco: it seems to me we're aiming at the latter
alistair: right
wilco: did you work with Charu?
alistair: no, just did this 20 mins in the morning ;-)
shadi: things we can start putting in
requirements:
... atomicity: how atomic is a test
... another question is what we just discussed about positives or
negatives
... I feel a lot of this is good stuff to put in some kind of
requirements
... atomicity, relation to SC
... translate what we mean by atomic in the context of WCAG
alistair: the 1st thing is not to build
massive tests
... for SC, look at the techniques you can use to meet that
... then break it down to smaller tests
... that means we need to know what techniques we're following
... another thing is non technique tests
... e.g. an image: a technique test is test if there's an aria label
value
... a non-technique test is "does the image has a computed name value"
shadi: is this now the right time to
discuss this and put that in reqs?
... so that we have an understanding, and detailed later in spec
<Wilco> https://www.w3.org/WAI/GL/task-forces/conformance-testing/wiki/ACT_Framework_Requirements
shadi: right now our reqs are very high level
alistair: ultimately we want to achieve
massive amount of test coverage
... we'll want both approaches
... top-down: you don't know the udnerlying techniques
... bottom-up: you know the techniques,, tests if implemented well
wilco: I've looked at other areas, e.g. SVG
... if you look to make assesments about a11y support requirements
... you need to consider which exact attribute led to an accessible name
... it goes back to techniques
... you may have used a known technique, or another one. we need to
report on these differences
alistair: it's almost the difference
between warnings and fails
... you won't be able to say "you've done it wrong" if you look only at
aria-label for instance
... if you look at the outcome, it's about the SC
... if you look at the techniques, you can say "yes you pass" but maybe
a better way is another technique (???)
shadi: what I don't know in our fwk is if
we're expecting some kind of a11y support as input of our test
... assumptions as input
... for that baseline you define which does or doesn't work
... I don't want to get into the specifics right now
... is this the stufff we want in our req document?
kathy: on the a11y support side we'll
always have difficult times
... to say this technique is supported or not
... so many different components are part of it
... we can't really have that in here. w/b great if we could
... but I'd be concerned if we tried to put a11y support there, if only
for maintenance difficulty
wilco: we've discussed a11y support before,
when we started ACT
... we don't want to have that baked into rules, for the reasones you
mentioned
... you need to decide on granularity
... if we can achieve that, ideally, is to get to a point where the user
has some sort of baseline
... (probably up to the tools developers)
... so you can say "given this support", maybe a matrix, as input in a
rule
... so we're not stuck in a11y support issues
alistair: I agree 100% with you both
... definitely not put that into the tests themeselves, but make sure we
have the broader coverage
... the tools developer can apply some weightings, but that's not for us
to hardcode in the tests
wilco: what I want is to put a11y support data as input, and build results based on that
alistair: it looks like something tools
developers should put in
... we're talking about how to write tests
... how to utilize them is different
shadi: isn't it part of how to write tests?
... I agree that we shouldn't hard code a11y support in tests
... whether we want a mechanism that tool developers or a user puts a
database of assumptions in
alistair: it's still metadata related to the test, so you need to maintain it
shadi: let's take a simple example,
aria-label
... right now, the test is to check for aria-label
... if we're not considering a11y support, it's just a test to see if
aria-label exists
... a 2nd approach, is to have somewhere else in the test rule a
statement to say "this rule relies on aria-label to be implemented"
... so when you implement, you can look at these statements
... in the reqs, we want to come to an agreement and write that down
... how our fwk deals with a11y support statements
alistair: I agree
... you need to push the information in a kind of "relies upon" section
kathy: I think that would work
... I agree with shadi
... the more information the better
... we just need to be careful about not including things that would be
impossible to maintain
wilco: should we add a resolution?
shadi: my suggestion is to put that in the
requirements document, or record a resolution to do so
... and for people to think about other headings to put in the
requirements document
... you can send an email and request comments
<shadi> romain: to what extent do we want to say in the reqs how the rules will look like
<shadi> ...like reqs for reqs
wilco: the way I see it is: what features
would I want the rules to have, what are the quality aspects
... the way to deal with a11y support is one of them
shadi: we can take a middle way
... I don't want to wordsmith, but we can find a way to turn that in a
req
... we don't have to take a decision right now
... we might later have other areas that influence this decision
... for now, we need to gather these topics, to have a kind of skeleton
for our fwk
+1
wilco: little documentation, but
interesting ideas
... WPT have 5 test types
... some of them pretty similar to what we'll be doing
... ** goes over the 5 test types **
... about the requirements, we're aligned on atomicity, being short
minimal, cross platform
... e.g. "no proprietary techniques"
... they also talk about self-contained tests
... all their tests are written in HTML
... we don't use a format in auto-wcag, but WPT is explicit about it
... do we want to say what format we want the rules to be written in?
shadi: can you elaborate?
wilco: all their tests are in HTML
... their tests are really atomic, which is very powerful
alistair: self-contained tests is very
important
... in auto-wcag we have some dependencies between tests
wilco: the reason for that is atomicity
... it results in very big rules
... point 6 is about "accessible name", "role". we had that disucssion
... all tests are "pass" or "fail" tests
... do we need an output format? or do we only want a pass/fail
alistair: I think you'll need more than
that
... non-applicable, pass/fail, review
wilco: do we really need something like
EARL or can we make it work with something simpler
... (boolean result)
shadi: EARL is trying to describe results,
which can come from any model (binary or not)
... we concluded we have 4/5 types
... pass, fail, untested, etc
... my impression is that e.g. "cannot tell" must be in there
... there may be triggers for a test, but the test won't be able to tell
pass or fail
... "non-applicable" might not be needed
... in terms of warnings, they are usually essentially a "pass", just
ugly ones
... we may not want a value for "warning", it's just an attrribute
... some tools like to say "nearly pass"
... if it's a very minor issue. it kinda works.
... it's actually a fail. you have a problem, it just doesn't have much
impact
... these are nuances on pass
... that's why you have extra info in EARL
wilco: we'll definitely to have this discussion again.
shadi: we can untangle this from EARL
... for now, we can say we'll need to define the results (with potential
candidates)
... when we get into that, we can discuss it more
alistair: it needs to be totally
non-subjective
... "nearly pass" is very subjective
... "non-applicable" is not subjective, and very useful to know
... not having it may lead to wrong conclusions
... all of these have to be non-subjective
wilco: +1
... shadi, have you looked at github?
shadi: you all should have access and be
able to edit
... I set up a team
... I've not set up the automated publishing process
<Wilco> https://www.w3.org/WAI/GL/task-forces/conformance-testing/track/
wilco: quick look at action items
... last thing: we have our planning document
<Wilco> https://www.w3.org/WAI/GL/task-forces/conformance-testing/wiki/Month_by_month_plan
wilco: by december-ish we'll need agreement on the requirement, and a first draft