Accessibility Conformance Testing Teleconference

Meeting minutes

<jeanne> OR relationship (Outcomes have AND, Methods have OR)//OR relationship (Outcomes have AND, Methods have OR)

<ChrisLoiselle> is there a link to the meeting in zoom ?

<ChrisLoiselle> nevermind, found it : )

<Dimitri> Unable to join the meeting

<Dimitri> I figured the issue. Thanks Wilco

Introduce Method and ACT Rules

Wilco: Last time we had close look at outcomes. We came to the conclusion that outcomes needs to be defined, and those definitions hopefully could be written in collaboration with ACT
… feting we wnat to have acloser look at some ACT rules, specifically those related to headings and how they can be applied to the current methods for WCAG3

<Wilco> https://act-rules.github.io/rules/b49b2e

Wilco: First is "Heading is descriptive". Checks that anything marked up as semantic heading describes the next piece of content in the document
… We link to definitions developed in ACT

JF: When you say the semantic role is h1-h6 as well as aria role="heading"?

Wilco: Yes.
… We currently don't have a rule testing that things looking like headings are coded as headings

Jake: Captions and legends that look like headings are part of the rule or there are alternatives to those?

Wilco: No, this rule is about what ARIA calls a heading
… A column header is not a heading as per ARIA definition
… Headers or fieldsets applies to tables or forms, so different scope

Carlos: That was our initial goal, but could not find an objective way to define a heading from what it looks like

Wilco: We could eventually look into it now that we have developed somewhat more complicated rules. We now have a vocabulary we can rely on as we have been working on definitions that are not even in WCAG

Wilco: After the applicability there are two expectations
… 1 that the visual heading describes the content< 2 that the accessibility tree bit describes the content. Those 2 need to be true for the rule to pass.

Wilco: Assumptions. When to use and not to use this particular rule. Mainly covering edge cases.
… For example there are language that do not have a code, so they are not programmatically determined
… Accessibility support: Describes if there are differences in how browsers and ATS behave. This one is about presentational role conflicts that exist.
… Background: links to related technologies and WCAG techniques
… Then test cases: passing, failing, and inapplicable
… Test cases give you examples of right and wrong practices and describe them
… And also for somebody using the rule to have achance to compare expected to actual outcomes
… Glossary lists definitions

JF: These are mechanical testable rules, but some require human intervention.

Wilco: We don't specify how and when human intervention is required

JF: The rules format can ber used to write machine and human testable rules?

Wilco: Yes

Sheri: There is a project that can test headings based on the surounding content
… Also about contrast, keyboard focus indicators

<sheri_b-h> https://github.com/vmware/crest

Ken: Pass example 7, hidden heading, but visible. I thought both expectations have to be passed.

<sheri_b-h> We have 16 other tests that are currently not automated that we think we can automate with machine learning in the next year

Ken: Why is this passed?

Carlos: The rule checks that the heading is descriptive, ont that the content is described by the heading
… If the heading is not included in the accessibility tree and there is no other heading that is included, then this would pass

Ken: Was there debate about that causing issues? It seems confusing
… There is counter-examples as well, where something is a heading but is not visible

<Zakim> jeanne, you wanted to say Example 7 has an impact in Outcomes

Jeanne: I think we may be handling this in WCAG3. We wrote outcomes to have and AND relationship, but the methods were talking about OR, as they are technology oriented
… I think we would handle this by saying "this is a pass example for the outcomes that headings need to describe, but it would fail the outcome that they need to be semantically available"
… As you have to pass every single outcomes, we would pick it up

Ken: I think I am now clearer

Anne: This rule is testing an SC which do not distinguish between visible and semantic headings. As we need to have a fail-to-fail relationship in ACT, we needed to test it that way

Wilco: True, WCAG does not define headings

<Wilco> https://act-rules.github.io/rules/047fe0

Wilco: Rule "Document is heading for non-repeated content
… Part of a different rule, called "Bypass blocks" testing the SC about bypass blocks
… Composite: if anyone of the rules included (atomics) passes, the composite passes
… The applicability here is the HTML web page as bypass blocks applies to web pages
… Expectation is similar, it has a bunch of definitions, and it looks for any element that is a heading, that is visible, that is included in the accessibility tree, and that is after a block of repeated content

Quantitative and Qualitative tests

<jeanne> https://www.w3.org/WAI/GL/WCAG3/2020/methods/relevant-headings/

Jeanne: Current methods for relevant headings. It has five tabs.
… It has an introduction about platform and technology. It then has a description section that connects it to the relevant outcome
… It describe the methods, and then it has automated and manual tests
… And last tab is resources.
… We are open to changing this
… We are open to including ACT format, or to linking to specific ACT rules
… How do we change the methods format to better integrate it with ACT rules?

Wilco: Let's talk more about the difference between atomic and holistic testing. How do you have both in the same method?

<ChrisLoiselle> Reference point for current discussion https://www.w3.org/TR/wcag-3.0/#types-of-tests , https://www.w3.org/TR/wcag-3.0/#atomic-tests and https://www.w3.org/TR/wcag-3.0/#holistic-tests

Jeanne: The holistic tests would be a way to include more usability oriented testing, how to test with ATS. Probably they need to be split into different methods
… We received a lot of feeedback about this design that was mostly critical. People did not like atomic tests as they could be confused with ACT testing.
… Most of the comments were focused on the testing tab, so we are open to changing that
… One of the proposals we have is, instead of looking at manual versus automated testing, we could look at qualitative versus quantitative testing

<JF> https://www.w3.org/WAI/GL/WCAG3/2020/methods/relevant-headings/

JF: We have been testing for methods as opposed to outcomes. Now we have methods for relevant headings in our draft. When I look at the test procedure, it seems that we are missing that the heading needs to be exposed in the accessibility tree. Also I miss hierarchy in our test procedure. I am not sure how to address that question overall

<Zakim> jeanne, you wanted to say there are 3 Methods about headings

Jeanne: We have anumber of outcomes under each guideline. In headings we have three methods. They are associated with different outcomes, I think John is bringing an edge case but probably not to address now in this conversation
… If you write it down John we could either save it for later or for another meeting

Wilco: Subjective and unambiguous are importance concepts we apply. We have a requirement that applicability needs to be unambiguous
… The number of headings is quantitative< contrast is qualitative
… Opposed to that, we have the requirement that expectation can be subjective, but it cannot be ambiguous. When an expectation is ambiguous it will result in different people testing it in different way, thus getting different results

<ChrisLoiselle> For context of JF'ss and Jeanne's comment on prior topic and how it relates to WCAG 3 - Guideline is on https://w3c.github.io/silver/guidelines/#structured-content and talks to outcomes and methods, for example - https://www.w3.org/WAI/GL/WCAG3/2020/outcomes/uses-visually-distinct-headings and https://www.w3.org/WAI/GL/WCAG3/2020/outcomes/conveys-hierarchy-with-semantic-structure

Wilco: This allows us to be much more precise in what needs to be tested

Jeanne: It also gives us a way to pick a qualitative assessment in headings for example (how well does the description apply), as the rules does a good job defining the two extremes.
… We could then set up the boundaries of how good the description is, and then describe different numeric categories or ratings as well
… I would like to start with quantitative testing first, though
… I am interested on members of ACT commenting about this approach and about if it would fit into what you are doing

<JF> +1

<CarlosD> +1

<jeanne> +1 for quantitative as much as possible

Trevor: I am leaning towards quantitative as much as possible, qualitative tends to be more difficult

<shadi> +1 to Trevor

Trevor: "Setting up those bou8ndaries" might sound a bit hard

Chuck: Same issue exists on WCAG2, this is not a unique issue that we are introducing

Jeanne: I agree Trevor that we should use quantitative as much as possible. But we have examples in WCAG2 where we would need a better qualitative testing, that's what we need to explore

Shadi: I think the difference between WCAG2 and first draft of WCAG3 is the multiple possibilities. I agree that makes is very complicated.

<jeanne> +1 that text alternatives covers too much

Shadi: Text alternatives currently covers too much, and it could be broken down into smaller pieces so that the decision space would be more limited

<trevor> +1 for breaking down further to reduce decision space, wonder if that will increase or decrease barrier to entry

<jeanne> +1 that I think text alternatives should be broken into much finer outcomes as Shadi suggests.

<Lauriat> +1

Shadi: The heading requirements are more specific.

<JF> +1

<CarlosD> +1 to breaking down the decision space

Shadi: I would argue to break down the requirements themselves rather than to offer more qualitative choices

<JenniferC> +1

<johnkirkwood> +1

<Dimitri> +1

Wilco: Qualitative from ACT perspective is that it is unambiguous, when I say something is good I must mean the same thing as somebody else that says good as well
… When we introduce "better", that allows a lot of options in between, a lot of granularity

<JF> +1 to granularity in testing

<Zakim> jeanne, you wanted to note that breaking down the Outcomes to a finer state

Carlos: I agree with what Shadi was saying. From ACT perspective, tests need to as quantitative as possible. They need to be reputable. I wonder, if we need to have qualitative tests in WCAG3, can't it be achieved by passing certain tests but having those tests be really small in scope?

Jeanne: Qualitative could be giving guidance to a tester or a developer as to what makes something good and this is what makes it even better
… Certain techniques are more preffered than others, so we could guide people to use these but still don't fail people who use others that are also good
… We want to design a system to give people a better score for doing things better
… Maybe we should not do it in testing, though.

Jake: I hear two different approaches. Breaking up outcomes is how testers are doing it already. How can we provide those extra rating, better, or worse, and make a difference with respect to pass/fail but still do it objectively?

<Zakim> jeanne, you wanted to say what I queued for

Jeanne: I support braking down outcomes into detailed level.

<ToddLibby_> Does someone have the Zoom link, please? I can't get on the telecom info page.

Jeanne: We would like to work with ACT on this, we would appreciate your guidance in how to break outcomes appropriately so that they are easier to test.
… Back to Jake's question, many of these micro quantitative testings could be written for assessing the grade in a qualitative level

<ToddLibby_> that's giving me a 403 page, shadi.

Jeanne: If a heading describes the topic but then the next piece of content has multiple topics, the heading would describe the content but won't do it very well as there are multiple topics in the piece of content
… We would need very strict rules to define which goes into which category

<Zakim> Chuck, you wanted to ask if we need a scribe change

<ToddLibby_> shadi: thank you

<Zakim> JF, you wanted to note that guidance and normative requirements are separate ideas.

jf: there's guidance, then normative requirements. Problem now seems to be merging those two together. When it

When its easy to measure, that's one thing. Trying to use that type of measurement on subjective observations, seems to be why we get into problem.

Such as what's good alt text. Perhaps we try to stop scoring everything using the same metrics, as Bruce commented in the past.

<jeanne> +1 to Anne.

<shadi> +1 to Anne

<JF> +1 to Anne

Kathyeng: perhaps take ambiguity out of normative text. If you read the normative text and not learn anything, then have to research to try to understand the understanding docs. Keep the important parts in normative text, and that it is unambiguous.

And, that we take the testing parts, have the same strictness as the ACT rules.

<CarlosD> +1 to Anne

Easier to write the strict rule to begin, than to go in the reverse.

Requirements have been written in too fluffy a way.

<JenniferC> +1

<Zakim> kathyeng, you wanted to say qualitative is up to author

Correction: That was credited to Anne, not kathyeng.

kathyeng: sometimes we look at … have to defer to authors, where do you want breaks, how descriptive… in the relevant headings check, it would be really difficult for a tester to say it should be divided and separated here…

I'd appreciate that there's guidelines to evaluate this, but I don't know that those could encounter all he scenarios a tester could encounter.

ACT examples are very clear, but we rarely encounter those in real world testing.

Jeanne: outcomes s/b more granular (seems lots of agreement). the way we've set up outcomes, you need to pass all the outcomes for a particular guideline (unless not applicable)…

I worry there will be a lot more non-applicable.

do ppl think that's okay?

<JF> If the N/A's are mostly machine-testing items, then it shouldn't matter

<SuzanneTaylor> -1 putting that detail at the outcome level can also leave gaps

Jake: if you take images and alternatives, and break them up, then still have question… breaking up a criteria doesn't solve that one.

<sajkaj> +1 to no problem with NA. It's only a problem when humans look at the output, but if it facilitates more automatable testing, it's a worthy tradeoff, imo

what I hear in this conversation, they want to have the wording stricter, clear pass/fail, and I understand. I think the complete opposite is what silver wants, to break up, and make more loose, yet still have a way to measure.

wilco: point of order: I would like this conversation to focus on if it can be done, and how act can help to do it, and less about silver scope.

how can we apply the lessons learned from ACT to wcag3?

Jake: yes, it's all about pass/fail

<Zakim> alastairc, you wanted to comment on NAs

Alastair: I am constantly staggered by the ways devs can implement what would be simple things.

I worry that if the eventual pass/fail is restricted to the sub-set of requirements that are strictly quantitative…

I think if we go to easier to understand and easy to test, whether more outcomes or outcomes are longer… is what I'm picking up.

<jeanne> +1 to longer

shadi: I don't think the n/a is an issue. … we do have principles/guidelines under success criteria.

I think it's a design prob, there are many ways to design this… structured in a nice way, that you find what applies to your particular situation, without these n/a in a report. conceptually, if we agree/not about making things more specific/clear.
… next point, what Jake was talking about, how can we write more reqs that are not easy to test, that are ...
… I really see no reason to water down the current reqs, but we still have these differences, as Anne pointed out. I was astonished to see testers have different interpretations of requirements. I worry it will cause more inefficiencies. I'm not convinced that things aren't that different,, rather we haven't found a way of writing them.
… I think it w/b unfortunate if we can find a way to improve the content of wcag, since we do have act. it doesn't mean we have to leave out COGA or other.

<JF> +1 shadi

<Zakim> jeanne, you wanted to respond to very strict hard pass fail vs open COGA testing

Jeanne: answer Jake's point re want to do ambitiously improve the content we have, and include more complex testing. Jake is right, we do want to do that. What we want to do in this call is improve the content we have and the testing we have, then we can test against the new content we have. but is beyond scope of today's call.
… today, focus on quantitative testing and a need for qualitative… as much as I like the idea, w/ alt text as an example, I don't want someone to be able to pass by putting "image", "image"… and pass.
… just as heading to be descriptive, I think that's the key rule we're looking for in the example.
… and in response to Alastair, how devs can do different things, it w/b helpful to easily create more methods. I hope that's not at an outcome level. pardon me if my thoughts are a bit blurry. I think having precise outcomes that are narrow and tech neutral…
… then more --- I lost my train of thought. if Alastair talks more, it make come back.

Wilco, check back with Jake if that answered the question.

Jake: I was thinking of my exact question, I guess I find in Jeanne's word, how ACT can help with Silver. I thought it was also to see if we can go a step further. I'd like to know if there's room for anything beyond pass/fail.

<Wilco> ack

Jake: I thought another way of approaching pass/fail w/b on the agenda.

<Zakim> JF, you wanted to note that we can all agree that "image", "image", "image" = BAD, but defining GOOD is a lot harder

Wilco asking Jeanne: can we get to some conclusions on that,

JF: example Jeanne gave of three images on a page, can agree; yet defining what is good is what we can't come to an agreement on.
… it seems we may not get to an answer on that, because it is so subjective. testing tools can help isolate instances of bad… at the end of the day, trying to score/reward good alt text, I don't see how we can do in a normative way.

Wilco, pass the ball back to Jeanne.

<Zakim> jeanne, you wanted to propose that we all agree that the Outcomes should be more precise

Jeanne: I'd like to do a straw poll, if ppl agree that the outcomes at a normative level need to be more precise, and there should be more of them.

<Chuck> +1 more outcomes, more precision

<jeanne> Proposed: Outcomes should be more precise and there should be more of them

Jake: I think in the document it mentions it is not complete, that they are just to give an impression. How can we decide if we don't have a set already?
… if know they already lack outcomes.

Suzanne: Risk of moving … methods such as the methods for different types of images up to the outcome level
… if we put it at the outcome level, where it's normative, we may not be able to catch some of these new things as they come up.

<jeanne> +1 very good point that we don't want to create a list that excludes

wilco: when you ask 'precise' can we say they need to be 'unambiguous', need to be more granular, is that fair to say?

<JF> +1 to granularity

question for jeanne

Jeanne: I think 'unambiguous' is a better word than 'precise'. I'm mulling over Suzanne's point.
… I know Charles hall brings this up, that when we bring up a list, we already exclude anything in the list (is this right?). maybe table that for another meeting, I don't know. Wilco, do you see a point forward that includes Suzanne's?
… part of the reason we lumped things together in one outcome (in alt text), because we wanted to run an alt text that didn't need to know how the image was being used, then a qual test that eval how well the image was done.
… I think we can agree the outcomes s/b unambiguous…
… i need to think more about what Suzanne raised.

Suzanne: I just wanted to say we do have tests down at the method level, as well. Maybe we could have something more general, but unambiguous at the outcome level and then more precise at the method level. that way at the outcome level, you're still covered for what may come up, then at method you're covered by specific test scenarios.

Wilco: pointing out Anne described the opposite, and got a lot of +1s.

Jeanne: I think the resolution s/b

<jeanne> Proposed resolution that we agree that Outcomes need to be unambiguous

<Wilco> +1

<trevor> +1

<jeanne> +1

<JF> +1 to Outcomes need to be unambiguous

<Lauriat> +1

<kathyeng> +1

<ToddLibby_> +1

<JakeAbma> +1 / -1

<SuzanneTaylor> +1

<Shri> +1

<CarlosD> +1

<KenP> +1

<alastairc> Nothing communicated in human language is completely unambiguous, I think it has to be relative, e.g. less ambiguous (longer where necessary) than the 2.x criteria

Jake: if we break up criteria… 800, 900, 1k, more clear outcomes… if that's the result, will we all still say +1? Is that more clear for ppl if we split them up to hundreds?

Alastair: nothing we phrase in human language is completely unambiguous, key is where are we drawing the line, on the normative aspect and the more detailed testing info beneath that. taken to extreme in either direction isn't helpful. I think we're saying outcomes will be less ambiguous, and potentially longer, than wcag 2.x.

<jeanne> Proposed: Outcomes will be less ambiguous and longer in explanation

<jeanne> +1 to JF and more automated tests and less manual

JF: Jake brings a good point, if the 900 gets us machine-testable idk. it's the effort to complete the tests -- like binary/boolean -- then it's good. Less concerned about number of tests, more that can be handed off to machine testing.

Jake: it's less about testing than about outcomes in silver. do we want a lot more outcomes, or does one outcome be divided in more tests? if that's true, then do we need them at the method level? if they are testable statement, they are already.

<JF> A huge +1 to "testable statments"

Jeanne: we've gotten a fair way away from qual vs quant question… I'd like to remind we are trying to put together a proposal to be tested with data. if doesn't work, will be revised. we're trying to find something that can be tested at the method level today.
… if we find something that can't, we'll come back and revisit. I'd like to focus on testing today, merging act rules and silver methods. I don't want to ignore the broader issues, but want to focus.
… where we were is, do we agree that we should propose making the outcomes less ambiguous?

[11: 42] <jeanne> +1 to JF and more automated tests and less manual [11:42] <jstrickl

<Zakim> jeanne, you wanted to wrap up

<sajkaj> +1 to JS

<Wilco> +1

<trevor> +1

<anne_thyme> +1

<JenniferC> +1

<kathyeng> +1

<KenP> +1

<alastairc> +1

<JF> +1 I agree with the statement (perhaps change longer to "more detailed", but...)

<johnkirkwood> +1

<ToddLibby_> +1

<Shri> +1

<JustineP> +1

<JakeAbma> +1

Do Methods need applicability?

Wilco: the applicability of an act rule is that it is objective. what part of this can we define in an objective way? what's the scope that can go with that applicability?

<JF> +1 to Trevor

<Zakim> jeanne, you wanted to answer trever

Trevor: I'm trying to get an idea of how this is going. I'm trying to imagine how this applicability applies for a single method, how does that mix with different tests? Do we match some tests with part of the applicability?

Jeanne: I think the way is each method is tech-specific, diff for HTML, epub, or PDF.
… as we become more precise with ACT's assistance, we may change that. I think we can see this applied to semantics in HTML, which w/b HTML specific.

Trevor: I was looking at the headings, and wondered how applicability got mixed in there, or was a special case.

Jeanne: I think it's a special case, and if it needs diff methods for diff tech.

<Zakim> jeanne, you wanted to answer

Jake: I think that may not be the complete answer, we decided 1/2 a year ago that we would have a fallback method that would be tech agnostic due to new tech that may come along… that would cover the outcome. there w/b tech specific methods, but always a tech agnostic method, too, for every outcome.

Jeanne: we talked about this in a mtg, and agreed it was a great idea, but no one did it.

Jake: but we know it doesn't work if it's not there.

Jeanne: but we need ppl to write them.
… we need someone to write it, so we test it.

Jake: for the ppl who were not around at the time, it came from using headings and the approach, then we saw different heading elements -- like Android and ARIA, and WCAG now, where tech and methods are not complete, ppl can have their own methods, or tech proceeds…
… and to open up for that we decided to come up with a fallback method; that they are not there does not mean they are not needed.

Scribe would like to have someone else pickup at the top of the hour, please. Need a bio break.

<Zakim> Wilco, you wanted to react to jeanne

Wilco: they are not unambiguous if we don't cover how they are applicable to technology -- was that right?

<jeanne> +1 to following the ACT model of Applicability

<trevor> +1 for applicability

<shadi> +1 to applicability for technology-specific methods

<joesaiyang> +1

<SuzanneTaylor> +1

<KenP> +1

<Lauriat> +1

<Wilco> straw poll: Add applicability to the WCAG 3 methods, similar to how ACT rules have an applicability

<trevor> +1

<CarlosD> +1

<sajkaj> +1

<KenP> +1

<JenniferC> +1

<Lauriat> +1

<kathyeng> +1

<Francis_Storr> +1

<SuzanneTaylor> +1

<ToddLibby_> +1

<JustineP> +1

<JakeAbma> +1

<JF> +1

<jeanne> +1 with Shadi's addition of Technology-specific methods

<Shri> +1

Resolution: Add applicability to the WCAG 3 methods, similar to how ACT rules have an applicability

Viability of AND/OR relationship (Outcomes have AND, Methods have OR)

anne: Is it that there's one test per method

wilco: Next agendum!

jeanne: Definitely an area where we need ACT guidance

jeanne: Outcomes are an .AND. relationship; one must pass all outcomes

jeanne: .OR> in methods

jeanne: we have written multiple tests in methods, but not necessary. Open to changing that

anne: No opinion yet, wondering how it will work

jeanne: Perhaps an example of where it wouldn't work?

ann: Not at the moment -- Perhaps the example shown earlier and capability may have not been parallel

ann: Don't believe the atomic tests would have the same applicability

jeanne: Thinkwe could ignore for now ...

jeanne: But, can we reference two rules -- in a single method for relevant headings?

ann: Not sure

ann: Guessing html would be an applicability; and another method for an Android app view

jeanne: If we assume a particular method just for html; could we have applicability one rule applies to any page, and a second rule with a different heuristic

wilco: We've tried to keep our rules atomic, as small as we reasonably can

wilco: So combining different requirements into the same method can be done, but doesn't fit well into ACT

wilco: Also typical for testing

<JF> +1 to Wilco

wilco: So, when one fails the failure is specific

wilco: As a result ACT rules are smaller than SC today

wilco: Notes 4 rules relating to page lang

wilco: provides a precise way of testing

wilco: allows for breaking things up as much as possible

wilco: as a result, no one ACT rule informs whether a particular SC passes

jf: Assuming the lang testing is recursive?

wilco: NO

wilco: can be tested in any order

jf: But there are dependencies?

wilco: There's a concept of relation, but each can be tested on its own

<Zakim> jeanne, you wanted to say there is a third option and that would be to have multiple tests within an method and clearly state which ones are union and which are alternatives

jeanne: Believes there's also multiple rules within a method and specify which are additive and which are alternatives

jeanne: Putting these at the method level makes them more updatable

SuzanneTaylor: Agree with that; also that not all developers need to understand that granularity

SuzanneTaylor: I like having the detail, but also like having it on the test tab; better supports smaller orgs with fewer experts

<Jemma> I am pondering about the statement that all developers do not need to understand details.

<jeanne> +1 to keeping the gritty details in the Method

Wilco: Agrees with Jeanne; not a bad idea -- as long as they're designed to fit well together

Wilco: Will there be one to one between methods and outcomes? Or could html method have two outcomes?

jeanne: The latter

jeanne: Believe we had a method with multiple ways to approach

<Jemma> good question, Wilco.

Wilco: How does that work with scoring where you could use either method to decide the score? Wouldn't that risk multiple scorine results?

jeanne: The testing results are interpreted at the normative level into a common point system because we designed to accomodate different kinds of testing approaches

<Jemma> wondering about "normalization" method Jeanne is referring to...

Wilco: It would surprise me if you can get consistent results from two testing methods

jeanne: Do have a group working on that

jeanne: Notes we have test sites to test for several *ities

jeanne: If you have an example, we will test and more participants in that group are welcome

jeanne: The reason for the group is to find out whether what we're doing works

Wilco: Is there an example?

jeanne: Don't believe so

jeanne: So, we should probably try that for the August draft so we have a way to determine whether we have a problem here

Wilco: If you can make that work, I'm certainly OK with it!

jake: Perhaps embed part of HTML in an IOS app --

wilco: Still two methods for two different tech

jake: But with hybrid versions you have to decide how to test

jeanne: Suggest this is another tangent -- we need to stay on agenda

SuzanneTaylor: Have an example -- but we can table

jeanne: Please send it to me

KenP: Looking at headings with levels outcome and from there to the method ...

KenP: two examples neither referenced techs have ability to reference a level

<Zakim> jeanne, you wanted to answer

kentPerhaps the outcome needs revision

jeanne: Yes, it's a typo that slipped through QA

jeanne: long story

KenP: OK

Wilco: Have a proposal ...

<Wilco> Proposed resolution: Methods can include one or more applicability and expectation pairs

wilco: Hmm, maybe we haven't settled on expectations ...

Wilco: will mean more than one applicability per method

KenP: seems will be needed -- are methods always tech specific?

Wilco: Not necessarily

<Wilco> Proposed resolution: Methods can include one or more applicability and test pairs

jeanne: asks meaning of "pair"

<SuzanneTaylor> maybe "Methods can include one or more tests, similar to ACT tests, so can also include one or more applicability and expectation pairs"

<Wilco> Proposed resolution: Methods can include one or more tests

Wilco: let's say multiple tests

<jeanne> +1

jf: suggests example of multiple tests ...

<JF> +1

<KenP> +1

<jeanne> +1

<trevor> +1

<SuzanneTaylor> +1

<Dimitri> +1

<anne_thyme> +1

<jstrickland> +1

JakeAbma: don't get it

<Shri> +1

JakeAbma: always thoughts methods were like techniques

JakeAbma: unclear for me [lays out some detais]

jeanne: We're looking at something different

jeanne: we're looking at a more granular level

Wilco: actually prefer rule to test

JakeAbma: so what's the mapping a la lang

Wilco: believe it gives us the flexibility to decide that an outcome can have a lrager scope than the rule that underlies it

Wilco: running just one method will be sufficient

jake: outcome as generic method

wilco: yes

JakeAbma: mentions ultiple quantitative ambiguities

<Zakim> Lauriat, you wanted to give an example of outcome, methods to realize it, and tests to check things, in hopes that it helps (from an old doc): https://docs.google.com/document/d/18JyGF-AK8Qgq7DPyVlDYmxoj6814rORxuCf0l0oSb7U/edit

Lauriat: From a working session in 2018 for lang of page

Lauriat: real outcome is that AT can verbalize text on screen in correct lang, etc

Lauriat: methods are all centered around lang of env

Lauriat: within each of these there are additional tests to see whether correctly accomplished

Lauriat: includes http header test

Lauriat: even though no AT looked at that in 2018

Lauriat: This one is cross env

Wilco: great example

Wilco: assuming AT supports http headers, then there would be two ways of meeting the outcome

wilco http or lang attrib

Wilco: would need to check both methods to see if correctly achieved

Lauriat: almost

Lauriat: testor could test otherwise, but we built tests to see whether method is done correctly, but not necessarily whether outcome is realized

Wilco: maybe bypass blocks is another example?

wilco: multiple ways to pass

wilco: like that?

Lauriat: not sure

Wilco: bypass blocks SC has several ways to check whether it's correct; each would be a method and outcome would show whether it works

Lauriat: yes, but some may also be functionally the same, so may not be exactly one to one and can have many ways

Lauriat: but yes, it would work

Lauriat: some blocks might have different internals and different bypass

Passing rules vs. failing rules

Wilco: think we've largely covered this ...

Wilco: we generally check for failures

Wilco: rule doesn't tell you something is met; but it does when it isn't

Wilco: methods don't tell you whether something has failed, but it will tell you something works

wilco: Correct?

jeanne: also have ability to identify failures

jeanne: so unsure impact

Wilco: in scoping

Wilco: ACT rules can have more scope than SC

Wilco: e.g. can't just test for img element because there are other types of images

<Zakim> Lauriat, you wanted to say that methods help people understand how to meet guidance, but they do not need to comprehensively cover all possible ways to meet the guidance.

Lauriat: while we want to document all the common ways to meet guidance, we don't need to be fully comprehensive

Lauriat: there are other ways not yet invented and it's OK we don't have test yet

Lauriat: tests are there to illustrate how to apply the methods

Lauriat: if outcome is met some other way, that's acceptable and important for our "future proofing"

shadi: not so strong an opinion here ...

Wilco: one challenge for ACT in that may be overcome if we allow multiple rules within a method so the method can be broken down to atomic

Wilco: could create gaps

Wilco: e.g. contrast rule checks that at least one picsel has sufficient contrast

Wilco: we had no answer on how many picsels need to have that contrast

SuzanneTaylor: think that's a good thing of having ACT test being written while guidelines are also being written

SuzanneTaylor: seems the two will help each other

Wilco: indeed

Wilco: qwill help identify gaps

Wilco: seems we're in general greement that things work

jeanne: excellent, so let's build some and see

<Lauriat> +1 to building & testing!

kentlooking at tab struct for relevant headings and feels like they all fit (or could be put) under Rules with ACT rules inside

KenP: notes we also would have resources within

kentperhaps could borrow our structure?

<jeanne> +1 to pulling data from ACT - I like what you ahve done

Wilco: believe we've picked from similar sources -- converging evolution

Expectations vs. Test Procedures

<Wilco> https://act-rules.github.io/rules/047fe0#expectations

<jeanne> +1 for Expectations

Wilco: ACT have expectations

Wilco: methods have procedures

Wilco: almost the same, but take from different perspectives

Wilco: want to suggest using expectations rather than procedure

Wilco: expectations tell precisely what outcome needs to be

Wilco: we used to have test procedures and that turned into a brick wall!

Wilco: procedures describe what you need to do

wilco: expectations don't tell you how to do it

Wilco: manually, machine model, etc., no perscription

Wilco: opinions?

<jeanne> Propose moving the Test Procedure to the HowTo level and make it technology neutral.

<Zakim> Lauriat, you wanted to say I think we can and probably need to use both, at different levels.

Lauriat: believe we need both but probably at different levels

Lauriat: testing against outcomes probably more at expectations level

jeanne: agree with sl at basic level, but considering how to reorg usefully

jeanne: we could put expectations at method level and keep a generic for newbies at the howto level; which keeps it out of the way of normative

wilco: like that

<Wilco> +1

shadi: let's try it out

<ToddLibby_> +1

<JF> +1 Shadi

shadi: believe a bit of emphasis to be not only a standard but also an educational resource, and maybe more than we should take on

<Lauriat> +1 to looking at not documenting the world as a part of creating a standard.

shadi: believe we should create the clear and unambiguous standard

<johnkirkwood> +1 Shadi

Wilco: looks for volunteers for a subgroup with some aCT and Silver folks to try things out

jeanne: believe current silver testing would be interested

wilco: looks for volunteers ??

<ToddLibby_> I will volunteer

<Lauriat> Not sure of time availability, but please keep me at least in the loop for now?

<shadi> also want to stay in the loop

<jstrickland> I need a bit more clarity on the expectation.

[crickets]jeansuggest existing group on silver side and add act side

jeanne: can check on meeting time

<shadi> +1 to not spawning yet another group :-)

<jstrickland> +1

– DRAFT –
Accessibility Conformance Testing Teleconference

21 May 2021

Attendees

Meeting minutes

Introduce Method and ACT Rules

Quantitative and Qualitative tests

Do Methods need applicability?

Viability of AND/OR relationship (Outcomes have AND, Methods have OR)

Passing rules vs. failing rules

Expectations vs. Test Procedures

Summary of resolutions

Diagnostics