Accessibility Conformance Testing Teleconference -- 08 Feb 2017

Previous week's survey results (from Feb. 1 meeting) https://www.w3.org/2002/09/wbs/93339/act-2017-01-30/results

https://www.w3.org/2002/09/wbs/93339/act-2017-01-30/results

Minutes approved.

Draft Section 4.3 Test Input Types https://github.com/w3c/wcag-act/pull/44/files?diff=split

Wilco: Romain talked about the different input types available. The list is HTTP Response, DOM Tree, Web Page, Web Driver.
... Comment Web Driver is more of an API. I suggest Web Driver Page. Any alternate suggestions?
... Does anyone have any specific terminology for this?
... Proposal, Web driver page instead of Web driver. Anyone against this change?

Alistair: What do you mean by this?

Wilco: a page controlled by a driver or web driver

Alistair: If you have complicated path through widget, abstract to page object
... Basic use of web driver is driving the browser.

Wilco: Does everyone know what we are talking about?
... API to control a web browser

Alistair: More sophisticate UA testing, will use some framework like Cucumber or Selenium to drive Web driver. If you have a complex UI, a form with multipaths, when you use framework will need step definitions
... Take form to X and do Y. May need 15 commands.
... Dull to put 15 steps. Extract into page object. Sits as bridge between web driver and page.
... No such thing as a web driver page
... We're talking about HTML documents, etc. Talking about type of content that goes into tests. Have we gone too far? Should we just talk about content?

Wilco: Reason, test requires interaction with page.

Alistair: Sets of instructions or interactions, but what we are doing is defining a task, e.g. load page, load page and take page to DOM state. But this is the task. Not what we need to do the task. Web driver is too detailed.

Wilco: Romain recommends actionable page

Alistair: User Acceptance Testing, give instructions on how to get to state but doesn't take into account "snippet" testing

Wilco: Question, how much context around snippet? Styled, event handling, etc.

Alistair: Look at DOM state as a de facto. May be 100s. Look individually. Take state and run test

Wilco: Where do we draw lines, preparsed and DOM Tree is clear, blurry DOM state with properties attached to it.

Charu: Trying to understand. What we have today, we have an engine that is independent of rules. We have option to run on rendered page or select event based scanning. Based on timing, will continuously check DOM
... But it's all independent of the rules
... Aren't we defining a framework for the rules and not the engine?

Wilco: Kind of
... Not that we put requirements on the engine but what if we have a rule that the engine run multiple times because content changed.
... May not be able to run on certain engines.
... Should we categorize these rules? Does engine do its own parsing or use browser with JS enabled?

Alistair: All things to do with testing tool.
... HTTP response, DOM Tree, Rendered page. Let's end there. Do not be overly prescriptive.

Charu: I agree with Alistair.
... Our rules run only on a rendered DOM, period.
... I think when we add all these other things we are too prescriptive

Alistair: Everything else just gets us to the right DOM Tree. Not input, just a method to get us there.

Wilco: What about testing focus styles? If you don't have access to web browser cannot force focus indicator. You can ask element to get focus but the browser focus does not show.

Alistair: This may require a more guided manual test
... maybe change rendered page to rendered page [what you see]
... HTTPResponse - refresh, Documents-Document Tree

Wilco: proposal to remove Web driver section as well.

<Wilco> By controlling the browser, events can be triggered in the page, and user interaction can be simulated. This can be done using drivers, the most common of which is currently WebDriver. With driver testing, interactions can be tested.

Wilco: Do we want this smushed into Rendered page?

Alistair: Just leave it out. How you get there is up to you.
... Most people will use Web Driver but don't need to

<Charu_> +1

<maryjom> +1

<agarrison> +1 to removing web driver section

Wilco: For those in favor of removing Web Driver please +1

<shadi> +1

<Wilco> +1

RESOLUTION: Remove Web Driver section from input type

Wilco: I will update pull request as proposed and merge it in

Rework Rule Description https://github.com/w3c/wcag-act/pull/42/files?diff=split

MoeKraft: My concern is that we are repeating accessibility requirement and related techniques from Rule Outline

Wilco: The Description will provide more details on the accessibility requirements and the related techniques
... I think we can leave this as is.

<Charu_> +1

Proposal: Accept pull request

<agarrison> +1

RESOLUTION: Accept Pull Request #42

Draft section 7.3 Accuracy Benchmarking https://github.com/w3c/wcag-act/pull/47/files?diff=split

Wilco: Shadi's feedback see this as separate validation steps.
... Ok, check for accuracy before 1st version and then reevaluate to see if rule needs adjusting

Shadi: What do we need in order to publish a test rule? What are the validation requirements?
... I don't think we want to require expert testing. Will we require that?
... This is like a secondary step, after some time or has feedback
... Need a scale-able way of validating tests. These tests conform to our test. Accuracy and interpretation. Needs to be scale-able.
... How do we determine if test goes into library or not

Wilco: What is the requirement for rule and what is the requirement for a rule to go into ACT repository?

Shadi: Interesting
... Repository will be rules that correctly interpret WCAG which is not necessary requirement for all rules.

Wilco: Doesn't say Benchmark is required. Just states how to do it. Reason: I can imagine that not everyone would want to invest time since this is not easy to do. So shouldn't be required.
... But if orgs do do it, it is useful If organizations do this, then this information should mean the same thing. That's why I think it should be in the standard.

Shadi: Starting to click. Benchmarking is not as clear to me.
... Seems to have a strong emphasis on expert testing. False positives and negatives are relying on expert testing to classify.
... What do others think?

Charu: My understanding is that we will have a test suite that the rule will be benchmarked against.

Wilco: Two parts. Implementation validation. Takes snippets for which we know results. Whether or not rule is implemented correctly.
... Benchmark looks at if rule produces correct results as compared to accessibility expert.

<shadi> 1+ to Alistair

Alistair: Some of this should be handled in test suite. Expected outcome + HTML snippet. Gauge if whole bunch of people disagree. Bring it to attention. We should be able to cull out those that are suspect.

MoeKraft: Our rules frameworks and engines are not perfect. We need a standard interpretation on how to determine false positive and negatives. Is this correct?

Wilco: Yes. We need a way to consistently define that and not just rely on feedback of people that catch these issues.

Charu: Doesn't this role back to interpreting the requirement correctly?
... If the requirement is interpreted correctly then you will know.

Wilco: I don't think I do.
... When rule gets complicated, I will have to adjust the rule.
... I know there are scenarios that could arise but not the way certain technologies are used. So I don't put that case in rule.

Charu: Agree. Hard to come up with all scenarios. If false positive comes up, look at requirement and see if it meets it.

Alistair: If we all agree. Yes. It's right to see all fringe cases. As we move forward, things change. Two things: 1. feedback loop, investigate it, add to unit test and make change if justified. 2. If we all pool our test suites we can find fringe cases for free
... Our suites probably look different but we can pool and compare and we can increase our coverage.

Wilco: Want experts to use tool and report issues. That's what Benchmark is supposed to do.

Alistair: one problem. This may work in an organization but if we come up with a whole load of tests we agree upon, what that requires is that the tests be updated in the wild. This requires a new feedback loop.
... We found fringe case. Then I need to feed this back to ACT team and collectively need to make that change.

Wilco: Absolutely. That's what we are talking about.

Alstair: Need to pool our test suites.

Wilco: Excellent discussion.

MoeKraft: What do we do with the PR?

Wilco: I will update it and send it out for review again.
... Have a great week!

Accessibility Conformance Testing Teleconference

08 Feb 2017

Attendees

Contents

Previous week's survey results (from Feb. 1 meeting) https://www.w3.org/2002/09/wbs/93339/act-2017-01-30/results

Draft Section 4.3 Test Input Types https://github.com/w3c/wcag-act/pull/44/files?diff=split

Rework Rule Description https://github.com/w3c/wcag-act/pull/42/files?diff=split

Draft section 7.3 Accuracy Benchmarking https://github.com/w3c/wcag-act/pull/47/files?diff=split

Summary of Action Items

Summary of Resolutions