IRC log of wcag-act on 2017-05-08
Timestamps are in UTC.
- 14:00:06 [RRSAgent]
- RRSAgent has joined #wcag-act
- 14:00:06 [RRSAgent]
- logging to http://www.w3.org/2017/05/08-wcag-act-irc
- 14:00:08 [trackbot]
- RRSAgent, make logs public
- 14:00:08 [Zakim]
- Zakim has joined #wcag-act
- 14:00:10 [trackbot]
- Zakim, this will be
- 14:00:10 [Zakim]
- I don't understand 'this will be', trackbot
- 14:00:11 [trackbot]
- Meeting: Accessibility Conformance Testing Teleconference
- 14:00:11 [trackbot]
- Date: 08 May 2017
- 14:00:44 [Wilco]
- agenda?
- 14:00:46 [Wilco]
- agenda+ Benchmark definition - Issue #81 https://github.com/w3c/wcag-act/issues
- 14:00:53 [Wilco]
- agenda+ Topics to address in the Rules https://www.w3.org/TR/act-rules-format/
- 14:01:01 [maryjom]
- maryjom has joined #wcag-act
- 14:01:03 [Wilco]
- agenda+ Test case format https://www.w3.org/WAI/GL/task-forces/conformance-testing/wiki/Testing_Resources
- 14:01:11 [Wilco]
- agenda+ Rules repository https://www.w3.org/WAI/GL/task-forces/conformance-testing/wiki/Rules_repository
- 14:01:17 [Wilco]
- agenda?
- 14:01:45 [maryjom]
- present+ MaryJoMueller
- 14:02:40 [Kathy]
- Kathy has joined #wcag-act
- 14:04:20 [skotkjerra]
- skotkjerra has joined #wcag-act
- 14:04:53 [shadi]
- scribe: shadi
- 14:05:09 [shadi]
- zakim, who is on the phone?
- 14:05:09 [Zakim]
- Present: MaryJoMueller
- 14:05:39 [shadi]
- present+ Wilco, Kathy, SteinErik, Debra
- 14:05:49 [shadi]
- present+ Moe
- 14:06:34 [shadi]
- present+ Sujasree
- 14:07:12 [shadi]
- Topic: Introductions
- 14:07:21 [shadi]
- Sujasree: leading Deque team in India
- 14:07:42 [shadi]
- Debra: enagement manager for Deque fedral account
- 14:07:55 [shadi]
- ...want to understand the direction and stay on top of it
- 14:08:11 [shadi]
- ...not a coder but happy to help
- 14:08:25 [shadi]
- zakim, take up next
- 14:08:25 [Zakim]
- agendum 1. "Benchmark definition - Issue #81 https://github.com/w3c/wcag-act/issues" taken up [from Wilco]
- 14:08:53 [Wilco]
- https://github.com/w3c/wcag-act/issues/81
- 14:09:29 [MoeKraft]
- MoeKraft has joined #wcag-act
- 14:09:36 [shadi]
- WF: previous discussion - two weeks ago - brought this up
- 14:09:58 [shadi]
- ...topic keeps coming up
- 14:10:06 [shadi]
- ...confusion about what was meant
- 14:10:38 [shadi]
- ...initially was meant as a mechanism to figure out if rules will generate false positives in practice
- 14:10:44 [shadi]
- ...measure their accuracy
- 14:10:55 [shadi]
- ...no real solution proposed
- 14:11:20 [shadi]
- ...initially thought about comparing tools to manually tested results
- 14:11:39 [shadi]
- ...but that idea is changing
- 14:11:54 [shadi]
- ...more about collecting feedback by using our rules
- 14:12:12 [shadi]
- ...let users try out the rules, and react to feedback
- 14:12:44 [shadi]
- ...IBM and Deque have a kind "beta" or "experimental" approach
- 14:12:52 [shadi]
- ...until tests are confirmed
- 14:13:25 [shadi]
- SES: so validation is, it is accepted until someone complains?
- 14:13:52 [shadi]
- WF: up for discussion
- 14:14:05 [shadi]
- SES: so what would the use be of the benchmarking?
- 14:14:25 [shadi]
- ...not sure to receive comments
- 14:14:40 [shadi]
- ...unless you have a mechanism to ensure testing
- 14:14:47 [shadi]
- ...but not guarantee
- 14:15:00 [shadi]
- ...so what is the purpose then?
- 14:15:04 [shadi]
- q+
- 14:15:55 [MoeKraft]
- Shadi: What I understood Alistair was proposing. When you develop a rule you develop test cases along with the rule. The two approaches are not mutually exclusive.
- 14:16:41 [shadi]
- q-
- 14:16:43 [MoeKraft]
- Shadi: w3c maturity model, things in draft or testing phase however we need at least a minimal amount of testing and ask for feedback for further validation.
- 14:16:57 [shadi]
- WF: already part of the spec to write test cases
- 14:17:19 [shadi]
- ...that filters out the known potential problems
- 14:17:29 [shadi]
- ...but that is not benchmarking
- 14:18:29 [shadi]
- WF: to respond to SteinErik, before rules are put on the repository, have test cases
- 14:18:51 [shadi]
- ...additionally a feedback mechanism
- 14:19:24 [shadi]
- DM: already an existing repository of failure conditions?
- 14:19:59 [shadi]
- WF: yes. currently different tools have their test case repositories, but want to merge these
- 14:20:11 [shadi]
- DM: so common place to validate the rules
- 14:21:10 [shadi]
- WF: thoughts?
- 14:21:23 [shadi]
- ...maybe need to include positive feedback too
- 14:21:46 [shadi]
- SES: concern about the usefullness of the information we receive
- 14:21:52 [shadi]
- ...no clear view yet
- 14:21:57 [shadi]
- q+
- 14:22:25 [Wilco]
- ack s
- 14:23:13 [MoeKraft]
- Shadi: We definitely are changing from what we originally had in mind for benchmarking. At least what we have in our work statement. If we do have a test suite, we could have tools run by themselves to provide information on how well developers support test suite.
- 14:23:40 [MoeKraft]
- Shadi: Not sure how many tool vendors would want to expose false positives.
- 14:24:13 [skotkjerra]
- q+
- 14:24:33 [MoeKraft]
- Shadi: There would have to be self declaration. But this could be criteria. Force some useful information to come back here. If tool implementation does not report rule comes back cleanly, then it is experimental.
- 14:24:43 [MoeKraft]
- Shadi: Test cases would need to be versioned too.
- 14:24:54 [MoeKraft]
- Shadi: Would be complex because of regression.
- 14:25:39 [MoeKraft]
- Shadi: Encourage someone who proposes are rule to get it implemented first. Plus, we would get more competition because tools vendors would try to implement and get a green light.
- 14:25:51 [shadi]
- WF: so what would be the evidence that rules work in practice?
- 14:26:04 [MoeKraft]
- Wilco: What would be the evidence that a rule is accurate?
- 14:26:04 [shadi]
- ...if enough tools implement it?
- 14:26:53 [MoeKraft]
- Shadi: First criteria, hope there is an active community that constantly reviews rules proposed. This is the first level of checking. Rules proposed by vendors
- 14:27:33 [MoeKraft]
- Shadi: If rule is accepted by competitor this is a good sign. Consensus building. Raise the bar to 3 independent implementations. Disclose how many vendors implement rule.
- 14:28:33 [shadi]
- WF: hesitant because implementation is only proof of accuracy if tools implement no false positives
- 14:29:07 [shadi]
- ...some rules indicate a level of accuracy
- 14:29:46 [shadi]
- ...for example if someone proposes a test rule with only 70 accuracy
- 14:30:13 [shadi]
- DM: can run against failure conditions, but can also check for false positives
- 14:30:23 [shadi]
- ...then there is semi-automated testing, where human intervention is needed
- 14:30:38 [shadi]
- ...may or may not test to the full standard
- 14:30:44 [shadi]
- ...like maybe only 50%
- 14:30:50 [shadi]
- ...so it is a scale to test
- 14:30:53 [shadi]
- q?
- 14:31:18 [shadi]
- WF: agree that some rules are more reliable than others
- 14:31:32 [shadi]
- DM: reliability and completeness
- 14:31:46 [shadi]
- SES: yes, but what does accurate actually mean?
- 14:32:42 [shadi]
- ...consistent and repeatable versus correct
- 14:33:13 [shadi]
- WF: "average between false positives and false negatives" (reads out from spec)
- 14:33:37 [Wilco]
- https://w3c.github.io/wcag-act/act-rules-format.html#quality-accuracy
- 14:34:50 [shadi]
- q+
- 14:34:55 [shadi]
- ack sko
- 14:34:57 [Wilco]
- ack sk
- 14:35:00 [Wilco]
- ack sh
- 14:37:46 [shadi]
- SAZ: think really trying to avoid a central group gate-keeping to scale up
- 14:38:00 [shadi]
- ...but need minimum bar of acceptance defined by the test cases
- 14:38:26 [shadi]
- ...this can be increased over time, as new situations and new technologies emerge
- 14:38:39 [shadi]
- ...may even need to pull rules at some point
- 14:39:01 [shadi]
- WF: can publish rules at any time, with different maturity flags
- 14:39:13 [shadi]
- ...fits with the W3C process
- 14:39:16 [shadi]
- SES: agree
- 14:39:22 [shadi]
- MJM: so do i
- 14:39:27 [shadi]
- SAZ: me too
- 14:40:11 [shadi]
- WF: maybe not more than one flag
- 14:40:23 [shadi]
- ...just something like "beta" or "experimental"
- 14:41:16 [shadi]
- SES: rely on implementers providing information that the tools function
- 14:42:13 [shadi]
- WF: when implemeters find an issue, and make a change, to encourage them to provide that feedback
- 14:42:37 [shadi]
- SAZ: ideally by adding a new test case
- 14:43:00 [shadi]
- WF: want to make sure that the rules stay in synch
- 14:43:59 [shadi]
- WF: implementers should give feedback by way of test cases
- 14:44:14 [shadi]
- ...does not work unless implementers share their test cases
- 14:44:49 [shadi]
- ...need iterative cycles, but need a way to do that
- 14:44:57 [shadi]
- ...have to encourage tool vendors
- 14:45:49 [shadi]
- MK: guess some vendors will not want to share all their rules
- 14:46:31 [shadi]
- WF: so can't take rules unless test cases are shared back
- 14:46:51 [shadi]
- MK: how to phrase that
- 14:47:43 [shadi]
- SES: fair expectation to set rules will be shared, because it is a quality check
- 14:48:40 [shadi]
- q+
- 14:49:20 [shadi]
- WF: does somebody want to take over writing up this part?
- 14:49:26 [shadi]
- ...also need a name change
- 14:49:33 [shadi]
- SES: how about validation?
- 14:49:48 [shadi]
- WF: think publication requirements
- 14:50:03 [shadi]
- ...talking about how test rules get posted on the W3C website
- 14:50:05 [Wilco]
- ack s
- 14:52:09 [skotkjerra]
- q+
- 14:52:31 [shadi]
- SAZ: can take this over
- 14:53:01 [shadi]
- ...like the idea of incentives
- 14:53:21 [shadi]
- ...and the cycle that the incentives drive
- 14:53:30 [shadi]
- SES: happy to work on this too
- 14:54:13 [shadi]
- RESOLUTION: SES will head up drafting the publication/validation/benchmarking piece, with SAZ supporting
- 14:54:25 [shadi]
- WF: MaryJo maybe you can help too
- 14:54:45 [shadi]
- DM: what are the failure conditions of a rule?
- 14:55:04 [shadi]
- WF: think that is what we can test procedure
- 14:55:45 [shadi]
- DM: clients want to know what in the rules triggers the "fail"
- 14:55:56 [Wilco]
- https://auto-wcag.github.io/auto-wcag/rules/SC4-1-1-idref.html
- 14:55:57 [shadi]
- WF: that is the test procedure
- 14:57:15 [Wilco]
- ack sk
- 14:58:42 [shadi]
- WF: SteinErik, draft by next week?
- 14:58:48 [shadi]
- SES: yes, will try
- 14:59:14 [shadi]
- SK: still behind but will catch up
- 14:59:31 [MoeKraft]
- +1
- 14:59:40 [Kathy]
- present+ Kathy
- 14:59:42 [shadi]
- DM: amazing effort! complicated, but excellent to address
- 15:00:43 [shadi]
- ...shared effort
- 15:01:04 [shadi]
- SK: do we create our own test files?
- 15:01:20 [shadi]
- WF: we have some test cases, will send you the link
- 15:02:00 [shadi]
- trackbot, end meeting
- 15:02:00 [trackbot]
- Zakim, list attendees
- 15:02:00 [Zakim]
- As of this point the attendees have been MaryJoMueller, Wilco, Kathy, SteinErik, Debra, Moe, Sujasree
- 15:02:08 [trackbot]
- RRSAgent, please draft minutes
- 15:02:08 [RRSAgent]
- I have made the request to generate http://www.w3.org/2017/05/08-wcag-act-minutes.html trackbot
- 15:02:09 [trackbot]
- RRSAgent, bye
- 15:02:09 [RRSAgent]
- I see no action items