Silver Community Group Teleconference

03 Dec 2019


jeanne, KimD, bruce_bailey, janina, Fazio
Angela, Makoto


<Fazio> I'm down

CSUN F2F meeting

<jeanne> FOrm:https://docs.google.com/forms/d/1Gwkux5VMHhJUavPSCCvD-hxb79XYjNGFyhN8DqDBS_o/

Finding sponsors for F2F meeting

<scribe> scribe: KimD

Sampling proposal

Jeanne: the Sunday in the survey is the Sunday prior to CSUN.
... anyone who can help with sponsoring us, please contact Jeanne directly

Conformance and point system

Jeanne: continue from where we left off last week regarding the sampling proposal.

<jeanne> https://docs.google.com/document/d/1y_HOyuMKltOQoZr0Gk7hMQXi3Jd8Mc5fyr-XkH7kZQY/edit#

Jeanne: 100% or nothing is not going to solve the problem of conformance for large corporations.

Regarding sampling... Jeanne proposing (see section "Different sizes, Different sample sizes")

Sites under 100 pages or products under 100 screens (including counting the different dynamic states) need to test every page or screen.

scribe: sample sizes of 10% would still be too large - would not scale for large companies
... in-depth testing doesn't scale for large companies

Does anyone have any ideas?

Joe: can be measuring at a component level, not a page level

Janina: this is a discussion by the challenges paper (?)

Joe: Is this for sampling for live content (already in production) or going-forward?
... think about it at the component, page, then product level - different testing for each.

<janina> https://github.com/w3c/wcag/issues/943

Joe: a lot of variability between content and companies.

Jeanne: conformance is about measuring whether the a11y job got done correctly
... can be really small or really big, so often guidance is general.
... it might be good to identify the scope.

Janina: We're moving toward one size doesn't fit all, which is ok as long as we have a path for each
... if larger websites, then maybe a cms system is needed.
... need to scope conformance
... methodology, date, what was tested

Jeanne: do we want to do that? This sounds more like VPAT approach. Is that ok?

Janina: What does it mean to conform to W3C?

Jeanne: keep it simple, conformance is about knowing whether you did it right.
... stay away from claims, legality, etc.

Janina: WCAG 2.x is very specific about conformance.
... from results, it can be used to assert more than that
... how descriptive do we want to be? Or should we be prescriptive

Jeanne: What kind of claim do we want people to make?

Bruce: We really do need to break for something that's beyond page and 100% complaince
... but it's really hard and not clear how to hit that middle ground. 508 is more 100% and doesn't address "good faith effort"

Joe: I like the idea of having large site owners pick, then identify % based on workflow
... important things can be addressed first
... can keep people focused on primary workflows
... could add in traffic too

Janina: I think about banks... front-line on the a11y front, but haven't gotten themselves all the way there.
... for example, 'preferences' not always accessible. Main functionality is often pretty good
... "accessible options"
... aural interface might have a big market - could multi-task
... screen reading functionality can be used in other settings (unintended benefits)

Jeanne: Regarding sampling, how would we address component v. overall screen?

Joe: might not work for all orgs, but component level might be an alternate track.
... would enable conformance claims for certain flows/tasks

Jeanne: take out section

Joe: automated testing takes place as part of a release

Jeanne: what about testing older (legacy) content?

Joe: we focus at the time of deployment. So older stuff could get missed with that approach.
... maybe that's an additional metric that goes along with workflows.
... reference material could fall into that category

<jeanne> Kim: We kind of do that now.

<jeanne> ... we do an ad hoc approach where we test the workflow on a page and the components on a page.

<jeanne> ... we don't do automated testing, because our pages are complex and have a lot of legacy code that wouldn't pass an automated test, even if it was accessible.

Jeanne: let's leave this for now, and move on.
... when we figure out the very large, we'll have to move on to the medium-sized then.
... we should let people chose their method of measuring
... next, let's talk about how this connects to point-scoring.

Sampling and the point scoring system

Under section "Connection to Conformance Point Scoring" - very drafty

scribe: came from NIST in the US
... got some of this from a paper from NIST.
... how do we weight a point-scoring system? Very tough.
... WCAG divides into A, AA, AAA. One issue is that there's no written explanation about how the SCs got their levels.
... probably in an understanding doc, after publication.
... For silver, we need to be clear and document the weighting
... when looking at the point system 1.5 years ago, several of us did some testing with different weighting systems
... tested against current WCAG to see how they came out.
... if there was a more difficult barrier, difficult to implement, to test, etc. 10-ish factors.
... we didn't like the result
... this summer, different proposals for point-scoring systems - they didn't have the same background we did
... they didn't address the problems we were trying to solve, and didn't seem fair across disabilities.
... needs to be fair.
... a lot at A that are for visual impairment, but a lot at AAA for cognitive. Result of weighting.
... NIST paper did weighting by how good is the evidence. How strong is it?
... this seems good. It gets back to "how do you know you did it right"
... weighting based on evidence seems to solve a lot of problems.
... there are a lot of formulas in this section

each method in Silver has methods and each method could have a test.

scribe: this gives the possibility to have % of success.

Intrinsic value Yes = 1, No = 0

"Another question that may be asked is :To what degree does the alt text value serve the equivalent purpose to what the image conveys? 0=not at all. 0.5=somewhat, and 1=entirely. "

"The answer to this second question would be 1 (entirely). So in this example, the intrinsic evidence value would be 1, computed as follows: (1) (first question answer) + 1 (second question answer)/2 (number of questions) = 1. "

scribe: could be based on tests we put in Silver
... next part is based on quality. This is complicated
... based on 5 separate scores:

reproducibility, objective measurability, relevance to the associated success criterion, not subject to compromise or tampering degree of uncertainty or error

Jeanne might drop "relevance to the associated success criterion, not subject to compromise or tampering"

scribe: add scores for each of the 5, then divide by 5, and that feeds back to the score for that SC

<Fazio> the problem is some Sc's are critical failures

David: what about critical failures?
... "big red dog" example you have to have both to make it clear what the image is
... if not good, it could stop someone in their tracks. We need to identify critical failures - things that must be right

Janina: you might be right, in ALT example, ALT is perhaps more critical if the image is also a link than if it's a flat image

Jeanne: we could put in tolerances. If it's a navigational control, it has to be 100%, but maybe informational could be 80%
... we're kind of moving away from critical failures bcs of pushback from COGA.

COGA-related got pushed to AAA because it seemed like there were work-arounds

scribe: COGA points out that added together, they do become critical failures
... thus severity weighting made it so we introduced a structural bias against COGA

<bruce_bailey> See:

<bruce_bailey> https://www.w3.org/WAI/GL/wiki/WCAG_2.x_Priority_levels_discussion#Tabular_View_of_Common_Factors

scribe: not intentional, but that's what happened.

Janina: challenges
... like a survey after next weeks' AGWG meeting to see if we can go to FPWD
... 4 issues left
... Janina proposes to send an email and explain this

Jeanne: apologies for not saving time
... any questions?

Summary of Action Items

Summary of Resolutions

[End of minutes]

Minutes manually created (not a transcript), formatted by David Booth's scribe.perl version 1.154 (CVS log)
$Date: 2019/12/04 01:04:06 $

Scribe.perl diagnostic output

[Delete this section before finalizing the minutes.]
This is scribe.perl Revision: 1.154  of Date: 2018/09/25 16:35:56  
Check for newer version at http://dev.w3.org/cvsweb/~checkout~/2002/scribe/

Guessing input format: Irssi_ISO8601_Log_Text_Format (score 1.00)

Present: jeanne KimD bruce_bailey janina Fazio
Regrets: Angela Makoto
Found Scribe: KimD
Inferring ScribeNick: KimD
Found Date: 03 Dec 2019
People with action items: 

WARNING: IRC log location not specified!  (You can ignore this 
warning if you do not want the generated minutes to contain 
a link to the original IRC log.)

[End of scribe.perl diagnostic output]