Silver Community Group Teleconference -- 26 Nov 2019

update on Challenges

<jeanne> scribe: jeanne

Janina: The editor's draft is the most uptodate right now. It has everything we have gotten from everyone, including Success criteria from Detlev Fischer. After Challenge 4, it includes that part of Silver Research around Conformance.
... we hope that we will be able to get it published as a FPWD in December.
... we have issues in Github with interesting discussion.

<janina> Here are the issues:

<janina> https://github.com/w3c/wcag/issues?q=is%3Aissue+is%3Aopen+label%3A%22Challenges+with+Conformance%22

Sampling

<janina> scribe: janina

jeanne: Spent my weekend thinking about Silver! Is that fun?

<jeanne> https://docs.google.com/document/d/1y_HOyuMKltOQoZr0Gk7hMQXi3Jd8Mc5fyr-XkH7kZQY/edit#

jeanne: Notes many academics have thought and written about a11y conformance over the years
... Until Silver we haven't paid a lot of attention to that research
... I also spent a lot of time with WCAG-EM, as in "Evaluation Methodology"

<KimD> https://www.w3.org/TR/WCAG-EM/

jeanne: It seemed very applicable for a methodology
... Some useful terms and definitions, pointers to content resources, etc ...
... A nmice list of website types
... e.g. international translations of sites
... I really liked the procedure. It's flexible
... Methodology has steps for creating a structured sample plus a random sample
... Discusses how a single page might actually form one of a related set, and the set would need to be considered in this methodology

<Zakim> bruce_bailey, you wanted to say i need to take another look at EM

bruce: Really liked the em doc, the notion of exploring without the owner involved was a problem, though
... how else can you be sure you're covering all the essential design patterns?

<bruce_bailey> I would never want to do testing w/o cooperation with site owner

jeanne: Agrees it's more accurate when the owner is cooperating
... Notes that EM covers some specificity that we didn't have at Paciello

janina: Suggest this medhodology belongs somewhere with Silver, not sure where, but somewhere

jeanne: Can be used so many ways
... Notes she also reviewed the Japanese methodology
... Believe they're actually testing a high number, expensive

<bruce_bailey> i agree that 40 is a lot

jeanne: They say 40, I always tried for 12

<bruce_bailey> agree that a dozen is more reasonable

<KimD> me too; 40 seems high

jeanne: Notes Japanese methodology less flexible than EM, but more easily implemented
... Next looked at an EU decision on how they would measure public sector sites
... EU set a sample set for auto testing; then a subset for manual eval
... Also testing covers all M376 user categories
... They need only one SC for conformance
... Detlev believes that's too small

<jeanne> https://docs.google.com/document/d/1y_HOyuMKltOQoZr0Gk7hMQXi3Jd8Mc5fyr-XkH7kZQY/edit#heading=h.qb7ma0p9m0bv

<jeanne> WCAG-EM is complex procedure for obtaining the samples, but it is the most variable and adaptable to different conditions.

<jeanne> JIS X is simple to understand and perform, but may not provide sufficient coverage on very large sites.

<jeanne> Detlev’s article proposes (for a different context, but a good idea nonetheless) setting a sample size for automated testing and 5% of that sample for “in-depth”, a full evaluation.

<bruce_bailey> 5% for a large site seems like quite a lot

jeanne: Proposes an automated set and a subset for manual eval per EM
... For somewhat larger sites, over 1K pages, more ...

<KimD> 5% seems high to me too, for huge sites.

<jeanne> I took out the 5% for huge sites.

<jeanne> I took the WCAG-EM process for huge sites

<bruce_bailey> 5% huge, and worse if it is just random sample rather than smart sample

joe: Suggests prioritizing primary use cases, the necessary processes for using the site

kim: Here's what many customers do on our site
... Here's some that few do, but it's critical

jeanne: Quotes: "Identify the essential functionality"

kim: wonders whether we need an upper limit?
... What if you have millions of docs?
... We have law and regs from over 100 years, Federal, state, Canada, U.S., etc., etc.

<jeanne> Janina: It has to be doable, and it has to be prioritized: like the most recent documents are good, but the very old might be OCR-scan. You could sample across time.

<jeanne> WCAG-EM sampling method https://www.w3.org/TR/WCAG-EM/

jeanne: Think you would not have a huge sample set following EM
... Recommends considering what EM implies for our different cases

joe: tests for individual components
... then for individual pages
... then tests across groups, e.g. sites
... but then also primary use cases for users that have to be run through

angela: We also test that way, but also consider what gets more traffic

joe: think that's one of the ways we might prioritize fixes

janina: Suggests we can make example flows for various size sites

jeanne: notes how different the methodology for very large vs small
... but really huge auto testing is probably not practical or all that useful

joe: not so sure about that
... can't scale nonautomated testing
... in order to cover what we need to cover we must have good auto tests to help the little manual we can do

jeanne: doesn't amazon or facebook update too often for auto test

joe: we run on anything
... anyone with aws can do it
... a team could do their particular content
... this is part of the release process
... recalls he did a csun session on the amazon process

jeanne: an example?
... for rapidly updating material

joe: so a product page there will be hundreds of components contributing to that
... only way to control that to scan the content as it's being pushed; then it's released or not released
... that's not just accessibility
... all small component stuff
... it's just part of the release process

kim: notes that today's vpat allows for conformance claims constraints
... maybe we want people to be transparent about what and how they test?

- DRAFT -

Silver Community Group Teleconference

26 Nov 2019

Attendees

Contents

update on Challenges

Sampling

Summary of Action Items

Summary of Resolutions

Scribe.perl diagnostic output