WCAG 2.0 Evaluation Methodology Task Force Teleconference

11 Apr 2013

See also: IRC log


Martijn, Richard, Ericv, Detlev, Peter, Vivienne, Kathy, Mike, Sarah, Tim
Shadi, Alistair


welcome and documents

Eric: public working draft: the wording of link is wrong

state of comments

<ericvelleman> http://lists.w3.org/Archives/Public/public-wcag-em-comments/

<MartijnHoutepen> who just joined?

Eric: a few organisations are still preparing comments that should be in by the 15. April

test run

<ericvelleman> <http://lists.w3.org/Archives/Public/public-wai-evaltf/2013Apr/0014.html>

Eric: Discussing how to organise the test run
... discussing purpose of testng

Peter: some thoughts (will provide in email): do target audiendces fnd guidance clear?
... 'multiple' instead of two evaluators
... will suggest changes in a mail

Vivienne: Approached web site owners to ask to allow testing of their sites as objects evaluation of methodology
... site owners were very receptive

Eric: purpose of testing in this phase is to check suitability/ applicability of our sampling process
... discussed with Shadi the usefulness of the WBS system (the systwem usedfor questionnaires / surveys)
... to be used for recording results

Vivienne: Other purposes: an organisation that was already evaluated would be interested if out test run would produce similar results

Eric: Braillenet di such a comparison, looking at differences in results between their common approach and WCAG EM

<ericvelleman> test

Peter: the purpose questions should be direct questions to beta testers
... is was not clear whether we would explicitly ask testers to answer these questions

Eric: introducing next part, the cautions
... discussing purpose (NOT testing WCAG but the methodology, anonymising site names, etc.)
... Any comments on cautions?

Vivienne: Caution for ourselves - declaration of interest / involvement with any particular website being evaluatewd

Eric: may not be necessary here since we are just focusing on WCAG EM

Vivienne: The bias may still influence our approach

Peter: if we test random sampling, intimate knowledge of site may be in the way

Eric: discussing third topic of Email, plan
... we could use two or three web sites
... three or four nominated web sites
... will share names soon (but only EVAL TF internal)
... First test run could be first three steps (incl. sampling)
... then discuss outcome on list or the WBS system
... differneces found would be interesting input for open discussion on list

Peter; concerned about idea of splitting this up

Peter: important to do a full end-to-end run; the second time should go the whole way
... ending with sampling could leasd to a situation where the different samples that are used wmight not impact theresult

Eric: also interesting if different result would occur even with the same sample

Vivienne: most of the time client will have had recommendations / constraints regarding the sample size
... among us we will have different notions of what an adequate sample is

<Vivienne> so much depends upon their budget

Eric: if we come up in first part with samples of very different sizes, this needs to be adressed

Peter: another purpose question: is size of sample in line with typical sample size the reviewer would have otherwise used?

Detlev: there may not be a natural size of sample - depends on quality expectations
... is it worth having a 50 page sample if it just uncovers a few minor extra problems?

Eric: a few people should do a full run at least

<Vivienne> for part 1, the website I've provided for an example is one I've done following the methodology

<Vivienne> so in effect I've done the whole website's evaluation

Detlev: may be we should do the full rtun, but inter evaluator comparisons are difficult if the sample is not not identical

Peter: The miost important aim is to establish how close we are to 'done' with WCAG EM

Holisitc testing will be neeed to tell us how close we are to 'done'

(that was Peter)

Peter: Fine-grained info less important than a measure of how close we are to completion

Martijn: Split in parts will provide more information than a full run right now - we can use the input then do a holistic test

Peter: if we focus on unit testing we should focus on the same sample (step 4)

<MartijnHoutepen> i agree

<richard> I go for the full thing

Eric: Unit testing or full test run?

Vivienne: First test to see how close our samples are
... then compare step 4 based on same sample

Eric: asking Peter for software testing experience

Peter: Its not so much sw testing, more how testers read and apply the text of WCAG EM
... better to vary just one variable (human being) not the sample
... in the end we need to have all the variables varying

Eric: Let's start with part 1 (first three steps), asking Shadi what sites we will pick, out things in the WBS system
... then based on selected websites, carry out first three steps, Eric posing as web site owner communicating requirements
... should be out soobn (may be as early as tomorrow)

Vivienne: Sent info on organisation, wonders whether we need a memorandum of understanding so they know hoe their sites are going to be used

Eric: Let's leave that to Shadi
... MoU mentioned in mail to Shadi?

Vivienne: probably

Eric: will remind Shadi
... will send mail with site info and ask us to conduct steps 1-3


Mike: We can avoid issues with companies would be to use W3C sites...

Eric: W3C site is gigantic heterogenous, not so suitable

Peter: A gigantic site might be good for step one
... if W3C site contains samples of bad sites and what not to do - good for sampling, problem if these would not be found
... Advantage that discussion of site could be open

<Mike_Elledge> :^)

Detlev: prefers a more typical site

Peter: a large site is still useful if the result is: the site is too big to apply WCAG EM to

Summary of Action Items

[End of minutes]

Minutes formatted by David Booth's scribe.perl version 1.137 (CVS log)
$Date: 2013/04/16 17:37:07 $