See also: IRC log
Eric: public working draft: the wording of link is wrong
<ericvelleman> http://lists.w3.org/Archives/Public/public-wcag-em-comments/
<MartijnHoutepen> who just joined?
Eric: a few organisations are still preparing comments that should be in by the 15. April
<ericvelleman> <http://lists.w3.org/Archives/Public/public-wai-evaltf/2013Apr/0014.html>
Eric: Discussing how to organise the test run
 ... discussing purpose of testng
Peter: some thoughts (will provide in email): do
target audiendces fnd guidance clear?
 ... 'multiple' instead of two evaluators
 ... will suggest changes in a mail
Vivienne: Approached web site owners to ask to
allow testing of their sites as objects evaluation of methodology
 ... site owners were very receptive
Eric: purpose of testing in this phase is to
check suitability/ applicability of our sampling process
 ... discussed with Shadi the usefulness of the WBS system (the systwem usedfor
questionnaires / surveys)
 ... to be used for recording results
Vivienne: Other purposes: an organisation that was already evaluated would be interested if out test run would produce similar results
Eric: Braillenet di such a comparison, looking at differences in results between their common approach and WCAG EM
<ericvelleman> test
Peter: the purpose questions should be direct
questions to beta testers
 ... is was not clear whether we would explicitly ask testers to answer these
questions
Eric: introducing next part, the cautions
 ... discussing purpose (NOT testing WCAG but the methodology, anonymising site
names, etc.)
 ... Any comments on cautions?
Vivienne: Caution for ourselves - declaration of interest / involvement with any particular website being evaluatewd
Eric: may not be necessary here since we are just focusing on WCAG EM
Vivienne: The bias may still influence our approach
Peter: if we test random sampling, intimate knowledge of site may be in the way
Eric: discussing third topic of Email, plan
 ... we could use two or three web sites
 ... three or four nominated web sites
 ... will share names soon (but only EVAL TF internal)
 ... First test run could be first three steps (incl. sampling)
 ... then discuss outcome on list or the WBS system
 ... differneces found would be interesting input for open discussion on
list
Peter; concerned about idea of splitting this up
Peter: important to do a full end-to-end run; the
second time should go the whole way
 ... ending with sampling could leasd to a situation where the different
samples that are used wmight not impact theresult
Eric: also interesting if different result would occur even with the same sample
Vivienne: most of the time client will have had
recommendations / constraints regarding the sample size
 ... among us we will have different notions of what an adequate sample is
<Vivienne> so much depends upon their budget
Eric: if we come up in first part with samples of very different sizes, this needs to be adressed
Peter: another purpose question: is size of sample in line with typical sample size the reviewer would have otherwise used?
Detlev: there may not be a natural size of sample
- depends on quality expectations
 ... is it worth having a 50 page sample if it just uncovers a few minor extra
problems?
Eric: a few people should do a full run at least
<Vivienne> for part 1, the website I've provided for an example is one I've done following the methodology
<Vivienne> so in effect I've done the whole website's evaluation
Detlev: may be we should do the full rtun, but inter evaluator comparisons are difficult if the sample is not not identical
Peter: The miost important aim is to establish how close we are to 'done' with WCAG EM
Holisitc testing will be neeed to tell us how close we are to 'done'
(that was Peter)
Peter: Fine-grained info less important than a measure of how close we are to completion
Martijn: Split in parts will provide more information than a full run right now - we can use the input then do a holistic test
Peter: if we focus on unit testing we should focus on the same sample (step 4)
<MartijnHoutepen> i agree
<richard> I go for the full thing
Eric: Unit testing or full test run?
Vivienne: First test to see how close our samples
are
 ... then compare step 4 based on same sample
Eric: asking Peter for software testing experience
Peter: Its not so much sw testing, more how
testers read and apply the text of WCAG EM
 ... better to vary just one variable (human being) not the sample
 ... in the end we need to have all the variables varying
Eric: Let's start with part 1 (first three
steps), asking Shadi what sites we will pick, out things in the WBS system
 ... then based on selected websites, carry out first three steps, Eric posing
as web site owner communicating requirements
 ... should be out soobn (may be as early as tomorrow)
Vivienne: Sent info on organisation, wonders whether we need a memorandum of understanding so they know hoe their sites are going to be used
Eric: Let's leave that to Shadi
 ... MoU mentioned in mail to Shadi?
Vivienne: probably
Eric: will remind Shadi
 ... will send mail with site info and ask us to conduct steps 1-3
que!
Mike: We can avoid issues with companies would be to use W3C sites...
Eric: W3C site is gigantic heterogenous, not so suitable
Peter: A gigantic site might be good for step
one
 ... if W3C site contains samples of bad sites and what not to do - good for
sampling, problem if these would not be found
 ... Advantage that discussion of site could be open
<Mike_Elledge> :^)
Detlev: prefers a more typical site
Peter: a large site is still useful if the result is: the site is too big to apply WCAG EM to