08:07:55 <RRSAgent> RRSAgent has joined #eval
08:07:55 <RRSAgent> logging to http://www.w3.org/2012/10/29-eval-irc
08:07:57 <trackbot> RRSAgent, make logs world
08:07:57 <Zakim> Zakim has joined #eval
08:07:59 <trackbot> Zakim, this will be 3825
08:07:59 <Zakim> ok, trackbot; I see WAI_WCAG_()4:00AM scheduled to start 8 minutes ago
08:08:00 <trackbot> Meeting: WCAG 2.0 Evaluation Methodology Task Force Teleconference
08:08:00 <trackbot> Date: 29 October 2012
08:08:12 <shadi> zakim, call St_Clair_1
08:08:12 <Zakim> ok, shadi; the call is being made
08:08:13 <Zakim> WAI_WCAG_()4:00AM has now started
08:08:14 <Zakim> +St_Clair_1
08:10:39 <shadi> chair: ericvelleman
08:11:45 <JohnS> JohnS has joined #eval
08:13:13 <Ryladog> hello all
08:13:21 <shadi> scribe: shadi
08:14:08 <shadi> agenda: http://www.w3.org/WAI/ER/2011/eval/f2f_TPAC
08:22:37 <shadi> [Present in room: Eric Velleman, Katie Haritos-Shea, Ramon Corominas, Vivienne Conway, Shadi Abou-Zahra, Jason Kiss (observer), John S Lee (observer), David McDonald (observer)]
08:24:44 <David_MacD_Lenovo> David_MacD_Lenovo has joined #eval
08:25:58 <shadi> Topic: Improving coverage for web applications
08:26:46 <shadi> Eric: would like to address (1) scope of "web application" in the context of WCAG-EM and (2) how we address web applications in WCAG-EM
08:27:13 <shadi> ...what information is missing and where do we need to add information
08:32:19 <vivienne> www.w3.org/TR/WCAG-EM/
08:32:33 <shadi> agenda: http://www.w3.org/WAI/ER/2011/eval/f2f_TPAC
08:33:05 <shadi> Eric: web application is an application rendered via a browser
08:33:25 <shadi> Ramon: what about Air that is installed locally
08:33:42 <shadi> Katie: downloaded using HTTP
08:34:27 <shadi> Shadi: WCAG 2.0 defines web content as that delivered via HTTP and rendered via a browser
08:38:16 <shadi> Katie: develop a new version of the methodology after WCAG2ICT work is completed?
08:39:54 <shadi> Shadi: need to make sure we address what WCAG defines as web content, and make sure we do not break usage for other contexts
08:40:25 <shadi> Ramon: should add example of web applications
08:40:39 <shadi> ...currently no differentiation to web application
08:41:21 <shadi> Vivienne: need to spell out what web application is, because some people consider a web application is not a website
08:42:58 <David_MacD_Lenovo> http://www.tbs-sct.gc.ca/pol/doc-eng.aspx?id=23601&section=text Government of Canada Web pages refer to static and dynamic Web pages and Web applications.
08:45:03 <shadi> Ramon: impact of some success criteria on web applications are different that for more static websites
08:45:40 <shadi> ...for example when screen is magnified may impact an applications with many widgets more than a document with more text
08:46:27 <shadi> Katie: would not want to single out and separate specific requirements
08:46:41 <shadi> ...screen magnification also impacts other situations
08:50:15 <shadi> Shadi: maybe makes sense to point out to evaluator particular success criteria that occur more frequently in the context of web applications
08:51:15 <shadi> Vivienne: use a set of such criteria when evaluating web applications
08:52:14 <shadi> ...more important than sampling because could be only few web pages
08:53:54 <shadi> David: rather than defining web application just use the WCAG approach of calling it content
08:56:34 <shadi> ## Examples of Web Applications ##
08:56:39 <shadi> calendar widget
08:56:47 <shadi> online forms
08:57:22 <shadi> word processor
08:57:26 <shadi> ...
08:57:38 <shadi> Ramon: something that performs an action
08:57:57 <shadi> John: typically over several iterations of interaction
08:59:24 <shadi> Jason: interaction where user provides some input and a response based on that output
08:59:51 <shadi> John: the nature of that response is what differentiates them
09:00:09 <shadi> Jason: sometimes thing applications don't exist
09:00:16 <shadi> ...it is about content and interaction
09:00:41 <shadi> ...more dynamic than traditional static pages
09:01:11 <shadi> ...model not fundamentally differnet but usability may be different
09:01:40 <shadi> John: in web applications some of the logic happens on the client side
09:02:44 <shadi> Jason: discrete set of functionality that serves a specific purpose?
09:04:09 <shadi> ...what about client side scripts that generate a series of pages
09:04:26 <shadi> David: use "horse power" for cars even though no horses pull the cars anymore
09:04:58 <shadi> ...maybe want to keep "web page" despite the new paradigms
09:05:46 <shadi> Ramon: want to avoid that people think WCAG-EM is not applicable to what people perceive as a web application
09:08:43 <shadi> Vivienne: current descriptions include web applications as part of website
09:08:51 <shadi> ...maybe only need some more examples
09:13:52 <shadi> Katie: test all the functionality on a page
09:14:10 <shadi> Ramon: cannot test all the functionality on an application like Google docs
09:14:27 <shadi> ...because possibly many thousand individual ones
09:14:37 <shadi> ...usually group the types of functionality
09:18:40 <shadi> ## in a web application lots of functionality and content may be compressed into a single web page
09:19:17 <shadi> ## there may also be lots of repitition of components (blocks)
09:20:18 <shadi> ## requirement for "complete transaction" is also frequently an issue
09:21:14 <shadi> Vivienne: deciding the parameters of a web application, where it starts and where it ends
09:22:58 <shadi> ### Example of Web Applications
09:23:17 <shadi> iTunes is a browser
09:23:41 <shadi> internet banking part of the bank website
09:24:09 <shadi> webmail client
09:24:20 <shadi> hotel or airline booking
09:26:25 <shadi> (bank may not consider internet banking application as a website in itself)
09:27:39 <shadi> (booking websites may have distinct search versus booking functionality)
09:27:58 <shadi> s/iTunes is/iTunes homepage is
09:29:15 <shadi> live-time tickers like for scores or stock market
09:29:25 <shadi> social networking applications
09:34:39 <shadi> ## discussion about dependency: on a traditional website there may be more easily separable areas (like "the math department") whereas in an application there may be dependcies, like the path to get to the particular part of an application (like an "app" on facebook)
09:35:25 <shadi> tax calculator
09:59:40 <Zakim> -St_Clair_1
09:59:40 <Zakim> WAI_WCAG_()4:00AM has ended
09:59:40 <Zakim> Attendees were St_Clair_1
10:06:48 <Ryladog> Zakim, scribe Ryladog
10:06:48 <Zakim> I don't understand 'scribe Ryladog', Ryladog
10:07:25 <Ryladog> scribe:Ryladog
10:07:38 <Ryladog> Topic, Revision of the current sampling approach
10:08:12 <David_MacD_Lenovo> David_MacD_Lenovo has joined #eval
10:08:12 <shadi> zakim, call St_Clair_1
10:08:12 <Zakim> ok, shadi; the call is being made
10:08:13 <Zakim> WAI_WCAG_()4:00AM has now started
10:08:15 <Zakim> +St_Clair_1
10:08:20 <David_MacD_Lenovo> test
10:08:44 <ericvelleman> ericvelleman has joined #eval
10:16:17 <David_MacD_Lenovo> David_MacD_Lenovo has joined #eval
10:16:41 <Ryladog> Agenda: Topic, Revision of the current sampling approach
10:17:13 <Ryladog> EV: We do not have a random sample at the moment in our methododlogy
10:17:51 <Ryladog> EV: We define in in Scope section 3
10:18:07 <shadi> Present: Eric Velleman, Katie Haritos-Shea, Ramon Corominas, Vivienne Conway, Shadi Abou-Zahra, Jason Kiss (observer), John S Lee (observer), David McDonald (observer)
10:18:27 <Ryladog> EV: ◦3.3 Step 3: Select a Representative Sample■3.3.1 Step 3.a: Include Common Web Pages of the Website ■3.3.2 Step 3.b: Include Exemplar Instances of Web Pages ■3.3.3 Step 3.c: Include Other Relevant Web Pages ■3.3.4 Step 3.d: Include Complete Processes in the Sample
10:19:05 <Ryladog> 3.3 is Missing random sampling. We want to add it to 3.3
10:20:23 <Ryladog> VC: Perfromed a test 25% of total sample size could be random. 90% of the pages were represented in the structured pages
10:21:00 <Ryladog> VC: Had a problem finding random pages that were not already in her structured pages
10:21:02 <jkiss> jkiss has joined #eval
10:21:37 <Ryladog> VC: That could be very expensive, random sample would need to change for next test
10:22:10 <Ryladog> DM: Why would you exclude pages because it uses a template?
10:22:22 <Ryladog> VC: Because they were all so similar
10:22:56 <Ryladog> DM: I would not worry about overlap
10:23:20 <Ryladog> DM: 4 things that a random sample will help with
10:24:22 <Ryladog> number 1, like the tax filing example,1. they could miss something (to ensure who website)
10:25:03 <Ryladog> number 2, inadverant parts that will be missed
10:25:35 <Ryladog> number 3, coming....
10:25:53 <Ryladog> EV: Why do random pages?
10:26:01 <shadi> http://lists.w3.org/Archives/Public/public-wai-evaltf/2012Sep/0066.html
10:27:26 <Ryladog> RC: Newpaper, one news item, it is not really random
10:27:48 <Ryladog> EV: The idea is that it should be random
10:28:42 <Ryladog> RC: Every three month automatic testing, then every other three months is maual.
10:29:20 <Ryladog> DM: Self evaluation, vs. IV&V
10:31:15 <Ryladog> SAZ: With an honest evaluator performing a random sampling, have a detective's nose........
10:32:32 <Ryladog> SAZ: We pick out of a completely structured example, and random sample should produce the same results........if the chose is truely random
10:33:19 <Ryladog> VC: It would be good to compare structured and random results
10:33:34 <Ryladog> RC: That could be difficult
10:34:40 <Ryladog> RC: 1 home page, 3 landing pages and 1000 content pages
10:35:30 <Ryladog> EV: The random sample should in no way be worse than the structured sample
10:36:36 <Ryladog> SAZ: Any random selected pages should never perform worse than structured
10:37:50 <Ryladog> VC: I have one example where random wont work.
10:38:24 <Ryladog> JK: Australian Gov says that 10% of the pages need to be tested.
10:40:45 <Ryladog> KHS: Should we recommend that identification of random sampling was performed with any methodology testing
10:41:19 <Ryladog> VC: Does for research a % of failures
10:42:37 <Ryladog> SAZ: I want to push the sampling responsibility to the evaluator
10:42:56 <Ryladog> SAZ: The result you provide should prevail
10:44:21 <Ryladog> RC: For conformance we need to include sampling
10:46:28 <Ryladog> DM: Gov of Canada, has an example/sample size requirements, is 10 or minutes 90%
10:48:27 <Ryladog> JK: Department of statistics in Canada put this together. They have gone up to 69 pages for a site of 3,000 pages or +. If the TF is going to go this way, by + or - 5% or 10% - that might be reasonable
10:48:36 <Ryladog> EV: I am not sure
10:48:52 <Ryladog> JK: Standard deviation
10:49:20 <Ryladog> VC: Of your 25% should be random
10:49:48 <Ryladog> VC: Gregg V we thong suggested that number 25%
10:50:12 <Ryladog> VC: Why random sample, it keeps you honest
10:50:41 <Ryladog> EV: Purpose, Conformation of the outcome of your structures sampling appraoch
10:51:06 <Ryladog> EV: Comparison of 2 or more sites or instances of the same site
10:51:55 <Ryladog> VC: Random sampling is better for re-assessments of the same site
10:53:11 <Ryladog> SAZ: There are different gradients of sampling
10:53:43 <Ryladog> semi-structured selection, under specific criteria - with out identifying the pages
10:54:35 <Ryladog> Ramon: instructions of how to select the pages including randomness
10:54:44 <Ryladog> Shadi: perhaps the term of 'variation'
10:55:11 <ericvelleman> ericvelleman has joined #eval
10:56:02 <Ryladog> David: tax department doesn't give you the opportunity to choose the tax receipts they review - the whole point of not being able to choose - takes that out of the equation.  A lot of people are not as passionate about accessibility - they just want the checkmark.  There aren't a lot of people who know
10:56:28 <Ryladog> David: there should be  component of random sample - some automated or third party selected
10:57:18 <Ryladog> VC: What validity of a third party if you do it yourself?
10:58:07 <shadi> http://www.macorr.com/sample-size-calculator.htm
10:58:27 <Ryladog> KL: Self assessment can work well
10:58:56 <Ryladog> KHS: The result need to be the same - as long as the outcome is the same.
11:00:05 <Ryladog> KL: Always x pages should be be always evaluated and x number of pages will be randomly chosen
11:00:38 <Ryladog> VC: Austalia could use this methodology
11:02:02 <Ryladog> RC: We do multi-sites evaluations just to compare
11:02:09 <shadi> Number of Pages in the Website / Sample Size
11:02:09 <shadi> 5 / 5
11:02:09 <shadi> 10 / 10
11:02:09 <shadi> 25 / 23
11:02:09 <shadi> 50 / 42
11:02:10 <shadi> 100 / 73
11:02:12 <shadi> 125 / 86
11:02:14 <shadi> 150 / 97
11:02:18 <shadi> 200 / 116
11:02:20 <shadi> 250 / 131
11:02:22 <shadi> 350 / 153
11:02:24 <shadi> 500 / 176
11:02:26 <shadi> 750 / 200
11:02:28 <shadi> 1000 / 214
11:02:30 <Ryladog> Not to check conformance  - but to see what the people do with the website
11:04:41 <shadi> http://www.macorr.com/sample-size-calculator.htm
11:05:11 <shadi> (85% - 95% confidence)
11:05:50 <Ryladog> RC: We do random sampling with automated testing
11:06:24 <Ryladog> EV: The difference when random with automated or structured
11:07:39 <Ryladog> EV: Purposes: Confirmation,
11:08:39 <Ryladog> DM: Random sampling would be used as a validation of structured testing
11:10:05 <Ryladog> SAZ: Current guidance is: Use and automated tool and then pick your pages
11:10:31 <Ryladog> SAZ: How are we going to change that to include 'random'
11:10:35 <jkiss> s/69 pages/68 pages/
11:11:26 <jkiss> s/JK: Australian Gov/VC: Australian Gov/
11:11:54 <Ryladog> SAZ: Two things we need to tackle with random sampling: Over-site and Intentional unearthing of errors
11:12:25 <Ryladog> EV: Should we require random sampling
11:12:47 <Ryladog> VC: We should require it
11:12:52 <Ryladog> JK: We should require it
11:12:57 <Ryladog> KHS: We should require it
11:13:26 <Ryladog> VC: Over time this matters
11:15:09 <Ryladog> RC: Unsure to require it
11:15:47 <Ryladog> Sample should be representative
11:16:21 <Ryladog> JK: For a small site the entire site is your representative sample
11:17:02 <Ryladog> JK: The who idea is to avoid the view - in Canada, every site needs to test the home pages, pages with media,
11:18:21 <Ryladog> DM: We should require it
11:22:48 <Judy> Judy has joined #eval
11:24:28 <Ryladog> ACTION: Will check with Canadian Treasury Board Secretariat's' web-site statistical audit guidance folks - to help us make a determination on sample size
11:24:28 <trackbot> Sorry, couldn't find Will. You can review and register nicknames at <http://www.w3.org/WAI/ER/2011/eval/track/users>.
11:25:18 <Ryladog> ACTION:Jason Kiss will check with Canadian Treasury Board Secretariat's' web-site statistical audit guidance folks - to help us make a determination on sample size
11:26:28 <Ryladog> RC: Small sample size just for PASS FAIL - not for conformance - then a full evaluation
11:34:30 <Ryladog> Group reviews Sampling survey - 29 repondents
11:44:26 <Ryladog> JK: Is Random Sampling used for economical ans staffing? Yes all agree
11:50:12 <shadi> zakim, drop st
11:50:12 <Zakim> St_Clair_1 is being disconnected
11:50:13 <Zakim> WAI_WCAG_()4:00AM has ended
11:50:13 <Zakim> Attendees were St_Clair_1
12:52:58 <David_MacD_Lenovo> David_MacD_Lenovo has joined #eval
13:01:13 <shadi> shadi has joined #eval
13:10:03 <jkiss> jkiss has joined #eval
13:13:12 <ericvelleman> ericvelleman has joined #eval
13:13:18 <jkiss> jkiss has joined #eval
13:15:30 <vivienne> scribe: vivienne
13:16:09 <shadi_> shadi_ has joined #eval
13:16:23 <vivienne> topic: revising the current performance score
13:16:45 <shadi> scribe: vivienne
13:17:59 <vivienne> Eric: Step 5C is the performance score
13:18:06 <Judy> Judy has joined #eval
13:18:34 <vivienne> eric: only compulsory is providing documentation
13:20:09 <vivienne> eric:  how do you score? There are a few possitilities - total website, web page or instance.  What would it look like? Do we want to make it mandatory, or if there is a score it must be between 1-10 etc
13:20:34 <vivienne> Katie: somewhere between green & yellow
13:21:28 <Detlev> Detlev has joined #eval
13:21:33 <vivienne> Ramon: don't like global scores because they convey whether you have done it well or bad.  If you give 90% and you have a disability, even that 90% is bad for those people.  It tends to make people complacent - they feel they are pretty good.
13:21:42 <vivienne> Eric: the easiest - fail or not fail
13:22:02 <vivienne> Ramon: we use severity and frequency according to the SC
13:22:15 <vivienne> Ramon: the global score tends to be for the visually impaired
13:23:13 <vivienne> Detlev: if the score is based on WCAG, then you measure the score based on the criteria.  There could be an argument for a universal score.  It may not serve user groups well, because of something like captioning in which case it would fail for them completely.
13:23:57 <vivienne> Ramon: the problem with the approaches in the EM - eg keyboard accessibility, it seems like 99% accessibility, but it seems for most people as completely accessible.
13:24:13 <vivienne> Ramon: say for an epileptic person some criteria would be a major fail
13:24:35 <vivienne> Ramon: very difficult to find out which criteria affect which group
13:24:57 <vivienne> David: that's why we don't use the word priority - it is very political
13:25:48 <vivienne> Ramon: we are part of a foundation for people with disabilities, we cannot discriminate between the different groups.  We don't like scoring because of that.  Any scoring has to be very clear that covers all ofd the possible types of disability.
13:27:07 <vivienne> Detlev: one example - the BITV test - you score each success criteria for each page with a scale.  If you have 95% it doesn't mean one SC would fail completely, it means that for 1 criteria you have less than ideal results.  EG colours could be technically a failure (4.2:1), but it is a nearly pass.  These near passes then add up
13:28:12 <vivienne> Detlev: if there are 'accessibility killers' that even when you have a good score and you find a keyboard trap it would be downgraded to unaccessible.
13:28:25 <vivienne> Eric: how did you get to the scale?  You are scoring according to severity?
13:29:22 <vivienne> Detlev: we had a number of criteria that are critical.  For every SC and every test that you can downgrade.  We always have 2 testers for a final test - is this vital?  Sites which have vital failures will seldom reach 95% anyway.
13:30:28 <vivienne> Ramon: we try to avoid any numbering of the score.  We try to give a subjective opinion - you are doing good, bad, almost accessible, terrible.  We don't want to put a number because when the company comes to us and says they have 10% this means that our website is so terrible that we can't do anythingt without spending a lot of money
13:30:41 <vivienne> Katie: you don't have to use numbers
13:31:13 <vivienne> Ramon: we use 2 columns severity - and frequency - each of them 1-3.
13:31:20 <vivienne> Shadi:  what is the severity of a SC?
13:31:27 <vivienne> Ramon: it is a subjective analysis
13:31:51 <vivienne> Katie: these are the 4 critical failure points
13:32:27 <vivienne> Ramon: even if the alternative text is bad, it may not be a problem.  But it is subjective.
13:33:02 <shadi> zakim, call St_Clair_1
13:33:02 <Zakim> ok, shadi; the call is being made
13:33:03 <Zakim> WAI_WCAG_()4:00AM has now started
13:33:05 <Zakim> +St_Clair_1
13:33:38 <Zakim> -St_Clair_1
13:33:40 <Zakim> WAI_WCAG_()4:00AM has ended
13:33:40 <Zakim> Attendees were St_Clair_1
13:34:14 <vivienne> Katie: we correlate with FQT & QA we customize - critical/serious/moderate/not as moderate.  This has to be fixed now, next bill etc.  Because you are tracking, you can use that level.  We don't use numbers.  Preferably I use critical/serious and moderate.  If it is 1 alt text on 1 page, that would be minor.
13:34:21 <vivienne> Eric: do you look at Frequency?
13:34:41 <vivienne> Katie: yes, that comes up in tracking - which level, which SC,
13:34:59 <vivienne> Eric: you don't say green/orange/red/black
13:35:27 <vivienne> Katie: in my world we have laws attached - section 508 - deals witrh what you much fix - fix critical first
13:36:22 <shadi> zakim, call St_Clair_1
13:36:22 <Zakim> ok, shadi; the call is being made
13:36:23 <Zakim> WAI_WCAG_()4:00AM has now started
13:36:25 <Zakim> +St_Clair_1
13:36:36 <vivienne> Ramon: a project asked this specifically for priority table - we combined severity/frequency with impact that the fix would have.
13:36:58 <Zakim> -St_Clair_1
13:36:59 <Zakim> WAI_WCAG_()4:00AM has ended
13:36:59 <Zakim> Attendees were St_Clair_1
13:37:26 <vivienne> Ramon: we include as to whether it is easy or hard to fix.  The priority may be that the impact of fixing it would great and how easy it would be to fix.
13:37:47 <vivienne> David: priority, frequency impact and effort
13:37:54 <shadi> zakim, call St_Clair_1
13:37:54 <Zakim> ok, shadi; the call is being made
13:37:55 <Zakim> WAI_WCAG_()4:00AM has now started
13:37:56 <Zakim> +St_Clair_1
13:40:50 <Detlev> Vivienne pass / fail / N.A. not tested
13:41:16 <Detlev> Vivienne: score of percentages across pages of pass/fail
13:41:36 <Detlev> Viviennw: that was for professional work
13:42:10 <Detlev> Vivienne: for research, create an avergage score across pages to be able to compare sites
13:42:41 <Detlev> Vivienne: needs quantitative score of liberies, retest to check if repairs have been done over time
13:43:19 <Detlev> Vivienne: so two different worlds
13:43:30 <Detlev> Vivienne: client love charts
13:44:03 <Ryladog> Ryladog has joined #eval
13:44:46 <Detlev> Vivienne: for research, adding up violation (any 4 critical points in conformance criteria) get extra 5 points to add significance
13:46:33 <Detlev> Vivienne: in reporting a hint of wherther it is a global or individual probem (shared page problem)
13:48:26 <Detlev> Vivienne: for comm work per pagethere will be pass/fail for every SC which then get aggregated across pages
13:49:49 <vivienne> David: accessibility differs for each client based on their goals.  For web applications it does change differently.  We had a template with all of the SC and you had an example the first time you ran into the error, we report that we've found an error - tells the client to go through and fix those issues
13:52:13 <vivienne> David: report has an executive summary which summarizes - whether they have the skills to fix things, top/priority issues - if you fix these 5 issues a whole bunch of stuff gets better.  Then has a table with 1,2,3 level priorities - right away, next - based on effort and impact.  Fix these and the site gets better quicker.  Don't provide a score - except Government of Canada.  They have to report to the courts.
13:53:30 <vivienne> David: eg 1.3.1 has a huge impact - if it has the same weight it messes up the severity of the impactg
13:54:24 <vivienne> Ramon: if they pass most of A, they get a score of 100, but when they try to go to the higher level, it lowers the score
13:55:17 <vivienne> Katie: priority for this methodology should be on compliance level.
13:56:36 <vivienne> Ramon: now the levels are more reflective on the difficulty, the unusualness, ability to comply
13:57:56 <vivienne> David: if you're going to give a score and a percentage, some people are adamant they get a score so they can compare to other organisations.  Some points are more important to some people than others.  Some people on the WCAG group will question the scoring method.
13:59:14 <vivienne> Shadi: 1.3.1 occurs so frequently.  it can come up 100times on a page - tables/headings etc.  Out of those 100 occurrences, how many were failed?
13:59:27 <vivienne> David: usually if they just get 1 wrong, they may be doing it all over the site
13:59:48 <vivienne> Shadi: then it will occur systematically so the numbers will go up
13:59:56 <vivienne> Katie: you would still get a fail overall
14:00:16 <vivienne> Shadi: is it real world that pages are really good quality and just the 1.3.1 is badly marked up
14:00:34 <vivienne> David: 1.3.1 happens disproportionately in terms of the websites.
14:02:05 <vivienne> Shadi: it occurs so often on a page.  The more complexity you put in, the more subjectivity that comes in, actually lowers the value of the outcome.  There is the notion that the more complex the scoring system, the more parameters it has, the more inclined it is towards subjective - the more easy to vias it.
14:02:12 <vivienne> Katie: maintain simplicity
14:03:51 <vivienne> David: he gave a client a report based on the TBS and they got 95%, and they think it's great.  But all of the errors are in 1.3.1.
14:07:04 <vivienne> described adding the errors per page and dividing by the number of pages for an average score per page
14:07:37 <vivienne> David: similar to Government of Canada
14:08:23 <vivienne> Shadi: described the 3 different approaches
14:09:26 <vivienne> Shadi: 3rd one - instance: is more like Vivienne's example.  For 131 you could have 100 where 70 passed and 30 failed and you can work out an average of total number of errors/ over total possible and gives you a ratio
14:10:04 <vivienne> Detlev: would the last one be a way of differentiating between critical and minor failures?
14:10:41 <vivienne> Shadi:  you could have a website that is unusable which could score the same as a website with a lot of decorative images tagged with alt text etc.
14:11:24 <vivienne> Shadi: what is the purpose of scoring?  How to take to court - depending on score.
14:12:12 <vivienne> Shadi:  we need to state clearly that the scores are indicative and used to motivate the developer.  EG you're on orange, you invested $5 and now you're on level x and you can see the value of the money you've invested
14:12:37 <vivienne> Katie:  the world I'm in couldn't care about what you do right - they want to know how much trouble they are in
14:13:18 <vivienne> Katie:  only clear violation should be counted
14:14:07 <vivienne> Ramon: these approaches have the same problem - if you pass from A to AA, the percentages change and the results look worse
14:14:27 <vivienne> you can state both A and AA scores
14:15:10 <vivienne> Katie: identify which level you are trying for compliance with.  You can say 100% for A and 70% for AA
14:15:39 <vivienne> Ramon:  what is considered critical - contrast may be critical for me
14:16:14 <vivienne> Ramon:  you can't say anything is not critical
14:16:36 <vivienne> Detlev: we need to work with WCAG - has to be set according to that
14:16:45 <vivienne> David: there are new tools to enhance the contrast
14:17:08 <vivienne> Shadi:  what are you trying to address
14:17:42 <vivienne> Eric:  at the moment it is just pass/fail - WCAG already describes it.  Do we want to add things like severity and impact.
14:19:00 <vivienne> shadi:  those questions are part of the reporting,  First of all you have the conformance - pass/fail.  Then you come out with the report - which depends upon you as an evaluation commission as to how much detail you want - what needs fixing,k the frequency of the issues, and as eye-candy, and optionally you can get a score and it doesn't mean that the website is 80% accewssible.
14:20:03 <vivienne> Shadi:  this score is just an indicator - this circumstance, on this date, on these pages.  So you can compare your own progress.  Helps you track your own progress.
14:20:13 <vivienne> Katie:  let's add those 4 critical requirements.
14:20:34 <vivienne> Ramon:  they interfere with the other content
14:20:42 <vivienne> Katie: doesn't that make them critical then?
14:23:12 <vivienne> Detlev:  if we have this kind of separation and keep it simple on the pass fail basis and the score is additional and should not be taken as a value of the accessibility of the page.  Is there a way of reflecting those imbalance say with 1.3.1?  We have 7 different checkpoints for 1.3.1 - split into several bits so the score has more weight, same with 1.1.1
14:24:51 <vivienne> Detlev: they wanted to give a higher weight for some of the checkpoints say 1.3.1 - tries to give a relative weight.  It then aggregates the results.
14:26:01 <vivienne> Katie:  we have to make the same assumptions WCAG has made - things change.  such as less priority on tables, more on interactive controls.  We need to be careful of specific ways things are done.
14:27:12 <vivienne> Shadi:  the seriousness can show itself with the number of occurrences - there are killers and no matter how good the page is. e.g. getting stuck
14:28:25 <vivienne> Shadi:  the first 2 types - per website and per page may be too coarse.  If you want a score you will have to have a form of evaluation that counts every occurrence.
14:28:38 <vivienne> Detlev: where do you put your time - counting and looking at every image
14:29:02 <vivienne> Eric: we just indicate if there errors, and give them a few examples.
14:29:36 <vivienne> Katie:  you need to decide based on business cost as well
14:30:40 <vivienne> Ramon:  we say - global issue - lists without the proper markup.  You can go and fix them.  If we find 3 wrong in 30 pages, we don't test all of the pages to look for more, we assume that the developers don't know how to do them and they need to go and check them themselves.
14:31:55 <vivienne> David:  regarding Shadi's point - you count up the instances and see how many pass or fail.  Can you come up with a % number without doing that.  Is that where you want to spend you accessibility budget counting up what is right.
14:33:01 <vivienne> Detlev:  it is much more important to be able to hone in those things that are vital - eg search button image with no alt text.
14:34:13 <vivienne> Shadi:  is the conclusion - the first 2 - per website or per page is too coarse - more misleading than beneficial.  Other is per occurrence, but the administrative overhead is too high - you are using budget to count those things that work.  Perhaps provide a hybrid - certain checkpoints have more points and have a point system.
14:35:19 <vivienne> David:  the 4 points were picked because if you fail them, you can't access other content
14:35:29 <vivienne> Shadi:  to drop the scoring completely is the 4th option
14:36:05 <vivienne> Katie:  what about per page?
14:36:58 <vivienne> Shadi: on page 1 - AA have 38 possible success criteria and you sample 10 pages.  On page 1 you fulfilled 14, on page 2 you fulfilled X,  You get an average per page.
14:39:09 <vivienne> Detlev: it can severely dilute problems - picking more pages.  The more pages you check, the less major the impact of the 1 huge problem on 1 page.
14:40:04 <vivienne> Detlev:  if it is an issue just on 1 page, then this is an indication of the overall impact
14:41:06 <vivienne> Shadi:  we would need to see how sensitive the score is towards changes in the sample.
14:42:08 <vivienne> Detlev:  it is an issue if you use a score because people want to get a seal (90%) or (95%) for really good.  It gives them an impetus for increasing the number of pages tested to water down the results
14:42:25 <vivienne> Shadi:  we have to make sure the score is not a measure of conformance andnot a measure of accessibility.
14:42:32 <vivienne> Katie: then we have to say very clearly what it is
14:42:54 <vivienne> shadi: only used for looking at your own performance - we need to be really clear.
15:13:10 <Detlev> scribe: Detlev
15:14:09 <ericvelleman> ericvelleman has joined #eval
15:15:07 <jkiss> jkiss has joined #eval
15:19:44 <Detlev> Session No. 4 : Requirements for uniform accessibility support
15:20:48 <Detlev> Lookin at Step 3.1.4 Step 1.d
15:21:04 <Judy> Judy has joined #eval
15:21:20 <Detlev> Eric: example of range of AT 5 browsers, 3 types of Assistive technology
15:22:44 <Detlev> People may look at different scenarios (UA/AT) to make things work
15:23:56 <Detlev> Ramon :example of using one Screen reader instead of another
15:24:35 <Detlev> Eric: for some websites, zou don't have a choice (tax web site)
15:25:06 <Detlev> Ramon: W3C#s definition of accessibility support is loose
15:25:55 <Detlev> Katie: WCAG says you choose the technologies and AT  to ensure it works across the site
15:25:59 <David_MacD_Lenovo> http://lists.w3.org/Archives/Public/public-wai-evaltf/2012Sep/0020.html  my comments are here... do not think consistent support should be required...
15:27:30 <Detlev> Shadi: you could have a site accessible with one set of tools and another with another set of tools .. but it doesn't happen often
15:28:53 <Detlev> Ramon: Expains  problem  wit h PDF not accessible on the Mac - does that constitute sufficient accessibility (as you may install a virtual machine)
15:29:24 <Detlev> WCAG was deliberate in not nailing down the required level of support due to fast changing technologies
15:30:38 <Detlev> David: As long as something works it is sufficient (scribe: not sure if that is Davides position rendered correctly)
15:31:24 <Detlev> Katie: not offering alternatives for PDF that work on Mac would be a failure
15:31:32 <Detlev> Ramon / David seem to disagree
15:32:50 <Detlev> Vivienne: Australian Government would require an alternative version for PDFs (but this is no tthe WCAG position)
15:34:25 <Detlev> David: Leaving technologies out was conscious decision because it could otherwise have created disincentives for technology developers
15:35:49 <Detlev> Shadi: Partial lack of support cannot be the benchmark for establishing accessibiliy support
15:36:47 <Detlev> Vivienne: Australian government have applications withing websites only work in Windows - was a Government decision
15:37:20 <Detlev> Shadi: any other examples with conflicting sets of technologies for different parts of a site?
15:37:57 <Detlev> Vivienne: There were instances where Firefox did things that did not work in other browsers / AT
15:40:38 <Detlev> Detlev: Case of WAI-ARIA not being avaliable to many peope at the worplace
15:41:06 <Detlev> David: Makes case for not requiring any technologies
15:42:13 <Detlev> David: example of may different web teams at a large depertment or company, it is difficult to get all teams do the same things and apply the same set of UA/AT in their tests
15:44:19 <Detlev> Katie: testing methodology should not only test a site if the level of technology is even, - it is not the role of technology to mandate it, ressults are nevertheless helpful - uniformity not required
15:44:41 <Detlev> Shadi: take back this point  to WCAG Working group
15:44:56 <Detlev> Katie: still important to list what has been used for testing
15:45:40 <Detlev> Shadi: Differences in accessibility support that have been discovered should enter reporting (where are the weaknesses)
15:48:50 <Detlev> Create new issue: Develop a concept for  reporting accessibility support across the website
15:50:38 <Detlev> Katie: we should mandate a minimum set (number of tools) used
15:51:33 <Detlev> Shadi: Step 1d Define the context of website use - define tools used in testing - that may need to be changed
15:52:03 <Detlev> Shadi: you may cometo a piece of content that only works in another context - what is the impact of that?
15:52:54 <Detlev> Katie: Multiple operating systems should be considered - may be based on data on most commonos, browsers, AT used
15:54:05 <Detlev> Shadi: Step 1d came up to define a baseline for the developer
16:04:21 <Detlev> Shadi: Definig techniques fine, but they may be extended by technologies used in the site as they are discovered
16:05:23 <Detlev> Shadi: similar approach for tools -- start with the mot common, then extend to other tools AT to see if it works there
16:06:39 <Detlev> Detlev: Does that mena as long as any tool AT out there supports it, it meets WCAG conformance?
16:07:04 <Detlev> Shadi: yes, technically, thoughit is not best practive, should be noted in the report
16:07:34 <Detlev> Vivienne: make suggestion for better practice
16:07:50 <Detlev> Detlev is that WCAG WG position<ß
16:08:31 <Detlev> David: at least for one page the same tools should work throughout, seems to be WCAG WG position
16:08:57 <Detlev> David: Discernable sections odf a website (sub sites, etc) should work together on the same AT
16:09:08 <Detlev> David: not swap within tasks
16:09:48 <Detlev> David: Uniform AT support across subsites / chunks / tasks of a website might be WG position
16:10:55 <Detlev> Shadi: Maybe WCAG WG can define it more clearly, not do it in EVAL TF - ask for WCAG WG opinion here
16:11:45 <Detlev> Shadi: agreement that uniform level of AT support is at least per page, and per function / widget,  transaction
16:12:17 <Detlev> David: Shadi, write down as bullet points and put it into a WCAG WG survey to clarify this
16:13:03 <Detlev> Ramon: web site owner can bypass uniformity rewquirement by commissioning two different evaluations
16:13:55 <Detlev> Vivienne: Library and website may have different levels of AT support, library cozuld be singled out for testing
16:14:13 <Detlev> Ramon: problem is AT support / tools clash
16:15:10 <Detlev> David: We could make statement in Understanding document to explain AT support
16:15:46 <Detlev> David: Problem of large organisatons where that uniformity is not possible
16:16:36 <Detlev> Shadi: Uniformity may hold back technology development if new parts of a site have to keep in line with the old status
16:17:35 <Detlev> Vivienne: Government agencies purchase parts form others which look and work completely different from the rest
16:18:38 <Detlev> Shadi: We should make sure that a site does not need completely different sets of tools to access the site
16:18:54 <David_MacD_Lenovo> set of Web pages:..    collection of Web pages that share a common purpose and that are created by the same author, group or organization
16:18:55 <David_MacD_Lenovo>     Note: Different language versions would be considered different sets of Web pages.
16:19:00 <Detlev> Katie: the market in the end determines what will be used
16:19:46 <Detlev> David: as a proposal to present to WCAG for uniform AT support
16:23:41 <Detlev> Shadi: Question to WCAG WFG: What is the WG postion for the intent of accessibility support for 1) within individual web pages 2) complete processes 3) sets of pages, 4) across entire collections of pages (web sites)?
16:27:30 <Detlev> Katie: Examples: a) form on a web site that only works on the Mac, b) a calendar widget that only works in Firefox c) WAI-ARIA roles that are only supported in specific AT
16:33:28 <Detlev> Detlev: for may SC it does not matter which tool platform has been used
16:34:02 <Detlev> Katie: mandating using more than on tool is still useful
16:34:15 <shadi> issue: Develop a concept for reporting accessibility support across the website
16:34:57 <trackbot> trackbot has joined #eval
16:35:01 <shadi> issue: Develop a concept for reporting accessibility support across the website
16:35:01 <trackbot> Created ISSUE-10 - Develop a concept for reporting accessibility support across the website ; please complete additional details at http://www.w3.org/WAI/ER/2011/eval/track/issues/10/edit .
16:35:52 <shadi> action: eric to draft question to WCAG WG about the intent of accessibility support
16:35:52 <trackbot> Created ACTION-7 - Draft question to WCAG WG about the intent of accessibility support [on Eric Velleman - due 2012-11-05].
16:42:12 <Detlev> David: operators are mutually talking about API so things mighr improve
16:42:49 <Detlev> Different positions whether WCAG-EM should be independent of technology change or not
16:43:15 <Detlev> Katie: Developers find new ways of AT support
16:43:31 <Detlev> Eric: Wrap up of today
16:44:32 <Detlev> David: Good process, will feed back into larger group for decisions
16:44:53 <Detlev> Vivienne: Goold discussions, moredetails how everyone does things and the differnet views
16:45:47 <Detlev> Katie: Was great, good
16:47:14 <Detlev> Katie: no different expectations for tomorrow good that we came to formulate clear questions for WCAG WG (about AT support) - a survey of what people use would be interesting
16:49:26 <Detlev> Ramon: Good learning experience, reflects on many things in Technosite, similar to discussions there - only negative point is concern about excluding minorities that have specific requirements - drawing a line may include them
16:49:59 <Detlev> Katie: agrees that all disabilities (not just blindness) should be included
16:51:13 <Detlev> Detlev: Lively discussion, can stay that way today
16:52:43 <Detlev> Shadi: was a good discussion in a rather small group, good to have high level discussion - teleconferencing woukdn#t have cutit for that tyype of discussion - tomorrow we need to focus more on presentational side of methodology
16:53:21 <Detlev> Shadi: Facilites for CSUN in March
16:54:37 <Detlev> Discussiob about CSUN
16:56:47 <Detlev> Eric: Is happy with input and discussions, good food for thought for editor draft - could have more hannds-on work tomorrow
16:58:33 <ericvelleman> ericvelleman has left #eval
16:59:01 <Detlev> Trackbot, end meeting
16:59:01 <trackbot> Sorry, Detlev, I don't understand 'Trackbot, end meeting'. Please refer to http://www.w3.org/2005/06/tracker/irc for help
16:59:14 <Detlev> trackbot, end meeting
16:59:14 <trackbot> Zakim, list attendees
16:59:14 <Zakim> As of this point the attendees have been St_Clair_1
16:59:22 <trackbot> RRSAgent, please draft minutes
16:59:22 <RRSAgent> I have made the request to generate http://www.w3.org/2012/10/29-eval-minutes.html trackbot
16:59:23 <trackbot> RRSAgent, bye
16:59:23 <RRSAgent> I see 3 open action items saved in http://www.w3.org/2012/10/29-eval-actions.rdf :
16:59:23 <RRSAgent> ACTION: Will check with Canadian Treasury Board Secretariat's' web-site statistical audit guidance folks - to help us make a determination on sample size [1]
16:59:23 <RRSAgent>   recorded in http://www.w3.org/2012/10/29-eval-irc#T11-24-28
16:59:23 <RRSAgent> ACTION: Jason Kiss will check with Canadian Treasury Board Secretariat's' web-site statistical audit guidance folks - to help us make a determination on sample size [2]
16:59:23 <RRSAgent>   recorded in http://www.w3.org/2012/10/29-eval-irc#T11-25-18
16:59:23 <RRSAgent> ACTION: eric to draft question to WCAG WG about the intent of accessibility support [3]
16:59:23 <RRSAgent>   recorded in http://www.w3.org/2012/10/29-eval-irc#T16-35-52