14:54:57 RRSAgent has joined #eval 14:54:57 logging to http://www.w3.org/2014/02/13-eval-irc 14:54:59 RRSAgent, make logs world 14:55:01 Zakim, this will be 3825 14:55:01 ok, trackbot; I see WAI_ERTWG(Eval TF)10:00AM scheduled to start in 5 minutes 14:55:02 Meeting: WCAG 2.0 Evaluation Methodology Task Force Teleconference 14:55:02 Date: 13 February 2014 14:55:13 zakim, this will be eval 14:55:13 ok, shadi; I see WAI_ERTWG(Eval TF)10:00AM scheduled to start in 5 minutes 14:55:23 zakim, who is on the phone? 14:55:23 apparently WAI_ERTWG(Eval TF)10:00AM has ended, shadi 14:55:24 On IRC I see RRSAgent, Kathy, shadi, Bim, Zakim, trackbot 14:57:35 WAI_ERTWG(Eval TF)10:00AM has now started 14:57:42 +Kathy_Wahlbin 14:57:54 +Mike_Elledge 14:58:49 Mike_Elledge has joined #eval 14:59:37 EricVelleman has joined #eval 14:59:52 + +1.301.990.aaaa 15:00:22 +Shadi 15:00:30 Detlev has joined #eval 15:00:43 zakim, aaaa Liz 15:00:43 I don't understand 'aaaa Liz', shadi 15:00:46 zakim, aaaa is Liz 15:00:46 +Liz; got it 15:01:07 +EricVelleman 15:02:10 regrets: Moe, Alistair 15:02:30 +Detlev 15:02:38 regrets: Sarah, Moe, Alistair 15:02:46 Zakim, mute me 15:02:46 Detlev should now be muted 15:03:32 scribe: Mike_Elledge 15:03:34 chair: Eric 15:03:38 Vivienne has joined #eval 15:04:01 agenda: http://lists.w3.org/Archives/Public/public-wai-evaltf/2014Feb/0004.html 15:04:25 Topic: Test-run survey proposal 15:04:28 Eric sent around a draft questionnaire 15:04:33 +[IPcaller] 15:04:38 http://lists.w3.org/Archives/Public/public-wai-evaltf/2014Feb/0006.html 15:04:42 zakim, IPcaller is me 15:04:42 +Vivienne; got it 15:04:49 Received a few commments and included them 15:04:58 Uploaded new version 15:05:36 E: looked at methodology, tried to replicate most important steps. 15:06:14 E; If could put answers into spreadsheet could go through it easily. 15:06:34 zakim, mute me 15:06:34 Shadi should now be muted 15:06:55 E: Intro: where to find working draft. Note how long it may take. That it is confidential, published anonymous. 15:07:07 K: What are we using. 15:07:13 E: Qualtrics. 15:07:26 K: Has some a11y bugs that they're working on. 15:07:56 E: In Q can indicate what is/not accessible. So only used items called accessible. 15:07:57 gavinevans has joined #eval 15:08:35 K::Items that are accessible may take users to external pages that give directions for screen reader users. Some bugs still exist. 15:08:44 K: Will test drive survey. 15:09:25 E: Has numbered questions. Not logical numbers, but clearly indicated in tool. 15:09:48 E: Enter name, practical experience, days evaluating in 2013 (indicates experience) and email address. 15:09:52 + +44.179.281.aabb 15:10:03 q+ 15:10:14 E: Some comments not to as for email address. Will suggest making information optional. 15:10:28 K: Please enter email address or phone number so can contact you. 15:10:39 q- 15:10:48 q+ 15:11:03 K: Ask for email or phone number so we can contact you. 15:11:48 E: Survey follows WCAG Steps. Pls remember not testing you. Follow methodology as closely as possible. Then directions for how tosave contents. 15:12:24 E: Only saves,however, if you press Save before going back. Otherwise lose everything. 15:13:04 M: Is it possible to show a warning message? 15:13:14 E: Will look into it. 15:13:54 Are you using the queue today or are we supposed to speak up queueless? 15:14:03 E: Scope. Q105 Eval scope. Part we have to fill in. Should do on maling list. Scope of evaluation. Put @ signs to indicate we need input. Need more work. 15:14:06 ack me 15:14:29 zakim, who is on the phone? 15:14:29 On the phone I see Kathy_Wahlbin, Mike_Elledge, Liz, Shadi (muted), EricVelleman, Detlev, Vivienne, +44.179.281.aabb 15:14:39 D: Comment to Q101. mentions w 15:14:49 Q101 How much practical experience do you have evaluating websites using the WCAG2.0 guidelines (or other schemes / regulations based o WCAG 2.0)? 15:15:12 zakim, aabb is probably Gavin 15:15:12 +Gavin?; got it 15:15:32 yes is me 15:15:37 q+ 15:15:43 D; wcag change to one pasted into survey. Test against German translation BITV. 15:15:49 ack me 15:16:41 S: Maybe we can reformulate question. How much experience eval websites. 15:17:02 D: Just to make it clear that it counts. other countries have their own regs based on wcag. 15:17:30 S: Methodology does rely on WCAg 2.0. If using something else would be apples/oranges. 15:17:43 D: Why said or other schemes based on WCAG 2.0. 15:17:49 S: What's purpose of question? 15:18:09 Zakim, mute me 15:18:09 Detlev should now be muted 15:18:24 E: If people get different results is it caused by different pages or their experience. Not somehing that's definitive, but gives context. 15:18:39 E: If using differently may get different results. 15:18:53 S: # Days gets at it too. 15:19:17 E: Could use one question, but it's kind of a control. Great experience may be 5 days to some people. 15:19:36 S: Not disagreeing with Detlev, just wonder if it will add confusion. 15:19:52 q+ 15:20:01 q+ 15:20:08 E: Could put some info that if persons use a national version with same checkpoints and success criteria would be okay. 15:20:15 ack kathy 15:20:47 I agree with Kathy. Maybe a scale? 1-25, 26-50 etc? 15:21:04 K: How many days evaluating websites WCAG 2.0. do all the time, but have no idea. Suggest # of websites and perhaps ranges. 15:21:10 M: +1 15:21:15 +1 15:21:16 list or range 15:21:23 E: Could turn into bulleted list. 15:21:37 q+ 15:21:51 queue 15:21:53 q- last 15:21:58 E: Q104. The intro to eval scope. 15:22:05 S: Gavin first. 15:22:14 ack gavin 15:22:38 G: How many websites. could do one, but could be very large with sections. Should take into consideration. 15:22:49 E: Will come up with proposal. 15:23:02 approx. number of days/year may be more expressive 15:23:02 S: Back on requirements for scope? 15:23:27 S: Base line seems too big. What we would usually expect? 15:23:29 ack me 15:23:42 s/big/vague 15:23:43 E: Just typed something there. 15:23:58 E: Could put very specific things there. 15:23:59 q+ 15:24:13 zakim, mute me 15:24:13 Shadi should now be muted 15:24:32 q+ 15:24:39 ack gavin 15:24:50 G: Massive? Gets down to browsers, versions, etc. Important that mentioned HTML, ARIA, earlier versions of AT important as well. 15:24:58 ack kathy 15:25:26 K: Also add asking what kind of applications ahve been done, i.e., mobile. Gives indication of level of evaluations they'e done. 15:25:48 q+ 15:26:00 ack me 15:26:03 E: will have to spend more time on this on list. Have to decide accessibility benchmark. 15:26:49 S: Have to be precise. Enumerate browsers and assistive technologies and combinations, to really ahve list that we assume that website should work for. 15:27:10 q? 15:27:20 S: Isn't that how you would do testing. Define what needs to work and that baseline is addressed during testing. 15:27:38 K: Have to prioritize though. There are so many combinations and limited budgets. 15:28:05 q+ 15:28:13 zakim, mute me 15:28:13 Shadi should now be muted 15:28:16 zakim, ack me 15:28:16 I see no one on the speaker queue 15:28:17 S: Agree, can make it short list. But need something so we have a target we're shooting for. Need to define minimum bar. 15:28:24 +Tim_Boland 15:28:46 ack me 15:28:53 V: When we test give client a questionnaire if there is specific requirement they have. Ex Aus gov't site may requre only IE 9. 15:28:56 zakim, mute me 15:28:56 Vivienne should now be muted 15:29:09 zakim, mute me 15:29:09 Shadi should now be muted 15:30:04 q+ 15:30:12 zakim, mute me 15:30:12 Shadi was already muted, shadi 15:30:25 E: If we test a website that is only required to be IE6, then will be only 15:30:32 E: that. 15:30:34 ack me 15:31:35 S: Do have a definition of that in methodology. Can't use an intranet website so can test. Assume it will be english as well. Might be dependant on type of website. 15:31:44 E: Propose not to test Dutch website! 15:31:53 S: Put that in definition. 15:31:57 zakim, mute me 15:31:57 Shadi should now be muted 15:32:42 E: Let's come back to scope on mailing list. Decide on what site we want to use, part, baseline, target, etc. Will be easier once we have website in front of us. 15:33:34 [["how many minutes" -> "how much time"?]] 15:33:34 E: Next section. Technologies relied upon (can be selected in Qualtrics). How much time to do, easy to do, comments, etc. For Step Two. 15:33:39 E: Missing anything? 15:33:41 ack me 15:33:53 q+ 15:34:01 S: Just minor. May to change from minutes to how much time. 15:34:08 zakim, mute me 15:34:08 Shadi should now be muted 15:34:08 zakim, ack me 15:34:09 unmuting Vivienne 15:34:09 I see no one on the speaker queue 15:34:24 V: Step Two, don't we have identify different templates? 15:34:58 E: Yes, but would have report on all the different parts that come back in Section 3. In the reporting we put others as optional. This is only non-optional one. 15:35:03 zakim, mute me 15:35:03 Vivienne should now be muted 15:35:32 E: Step 3. Representative sample. Read it then paste exemplar instances, maybe past urls or descriptions instead. 15:35:38 +1 to "provide the urls or descriptions ..." 15:35:58 Q: 115. Change to how much time. Then how chose random sample. 15:36:19 T: Have any indication of how big the site is, # pages? 15:36:28 E: We determine scope. 15:37:10 E: We will be telling them which site to review for consistency. 15:37:49 E: Were you able to get representative sample. Not scientific, but indicates comfort level. 15:37:55 T: Tehre is rationale, right. 15:38:24 E: Yes. In line with what usually use. Q119 How easy to follow instructions. 15:38:53 E: Select the urls, paste them, then go back to them. 15:39:04 martijnhoutepen has joined #eval 15:39:17 q+ 15:39:25 E: Step 4. Audit the sample. Introduction, then a page for each SC. 15:39:37 Q 121 not a qestions so much as remark. 15:39:38 ack me 15:39:58 S: Thought earlier version was AA. Need to go with that. 15:40:05 agree about using AA 15:40:18 E: Okay. 15:40:34 S: A bit more work, but no way around it. Trying to make it easier. 15:40:57 T: If one goes through it would check A and AA accessibility. 15:41:13 E: If we dont SC1, then all applicable for A and AA. 15:41:27 S: Maybe 36, for total, not each. 15:41:30 +1 15:41:34 zakim, mute me 15:41:34 Shadi should now be muted 15:41:45 E: Agree to expand to WCAG 2.0 AA. 15:42:13 E. Please read for audited sample. Then real tests. Have to decide, easy or hard work for E. 15:42:39 +??P10 15:42:56 zakim, ??P10 is martijnhoutepen 15:42:56 +martijnhoutepen; got it 15:43:13 q+ 15:43:17 q+ 15:43:30 q- last 15:43:33 q+ 15:43:33 q- later 15:43:39 q- 15:43:45 zakim, ack me 15:43:45 unmuting Vivienne 15:43:45 E: Radio buttons or check boxes. This is relevant because on a page doesn't pass. But next page fails. Do we want possiblity for marking page pass... 15:43:45 q+ 15:43:46 I see Detlev on the speaker queue 15:44:26 V: We do both. First page is p, f, na and summarizes them according to sc 15:44:46 E: Can't use radio buttons. 15:44:57 ack me 15:44:57 V: dropdown for p, f, na? 15:45:02 zakim, mute me 15:45:02 Vivienne should now be muted 15:46:07 q+ 15:46:40 D: What would be the purpose for recording each p, f for each page. May not be same pages. Not comparable. Someone may have made different decisions. Can't really compare, unless go back to every page. Waht would work is for people report common problems. Difficult to rate 1.3.1 for some reason. Detailed results for each page won't be usable. 15:47:06 E: Thought if we do it that way would be pushing people into one direction. Wanted to give them the possibility. 15:47:34 D: But will we be able to collect meaningful information. Comments, not necessarily specific results. 15:47:51 E: So you would prefer radio button that opens comment field? 15:48:13 D: Need a comments field, not so much radio button or checkbox. 15:48:55 D: Checking P/F no way to calculate results. Maybe one P/F for entire site. Can process since not dependant on particular pages. 15:48:58 ack me 15:49:33 Zakim, mute me 15:49:33 Detlev should now be muted 15:49:46 S: I agree with Detlev. Particularly since later in process want to see how the procedure worked. Which page would require more analysis. Would skew questions about overall performance. 15:49:47 zakim, mute me 15:49:47 martijnhoutepen should now be muted 15:50:22 S: Step 5 A, looking at minimum requirements, 15:50:34 http://www.w3.org/TR/WCAG-EM/#step5a 15:51:10 q+ 15:51:10 S: Reading from Step 5, right after list there is a note, need to decide what the desired granularity is. can't compare two different levels. Need to set in scope. 15:51:18 q+ 15:52:05 S: Would like to simplify so have greater chance of people completing the test run. Table might work. Show SC with PF comments. 15:52:42 S: Rather than by SC have it done by entire website. Otherwise do by page. 15:53:17 E: But if don't put that level of specificity may wind up with differing, but intresting results. 15:53:45 E: Need people to paste where people found failures. If all pass, one fail, can see what difference is. 15:54:45 S: If you want to see a long vs. short version, maybe we should have two tests. Was not clear to me level of granularity wanted. 15:55:02 S: Should not leave it open, otherwise won't know what we get. 15:55:36 E: But we have a method. If people have methodology A, B, C then we won't know how they'll interpreting our methodology. 15:55:43 q? 15:56:04 zakim, mute me 15:56:05 Shadi should now be muted 15:56:06 S: Trying to test too many aspects? Clarity, efficiency, confidence. Lots of parameters. 15:56:07 zakim, ack me 15:56:07 unmuting Vivienne 15:56:08 I see Detlev, Mike_Elledge on the speaker queue 15:57:01 V: Understand what Detlev was saying, part of it is seeing how people will use mehtoldogy. Would expect them to use all sc on each page. If they don't, won't be using it as we envision. 15:57:11 ack me 15:57:19 V: If they don't use each of guidelines then how we know what they'll do in practice? 15:57:19 zakim, mute me 15:57:19 Vivienne should now be muted 15:59:44 D: Practical testing refers to wcag techniques as one approach. In practice people break items into individual checkpoints in 1.3. If we don't have that we won't know what causes 1.3 to be rated as fail. If we stipulate that any failure leads to failure of page then we will be consistent and comparable. Whether it will be meaningful is another question. If we checkbox then we can see there is variability, which would be useful. 16:00:01 q? 16:00:30 M: I'm confused. 16:00:32 ack mike 16:00:42 ack me 16:01:12 E: We decided not include in document that sometimes an error is okay. Every error is an error. 16:01:44 E: Rest will cover next week. You can send ideas for websites. 16:02:59 S: CSUN. Contract is still not yet signed. Scheduled to have room for both Monday and Tuesday. About 90% chance we are meeting Monday or Tuesday. 16:03:15 See you next week. 16:03:15 bye 16:03:15 good night all 16:03:16 many thanks bye 16:03:17 bye 16:03:18 -Tim_Boland 16:03:19 -Kathy_Wahlbin 16:03:20 -Liz 16:03:23 -Vivienne 16:03:24 -martijnhoutepen 16:03:24 -Mike_Elledge 16:03:26 -Detlev 16:03:26 -Gavin? 16:03:27 -EricVelleman 16:03:27 -Shadi 16:03:27 WAI_ERTWG(Eval TF)10:00AM has ended 16:03:27 Attendees were Kathy_Wahlbin, Mike_Elledge, +1.301.990.aaaa, Shadi, Liz, EricVelleman, Detlev, Vivienne, +44.179.281.aabb, Gavin?, Tim_Boland, martijnhoutepen 16:03:28 EricVelleman has left #eval 16:03:33 gavinevans has left #eval 16:07:41 trackbot, end meeting 16:07:41 Zakim, list attendees 16:07:41 sorry, trackbot, I don't know what conference this is 16:07:49 RRSAgent, please draft minutes 16:07:49 I have made the request to generate http://www.w3.org/2014/02/13-eval-minutes.html trackbot 16:07:50 RRSAgent, bye 16:07:50 I see no action items