WCAG 2.0 Evaluation Methodology Task Force Teleconference -- 23 Feb 2012

Changes to the Methodology draft

<Detlev> Look at the last mail form Tim Borland: the latest W3C WAI ATAG2.0 draft success criteria satisfaction options for conformance are: yes, no, not applicable

<shadi> http://www.w3.org/WAI/ER/methodology/

Eric: new version with change log

<Detlev> yes

Eric: Changes from Kerstin, added some things Richard reconnect

Is there more than pass and fail?

Eric: Pass and fail discussion - do we want N/A as in WCAG 1

<Detlev> Accordign to Tim, the latest W3C WAI ATAG2.0 draft success criteria satisfaction options for conformance are: yes, no, not applicable

Shadi: W3C though N/A is ambigous. Conetnet that does not need feartures conforms by virtue of not needing features
... If people feel strongly we could define N/A if so. Would need to make N/A almost, but not exactly a Yes

Detlev: Please see comment on irc earlier. I think N/A should be included because it is already acceptable to W3C and it makes sense to our end users

Shadi: Reminds that W3C WCAG2 decided against N/A. We need to check

Vivien: Giving too many passes suggests they are nearer than they are, but N/A allows for them to better understand the real state of their website

Eric: Does Shadi know of any discussions?

<Elle> I agree with Vivenne regarding N/A and false positive results - my apologies, all, but I've been called into another meeting - I will follow up and participate via email

Shadi: I will look more. I don't think we should stall. I would say attempt is as a draft and ask for input

Mike: I think N/A very useful - but needs careful definition

Eric: I propose we discuss this over the coming week and see what we come up with

Three different samples is that possible?

Eric: Item 3 - 3 different samples. Some others think the three approach is a good idea. Does anyone else think it good or bad?

<Kerstin> sorry for delay

Detlev: Possible misunderstanding of diference between sampling a web page and statistical sampling of more general things such as populations. Web pages have a common structure and limited in basic technology.

Viviene: I agree with most of what Detlev says. We are auditing the pages - but we are using three ways =of chosing teh pages we test. It is all about the choosing. Random alone would be useless
... Possible misunderstanding

Eric: We seem to agree that the three sample types are good

Shadi: I think it is a matter of explaining it better. Thinks that Mike thinks we suggest using random to fill up number - this is not the case so our text needs clarifying

<vivienne> I think I prefer pages

Eric: Resources- Confusion between resources and pages. Sometimes we call a page a resource, other times we say a page contains resources. Any thoughts?

<Detlev> I#d use pages, states of pages, elements

<vivienne> I think a page can include some resources

<shadi> [[A structured sequence of resources (of any type), for example an RDF sequence]]

Eric. I can clarify when I write page or resource.

<MartijnHoutepen> +1 for pages

Shadi: Eiter we use the two words as interchangeable or we need to retink

<vivienne> Can we also clarify whether we refer to "URL" or "URI"?

<Mike_Elledge> +1 for page elements

Viviens: Pages are more understandable to our users. Eric has suggested "elements" as bits within a page . Prefer Pages and elements. If we use "resource" it needs careful definition

Eric: I will make some changes to see how it work. We can put resources back if we need to

<Detlev> That was Kathy who said that, not "Kathy"

Shadi: We can put a small clarification of web-page

Evaluation clause 5

Eric: Clause 5 - Evaluation - three levels

<Detlev> +q

Eric: Cut into 1) Order, 2) Criterea - I added stuff here with a seperate section about stop critera - this is for discussion.
... we have to allow for when a series of pages present nothing new - ie we can go somehwre else
... What do you think of stop criterea? 4.3 ?

Detlev: Not sure I understand what they are. From aconformance point of view just 1 fail might be OK - but it is important to check if the error is truly accross the whole site or just those in one section
... This would make it esier for corrections if user knows which areas need attention and which do not

Eric: Compare Global and regional errors- perhaps part of this discussion

Kathy: We mention point of severity. If an error prevent complete use then everything fails : after evaluating just one page by silverlight it was clear that the whole thing was not accessible - so it was pointless doing four more pages

<Detlev> I'd be more comfortable having fewer pages checked fully...

Eric: We have to decide if it is global or regional.

Kathy: If it becomes absolutely clear that it can't be evaluated then I stop

Kerstin: I don't like "scores". We can check individual items. When I have cheked 3 or 4 tables I make a list and give a note and tell client he needs to check all other instances.

<Kerstin> I don't believe in "scores" ;-)

Eric: It must be possible to say "this is always wrong". - but impact of some regional areas can be serious

<Detlev> Re-checking is much quicker than the first check!

Viviene: I just copy comment each time so it is not a lot of hard work. But I agree that we can tell user that many pages have same error and expect user to check all other occurancies. But if you use a score and want accurate statisticsthen you have to check and comment on every page

Kerstin: For me Pass/Fail is enough. But for objectivity perhaps we need more

<Detlev> yes.

<Kathy> yes

Eric: Many of you do evaluations for real. Some of you take all pages individually

<Kerstin> don't know why I'm on the speaker queue, don't want to say something

<Detlev> same

<Detlev> yes

Viviene: Yes I score every page. I give score points to each error and can total at the end

Kathy: I also look at all teh check points

<vivienne> I do the same as Kathy when I am working for clients

Richard: I check is sequence. Run special checks over a range of pages; For example check that the keyboard works on a sequence of pages and enavle the completeion of tasks etc. Then do the sanme without CSS ro see structure etc andthe without ime=ges to seeif it works wu=itout images (there are sutable aklt tags

<shadi> +1

<Kerstin> we must describe procedure for be sure that the test is reliable

Eric: Detlev is amore a statistical thing, what we and richard do is more generic.

Detlev: Either way we still fill up the list of scs

<Kerstin> I'm on the generic side of the evaluators ;-)

Detlev: Good idea to check each element , SC

Kathy: How I di it depends upon teh type of site. For a task based site s=do as Richard by following tasks using different technologies, I also agree with Detlev taht we must have a checklist

Viviene: Diferent types of page need different methods. You need to actially use the technoogy - such as screen reader or keyboard etc. You still follow structure to make sure you don't miss anything

Eric: We need to cover - Stop Criterea,

<vivienne> Eric, are we able to share the draft with others to get their viewpoint?

IEric: Please walk throgh the changes I made and discuss also on scoring. It is crucial to get this clear - thanks

Shadi: Some of us a CSUN but you should do weell withiout us :)

WCAG 2.0 Evaluation Methodology Task Force Teleconference

23 Feb 2012

Attendees

Contents

Changes to the Methodology draft

Is there more than pass and fail?

Three different samples is that possible?

Evaluation clause 5

Summary of Action Items