Checking for similarity or difference: Is the list of applicable success criteria complete? Otherwise a phrase like "for example" should be added.

Frank Berker (talk) 10:11, 3 March 2016 (UTC)

This page is a general introduction to different sampling methods. It does not address the specific requirements related to multi-page testing that we have identified in auto-wcag.

The challenge is: Some SCs can only be judged by inspecting multiple pages. There seem to be two types of scenarios:

  • The SC requires all web pages to be similar (e.g. consistent navigation).
  • The SC requires all web pages to be different (e.g. page titled).

The questions are:

  1. How many pages have to be compared?
  2. How are the results reported? (Is there only one outcome for the whole set of web pages? Or is there an outcome for each page?)
  3. If there is an outcome for each page: Do the pages get different outcomes? How do we define thresholds for the two scenarios mentioned above?

Comments on the sampling ideas:

  • Why is the approach from "A sampling method based on URL clustering for fast web accessibility evaluation" recommended? (Is this a W3C endorsement? In the study of related work carried out for EIII, I compared several approaches and this one didn't look particularly promising. Mainly because it relies on strong assumptions that will only hold for a subset of web sites.)
  • In statistics "weak" is a technical term related to the statistical power of a statement. Is this what you had in mind when introducing the term "weak sample"? If not, please use another word instead.

Annika Nietzio (talk) 14:51, 10 November 2015 (UTC)