EARL Introduction & FAQ

What is EARL?

EARL is a notation for recording and sharing evaluations. In particular, it allows one to make claims/criticisms/judgements concerning characteristics of resources, e.g. whether a document or tool conforms to certain criteria.

EARL is non constraining in the range of things that can be evaluated; a lot like the proverbial "soap box" from which one is free to speak one's mind without any prior approval from a central authority. It does however provide a vocabulary to facilitate scoped reports.

In general, an EARL evaluation consists of a context, and then an assertion which consists of the thing being evaluated, the conformance criteria, and the validity status.

Evaluation ::= quad (

under what other conditions -- computing environment, human exercising judgement, ad lib.

what was evaluated -- any referenceable scope of resource

against what criteria -- published or re-usable assesment instrument as applicable

with what conclusion -- outcome, consistent with the conventions of instrument

)

For example, contextual information may include information such as creator details, platform, and so on. The thing being evaluated could be a Web page, or a tool. The conformance criteria could be something like a WCAG checkpoint or a syntax rule in a schema, and the validity status could be something as simple as "pass or fail" or something more granular with, for example, a certain level of confidence.

Why EARL?

The people who can tell you what works and what doesn't work are often different people, operating on different computers, from the people who can discern why it did or didn't work in those cases, and how to repair defects and extend positive results into wider applicability.

Product developers need to accumulate evaluations from many independent evaluators. Policy monitors need to collect comparable evaluations for many evaluands. Free and machine-comprehensible exchange of this information will allow information collections to reach critical mass and clearly point the way for action.

One practical reason is for authors to claim that their documents/programs/Websites satisfy the requirements of various guidelines, another might be to claim that a markup language satisfies the rules published for creating such languages. Claims may be published within one's own documents much like a warranty. EARL may be used for authoring tool and user agent ratings and bug reports, device independence testing and rating, and so on.

Who?

EARL is an experiment of the W3C/WAI Evaluation and Repair Tools Working Group.

Why WAI-ERT?

The need for something that can do what EARL does is especially important if one wishes to improve information access by people with disabilities. Existing practices of user testing become prohibitively difficult for this user group. It takes Internet-based methods to reduce user evaluation to an affordable cost in this community. People with disabilities commonly experience mobility problems which makes it harder to get them to a controlled, dedicated evaluation laboratory. Likewise, there is a vast diversity of equipment configurations that people use to accomodate their needs. This makes it difficult to build a large enough sample of respondents. The laboratory would have to be perpetually reconfigured to provide a realistic emulation of their environment of use. All of this argues in favor of tooling up for in vivo testing -- going straight to the field, or at least into laboratories where the primary activity is fitting and training assitive technology. Here the evaluations are performed in the user's operational environement where the assitive features have been installed and configured and work. Some means is needed, for these reasons, to a) isolate the information that is usefully comparable across user experiences, and b) automate its exchange so that small (down to one user at home) laboratories can effectively contribute usable information into collections.

How does EARL work?

This notation is being built as an RDF application, composed of a small starter vocabulary of RDF terms which, by RDF semantics, create pattern definitions. These patterns provide just enough structure so you can put together records of evaluations suitable for machine exchange and processing, but can say whatever it is that you have to say. Employing the RDF framework as an implementation medium means that the groundwork has already been laid so one can freely extend the vocabulary while preserving the processability of the shared core concepts. It also brings access to a growing base of tools that process RDF, including products from the W3C Semantic Web Initiative.

This document has been edited from previous works by Al Gilman, William Loughborough, and Sean B. Palmer.