Requirements for the Evaluation and Report Language (EARL) 1.0

W3C Internal Working Draft 15 April 2005

This version:
Latest published version:
Latest internal version:
Previous published version:
Previous internal version:
Shadi Abou-Zahra, W3C/WAI


This is a first W3C Internal Working Draft produced by the Evaluation and Repair Tools Working Group (ERT WG). The purpose of this document is to outline the requirements for the Evaluation and Report Language (EARL) 1.0. The Working Group encourages feedback about these requirements as well as participation in the development of the revision by developers and researchers with interest in software supported evaluation and validation of Web sites.

Status of this Document

This document is for review by the Evaluation and Repair Tools Working Group (ERT WG) and is subject to change without notice. This document has no formal standing within W3C. The latest status of this document series is maintained at the W3C.

This is a draft document and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use W3C Working Drafts as reference material or to cite them as other than "work in progress". A list of current W3C Recommendations and other technical documents can be found at http://www.w3.org/TR/.

Send comments about this document to the Evaluation and Repair Tools Working Group mailing list. The archives for this list are publicly available.

This document has been produced as part of the W3C Web Accessibility Initiative (WAI). The goals of the ERT WG are discussed in the Working Group charter. The ERT WG is part of the WAI Technical Activity.

Table of Contents



The need to express conformance in metadata was recognized in 1999 and a first schema of the Evaluation and Report Language (EARL) has existed since 2001. However, due to lack of resources within the Evaluation and Repair Tools (ERT) Working Group, this language was not further developed beyond the Working Draft of 6 December, 2002.

On 15 February 2005, a newly chartered ERT Working Group reconvened to continue and finalize the development of EARL 1.0. The new ERT Working Group will start with the EARL Working Draft of December 2002 as basis. Experience gathered from existing implementations as well as solutions provided by recent semantic Web technologies (such as OWL for example) will be the main focus point of this effort.

1. Scope and Audience

Much of the effort and considerations made for EARL can be coupled to assumptions about the description of test cases and overall work flows for evaluation methodologies. However, these aspects will be remain out of scope for any design decisions in order to ensure that EARL 1.0 is generic enough to serve many purposes. Future extensions in EARL 1.0 may provide features that can be used to support specific work flows or test case assumptions but these "hooks" must be optional or non restrictive otherwise.

The main driver for the development of EARL 1.0 is to support Web accessibility evaluations which typically involves a combination of automated, semi-automated, and manual evaluations. However, there are several other audiences that could potentially benefit form using EARL. Examples include readapting Web content according to user preferences or generic quality assurance beyond the realm of Web accessibility.

2. Technical Work

Beside several smaller work items such as support for internationalization (which is mostly inherited from RDF anyway) or reusing established vocabularies in several areas of the schema (for example for date values), the following four core pieces of work will constitute the main part of the technical development. A special difficulty will be dealing with the strong relationships and sometimes even dependencies between these areas of work.

2.1. Location of Results

The EARL Working Draft of December 2002 proposes a model in which the subject of the assertion sufficiently describes the location of the result. For example, if the subject being tested is a specific markup element in a Web page, it could be described using an XPath expression in the subject of the assertion so that it could later be found again.

However, in several situations it is desired to have a more generic subject as the overall context of the assertion, and more specific location pointers (such as XPath or XPointer expressions, line and position numbers, as well as other methods) separately. For example, the subject of the assertion could be a Web resource, and additional location pointers could identify each location within that subject where the test case of the assertion yields the same results.

For this reason, the ERT Working Group will pursue the refinement of the EARL model in order to allow more granularity for expressing the location of assertion results. This will also facilitate the comparison of test results that share the same subject but this aspect will be handled more specifically in the context of the work described in section 2.3. Relationship between Assertions.

2.2. Persistency of Results

In the December 2002 Working Draft, EARL assertions are tied to date-time stamps which makes the results only valid for a snapshot in time. For example, on the Web the life spans of the assertions are tied to the creation date of the resource and are therefore in average quite brief. At the same time, some of these results may be expensive (for example when manual reviews are required to make a judgement).

While it is probably not possible to achieve absolute persistency of the assertions with respect to changes in the subject, there is some room for enhancement in this aspect. The ERT Working Group will pursue several possibilities to enhance the persistency aspect of EARL assertions, for example by studying which types of changes to a subject imply which consequences to the assertions in respect to test cases. For example, for tests that are not context-sensitive (typically validation-type tests), changes outside the direct scope of the test usually do not affect the result. However, most accessibility tests are somewhat context-sensitive.

A substantial part of this effort will be to research relevant work and gather partial solutions to build upon. For example, the previous ERT Working Group had done substantial work on hashing algorithms and other techniques in the attempt to solve this problem. There is also much related work in the Annotea server and other W3C projects which could potentially contain important pieces and ideas to build upon.

2.3. Relationship between Assertions

According to the EARL model proposed by the Working Draft of December 2002, assertions are quite independent of each other and can only be sometimes related by processing the subject. However, directly describing the relationships between both the subjects and the test cases allows EARL assertions to address subjects or test cases which are composed of several related parts. Examples include:

  • paths of pages within Web sites that are required in order to commit a transaction
  • sets of source code files that are compiled in order to build a software application
  • expressing statements that are based on the results gathered from other sub-tests

The rechartered ERT Working Group will pursue the description of EARL assertions as appropriate. For example through the subjects and/or the test case elements of the schema.

2.4. Confidence Claims

While the EARL confidence element that is proposed by the December 2002 Working Draft potentially provides a mechanism to prioritize assertions, it has not shown the desired effect in practice. The main reason for that is that currently EARL does not provide sufficient guidance on how to make use of that element. This lead to different interpretations between implementations and therefore a lack of compatibility and reliability on the value of this element.

The rechartered ERT Working Group will attempt to refine the model for expressing confidence claims in EARL assertions and adjusting the processing model accordingly. It may be necessary to extend or tweak other related EARL elements, such as the test case for example, in order to support a more robust mechanism for conformance claims.

3. Deliverables

During the development of EARL 1.0, the ERT Working Group will be supply the following deliverables. These documents will be released either as Working Group Notes or as W3C Recommendations depending on decisions that will be made by the Working Group at a later stage.

3.1. Core Schema

The technical work described in Section 2 will directly flow into a core schema for EARL 1.0. This schema must include an effective model as well as a robust processing model that underlines the usage of the language features. The schema must also be published in a formal grammar to allow developers to validate the EARL 1.0 code produced or processed by their tools.

3.2. Primer Document

In order to facilitate an easy entry to EARL 1.0, a primer document highlighting the business case and features of the language will be developed. This primer will include examples of producing EARL 1.0 as well as examples of processing it. It should server as a guide and tutorial for developers who are new to the language and to semantic Web technologies (such as RDF).

3.3. Test Suite

In accordance with the W3C QA activity as well as to promote the adoption of EARL 1.0, the ERT Working Group will develop test suites to assist developers in building accurate implementations. Where possible, these test suites will be built around other test suites developed at W3C to provide practical examples; also for developers outside the ERT Working Group.

4. Issue and Implementation Tracking

It is expected that as the number features and deliverables grow, the number of issues and bugs will also grow. Therefore, especially if EARL 1.0 is released as a W3C Recommendation, a comprehensive issue and feedback tracking mechanism needs to be established.

At the same time, in order to promote the adoption of EARL 1.0, implementations need to be tracked and documented for other developers to study. Documentations should highlight the specific EARL 1.0 features that are used, or other specialities of the implementations.

Appendix A: Consensus Items

Scope and Audience
  • S1: Describing tests directly or making assumptions on a model for test case descriptions is out of the scope for EARL 1.0.
  • S2: Assuming specific work flow models or evaluation methodologies as base models is out of the scope for EARL 1.0.
  • S3: EARL 1.0 will be designed to suite the widest possible audience of developers in the context of generic quality assurance.
Technical Work
  • W1: ERT WG will pursue supporting extended capabilities for expressing the location of results in EARL 1.0 assertions.
  • W2: ERT WG will pursue possibilities to enhance the persistency of EARL 1.0 assertions with respect to changes in the subject.
  • W3: ERT WG will pursue mechanisms to (optionally) describe relationships between subjects and test cases in EARL 1.0.
  • W4: ERT WG will provide an enhanced mechanism for expressing confidence claims uniformly in EARL 1.0 assertions.
  • D1: ERT WG will develop a core schema document for EARL 1.0 as a Working Group Note or as a W3C Recommendation.
  • D2: ERT WG will develop a primer document for EARL 1.0 as a Working Group Note or as a W3C Recommendation.
  • D3: ERT WG will develop test suites for EARL 1.0 as a Working Group Note (where possible around other W3C test suites).
Issue and Implementation Tracking
  • T1: ERT WG will establish a comprehensive issue and bug tracking mechanism for internal and public feedback on EARL 1.0.
  • T2: ERT WG will maintain an annotated, comprehensive listing of EARL 1.0 implementations as well as relevant resources.