Evaluation and Report Language (EARL) 1.0 Guide

Status of this document

This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at http://www.w3.org/TR/.

[Editor's note: describe intent of this working draft and propose feedback questions. Synchronize with EARL 1.0 Schema.]

Please send comments to the mailing list of the ERT WG. The archives for this list are publicly available.

This is a W3C Working Draft of the Evaluation and Report Language (EARL) 1.0 Guide. This document will be published and maintained as a W3C Recommendation after review and refinement. Publication as a Working Draft does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.

This document was produced under the 5 February 2004 W3C Patent Policy. The Working Group maintains a public list of patent disclosures relevant to this document; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) with respect to this specification should disclose the information in accordance with section 6 of the W3C Patent Policy.

This document has been produced as part of the W3C Web Accessibility Initiative (WAI). The goals of the Evaluation and Repair Tools Working Group (ERT WG) are discussed in the Working Group charter. The ERT WG is part of the WAI Technical Activity.

2. What is EARL?

The Evaluation and Report Language (EARL) is a framework targeted to express and compare test results. EARL builds on top of the Resource Description Framework [RDF], which is the basis for the Semantic Web. It is not the object of this document to introduce the reader to the intricacies of RDF, and some basic knowledge must be assumed as a pre-requisite (see, e.g., [RDF-PRIMER] for more information). As any RDF vocabulary, EARL is not more than a collection of statements about resources, each with a subject, a predicate (or a verb) and an object. These statements can be serialized in many ways (RDF/XML or Notation 3, N3). A typical EARL report could contain the following statements (oversimplifying the notation and not including namespaces):

<#someone> <#checks> <#resource> .
<#resource> <#fails> <#test> .

From these simple two statements, it can be inferred already the main components of an EARL Report:

Who (or which tool) runs a test.
The resource tested.
The result(s) of the test.
The tested criterion(-a).

This structure shows the universal applicability of EARL and its ability to refer to any type of test: bug reports, software unit tests, test suite evaluations, conformance claims or even tests outside the world of software and the World Wide Web (although for such cases, there might be too open issues for its full aplicability). It must be stressed again the semantic nature of EARL: its purpose is to facilitate the extraction and comparison of test results by humans and especially by tools (the semantic Web paradigm); it is not simply an storage of information, for which some other XML application might be more suitable.

Summarising, the objectives of EARL are to:

Create a standardised way to produce test reports;
Support the exchange of reports between testers (humans or testing tools);
Facilitate the comparison of test results; and
Ease the aggregation of test results (e.g., like a different set of tests on the same subject).

It is also remarkable that the extensibility of RDF (or EARL) allows to tool vendors or developers the addition of new functionalities to the vocabulary, without losing any of the aforementioned characteristics, as other testers might ignore those extensions that they do not understand when processing third party results.

2.1. EARL use cases

The applicability of EARL to different scenarios can be seen in the following use cases:

Evaluating a Web site using tools in different languages

A group of people speaking different languages are evaluating a Web site for conformance to different legal environments, such as, e.g., Section 508 in the USA and BITV in Germany. The use of EARL:

allows localized messages explaining where problems are met. The report can contain messages in the languages spoken by the evaluators so that each of them understands the messages.
allows "keywords" to express the conformance level reached by the Web site that are language-independent. Thus a software tool can translate the validity levels in different languages.

Combining results from different evaluation tools

A Web site evaluator uses different tools for the task. Each tool can perform specific tests that the other tools cannot do. The evaluator's client wants a complete evaluation report. All the evaluation tools used produce a report in EARL format. Therefore, the evaluator can combine the separate reports into one bigger report, query the results, and offer to her customer statistical reports and a detailed conformance claim that specifies where the Web site does not meet the required level.

Comparing results from different tools

A Web site evaluator uses different tools for evaluation. The tools perform the same tests. All the evaluation tools used produce a report in EARL format. Therefore, the evaluator can compare the results from different tools to increase the confidence level of the test results. It will also help to make assertions about a given resource, when one of the tools is only able to give a warning on a problem, but the other performs a thorough test that removes the aforementioned uncertainty.

Benchmarking an evaluation tool against a test suite

For a benchmarking test of a given provider, different tools perform their tests on sample documents from a test suite. Some evaluation tools may produce false positives or false negatives. All of them create an EARL report with the result. Comparing the results of the tools with a theoretical output file from the test suite, evaluation tools could be rated according to accuracy against the test suite.

Monitoring a Web site over time (quality assurance)

A Web project manager wants to track the accessibility of a Web site over time by comparing current test results with previous ones. The reports contain the date/time of the tests and a way to locate the parts of the document the messages refer to. By comparing messages referring to the same locations the project manager can monitor possible improvements, and allocate resources to solve problems in the critical areas of the Web site.

Exchanging data with repair tools

A repair tool (or a similar module in an authoring tool) uses the results of an evaluation tool to identify the parts of the document that need to be fixed. For each instance of an error it provides a way for the user to notice the error and fix the document. The same scenario can be used for Content Management Systems that wish to integrate an evaluation tool into their workflow, helping to locate accessibility and validation problems to Web editors.

Exchanging data with search engines

A search engine uses a third-party service which publishes EARL reports of Web sites. The user interface lets the user choose between different levels of accessibility. The list of search results contains only documents with a chosen accessibility level. The search engine uses the test results in the calculation of the ranking/relevance, so that it affects the search results order.

[Editor's note: Maybe add some more exotic scenario outside the Web and software development.]

2.2. EARL audience

EARL is flexible enough to respond to the needs of a variety of audiences involved in a testing or quality assurance process. Typical profiles are:

Product manager: responsible for delivering a given product;
Product designer: designs the product and documents this in a design specification;
Quality engineer or tester: takes the product through a series of tests to find bugs; and
Developer: creates a product to satisfy the design specification; fixes bugs found by the quality engineer or tester.

2.3. Fitting EARL to the test process

A generic testing process has several steps. Typically, a test process consists of the following phases (see Figure 1):

Requirements capture
Test specification and design
Test
Reporting of test results
Collation of results
Query of results
Presentation of results

EARL is targeted to the phase #4, and supports the rest of the following ones, by providing the necessary information for the semantic interpretation of the test results.

[Editor's note: Add figure.]

3. Structure of an EARL report: concepts and core classes

EARL is not an standalone technology, and builds on top of many existing vocabularies that cover some of its needs for metadata definition. This approach avoids the re-creation of applications already established and tested like the Dublin Core elements. The referenced specifications are:

Dublin Core Metadata Initiative: The Dublin Core is a metadata standard for describing digital resources, often expressed in XML. The first standard published is the Dublin Core Metadata Element Set. It consists of 16 optional metadata elements, any of which may be repeated or omitted. Typical elements are: Title, Creator, Subject, Description, Publisher, Contributor, Date, Type, Format, Identifier, Source, Language, etc. The Dublin Core Metadata Element Set was accepted as a NISO standard in 2001 (ANSI/NISO Z39.85-2001) and as an ISO standard in 2003 (ISO 15836:2003(E)).
Friend of a Friend (FOAF) project: The FOAF project is about creating a Web of machine-readable resources describing people, the links between them and the things they create and do. Of particular interest for EARL are the Classes foaf:Person and foaf:Project [FOAF].

[Editor's note: ...]

[Editor's note: RDF/XML serialization.]

3.1. Namespaces

Table 1 [XXX define anchor XXX] presents the core namespaces used by EARL. The prefix refers to the convention used in this document to denote a given namespace, and can be freely modified.

Namespace prefix	Namespace URI	Comment
earl	`http://www.w3.org/WAI/ER/EARL/nmg-strawman#`	The default EARL namespace. Where RDF terms are used in their abbreviated form (e.g. Assertion or foaf:Person), if no namespace is provided the term is in the EARL namespace.
rdf	`http://www.w3.org/1999/02/22-rdf-syntax-ns#`
rdfs	`http://www.w3.org/2000/01/rdf-schema#`
owl	`http://www.w3.org/2002/07/owl#`
dc	`http://purl.org/dc/elements/1.1/`
dct	`http://purl.org/dc/terms/`
foaf	`http://xmlns.com/foaf/0.1/`

[Editor's note: Versioning terms during the process of developing the vocabulary is an issue the group is working on. It is possible that a new namespace will be used for a final version of the vocabulary.]

[DC]: The Dublin Core Metadata Element Set - DC Recommendation, 20 December 2004.
http://www.dublincore.org/documents/dces/
[DCT]: The Dublin Core Metadata Terms - DC Recommendation, 13 June 2005.
http://www.dublincore.org/documents/dcmi-terms/
[FOAF]: FOAF Vocabulary Specification - Working Draft, 3 June 2005.
http://xmlns.com/foaf/0.1/
[RDF]: Resource Description Framework (RDF) Model and Syntax Specification - W3C Recommendation, 22 February 1999.
http://www.w3.org/TR/1999/REC-rdf-syntax-19990222/
[RDF-PRIMER]: RDF Primer - W3C Recommendation, 10 February 2004.
http://www.w3.org/TR/rdf-primer/
[RDFS]: RDF Vocabulary Description Language 1.0: RDF Schema - W3C Recommendation, 10 February 2004.
http://www.w3.org/TR/rdf-schema/
[RDF-XML-DIFFS]: Why RDF model is different from the XML model - Paper by Tim Berners-Lee, September 1998.
http://www.w3.org/DesignIssues/RDF-XML
[RFC2119]: Key words for use in RFCs to Indicate Requirement Levels - IETF RFC, March 1997.
http://www.ietf.org/rfc/rfc2119.txt
[OWL]: OWL Web Ontology Language - W3C Recommendation, 10 February 2004.
http://www.w3.org/TR/owl-features/
[WCAG10]: Web Content Accessibility Guidelines 1.0 - W3C Recommendation, 5 May 1999.
http://www.w3.org/TR/WCAG10/
[XML]: To be completed.
http://www.ietf.org/rfc/rfc2119.txt

Evaluation and Report Language (EARL) 1.0 Guide

Editors' Draft 14 December 2005

Abstract

Status of this document

Table of Contents

Appendices

1. Introduction

1.1 Pre-requisites

2. What is EARL?

2.1. EARL use cases

2.2. EARL audience

2.3. Fitting EARL to the test process

3. Structure of an EARL report: concepts and core classes

3.1. Namespaces

3.2. The EARL root element

3.3. Our first EARL report

3.4. Core EARL Classes

4. Complex examples

5. Processing and aggregating reports

6. Extending EARL

7. Conclusions and open issues

Appendix A: References

Appendix B: Contributors