Evaluation and Report Language (EARL) 1.0 Guide

Editors' Draft 22 April 2009

This version:: http://www.w3.org/WAI/ER/EARL10/WD-EARL10-Guide-20090422
Latest published version:: http://www.w3.org/TR/EARL10-Guide/
Latest internal version:: http://www.w3.org/WAI/ER/EARL10-Guide
Previous published version:: http://www.w3.org/TR/2002/WD-EARL10-20021206/
Previous internal version:: http://www.w3.org/WAI/ER/EARL10/WD-EARL10-Guide-20070802
Editors:: Carlos A Velasco, Fraunhofer Institute for Applied Information Technology FIT; Johannes Koch, Fraunhofer Institute for Applied Information Technology FIT

This document is an introductory guide to the Evaluation and Report Language (EARL) 1.0 and is intended to accompany the normative document Evaluation and Report Language (EARL) 1.0 Schema [EARL-Schema]. The Evaluation and Report Language is a framework for expressing test results. Although the term test can be taken in its most widely accepted definition, EARL is primarily intended for reporting and exchanging results of tests of Web applications and resources. EARL is a vendor-neutral and platform-independent format.

EARL is expressed in the form of an RDF vocabulary. The Resource Description Framework (RDF) is a language for representing semantically information about resources in the World Wide Web. However, EARL is not conceptually restricted to these resources and could be applied in other scenarios.

Status of this document

This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at http://www.w3.org/TR/.

Please send comments about this document to the mailing list of the ERT WG. The archives for this list are publicly available.

This is a W3C Working Draft of the Evaluation and Report Language (EARL) 1.0 Guide. This document will be published and maintained as a W3C Recommendation after review and refinement. Publication as a Working Draft does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.

This document was produced under the 5 February 2004 W3C Patent Policy. The Working Group maintains a public list of patent disclosures relevant to this document; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) with respect to this specification should disclose the information in accordance with section 6 of the W3C Patent Policy.

This document has been produced as part of the W3C Web Accessibility Initiative (WAI). The goals of the Evaluation and Repair Tools Working Group (ERT WG) are discussed in the Working Group charter. The ERT WG is part of the WAI Technical Activity.

Introduction
What is EARL?
An EARL report: basics
- 3.1 Our first EARL report
Advanced EARL
Conclusions

1 Introduction

This document is an introductory guide to the Evaluation and Report Language (EARL) 1.0 and is intended to accompany the normative document Evaluation and Report Language (EARL) 1.0 Schema [EARL-Schema] and its associated vocabularies: HTTP Vocabulary in RDF [HTTP-RDF], Representing Content in RDF [Content-RDF] and Pointer Methods in RDF [Pointers-RDF]. The objectives of this document are:

To provide an introduction to the use of EARL and the associated vocabularies in different scenarios.
To clarify the key concepts of EARL and their translation into the different classes.
To explain the usage of the different EARL components.
To show how to aggregate and process EARL reports.
To demonstrate how to extend and customise EARL.

The primary audience of this document are quality assurance and testing tool developers such as accessibility checkers, markup validators, etc. Additionally, we expect that EARL can support accessibility and usability advocates, metadata experts and Semantic Web practitioners, among others. We do not assume any previous knowledge of EARL, but it is not the target of this document to introduce the reader to the intricacies of RDF and, therefore, the following background knowledge is required:

Basic knowledge of XML [XML] and its associated technologies.
Basic knowledge about the Semantic Web and RDF. For references, consult [RDF], [RDF-PRIMER] and [RDFS]. The reader must also be able to read and interpret the XML serialization of RDF.

Although the concepts of the Semantic Web are simple, their abstraction with RDF is known to bring difficulties to beginners. It is recommended to read carefully the aforementioned references and other tutorials found on the Web. It must be also borne in mind that RDF is primarily targeted to be machine processable and, therefore, some of its expressions are not very intuitive for developers used to work with XML only.

Editor's Note: insert here a paragraph on the structure of the Guide.

2 What is EARL?

The Evaluation and Report Language (EARL) is a framework targeted to express and compare test results. EARL builds on top of the Resource Description Framework [RDF], which is the basis for the Semantic Web. Like any RDF vocabulary, EARL is a collection of statements about resources, each with a subject, a predicate (or a verb) and an object. These statements can be serialized in many ways (e.g., RDF/XML or Notation 3, also known as N3). A typical EARL report could contain the following statements (simplifying the notation and not including namespaces):

<#someone> <#checks> <#resource> .
<#resource> <#fails> <#test> .

From these simple two statements, it can be inferred already the main components of an EARL Report (wrapped up in an assertion):

Who (or which tool) runs a test: this is known in the EARL terminology as the Assertor.
The resource tested: known as the Test Subject.
The tested criterion(-a): known as the Test Criterion.
The result(s) of the test: known as the Test Result.

This structure shows the universal applicability of EARL and its ability to refer to any type of test: bug reports, software unit tests, test suite evaluations, conformance claims or even tests outside the world of software and the World Wide Web (although in such cases, there might be open issues for its full applicability). It must be stressed again the semantic nature of EARL: its purpose is to facilitate the extraction and comparison of test results by humans and especially by tools (the Semantic Web paradigm); it is not an application for information storage, for which some other XML applications might be more suitable.

Initially, EARL was created as a way to create, merge and compare Web accessibility reports from different sources (tools, experts, etc.). However, this original aim has been expanded to cover wider testing scenarios. Summarising, EARL enables the:

Creation of a standardized way to produce test reports;
Exchange of reports between testers (humans or tools);
Comparison of test results;
Verification of how different test subjects fared on the same test; and
Aggregation of test results (e.g., like a different set of tests on the same subject or coming from different sources).

We want to highlight that the extensibility of RDF allows to tool vendors or developers the addition of new functionalities to the vocabulary, without losing any of the aforementioned characteristics, as other testers might ignore those extensions that they do not understand when processing third-party results.

It is also important to consider potential security and privacy issues when using EARL. For instance, test results expressed in EARL could contain sensitive information such as the internal directory structure of a Web server, username and password information, parts of restricted Web pages, or testing modalities, for example. The scope of this document is limited to the use of the EARL vocabulary: security and privacy considerations need to be made at the application level. For example, certain parts of the data may be restricted to appropriate user permissions, encrypted or obfuscated.

The keywords must, required, recommended, should, may, and optional in this document are used in accordance with RFC 2119 [RFC2119].

2.1 EARL use cases

The applicability of EARL to different scenarios can be seen in the following use cases:

Evaluating a Web site using tools in different languages

A group of people speaking different languages are evaluating a Web site for conformance to different legal environments, such as Section 508 in the USA and BITV in Germany. The use of EARL:

allows localized messages explaining where problems are met. The report can contain messages in the languages spoken by the evaluators so that each of them understands the messages.
allows "keywords" to express the conformance level reached by the Web site that are language-independent. Thus a software tool can translate the validity levels in different languages.

Combining results from different evaluation tools

A Web site evaluator uses different tools for the task. Each tool can perform specific tests that the other tools cannot do. The evaluator's client wants a complete evaluation report. All the evaluation tools used produce a report in EARL format. Therefore, the evaluator can combine the separate reports into one bigger report, query the results, and offer to her customer statistical reports and a detailed conformance claim that specifies where the Web site does not meet the required level.

Comparing results from different tools

A Web site evaluator uses different tools for evaluation. The tools perform the same tests. All the evaluation tools used produce a report in EARL format. Therefore, the evaluator can compare the results from different tools to increase the confidence level of the test results. It will also help to make assertions about a given resource, when one of the tools is only able to give a warning on a problem, but the other performs a thorough test that removes the aforementioned uncertainty.

Benchmarking an evaluation tool against a test suite

For a benchmarking test of a given provider, different tools perform their tests on sample documents from a test suite. Some evaluation tools may produce false positives or false negatives. All of them create an EARL report with the result. Comparing the results of the tools with a theoretical output file from the test suite, evaluation tools could be accurately rated against the test suite.

Monitoring a Web site over time (Quality Assurance)

A Web project manager wants to track the accessibility of a Web site over time by comparing current test results with previous ones. The reports contain the date and time of the tests and pointers to locate the parts of the document the messages refer to. By comparing messages referring to the same locations, the project manager can monitor possible improvements, and allocate resources to solve problems in the critical areas of the Web site.

Exchanging data with repair tools

A repair tool (or a similar module in an authoring tool) uses the results of an evaluation tool to identify the parts of the document that need to be fixed. For each instance of an error it provides a way for the user to notice the error and fix the document. The same scenario can be used for Content Management Systems that wish to integrate an evaluation tool into their workflow, helping to locate accessibility and validation problems to Web editors.

Exchanging data with search engines

A search engine uses a third-party service, which publishes EARL reports of Web sites. The user interface lets the user choose between different levels of accessibility. The list of search results contains only documents with a chosen accessibility level. The search engine uses the test results in the calculation of the ranking/relevance, so that it affects the search results order.

Monitoring generic Quality Assurance processes

EARL could be applied to any generic Quality Assurance process, not necessarily related to the Web or to software development. Like in any Semantic Web application, it is only required to map real objects, actors and processes to URIs.

2.2 EARL audience

EARL is flexible enough to respond to the needs of a variety of audiences involved in a testing or quality assurance process. Typical profiles are:

Web commissioner: Some private or public Web site owner commissions her site to an external agency, and wishes to monitor the site's compliance with a given set of legal or internal quality requirements;
Product manager: Responsible for delivering a given product;
Product designer: Designs the product and documents this in a design specification;
Quality engineer or tester: Takes the product through a series of tests to find bugs;
Developer: Creates a product to satisfy the design specification or fixes bugs found by the quality engineer or tester; and
Accessibility/usability advocate: Tests and monitors compliance of Web resources with different requirements.

2.3 Fitting EARL to the testing process

In software testing and quality assurance environments there are several typical steps that are followed. These are:

Test Plan, which prescribes the scope, approach, resources, and schedule of the testing activities. This part is mainly a management planning, and lies outside the capabilities of EARL, although the resources to be tested can be expressed with EARL.
Test Specification, which describes the test cases and the test procedures. EARL could be used to identify Test Cases (or test criteria), although the language is not targeted to support the different aspects of the specifications in the standard. For this target, a Test Case Description Language will be more appropriate.
Test Execution, which covers the physical execution of the tests. The output of this phase is the starting point for the reporting phase.
Test Reporting, which deals not only with the creation of test results' reports, but may include their post-processing like, for instance, filtering, aggregation, summarization, etc. This phase is where EARL fits more appropriately because its semantic nature enables these tasks.

Figure 1 displays graphically the aforementioned elements:

Figure 1. Steps in software testing processes.

The previous steps can be matched to existing standards like IEEE 829 [IEEE-829], which defines a set of basic software tests documents.

3 An EARL report: basics

EARL is not an standalone vocabulary, and builds on top of many existing vocabularies that cover some of its needs for metadata definition. This approach avoids the re-creation of applications already established and tested like the Dublin Core elements. The referenced specifications are:

Dublin Core Metadata Initiative. The Dublin Core is a metadata standard for describing digital resources, often expressed in XML. The first standard published is the Dublin Core Metadata Element Set. It consists of 16 optional metadata elements, any of which may be repeated or omitted. Typical elements are: Title, Creator, Subject, Description, Publisher, Contributor, Date, Type, Format, Identifier, Source, Language, etc. The Dublin Core Metadata Element Set was accepted as a NISO standard in 2001 (ANSI/NISO Z39.85-2001) and as an ISO standard in 2003 (ISO 15836:2003(E)).
Friend of a Friend (FOAF) project. The FOAF project is about creating a Web of machine-readable resources describing people, the links between them and the things they create and do. Of particular interest for EARL are the Classes foaf:Person and foaf:Project [FOAF].
Content-in-RDF [Content-RDF]. This is an RDF vocabulary to represent semantically any type of content, either on the Web or in any local storage media.
HTTP vocabulary in RDF [HTTP-RDF]. This is an RDF vocabulary used to represent HTTP requests and responses, and it is useful to identify online resources accessed via HTTP, which cannot be uniquely resolved via a URI. Typical examples would be servers accessed via content negotiation, Web applications using POST requests, etc.
Pointer Methods in RDF [Pointers-RDF]. This is an RDF vocabulary to enable pointing in an accurate way certain parts within a document, particularly HTML and XML documents.

RDF can be serialized in different ways, but the XML representation [RDF/XML] is the preferred method and will be used throughout this document. However, even when selecting this approach, there are many equivalent ways to express an RDF model.

These vocabularies are referenced via namespaces in the corresponding RDF serialization. The list of the normative namespaces can be found in the EARL 1.0 Schema.

3.1 Our first EARL report

In the following sections, we will make an step-by-step introduction to EARL with several examples. The root element of any EARL report is an RDF node, as with any RDF vocabulary. There, we declare the corresponding namespaces and possibly any custom namespace used to define additional classes and or properties.

Example 3.1. The root element of an EARL report [download file].

<rdf:RDF xmlns:earl="http://www.w3.org/ns/earl#"
         xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
         xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#">

    <!-- ... -->

</rd:RDF>

Next, let us assume that we want to express the results of an XHTML validation in a given document with the W3C HTML Validator in EARL. The tested document has the following HTML code:

Example 3.2. An XHTML document to be validated [download file].

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
  "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">

<html lang="en" xmlns="http://www.w3.org/1999/xhtml" xml:lang="en">
  <head>
  <title>Example of project pages</title>
  </head>
  <body>
  <h1>Project description</h1>
  <h2>My project name</h2>

    <!-- ... -->
  </body>
</html>

This document has three errors that will constitute the basis of our EARL report:

Error: Line 14, column 7: document type does not allow element "li" here; missing one of "ul", "ol" start-tag.
Error: Line 15, column 6: end tag for "li" omitted, but OMITTAG NO was specified.
Error: Line 16, column 9: there is no attribute "alt".

[Editor's note: To be extended. Include here Content, HTTP y Pointers]

4 Advanced EARL

[Editor's note: To be added: EARL extension; aggregation of reports; ...]

5 Conclusions

This guide presented a thorough overview of the Evaluation and Report Language (EARL). As mentioned in the introduction, EARL must be seen as a generic framework that can facilitate the creation and exchange of test reports. In this generality lies its strength, as it can be applied to multiple scenarios and use cases, which may even lay outside the world of software development and compliance testing.

The EARL framework allow as well merging and aggregation of results in a semantic manner, thus enabling different testing actors to share and improve results.

Of course, there could be scenarios where EARL might not be able to cope with their underlying complexity. However, its semantic nature allows its extensibility via proprietary vocabularies based upon RDF, without endangering the interoperability of the reports.

The Working Group is looking forward to receiving feedback on the current version of the schema, and expects from implementers of compliance tools issues and suggestions for improvement.

Appendix A: References

[Content-RDF]: Representing Content in RDF.
[DC]: The Dublin Core Metadata Element Set - DC Recommendation, 20 December 2004.
http://www.dublincore.org/documents/dces/
[DCT]: The Dublin Core Metadata Terms - DC Recommendation, 13 June 2005.
http://www.dublincore.org/documents/dcmi-terms/
[EARL-Schema]: Evaluation and Report Language 1.0 Schema.
[FOAF]: FOAF Vocabulary Specification - Working Draft, 3 June 2005.
http://xmlns.com/foaf/0.1/
[HTTP-RDF]: HTTP Vocabulary in RDF.
[IEEE-829]: IEEE Standard for Software Test Documentation (IEEE Std 829-1998). ISBN 0-7381-1444-8 SS94687. Available at: http://ieeexplore.ieee.org/servlet/opac?punumber=5976
[Pointers-RDF]: Pointer Methods in RDF.
[RDF]: Resource Description Framework (RDF) Model and Syntax Specification - W3C Recommendation, 22 February 1999.
http://www.w3.org/TR/1999/REC-rdf-syntax-19990222/
[RDF-PRIMER]: RDF Primer - W3C Recommendation, 10 February 2004.
http://www.w3.org/TR/rdf-primer/
[RDFS]: RDF Vocabulary Description Language 1.0: RDF Schema - W3C Recommendation, 10 February 2004.
http://www.w3.org/TR/rdf-schema/
[RDF-XML]: RDF/XML Syntax Specification (Revised) - W3C Recommendation 10 February 2004.
http://www.w3.org/TR/rdf-syntax-grammar/
[RDF-XML-DIFFS]: Why RDF model is different from the XML model - Paper by Tim Berners-Lee, September 1998.
http://www.w3.org/DesignIssues/RDF-XML
[RFC2119]: Key words for use in RFCs to Indicate Requirement Levels - IETF RFC, March 1997.
http://www.ietf.org/rfc/rfc2119.txt
[OWL]: OWL Web Ontology Language - W3C Recommendation, 10 February 2004.
http://www.w3.org/TR/owl-features/
[WCAG10]: Web Content Accessibility Guidelines 1.0 - W3C Recommendation, 5 May 1999.
http://www.w3.org/TR/WCAG10/
[XML]
[URI]
[XPath]: XML Path Language (XPath) Version 1.0 – James Clark, Steve DeRose, W3C Recommendation, 16 November 1999; To be completed.

Appendix B: Contributors

Shadi Abou-Zahra, Carlos Iglesias, Michael A Squillace, Johannes Koch and Carlos A Velasco.