Unicorn Requirements Document / Book of Specifications

Built from April to September, 2006 by W3C interns Damien Leroy and Jean-Guilhem Rouel.

Introduction

Nowadays, the W3C provides a lot of validators and checkers that help developers to create web documents conforming to specifications built by W3C Workgroups or getting them accessible.

The most popular ones are probably the CSS validator and the Markup validator. But there are also some useful tools such as the link checker and many others.

So, if you want to check multiple things on a document, you will have to use multiple validators and checkers.

The aim of our internship is to create a "universal validator" that will be able to validate and check multiple things in a document through a single Web interface. But this is not only a merge of all the validators interfaces, there should be a strong link between all these tools to avoid inconsistency and useless job.

The idea [...] of this framework would be to create a system where several modules would give observations on a given source, using soap messaging to communicate. There would certainly be some orchestration of the observation requests/replies by a central module, but one could imagine that the modules could pass their observations to one another.

So, the framework will be composed by two different types of modules:

Let's see in details how these modules will work.

Central module

In the following I will often use the term UniCORN to name this central module. In fact, UniCORN is the whole framework (central + observer modules), but it will be easier to call the central module UniCORN.

The central module is the part of the framework that will have the difficult task to coordinate the different micro-observers and dispatch information between them, but also provide a user interface.

Coordinator

To be able to coordinate the different observers, the central module needs some information about each of them. This information include the module location, it's role, what type of documents it can/should handle, ...

To provide all this information, we will use an xml file that we'll have to put in a specific directory of the central module. A cool feature would be to be able to add files in this directory and dynamically add the corresponding module, without having to restart UniCORN. Some existing servers already provide such feature (e.g: Tomcat, JBoss, Jonas, ... (yes they are all about Java ;))).
The complete description of this contract will come later.

Web interface

UniCORN will provide a Unified Web Interface, merging each module interfaces together, avoiding redundancies like two fields to enter the resource location.

The default interface offers three ways to call UniCORN:

These three alternives are the standard ways to call the current CSS or Markup Validators and are adapted to check various web-documents.

In addition to this standard interface, we will find modules dedicated pieces of interface, one for each module. This is covered in the Interface description section.

The Web interface will be the only "official" one, but it will be easy to add new UIs, extending proper classes and interfaces.

Validation result

One mission of UniCORN is to provide several output results to fit with any client type (browsers, command-line software, automatic tools, ...). So it needs to handle several formats. For example, most of the users will need an (X)HTML output, but some tools may prefer a SOAP or EARL one, ...

So, the framework will possess an output engine that will allow to easily add new output formats, only writing a template document that will be used to build the results. The CSS Validator has an embryo of such a feature, but still needs Java classes to handle each output correctly.

Template principle is illustrated by the figure below:
template+values=output

The template format will look like this (for a plain text output for example):

	#if( $valid )
	  The document is valid
	#else
	  The document contains $errorsCount errors
	  #foreach( $error in $errorList  )
	    * Line: $error.line
	      $error.message
	  #end
	#end
      

Words with a leading $ are keywords provided by the framework, while words with a leading # allows things such as iteration over a collection of elements or conditional branch.

Micro-observers

Micro-observers are modules used by UniCORN to get different observations on a same document. These observations can include CSS validity, (X)HTML validity, mobile Web accessibility, links checking, and so on. In fact, these observers are current validators or checkers that are modified a little to be able to be "plugged" in UniCORN.

It might be necessary to adapt their entry point and their output format so that the framework can communicate with them.

The contract

The contract consists in two files that will register an observer with UniCORN. It is mandatory if we want to have a flexible architecture that can accept various validators and checkers without having to adapt the code of UniCORN.

As a consequence, UniCORN will have as many contract files as observers plugged in it.

The first and most important document composing these contracts is a simplified WADL documents describing the different methods supported by the observer (a POST one and a GET one), and the different parameters (and their possible values) of each call. The second document is an RDF one (associated to a RDFS) contains meta-data in different languages. These documents will be located on the observer-side and describe what the observer can do.

Generalities (RDF document)

Some fields of the contract are not necessary for the work of the framework, but can be useful to provide information to the user. We can mention for example:

These properties won't influence the way the framework works, so most of them are optional and will be written in the RDF file.

The next sections will describe what information the WADL document will give.

Location

UniCORN needs to know where an observer resides to be able to call it, so a module location entry could be quite useful :-).

As the framework will work on the Web, this field will be an URL, such as http://validator.w3.org/check or http://jigsaw.w3.org/css-validator/validator, so that we can directly append possible arguments (the URI of the document for example).

Parameters allowed

Each module probably has specific parameters to tune a validation or check. For example, with the CSS validator, there are uri, text, profile, medium, warning, output and lang.

Each parameter has a list of possible values, and eventually a default one.

Tasks

UniCORN will not propose a list of possible observations to users since there can be a lot. Instead, users will be able to choose a task among a list.

A task is a bunch of observations, each observation depending on others. For example, we can imagine that in a specific task, css validation depends on markup validation for HTML documents.

A list of parameters will be associated to each task. These parameters will be mapped to the parameters of observers involved in the task.

Interface description

Related to the tasks' parameters management, is the description of the user interface.

This part of the file will describe how the parameters should be represented.

If a parameter consists in a list of values, we should be able to choose between a radio-button list, a drop-down menu, a checkbox list... It should also be possible to add a text element if the user can enter it's own value.

If a parameter does not have a list of values, we should have choice between a textarea or a text element.

Parameters like url or text are more special because they determine the method to use.

Finally, something must indicate if a parameter is visible or not in each interface: standard and advanced. The possible useful values could be none to hide this parameter, simple to show the parameter only in the standard interface, advanced to show the parameter only in the advanced interface and both to show the parameter in both interfaces.

Internationalization and localization

There are two sides for i18n and l10n: