Unicorn Requirements Document / Book of Specifications
Built from April to September, 2006 by W3C interns Damien Leroy and Jean-Guilhem Rouel.
Introduction
Nowadays, the W3C provides a lot of validators and checkers that help developers to create web documents conforming to specifications built by W3C Workgroups or getting them accessible.
The most popular ones are probably the CSS validator and the Markup validator. But there are also some useful tools such as the link checker and many others.
So, if you want to check multiple things on a document, you will have to use multiple validators and checkers.
The aim of our internship is to create a "universal validator" that will be able to validate and check multiple things in a document through a single Web interface. But this is not only a merge of all the validators interfaces, there should be a strong link between all these tools to avoid inconsistency and useless job.
The idea [...] of this framework would be to create a system where several modules would give observations on a given source, using soap messaging to communicate. There would certainly be some orchestration of the observation requests/replies by a central module, but one could imagine that the modules could pass their observations to one another.
So, the framework will be composed by two different types of modules:
- several micro-observers that will check specific parts of a document
- a central module that will glue and coordinate all these micro-observers
Let's see in details how these modules will work.
Central module
In the following I will often use the term UniCORN to name this central module. In fact, UniCORN is the whole framework (central + observer modules), but it will be easier to call the central module UniCORN.
The central module is the part of the framework that will have the difficult task to coordinate the different micro-observers and dispatch information between them, but also provide a user interface.
Coordinator
To be able to coordinate the different observers, the central module needs some information about each of them. This information include the module location, it's role, what type of documents it can/should handle, ...
To provide all this information, we will use an xml file that we'll have to put
in a specific directory of the central module. A cool feature would be to be able
to add files in this directory and dynamically add the corresponding module,
without having to restart
UniCORN.
Some existing servers already provide such
feature (e.g: Tomcat, JBoss, Jonas, ... (yes they are all about Java ;))).
The complete description of this contract will come later.
Web interface
UniCORN will provide a Unified Web Interface, merging each module interfaces together, avoiding redundancies like two fields to enter the resource location.
The default interface offers three ways to call UniCORN:
- by URL
- by file upload
- by direct input
These three alternives are the standard ways to call the current CSS or Markup Validators and are adapted to check various web-documents.
In addition to this standard interface, we will find modules dedicated pieces of interface, one for each module. This is covered in the Interface description section.
The Web interface will be the only "official" one, but it will be easy to add new UIs, extending proper classes and interfaces.
Validation result
One mission of UniCORN is to provide several output results to fit with any client type (browsers, command-line software, automatic tools, ...). So it needs to handle several formats. For example, most of the users will need an (X)HTML output, but some tools may prefer a SOAP or EARL one, ...
So, the framework will possess an output engine that will allow to easily add new output formats, only writing a template document that will be used to build the results. The CSS Validator has an embryo of such a feature, but still needs Java classes to handle each output correctly.
Template principle is illustrated by the figure below:
The template format will look like this (for a plain text output for example):
#if( $valid )
The document is valid
#else
The document contains $errorsCount errors
#foreach( $error in $errorList )
* Line: $error.line
$error.message
#end
#end
Words with a leading $ are keywords provided by the framework, while
words with a leading # allows things such as iteration over a collection
of elements or conditional branch.
Micro-observers
Micro-observers are modules used by UniCORN to get different observations on a same document. These observations can include CSS validity, (X)HTML validity, mobile Web accessibility, links checking, and so on. In fact, these observers are current validators or checkers that are modified a little to be able to be "plugged" in UniCORN.
It might be necessary to adapt their entry point and their output format so that the framework can communicate with them.
The contract
The contract consists in two files that will register an observer with UniCORN. It is mandatory if we want to have a flexible architecture that can accept various validators and checkers without having to adapt the code of UniCORN.
As a consequence, UniCORN will have as many contract files as observers plugged in it.
The first and most important document composing these contracts is a simplified WADL documents describing the different methods supported by the observer (a POST one and a GET one), and the different parameters (and their possible values) of each call. The second document is an RDF one (associated to a RDFS) contains meta-data in different languages. These documents will be located on the observer-side and describe what the observer can do.
Generalities (RDF document)
Some fields of the contract are not necessary for the work of the framework, but can be useful to provide information to the user. We can mention for example:
- name
- version
- description
- provider
- help location
- mimetype list the observer can handle
These properties won't influence the way the framework works, so most of them are optional and will be written in the RDF file.
The next sections will describe what information the WADL document will give.
Location
UniCORN needs to know where an observer resides to be able to call it, so a module location entry could be quite useful :-).
As the framework will work on the Web, this field will be an URL, such as
http://validator.w3.org/check or
http://jigsaw.w3.org/css-validator/validator, so that we can directly
append possible arguments (the URI of the document for example).
Parameters allowed
Each module probably has specific parameters to tune a validation or check.
For example, with the CSS validator, there are uri,
text, profile, medium,
warning, output and lang.
Each parameter has a list of possible values, and eventually a default one.
Tasks
UniCORN will not propose a list of possible observations to users since there can be a lot. Instead, users will be able to choose a task among a list.
A task is a bunch of observations, each observation depending on others. For example, we can imagine that in a specific task, css validation depends on markup validation for HTML documents.
A list of parameters will be associated to each task. These parameters will be mapped to the parameters of observers involved in the task.
Interface description
Related to the tasks' parameters management, is the description of the user interface.
This part of the file will describe how the parameters should be represented.
If a parameter consists in a list of values, we should be able to choose between a radio-button list, a drop-down menu, a checkbox list... It should also be possible to add a text element if the user can enter it's own value.
If a parameter does not have a list of values, we should have choice between a textarea or a text element.
Parameters like url or text are more special because they determine
the method to use.
Finally, something must indicate if a parameter is visible or not in each interface:
standard and advanced. The possible useful values could be none to hide
this parameter, simple to show the parameter only in the standard interface,
advanced to show the parameter only in the advanced interface and
both to show the parameter in both interfaces.
Internationalization and localization
There are two sides for i18n and l10n:
- Framework side: there should be a localization of the Web interface in several languages. It should also localize generic messages (such as "Not found" when a resource is unreachable, unless we let each observer report it). Output templates will also be localizable, building one template per language.
-
Micro-observers side: each micro-observer has its own messages. The framework
doesn't know them, so they need to provide localized messages. If not, some
messages will be displayed using default language: English. They need to support
a
langparameter to enable localization.
