Unicorn Implementation Questions

Who will parse the document? (Solved)

There are several possibilities:

Personnally (Jean-Guilhem), I think the best solution would be the third one, because it avoids huge modifications on observers, and resulting bugs. I don't think neither that having multiple parses would lead to many inconstencies (existing validators are nearly perfect... ahem :-)), and in the case it might happen, a simple message could explain that the validators/checkers have some issues...

Actually we have chosen to use the third solution.

How observers should interact with each others and with UniCORN? (Solved)

In the previous question, we have seen that if an observer parses the document and emit fragment to other observers, there are two possibilities:

This problem exists only if we choose the second solution in the previous question, but a problem still remains: how do observers and the framework communicate together? We think that the solution would be to use SOAP messages or EARL documents (and maybe only GET requests in the framework→observer direction if we choose the third answer in the previous question).

Actually the observer didn't communicate between us so this question doesn't make sense.

Which implementation for the framework? (Solved)

Since validators/checkers are already developped using various technologies, we should be quite technology-free for UniCORN, we only have to use a standard way of communication between the different entities (as explained before).

We think we will develop it in Java because it's quite easy to use it, and we already know it (contrary to PERL). We still don't know what technology to use (servlets, JSP, ...) and have to get documented on that.

Identify document type

Because the framework enables to validate/check many type of document as XHTML document, CSS stylesheet, XML document and other, we need a solution to identify a document. In most case, we can rely of the Content-Type provided by other servers or MIME-Type for file uploads.

The main problem is what to do in case of direct input?

Actually we use a drop-down list in the "Direct input" interface.

Do we use the word "validation" in a good way in these documents ?

We hope we are not mistakening when we use this word. It's possible that sometimes we are mixing validation and check.

Observers output

Observers' results can be written in SOAP or EARL. The question is which of them would be the best to communicate with the framework? SOAP output already exists for some validators, but it is not normalized (CSS validator has one, Markup validator has a different one). So, will it be possible to have a generic handling of these SOAP messages if they are all different? Yves seems to say yes, but we don't really understand how because how will the framework know the way to handle results? For example, if an observer decides to write it's messages in a <msg> tag, and another one in <error>, the framework won't probably know how to manage them. So, we think that having a "universal" WSDL would be a lot easier, but... When Yves returns to work we will talk about that problem with him.

File upload (Solved)

How can we handle file upload? The framework might read the document and send its sontent to observers using the text parameter (this parameter exists in the CSS validator and Markup validator) with the good MIME-Type. Another way is maybe to upload the file to each observer.

We use a specific class (ClientHttpRequest write by Vlad Patryshev) to upload file received from client to any appropriate observer.

Use of WSDL/WADL

Yves told us that it would be great to use WSDL or WADL to describe the contract. But we are not sure it will enough to describe everything, because the contract is not only used to know how to call an observer. It contains also information on the possible values of these parameters, a description of the UI, ... We will discuss this problem with Yves as soon as he is back from holidays.

Parameters URI, DirectInput and FileUpload

This three parameter are specific cause they define the input method. So we need a solution to handle observer will not allow all this input method.

An approach can be : Describe in the RDF of observer if it can handle each parameter. In the client interface enable only the input method allowed by the selected task. Show, for task who need it, information about input method not handle.

Select one observer between two or more.

In some case it will be usefull to select only the best observer to handle a document and avoid other observer even if they also can handle the mime type of the document.

Actually the UniCORN framework checks each observer and call those who can handle the mime type of the document.