Re: Pipeline proposal

Richard Tobin wrote:
> See http://www.cogsci.ed.ac.uk/~richard/pipeline.html

Thanks for posting this, Richard. It's very helpful to have something 
concrete like this to discuss.

Some questions/comments:

Why do you say "Each port must be connected to exactly one other port."? 
I see that you've suggested 'sink' and 'duplicator' components to 
support occasions when you want to ignore some output or connect some 
output to more than one input, but you wouldn't need these components if 
you could simply connect each port to zero or more than one other port. 
What's the rationale behind this constraint?

There seems to be a growing consensus within the group, reflected here, 
that inputs and outputs accept/produce *documents* and can't 
accept/produce document fragments (e.g. elements). I assume that they 
also can't accept/produce external parsed entities (e.g. documents 
without a document element). I accept that this might be a necessary 
simplification for this version, but I think that we should have in the 
back of our minds that to fully support XSLT and XQuery we really need 
to pass document fragments and external parsed entities between components.

You talk about declaring the cardinality of the inputs/outputs a 
component accepts/produces. Another thought, perhaps for a future 
version, would be to provide more detail about what's expected/generated 
by a particular component: perhaps giving the name of the document 
element or even an entire schema that the documents adhere to. I'd like 
to see us allow for that kind of extension if we don't support it in 
this version.

I'm concerned about how we define parameters. In particular, I'm worried 
about the XSLT case where one of the parameters for the component is a 
set of QName/value pairs (the XSLT parameters). I guess that in this 
proposal, you'd do that with a parameter called 'parameters' whose value 
was a formatted string such as '{uri1}local1=value1; 
{uri2}local2=value2', or we'd say that the XSLT parameters were encoded 
in an XML document and passed as an *input*.

You say: "Except as described in conditionals, all components in a 
pipeline are run (in particular, they do not get run only if input 
arrives or output is requested)." I'm not sure of your intention here. 
I'm worried that this constraint prevents implementations from caching 
and reusing intermediate documents (if they can detect that the 
information that led to the generation of those documents hasn't 
changed). Perhaps we need to look at the question of whether components 
can have side-effects to work out whether this is important or not.

That's enough for now.

Cheers,

Jeni
-- 
Jeni Tennison
http://www.jenitennison.com

Received on Friday, 24 March 2006 11:09:47 UTC