See also: IRC log
Alessandro suggests taking item 2.5 before 2.3
<scribe> ACTION: Henry to provide registration page for August f2f [recorded in http://www.w3.org/2006/04/20-xproc-minutes.html#action01]
<ht> http://www.w3.org/2002/09/wbs/38398/XProcFTF2/ is now listed as an open questionnaire for our group [this completes HT's action --scribe]
<scribe> ACTION: Murray to provide local arrangements info for August (ETA: two weeks) [recorded in http://www.w3.org/2006/04/20-xproc-minutes.html#action02]
MSM: One prominent way to get to the meeting will be to drive. Can we add some questions about car pooling to the registration form?
Murray: I'm thinking about that, I'll see what makes the most sense.
Henry: Let's us a wiki for that instead
Alessandro: This was raised in a
call a few weeks ago.
... I don't know if we need to spend a whole lot of time on it. We probably don't want to add constructs to the language to control this if we can avoid it.
Richard: I hope most of this
falls out naturally. If we don't specify the order of execution
where it isn't inevitable. That implicitly allows parallel
execution. We don't initially have to say much about it.
... If you have two things that could be executed in parallel, maybe they will be. If you want to synchronize them, you have to provide some mechanism, such as reading a document that one is writing.
Alex: I think we shouldn't disallow parallel execution.
Richard: We shouldn't put anything in the language to accidentally prevent it.
Norm: It sounds like we view parallel exec. just as an optimization.
Richard: I take the normal unix
pipeline as a model. If you have two processes running, nothing
expresses the order except that if one is reading and one is
writing, you can be sure the reader will block waiting for the
... Another aspect is that any kind of streaming implies a certain kind of parallelism.
Norm repeats summary.
Richard: Not just features of the
language, but also the way we describe the language. A
processing model might have unintended consequences that
prevented parallelism, we want to avoid that
... An example: we might say that the processing language as if it executed the components in top-to-bottom, left-to-right order which would be bad because it would imply that side-effects (if there are any) occur in a particular order.
Alessandro: This is a specific
question about a particular example.
... The stylesheet executed by the second step is executed by the first step.
... Should the pipeline engine be allowed to cache the stylesheet produced by the first step across invocations
... Can the engine be smart enough to determine that the output will be the same and reuse a cached value.
Norm: I think that what an engine does is not our problem.
Richard: The answer, in some sense, is obviously yes. If the engine can determine that the same results will be produced, then it can use the cached copy.
Richard: What does it mean for it
to be exactly the same? Vanilla XSLT 1.0 stylesheets can't
produce any side effects.
... but care must still be taken to assure that side effects don't happen
... We may need a way to allow authors to express that some components are side-effect free
Alex: It would be interesting to
consider annotating the steps
... You may be able to say "never cache" but maybe a smart impl could cache or not as it saw fit otherwise.
Richard: There are some even simpler cases of caching. In the MT pipeline, we compile schemas and cache them. That means the same schema used in two places can reuse the cached copy.
Alex: The more interesting case is where it's produced by the pipeline.
<MoZ> alexmilowski, cache hints like expires in Cocoon ?
Alex: The concept of a dynamicly generated schema isn't far fetched, but URIs that change everytime you read them could be problematic.
HT: The http expires case isn't
good enough. The MT engine checks using the http refresh if
stale everytime anyone touches a cached resource because
there's no way to count on pipeline time and internet time
... The actual time between two uses of a cached object may be wildly different from what you think they are. The only safe thing to do is ask the server each time.
... That works on a filesystem too
... I'm not sure how that works in the context of documents generated by the pipeline
<Zakim> ht, you wanted to endorse the idea of annotation
HT: I think that for practical
reasons, I'd be very unhappy to see any requirement of no
side-effects imposed on components.
... I think "escape to program execution" and "synchronous SOAP exchange" are examples of components that cannot have intrinsic gaurantees of no side-effects.
... There are also cases of components that do database updates. Those components have a side-effect.
Alex: Those aren't (necessarily)
examples of pipeline steps communicating through
... If you're going to have that synchronization problem, you'd setup a dependency for that.
HT: I'm in favor of an approach
which has a default and allows the component to assert the
... 1. Not side-effect free; even though my inputs are the same, you can't be sure I'll produce the same output and
<MSM> [this seems to be an "i am not a function" declaration?]
HT: 2. An expression of out-of-band dependencies.
Norm observes that we've wandered into the issue of side-effects
Norm: I think everyone will agree that if the pipeline knows the output will be the same, it can cache the result
Alessandro: I'm not sure if this is too strong a statement. Consider the case of reading a stylesheet from a URI.
MSM: Side-effects and caching are not seperable questions
Alex: Caching is a feature of the implementation not the language
Richard: Just a big switch will
probably be too coarse grained.
... I imagine descriptions for each component type and the XSLT component might, for example, say that it has no side-effects by default. But then on a particular case, you could override it
Alex: The thing that concerns me
about being able to say a component has side-effect is that it
isn't clear what that means.
... Does it really effect the pipeline running?
... Unless there's some dependency in the flow-graph, what can the processor do.
... Unless we do something like what XSLT does with the document() function, I'm not sure there's a great answer here.
Norm: I wouldn't stream past a component that had side-effects
Norm describes the case of a SOAP service
Alex: You have a pipeline with
five steps, each has an auxinput that calls this SOAP
... If you cache, you'll get one answer. If you don't cache, you'll get five answers.
Norm will take the question to email