XML Processing Model WG -- 16 Feb 2006

<scribe> Scribe: Norm

<scribe> ScribeNick: Norm

Date: 16 Feb 2006

Administrivia

Accept this agenda?

-> http://www.w3.org/XML/XProc/2006/02/16-agenda.html

Accepted.

Accept minutes from the previous teleconference?

-> http://www.w3.org/XML/XProc/2006/02/09-minutes.html

Accepted.

Next meeting: 23 Feb 2006.

Any regrets?

None

Dail-in for the face-to-face at the plenary?

(08:00+01:00-18:00+01:00 Monday and Tuesday, 27-28 Feb)

Andrew would like to call in.

Agenda planning for the face-to-face

Alex: Infosets/representation of inputs as a topic for the f2f

Norm: Processing model

Richard: I was speaking about the non-xml stuff being the same thing

Alex: Does it make sense to spend some time talking about the various tools that are out there?

Richard: I was going to suggest the attendees that have a pipeline implementation give a brief presentation on it.

Requirements and Use Cases

<PGrosso> http://lists.w3.org/Archives/Public/public-xml-processing-model-wg/2006Feb/0013.html

<PGrosso> The above URL is from Alessandro.

<PGrosso> Alex: Point 1 is to have a defn of parameters which we do now.

<PGrosso> Alex: Point 2 should be taken care of now too. Alessandro agrees.

<PGrosso> Alex: Point 3 (standard names for steps). We discussed that a component is like "XSLT" but a step is a thing in the pipeline that may make use of a given component like XSLT.

Richard: Is a step a component plus some parameters plus it's position in the pipeline?

Alex: Yes. In fact a step might even use multiple components.

Richard: We probably don't have to come to complete closure on this now

Alessandro: My comment was narrower, just that in the particular place in 4.6 where the word "step" is used, the word "component" would have been better.
... I see a Step like a function call and a Component more like a function.

Alex: I could change 4.6 to say Component and be happy with that

General agreement

Alex: Point 4 is intended to say that we won't create a pipeline vocabulary that can't be validated

Richard: Can you give an example of something that couldn't be validated?

Alex: Atom, for example, voilates the XML Schema UPA rule by allowing interleaving at several levels
... I would like to avoid that, I'd like to create a vocabulary that can be validated with either language

Richard: I agree as long as it's not taken to extremes. Don't use things that many validation tools can't validate. But if we wind up with co-constraints (in attribute values, for example), it may never the less be the best way to do that.
... We can't rule out all constraints that can't be checked by an XML Schema validator.
... This sounds more like a design principle

Norm: I agree with Richard.

Alex: Ok.

Alessandro: I was thinking of the XSLT case, where there are good things that can't be validated easily with XML schema. I wouldn't like us to constrian ourselves not to do that.

Murray: On the other hand, we'd like processing languages to be as easily validated as possible. We should think long and hard before we let this one go.
... If we're going to allow something that isn't validatable, we're going to think long and hard about it.

Alex: Point 5 is about naming of pipelines
... There's no use case for many of the things in the document so that's a more general problem.

Norm: Can you give an example?

Alex discusses giving pipeline documents URIs

Murray: The mechanism that's missing is do I have a way to reference a pipeline and have it invoked

Richard: Do you mean in general or in a pipeline?
... Do we want pipelines to be able to refer to one another?

Alex: Consider 4.9 on composition, you could say use XInclude
... I think naming goes along with composition.

Richard: It's been the case in several specifications that the new language has defined it's own inclusion mechansim. It has always been a hope that XInclude was vailable it wouldn't be necessary. Often, alas, it turns out to be necessary.

Norm: I think the design principle "reuse existing technologies" covers that case.
... I propose that we leave 4.9 and let naming fall out of our composition mechanism if it does

Richard: We also have the case of supplying the pipeline in the URI so that you can write a URI that means run this pipeline on this document with these parameters.

Norm: I can't tell from 4.10 if that is what was for.

Consensus: delete 4.10

<PGrosso> http://lists.w3.org/Archives/Public/public-xml-processing-model-wg/2006Feb/0032.html

<PGrosso> [Norm's email]

Thank you, Paul

Norm describes his ideas

Alex asks about the syntax

Some discussion of flow and parallelism

Richard: I have some problems that are simpler than Alex's case.
... The use of a "current" infoset has two implications: straight through processing, everything is one input or output unless it's named; the other is that it implies sequential processing.
... I don't think the sequential processing is an issue. But the first one is more important.
... If we want to have some components like "XML diff" then I don't think we want to have the two inputs be described in entirely different ways.
... Maybe one has to be input1 and the other input2, but we shouldn't have to go deeper than that.
... but using names for the non-XML data, then I think that's an approache to consider.

A collection?

Richard: that isn't what I had in mind
... Suppose you have a pipeline that wants to cleanup some insignificant diffs and then run the XML diff component.
... I imagine that you might start this pipeline with two inputs and at some point they get merged.
... At the point of the execution of the step that does the diff, I want that to be just like the case where there's only one

Murray: I'm confused.

Alex: Conceptually, this is two pipes inside a pipe I think.

Some discussion of a shell script case

Richard: I'm assuming that we have a way to have two things in the pipeline, I want to get them merged later one
... The way we get two things into the pipeline is by having some upstream thing refer to URIs

Alessandro: I think it's an oversimplification to use the shell script analogy for everything.
... There are existing pipeline languages that can already handle this case.

<Alessandro> (That was Erik)

Oh, sorry.

Murray: Where I'm having difficulty is the case where there's more than one stdin

Richard: That's only if we only allow stdin on a process.

Murray: If we allow each step to have stdin/stdout, that step can also have other inputs.

Richard: Unix actually has a whole bunch of file descriptors, 0, 1, 2, and with sufficient hackery, you can actually read from 5 without ever giving it a name.

Alex: We need a white board for this.

Norm asks for concrete examples

Nearly out of time

Any other business?

None

Adjourned

- DRAFT -

XML Processing Model WG

16 Feb 2006

Attendees

Contents