XML Processing Model WG -- 01 Jul 2015

Accept this agenda?

-> http://www.w3.org/XML/XProc/2015/07/01-agenda

Accepted.

Accept minutes from the previous meeting?

-> http://www.w3.org/XML/XProc/2015/06/24-minutes

Accepted.

Next meeting

Scheduled for 8 July.

No regrets heard.

Review of open action items

Norm proposes creating issues for the longer running ones and we can take them off the agenda.

The default error port

-> https://github.com/xproc/specification/issues/136

Norm: Jim, would you like to review what you've proposed?

Jim: Part of the requirements, issue 48.

<jfuller> https://github.com/xproc/specification/issues/48

Jim: Which states that we should do better at helping people understand and debug pipelines.
... This includes the p:message step and there are various conversations about the idea of providing more default error reporting mechanisms.

Jim: I distilled this down into issue 136. What I've tried to do is balance several concerns.
... The higher level question is: what would help end-users out in terms of what we could formalize for more transparent operations.
... One idea is an implicit error port. Another conversation lead to the idea of an implicit report port.
... The idea is an implicit error output port on every step. It's the canonical way of reporting errors.
... It would be an error to attempt to define an error port, except in p:declare-step where you could override it.

Norm: Can you say why you think it should be implicitly declared, rather than explicitly?

Jim: My feeling is that it's just a shorthand. If every step has these ports, I'd rather not declare them. But I'm ok with explicitly declaring them. It's not an important part of the proposal.
... What comes out of the port is arbitrary. The idea I propose is a c:result that emits a c:error or c:errors.
... I don't think we're beholden to an implementor to show every error.
... You can imagine a step like p:for-each that might have an error on one iteration. We don't say much about how we deal with partial failure or multiple errors.
... Do we allow steps to not fail if they report errors?

Norm: I think we can think of it like stderr in unix. Somtimes it coincides with a failure and sometime it doesn't.

Jim: I've left it implementation-defined if there are foreign namespaced elements or attributes.
... We may want to add information to the errors that identify where the error came from (step type, name, etc.)

Norm: If we imagine something like p:for-each accumulating errors then making them self-identifying makes sense.

Jim: The report port is similar. When I was reading the requirements, what we don't have is a nice audit of what a step does on a per-document point of view.
... The idea is that a report port that emits information about what it's done to each thing flowing through the step.
... It's pretty minimal at this point, just a notification of what happened to each document.
... If an error occurred on one of these documents, you could inject that into the reports on this port.

Jim summarizes his example in the issue.

Jim: This is pretty brief. I'm trying to cross a bunch of streams and I first wanted to understand if we should do this and does it address the problem we're trying to solve.
... So that's where I'm at.

Norm: I'm not sure I like the report port for what it does in this proposal.
... I think p:log was a mistake. We seem to be adding more stuff like that.

Jim: That's because it goes the other way. This is more global. This is formalizing information that we have already.
... This is just formalizing the document trace in a human readable way.

Murray: I'm coming from a space of ignorance, so I have some basic questions.
... Each step would have an error port and other steps could read from it.
... And is it also true that all of the error ports could be funneled into a single port for output at the end of the entire pipeline.

Jim: I haven't said that, but maybe.

Murray: I'm think of stderr; basically everything came out there: fatal errors, warnings, trace output. You had to have a script to figure out all the pieces.
... What's being suggested here is that there are errors being reported and a different port for reporting non-errors.
... Given that we can differentiate between the items on the port. Why separate them?

Jim: Good question. From my perspective, there are two different points of view. For us, XProc is a pipeline that has documents flowing through it.
... I have no problem with saying that we have specific conventions. And I think having a report port emphasises that point of view.
... I think having everything in one basket could confuse people who are working with documents.
... We've made things very generic in the past; while that makes it very flexible, it puts a cognitive load on the user.
... If that's the case, I'd rather just call it the report port. If we're going to have one.

Norm: I'm just reluctant to standardize a particular kind of reporting. I don't think we know what the right solutions are. Producing at least twice as many documents seems to have possible performance implications.

Alex: I agree. I think we want to let implementors innovate in this space.

Jim: We're already creating this kind of trace information already.

Norm: I'm not really sure I agree; we have logs and things but they aren't documents.

Alex: Just adding a new port for, for example, the XSLT step, that produces messages.

Norm: I'm mildly in favor of an error port everywhere (as opposed to figuring out exactly where), but the alternative is what Alex suggests, only do it on the steps that need it. Like XSLT.

Jim: People are using logging output to debug pipelines.

Norm: Are they? Does this help?

Jim: I think several people have reported this kind of problem. They would like to make this kind of information explicit.
... If we think that the problem of debugging pipelines is solved with lesser measures, I'm happy to abandon this proposal.
... I'm thinking of what we can do to make it easy for users to understand what's flowing through pipelines.
... There are certainly implementation challenges.

Norm: Ok, stepping back, I'm not going to read the report port in the usual case. So my pipeline isn't working, what do I do?

Jim: The two scenarios of debugging are author vs production usage.
... The ability to report what each step did with errors would be useful.

Murray: I thought that what was going to happen was that was on the error port either the mechanism that the step was using, XSLT for example, might have an error mechanism and we might want to pipe that out.
... The other thing was that an author might want to say "if this happens, put this on the error port"
... But the other thing I heard was that we have a pipeline, it's been published and people are using it.
... So some guy is running a pipeline and he's waiting and waiting. Eventually it ends and it didn't work.
... He can re-run it with trace information, which might or might not be useful.
... The scenario I'm interested in is that I hit "go" and as the process moves along, it's telling me different things that it's doing.
... It's starting a step, it's doing a build, it's doing a clean, it's profiling, generated statistics, copying assets, etc.
... At the various stages, it's reporting things to me.
... As it gets to each chapter, it tells me, when it gets to the glossary it slows down, but I know that.
... Is that what this is for?

Jim: Yes. That's the report-port.

Norm: How!?

Jim: I was imagining that these messages would be reported.

Norm: Was it your intent that these steps should appear on the "console"?

Jim: Yes. (Maybe, says the scribe)
... I do say that all the error port outputs bubble up.

Henry: Ok, that makes sense. The top-level pipeline then has on it's "report" port all the reports from the contained steps.

<alexmilowski> Gotta run ...

Henry: Then you only have to tell the pipeline what to do with the report port. The normal answer is nothing, but you could put it into a file or on stderr.

Jim: That's what I was attempting to do in the writeup.

Murray: So each step has a stderr port; the stderr port has a manifold on it. On one side you can read it to get errors and the other pushes them up to it's "parent".
... On a per-pipeline basis you can say that stderr should go to /dev/null and on a per-step basis you can say that.
... What I said is that stderr output should immediately go to the console as well as percolating up.

Norm: Yeah, maybe, but I'm not sure that stderr output will be the most useful way to present information on the console.

Jim: I agree that we can have a single error port.

Murray: What if we could have a step that decided what to do with stderr. The input is the input off the "big pipe", the output is what you want to go out to the big port.

<jfuller> hehe, a uri for the console #console

The "console" has to be implementation-defined if not -dependent.

<jfuller> yes agreed!

Norm waffles a bit about implementing better error output from "outside" the pipeline.

Norm: I like the error port, there are lots of things that can go there, but I'm not sure I understand how much non-error output to put on there.

Jim: I think it should be per-document not per-step.

Norm: What do you mean by per-document?

Jim: Users want to know errors about a document, not about a step.

Murray: I disagree. I think what you're wanting to have on that port are messages about events.
... After doing a variety of things, I want to report that to the user. Exactly what shows up on that port depends on the person who's writing the steps.
... Maybe we want to be able to say, for certain steps, that there are certain kinds of events that typically want to be reported and we can just throw switches for those.
... Run this step with messages set at this level.
... We did this years ago with SoftQuad's troff. There were different levels of messaging.
... You could set a variable and that's the level you got.
... Authors and users and testers might all want to have information. QA might want to do an audit.
... By having different elements that you can filter on, you can do that.

Norm: Jim will you refactor the proposal to reflect today's discussion?

Jim: Yes.

Any other business?

None heard. We are adjourned.

- DRAFT -

XML Processing Model WG

Meeting 274, 01 Jul 2015

Attendees