See also: IRC log
-> http://www.w3.org/XML/XProc/2012/04/26-agenda
Accepted.
-> http://www.w3.org/XML/XProc/2012/04/19-minutes
Accepted.
Accepted.
A-213-10: Completed
A-213-11 to A-213-14: Completed
A-213-15: Completed.
A-213-15 - A-213-18: Completed.
Vojtech: For XSLT 1.0, the only option is to write to a file, we're explicit about not having documents appear on the secondary output port.
A-213-09: Completed.
Norm: I move we postpone this until Jim can be present.
Accepted.
Murray: I'm trying to figure out
two things: what sorts of mechanisms would be useful in the
language to assist with debugging, and what sort of things are
already there?
... There's p:log, two implementations have a "message" step,
but I'm wondering about the possibility of other kinds of
steps.
... I've had this discussion in the past with C programmers and
now I'm talking with XProc programmers.
... I put some steps in the requirements document: one to turn
on debugging, one to turn on tracing. I highlighted that there
are some functions that can give you information about your
environment.
... I wonder about strategies ... logs, etc.
... It seems like those are the sorts of things you might
expose.
Norm: There are two things you can do: get a dump of the graph and get more verbose logging.
Norm waxes poetic about -D and Java logging.
Vojtech: We have something
similar. We have profiling output. And also we have a detailed
trace of the pipeline: what documents were passed, what were
the options and variables, etc.
... I wonder if we should try to standardize this.
... In other specifications by XQuery, there's a trace function
but the rest is implementation defined or dependent.
... In my view, we have p:log which is rather inflexible and
you can have a message step. But the problem with this is that
it requires you to modify the pipeline and potentially break
the sequence of steps. Sometimes you have to do ugly plumbing
to keep the original sequence.
... Maybe what we could consider is some sort of construct like
group or a wrapper that would log some information without
having to add pipe bindings to keep the pipeline in the
original sequence. A construct that doesn't influence the
connections between the steps would be nice.
... Instead of a message step. Or we could have both. We could
have a trace element that wraps a bunch of steps and does
logging, but it wouldn't be a step.
Norm: Yes, we could invent a new kind of thing, but I wonder if this is so implementation dependent that it's of limited value.
Vojtech: Like p:log, I think we could leave the details implementation dependent. The trace wrapper might be something like the resource manager that we discussed in the past.
Norm: Yes, if we invent a new kind of wrapper for the resource manager, maybe we could leverage that for the trace wrapper.
Henry: Oh, I'd rather not. You
really shouldn't have to edit your pipeline to do this.
... Maybe a wrapper is the best we're going to come up with.
For something like the resource manager, a wrapper is more
appealing because it's a feature of the design of the pipeline.
Whereas, tracing and profiling are not part of the
pipeline.
... So I'd rather not have something in the pipeline.
... We can't just leave this to implementors, the way the
python or lisp debuggers do, because you can't implement XProc
in XProc.
... A different way to talk about this in the same spirit would
be to say that we already have ways to name things. Maybe we
want to think about this in a sort of meta way: we want to
think about ways of annotating pipelines, externally even, in
order to describe tracing or profiling behavior.
... We could have a trace descriptor and a pipeline.
Alex: This could be done if you
had a description of the binding for the pipeline.
... That would require the ability to point at a chunk of
pipeline not individual steps.
Vojtech: We could do somethign similar to XQuery 3 with annotations.
Norm: It seems like some sort of "trace only these named steps" feature might be useful.
Henry: I know that there was at
least some work in actually doing just what I dismissed: as far
as the engine is concerned all you can say is instrument
yourself. Where you put the enegy is in the tool that presents
the output to you. So instead of trying to say only give me
trace information for the last four steps, you just turn on
tracing.
... Then the tool only shows you the output for only the last
four steps.
Norm: Yeah. Fair point. My tracing is all adhoc.
Murray: So we could imagine an
XProc pipeline that read the trace output and presented it in a
nice way.
... I've heard the argument before for putting all the tracing
outside the program. I've heard the same argument about
documentation too.
... One of the things I've noticed as I'm gathering these
requirements is a section called "Integration".
... A lot of these requirements in the areas of debugging and
testing and error handling are related to integration. All of
these things can be aided by leaving sign posts in your
program. If you know that you're having a problem in a certain
area of the program, then leaving the indicators in there and
being able to flag the pipeline could be very helpful.
<ht> Hmm -- I absolutely agree that documentation is an integral part of a program or pipeline
Murray: You can run your pipeline 24 hours a day and diff the traces, look for differences, etc. This just seems useful from a Q/A audit perspective.
Alex: My question is, can I write
a pipeline that's normal and reasonably minimal and still debug
the thing?
... Could I profile, debug, etc. without having to touch the
pipeline?
Norm: I think with an appropriate debugging environment you could.
Vojtech: Yes, but some steps are in libraries that can have the same names, etc.
Murray: I don't care what anybody does with respect to designing a debugger that can look into an XProc program and debug it. More power to them. But that's not what I want to discuss. We're talking about requirements for the language.
Henry: I hear you, the way I hear this conversation going so far is that no body has come up with any.
Murray: No. Several people have made suggestions, but we keep coming back to "I want to do this from outside my pipeline"
Henry: Putting things in the
language requires that implementors support them. I think the
argument that I would make isn't that my program is sacred, but
rather are we sure enough of the value of in-language support
that we want to require everyone to do the work that's
necessary.
... It's the cost-benefit analysis that comes first.
Murray: Here's a simple question:
if a processor has the ability to turn on trace, then providing
some markup that advises that processor that this is a good
time to turn on trace, would be useful. And if the processor
can't turn on trace, then it's harmless.
... I don't want to specify what comes out in the trace, though
we might want to give some advice, but that's up to the
processor.
Alex: I guess the conundrum as I
see it is that we don't have any debuggers yet. And we have
very minimal tracing and debugging support.
... I suspect there are things we should do but I don't think I
know what they are.
Murray: Well, Norm said he output trace information...
Alex: Yes, but that's very primitive compared to other languages. Do we have the right naming conventions, for example?
Murray: We decided, early on,
that there would be a "stderr" port. Could we not designate a
port for trace output?
... I just want to look for some things that would make the
language easier to debug.
Vojtech: We already have p:log,
but it's very primitive. Maybe we should just make p:log more
flexible and useful; allowing it in options, variables, input
ports, etc. Then with a processor switch, you could enable the
log statements you wanted to trace.
... It could wind up in one location. Maybe we don't have to
add anything new, just improve existing features? Maybe we
could imagine a switch to magically insert p:log statements
everywhere. The advantage of the log is that it doesn't change
the sequencing of steps.
Norm: We could do that. The only thing that occurs most obviously to me would be a standard message step.
Vojtech: It's definitely useful, but it's tedious to add 10 of them.
Norm: Yes, it's tricky, but is still perhaps useful enough to standarize.
Vojtech: Maybe with a switch to disable the output.
Henry: Yes, I think that might be worth looking at standardizing that. Maybe we could add classes so that you can enable them or disable them by name. It would be nice to be able to turn them off without having to edit them out.
Alex: I'm looking at p:log. First a question: If I don't have an href or if I use the same href, what happens?
Implementors mumble a bit
Alex: It would be nice if there
was some metadata on the output so that I could reconstruct
what happened later. A notion of what port this was produced
from, when it arrived, etc.
... Similarly, it might be nice to log inputs.
Vojtech: Absolutely.
Alex: It would be nice to be able
to put assertions inside the p:log step.
... Is this XPath expression true?
Vojtech: The ability to construct a message with an XPath expression would be useful.
Alex: Those are the sorts of
things that would be useful.
... You could have one big log file with all the data in it;
then you could examine that output.
Murray: So one of the things we
could consider is whether every step would have a verbosity
level and basically if you had high verbosity turned on, then
that step would report some things when it started.
... We could rationally talk about what those conditions might
be.
... Speaking of which, I've listed a lot of functions in the
use cases and requirements document. It might be nice to have
an exhaustive list.
Norm: Where's the list?
Murray: F.5.12
Norm: That's a mixture so I'm confused.
Murray: Yes, it's a mixture, but
they return information about the current context or
environment.
... All of this is useful information that you can use in
debugging. Years ago, working in troff, I got some debugging
built in. We had levels of verbosity and I could set the
warning/error etc. messages. I could print messages at the
beginnings of loops, I could turn trace on in the middle of a
loop, etc.
... I found this useful at the time.
Norm: Yes. I can see that.
... Of the things we've discussed today, I think the proposal
to extend p:log so that it can contain messages or assertions
and the ability to log inputs seems like the best combination
of utillity and low hanging fruit.
<scribe> ACTION: Norm to sketch out an extension to p:log with messages and assertions. [recorded in http://www.w3.org/2012/04/26-xproc-minutes.html#action01]
Murray: Who's baby is clustering?
Norm: What do you mean by clustering?
Murray: Good question. I found an input along the lines of "does XProc need clustering?"
Norm: In the doc?
Murray: Yes, F.3.3
Henry: Is this group-by?
Some discussion of where the requirement came from and what it means
<Vojtech> http://www.w3.org/wiki/index.php?title=Integration&diff=55046&oldid=55034
Murray: Alex and I have noted
some language along these lines in the first requirements and
use cases document that didn't make it into the spec.
... But it's never clear what streaming and parallel processing
mean in concrete terms.
... How have we impeded or assisted parallel processing?
Henry: Parallel processing is a little easier. What I think we meant is to never constrain parallelization
Henry: Make no assumptions about evaluation order that aren't required by explicit connectiviiy.
Henry: The way I used to say it
was: it ought to be possible to implement an XProc processor by
starting each step in a thread an waiting to see what happens.
Someone has input, everyone else is blocked, and each step
works as input arrives.
... For example, there's nothing today that says that the steps
at the bottom of a pipeline have to run after the ones at the
top.
Murray: for-each says the step must produce output in the right order. Does that have an impact on parallelism?
Norm: On streaming more than parallel processing.
Alex: It might be nice to add annotations to a pipeline to say what the streaming/parllelism expectations are.
Murray: I was puzzled by a request to allow for-each in an unordered way
Henry: Yes, this connects up to
unordered collections. Right now we have sequences, but if we
had collections, then you could have a switch on p:for-each
that said it was allowed to be unordered.
... Then the question is, what does a step that takes an
unordered collection as input look like?
<scribe> ACTION: Norm to put streaming/parallel processing on the agenda for two weeks [recorded in http://www.w3.org/2012/04/26-xproc-minutes.html#action02]
Norm: Adjourned