See also: IRC log
Date: 26 April 2012
<scribe> Meeting: 214
<scribe> Scribe: Norm
<scribe> ScribeNick: Norm
A-213-11 to A-213-14: Completed
A-213-15 - A-213-18: Completed.
Vojtech: For XSLT 1.0, the only option is to write to a file, we're explicit about not having documents appear on the secondary output port.
Norm: I move we postpone this until Jim can be present.
Murray: I'm trying to figure out
two things: what sorts of mechanisms would be useful in the
language to assist with debugging, and what sort of things are
... There's p:log, two implementations have a "message" step, but I'm wondering about the possibility of other kinds of steps.
... I've had this discussion in the past with C programmers and now I'm talking with XProc programmers.
... I put some steps in the requirements document: one to turn on debugging, one to turn on tracing. I highlighted that there are some functions that can give you information about your environment.
... I wonder about strategies ... logs, etc.
... It seems like those are the sorts of things you might expose.
Norm: There are two things you can do: get a dump of the graph and get more verbose logging.
Norm waxes poetic about -D and Java logging.
Vojtech: We have something
similar. We have profiling output. And also we have a detailed
trace of the pipeline: what documents were passed, what were
the options and variables, etc.
... I wonder if we should try to standardize this.
... In other specifications by XQuery, there's a trace function but the rest is implementation defined or dependent.
... In my view, we have p:log which is rather inflexible and you can have a message step. But the problem with this is that it requires you to modify the pipeline and potentially break the sequence of steps. Sometimes you have to do ugly plumbing to keep the original sequence.
... Maybe what we could consider is some sort of construct like group or a wrapper that would log some information without having to add pipe bindings to keep the pipeline in the original sequence. A construct that doesn't influence the connections between the steps would be nice.
... Instead of a message step. Or we could have both. We could have a trace element that wraps a bunch of steps and does logging, but it wouldn't be a step.
Norm: Yes, we could invent a new kind of thing, but I wonder if this is so implementation dependent that it's of limited value.
Vojtech: Like p:log, I think we could leave the details implementation dependent. The trace wrapper might be something like the resource manager that we discussed in the past.
Norm: Yes, if we invent a new kind of wrapper for the resource manager, maybe we could leverage that for the trace wrapper.
Henry: Oh, I'd rather not. You
really shouldn't have to edit your pipeline to do this.
... Maybe a wrapper is the best we're going to come up with. For something like the resource manager, a wrapper is more appealing because it's a feature of the design of the pipeline. Whereas, tracing and profiling are not part of the pipeline.
... So I'd rather not have something in the pipeline.
... We can't just leave this to implementors, the way the python or lisp debuggers do, because you can't implement XProc in XProc.
... A different way to talk about this in the same spirit would be to say that we already have ways to name things. Maybe we want to think about this in a sort of meta way: we want to think about ways of annotating pipelines, externally even, in order to describe tracing or profiling behavior.
... We could have a trace descriptor and a pipeline.
Alex: This could be done if you
had a description of the binding for the pipeline.
... That would require the ability to point at a chunk of pipeline not individual steps.
Vojtech: We could do somethign similar to XQuery 3 with annotations.
Norm: It seems like some sort of "trace only these named steps" feature might be useful.
Henry: I know that there was at
least some work in actually doing just what I dismissed: as far
as the engine is concerned all you can say is instrument
yourself. Where you put the enegy is in the tool that presents
the output to you. So instead of trying to say only give me
trace information for the last four steps, you just turn on
... Then the tool only shows you the output for only the last four steps.
Norm: Yeah. Fair point. My tracing is all adhoc.
Murray: So we could imagine an
XProc pipeline that read the trace output and presented it in a
... I've heard the argument before for putting all the tracing outside the program. I've heard the same argument about documentation too.
... One of the things I've noticed as I'm gathering these requirements is a section called "Integration".
... A lot of these requirements in the areas of debugging and testing and error handling are related to integration. All of these things can be aided by leaving sign posts in your program. If you know that you're having a problem in a certain area of the program, then leaving the indicators in there and being able to flag the pipeline could be very helpful.
<ht> Hmm -- I absolutely agree that documentation is an integral part of a program or pipeline
Murray: You can run your pipeline 24 hours a day and diff the traces, look for differences, etc. This just seems useful from a Q/A audit perspective.
Alex: My question is, can I write
a pipeline that's normal and reasonably minimal and still debug
... Could I profile, debug, etc. without having to touch the pipeline?
Norm: I think with an appropriate debugging environment you could.
Vojtech: Yes, but some steps are in libraries that can have the same names, etc.
Murray: I don't care what anybody does with respect to designing a debugger that can look into an XProc program and debug it. More power to them. But that's not what I want to discuss. We're talking about requirements for the language.
Henry: I hear you, the way I hear this conversation going so far is that no body has come up with any.
Murray: No. Several people have made suggestions, but we keep coming back to "I want to do this from outside my pipeline"
Henry: Putting things in the
language requires that implementors support them. I think the
argument that I would make isn't that my program is sacred, but
rather are we sure enough of the value of in-language support
that we want to require everyone to do the work that's
... It's the cost-benefit analysis that comes first.
Murray: Here's a simple question:
if a processor has the ability to turn on trace, then providing
some markup that advises that processor that this is a good
time to turn on trace, would be useful. And if the processor
can't turn on trace, then it's harmless.
... I don't want to specify what comes out in the trace, though we might want to give some advice, but that's up to the processor.
Alex: I guess the conundrum as I
see it is that we don't have any debuggers yet. And we have
very minimal tracing and debugging support.
... I suspect there are things we should do but I don't think I know what they are.
Murray: Well, Norm said he output trace information...
Alex: Yes, but that's very primitive compared to other languages. Do we have the right naming conventions, for example?
Murray: We decided, early on,
that there would be a "stderr" port. Could we not designate a
port for trace output?
... I just want to look for some things that would make the language easier to debug.
Vojtech: We already have p:log,
but it's very primitive. Maybe we should just make p:log more
flexible and useful; allowing it in options, variables, input
ports, etc. Then with a processor switch, you could enable the
log statements you wanted to trace.
... It could wind up in one location. Maybe we don't have to add anything new, just improve existing features? Maybe we could imagine a switch to magically insert p:log statements everywhere. The advantage of the log is that it doesn't change the sequencing of steps.
Norm: We could do that. The only thing that occurs most obviously to me would be a standard message step.
Vojtech: It's definitely useful, but it's tedious to add 10 of them.
Norm: Yes, it's tricky, but is still perhaps useful enough to standarize.
Vojtech: Maybe with a switch to disable the output.
Henry: Yes, I think that might be worth looking at standardizing that. Maybe we could add classes so that you can enable them or disable them by name. It would be nice to be able to turn them off without having to edit them out.
Alex: I'm looking at p:log. First a question: If I don't have an href or if I use the same href, what happens?
Implementors mumble a bit
Alex: It would be nice if there
was some metadata on the output so that I could reconstruct
what happened later. A notion of what port this was produced
from, when it arrived, etc.
... Similarly, it might be nice to log inputs.
Alex: It would be nice to be able
to put assertions inside the p:log step.
... Is this XPath expression true?
Vojtech: The ability to construct a message with an XPath expression would be useful.
Alex: Those are the sorts of
things that would be useful.
... You could have one big log file with all the data in it; then you could examine that output.
Murray: So one of the things we
could consider is whether every step would have a verbosity
level and basically if you had high verbosity turned on, then
that step would report some things when it started.
... We could rationally talk about what those conditions might be.
... Speaking of which, I've listed a lot of functions in the use cases and requirements document. It might be nice to have an exhaustive list.
Norm: Where's the list?
Norm: That's a mixture so I'm confused.
Murray: Yes, it's a mixture, but
they return information about the current context or
... All of this is useful information that you can use in debugging. Years ago, working in troff, I got some debugging built in. We had levels of verbosity and I could set the warning/error etc. messages. I could print messages at the beginnings of loops, I could turn trace on in the middle of a loop, etc.
... I found this useful at the time.
Norm: Yes. I can see that.
... Of the things we've discussed today, I think the proposal to extend p:log so that it can contain messages or assertions and the ability to log inputs seems like the best combination of utillity and low hanging fruit.
<scribe> ACTION: Norm to sketch out an extension to p:log with messages and assertions. [recorded in http://www.w3.org/2012/04/26-xproc-minutes.html#action01]
Murray: Who's baby is clustering?
Norm: What do you mean by clustering?
Murray: Good question. I found an input along the lines of "does XProc need clustering?"
Norm: In the doc?
Murray: Yes, F.3.3
Henry: Is this group-by?
Some discussion of where the requirement came from and what it means
Murray: Alex and I have noted
some language along these lines in the first requirements and
use cases document that didn't make it into the spec.
... But it's never clear what streaming and parallel processing mean in concrete terms.
... How have we impeded or assisted parallel processing?
<Vojtech> Btw, it was Paul who put the remark about clustering in the wiki
Henry: Parallel processing is a little easier. What I think we meant is to never constrain parallelization
Henry: Make no assumptions about evaluation order that aren't required by explicit connectiviiy.
<Vojtech> Oh, it was Henry!
Henry: The way I used to say it
was: it ought to be possible to implement an XProc processor by
starting each step in a thread an waiting to see what happens.
Someone has input, everyone else is blocked, and each step
works as input arrives.
... For example, there's nothing today that says that the steps at the bottom of a pipeline have to run after the ones at the top.
Murray; for-each says the step must produce output in the right order. Does that have an impact on parallelism?
Norm: On streaming more than parallel processing.
Alex: It might be nice to add annotations to a pipeline to say what the streaming/parllelism expectations are.
Murray: I was puzzled by a request to allow for-each in an unordered way
Henry: Yes, this connects up to
unordered collections. Right now we have sequences, but if we
had collections, then you could have a switch on p:for-each
that said it was allowed to be unordered.
... Then the question is, what does a step that takes an unordered collection as input look like?
<scribe> ACTION: Norm to put streaming/parallel processing on the agenda for two weeks [recorded in http://www.w3.org/2012/04/26-xproc-minutes.html#action02]
This is scribe.perl Revision: 1.136 of Date: 2011/05/12 12:01:43 Check for newer version at http://dev.w3.org/cvsweb/~checkout~/2002/scribe/ Guessing input format: RRSAgent_Text_Format (score 1.00) Succeeded: s/ggin/gging/ Succeeded: s/pipeline./pipeline?/ Succeeded: s/think there are/think I know what they are/ Found Scribe: Norm Inferring ScribeNick: Norm Found ScribeNick: Norm Present: Norm Henry Vojtech Murray Alex Regrets: Jim Agenda: http://www.w3.org/XML/XProc/2012/04/26-agenda Found Date: 26 Apr 2012 Guessing minutes URL: http://www.w3.org/2012/04/26-xproc-minutes.html People with action items: norm[End of scribe.perl diagnostic output]