XML Processing Model WG -- 27 Feb 2006

Administrivia

Scribes: Norm, Mon am; Alex, Mon pm; Henry, Tue am; Michael, Tue pm

Accept minutes from the previous teleconference?

-> http://www.w3.org/XML/XProc/2006/02/23-minutes.html

<MSM> good morning, all

<scribe> Postponed until tomorrow AM

Next meeting: 9 Mar telcon (not including tomorrow, of course)

Any regrets?

None given

Another f2f?

The next Tech Plenary is 18 months away; maybe we should meet in December in Boston colocated with the XML 2006 conference?

Murray suggests that December isn't a good time

Murray suggests that perhaps August, colocated with Extreme 2006 might work better?

Technical discussion; Requirements and use cases

Norm suggests that we might look at some use cases

Alex observes that many of the use cases overlap

Michael says that's ok

Murray thinks we could get this down to three use cases

There's no clear consensus that that's a good idea

Murray suggests that most of the uses can be described from a single example

The group struggles with technology issues; USB media, wireless connections, CVS repositories, you name it.

Use case 5.1: Extract MathML

The output is images, rendered either with a MathML image renderer or an SVG renderer.

Henry: Is this really a pipeline use case?

Micheael: Yes

Agreement settles on the interpretation that SVG images must be used by the MathML rendere to build an image of the equation

Some concern that this use case is at the edge of what we consider our charter

Move this use case near the end.

Use case: 5.2: Style an XML Document in a Browser

Norm: This doesn't seem like a pipeline use case; the browser UI ought to let you pick a stylesheet

Richard: This may have been mine; I don't think it was user choice.
... I have several stylesheets that render very different documents. Right now I need different documents with different PIs
... The idea was that some parameter in the URI might choose the stylesheet

Henry: That's the old Adam Bosworth use case, I want to pack up the pipeline in the URI

Some discussion of parameters and servers and clients

Richard: Being able to invoke a pipeline on the client with a parameter

Micheal: Split this into two or possibly three: 1. allow a pipeline to take a parameter and choose processing on that basis; 2. On the server; 3. On the client (in a browser)

Richard: There's an intermediate case: the pipeline and parameters are specified by a URI

Alex: I think it's a non-use-case

Murray: The use case is, I'm going to click on a link and get back the right document

Erik: This use case is different because of the URI

Richard: We might decide not to do it, but I don't think the URI part is out of scope (Norm had suggsted it was)

Micheal: I think it's relevant because a desire to implement the pipeline in the browser may have an impact on the language design

Richard: This looks more like a fragment identifier syntax issue because parameters are usually resolved on the server

Erik: Do we have anything else that has this has this pattern

So this boils down to a fragid syntax that evaluates a pipeline with parameters

The semantics is that anyway

Micheal: That's got too much solution not enough problem statement in it

Richard: The problem is having a URI that refers to a pipeline with parameters in such a way that the pipeline can be evaluated on the client with those parameters

Micheal: This still seems more like a solution, but leaving out the URI would probably be leaving out what some feel is the critical part of the use case

<Zakim> MSM, you wanted to propose a clarification / alternative to 'understand the use cases'

Michael: We don't need to reconstruct what the original submitter meant

Norm: Absolutely

Use case 5.3: Apply a sequence of operations

"Apply a sequence of operations such XInclude, validation, and transformation to a document"

Henry: I wonder if we should make this more concrete: xinclude, remove xml:base attributes (leaving the base-uri property!), validate with a schema, absolutize all "anyURI"s
... not all the steps are W3C-spec specified, and it requires some persistents of infoset properties. You can't implement this by serializing between each step.

<scribe> ACTION: Henry to write this example for the use cases document [recorded in 27-morning-minutes.log]

Richard: There are two infoset things that are interesting in the example, the base-uri property and the type anyURI being assigned to things
... Proper serialization would put the xml:base attributes back in

Separate out the case where we abandon a pipeline that fails to satisfy a condition

Michael: Abort if a step fails

Richard: Steps have to be able to report if they failed

Use case 5.4: Run a custom program

General agreement that "display the result in a browser" is a red herring here

Use case 5.4: Run a custom program

General agreement that "display the result in a browser" is a red herring here

Use case 5.5: Service Request/Response

Michael: The underlying substantive requirement is that the pipeline must be less complex than doing the work

Alex: it's the mirror image of 5.2

The use case here is that the pipeline language makes it easy to send the pipeline somewhere else and get the result back

Richard: it shouldn't be the case that you specify the steps by some name that's only interpretable on one machine

Murray: This is platform, language, and environment neutrality

Richard: This implies that you are sending the data (e.g. in a POST request) because the data may not be available by URI

Murray: It may put a constraint on the language to allow the pipeline to build infosets on the side and not require them to be written to disk

Richard: You can also use multi-part MIME, for example

Use case 5.6: XQuery and XSLT 2.0 Collections

Alex: I think this was supposed to capture the idea of processing a collection of documents with XSLT 2.0

Michael: I would like my pipeline processor to come wiht an XQuery component built in. If it produces a collection of documents, I don't want the pipeline to reject it.

Process the output of an XSLT 2.0 step that is more than one document. Have a pipeline step that starts with a collection of documents.

Use case 5.7: A Simple Transformation Service

<ht> http://www.rfc-editor.org/rfc/rfc2392.txt, for the 'cid' URI scheme, which can be used to address multipart-mime parts

Henry: Components map from infosets to infosets. Pipelines may have non-XML inputs and produe non-XML outputs.

[Scribe fails to capture all of the discussion]

<MSM> several possible decisions: (1) pipeline implementations are allowed to provide input and output wormholes

Norm: The use case is that I have a pipeline, the first step of which constructs XML by magic, and the last step of which consumes XML and does magic with it.

<MSM> (2) implementations are required / expected to provide wormholes, but the choice of which ones to provide is implementation defined

<MSM> (3) specific wormholes are required to be supported

Alex proposes a use case that the pipeline be able to extract information from an HTTP request

Alex, for example, should the authenticated user be available as a parameter

Richard proposes that this could be accomplished with a pipeline component that can provide that

Alex: In my implementation, all of the HTTP content headers are available as standard parameters

Erik: What seems to be important is that it seems to be out-of-scope to deal with the request in the core language

Alex: There's a flip side, I'm a user developing a pipeline and I'm putting it in a servlet environment.

Murray: And why not put it in a component?

Alex: Because there's a non-uniformity about HTTP

<Zakim> MSM, you wanted to ask for clarification and to ask for clarification

Michael: it seems that there are three things we could say about wormholes.
... 1. They may exist; 2. Implementations are expected to provide them; 3. Specific wormholes are required.

<MSM> then question for AM: which are you arguing for?

Murray: It seems to me that somewhere in the middle of a pipeline I might want to be able to interact by HTTP

That's different because in Alex's case, the pipeline is fired by the request

Richard propose pipeline environment variables

Erik: Any of these things that we introduce into the language will make it more complicated

Alex proposes the use case that a pipeline be activated by an HTTP request, do some processing, and produces at output an HTTP heards

Richard: And he wants the pipeline ot have an environment that lallows information about that http request to be available at any point in the pipeline

Use case 5.8: An AJAX Server

Agreement that it's ok

Use case 5.9: Dynamic XQuery

Agreement that it's ok

Use case 5.10: Read/Write Non-XML File

Agreement that it's ok

Use case 5.11: Single-file Command-line Document Processing

Agreement that it's ok

Henry: Suggest that it be near the top

Use case 5.12: XInclude processing

Norm: Collapse steps 2 and 3

Michael: Spawn two near twins:
... Add a a validation step (validate before, after, and both before and after)

Perhaps 5.3 and 5.12 need to be combined

Use case 5.13: Document Aggregation

Michael: There's a question here of the level of dynamism.

Documents listed in URIs? A collection of documents available directly?

Use case 5.14: Update/Insert Document in Database

The important part of this use case is that it introduces conditionals

The conditionality is at the level of the pipeline, not inside a component

Use case 5.15: Content-Dependent Transformations

This is a switch or choose statement.

The XPath condition is built into the language.

Henry: In the background here is the tradeoff between conditionals in the

language and on the ohter hand having just try/catch and failure

Use case 5.16: Configuration-Dependent Transformations

Murray: Let's say "Device example" instead of "Mobile example". Add a step

4 that translates to Braille for example

Murray: Ask the device-dependent group for example.

Use case 5.17: Response to an XML-RPC Request

Suggestion: Move this near the top two

Use case 5.18: Multiple-file command-line document generation

This is aggregation of the for-each

Alex proposes to break this into two use cases.

Use case 5.19: Database import/ingestion

Break into two use cases

Use case 5.20: Metadata Retrieval

Richard: This is a looping example

Alex: Replace 5 with "Apply this sequence of transformations"

Use case 5.21: Non-XML Document Production

Pagination in this use case really means "chunking"

Murray: The use case also works for paper markup

Alex: Is there a use case here that isn't XSL-FO?

Henry: I want a "loop until fix-point" construct

Alex: there's a use case, for example for atom, where you get the feed in chunks and I want to build the whole feed

XML Processing Model WG

27 Feb 2006 A.M.

Attendees

Contents

Administrivia

Accept minutes from the previous teleconference?

Next meeting: 9 Mar telcon (not including tomorrow, of course)

Another f2f?

Technical discussion; Requirements and use cases

Summary of Action Items