XML Processing Model WG -- 07 May 2009

Date: 7 May 2009

<scribe> Meeting: 143

<scribe> Scribe: Norm

<scribe> ScribeNick: Norm

Accept this agenda?

-> http://www.w3.org/XML/XProc/2009/05/07-agenda

Accepted

Accept minutes from the previous meeting?

-> http://www.w3.org/XML/XProc/2009/04/30-minutes

Accepted.

Next meeting: telcon 14 May 2009

Norm must give regrets.

Henry to chair

The default XML processing model

Henry reminds us of the two deliverables from our charter.

<ht> http://www.w3.org/DesignIssues/XML

Henry: Those are Tim's thoughts on this issue.
... Tim's ideas are driven largely by the combination of XML elements from different namespaces in vocabularies like HTML
... Tim frames the question in terms of "what is the meaning of this document"? He likes to think about it in a way that I'll characterize informally as "compositional semantics"
... If you have a tree structured thing, you can talk about its semantics as being compositional in so far as the meaning of a bit of the tree is a function of the label on that node and the meaning of its children.
... So a simple recursive descent process will allow you to compute the meaning of the whole tree.
... It's not surprising that we switch between a processing view and more abstract statements about the meaning of nodes
... Tim's perspective is that if the fully qualified names of elements can be thought of as functions, then the recursive descent story is straightforward.
... That is the "XML functions" story about the document's meaning.
... Another bit of the background is the kind of issue that crops up repeatedly "now that we have specs like XML encryption and XInclude, when another spec that's supposed to deal with generic XML
... is created, just what infoset is it that those documents work with if you hand them a URI. Is it the result you get from parsing with a minimal parser, or a maximal parser, or one of those followed by ... name your favorite set of specs ...
... until nothing changes". People keep bumping up against this and deciding that there should be some generic spec to answer this question.
... That's where the requirement in our charter comes from

So GRDDL could have said "You start with a blurt infoset" where a spec such as ours could define what blurt is.

Henry: I ran with this ball in the TAG for a while.

<ht> http://www.w3.org/2001/tag/doc/elabInfoset/

Henry: This document is as far as I got.
... It uses the phrase "elaborated infoset" to describe a richer infoset that you get from performing some number of operations.
... That document doesn't read like a TAG finding, it reads like a spec.
... So one way of looking at this document is as input into this discussion in XProc.
... I think the reasons this stalled in the TAG are at least to some extent outside the scope of our deliverable.
... They have to do with the distinction between some kind of elaborated infoset and the more general question of application meaning of XML documents.
... Fortunately, we don't have to deal with the latter.
... Hopefully, folks can review these documents before next week.

Norm: They make sense to me

http://www.w3.org/2001/tag/doc/elabInfoset/elabInfoset

Norm: My concern with my chair's hat on is that there are lots of ways to approach this. Perhaps the right thing to do is start by trying to create a requirements and use cases document?

Murray: Can we do that by pointing to all the existing documents?

Henry: I think we could benefit from a pointer to Murray's discussion on the GRDDL working group mailing list.
... I'm thinking that what this is primarily is a way of defining a vocabulary that other specs can use.

Murray: Would I be way off base thinking that a net result of this process will be a pipeline written in XProc?

Henry: The process I described above requires you to repeat the process until nothing changes.

Norm: You could do it, you'd end up doing it n+1 times, but you could do it with p:compare.

Henry: It'd have to be recursive, it'd be a little tricky, but I guess you could do it.

Murray: It seems to me that if we can't answer part 2 by relying on part 1, then we didn't complete part 1 properly.
... If you can't produce an elaborated infoset by running it through a pipeline, then you did something wrong in the pipeline.
... Otherwise, what hope does anyone else have in accomplishing this.

Henry: It seems entirely possible to me that there are operations that on the one hand need to be specified as primitives in our view of things and on the other hand are not necessary for any of the use cases that get us over the 80/20 point.
... I'm not yet convinced that there aren't things in that category that you need.

Murray: Maybe that'll provide the imputus for XProc v.next
... If I have a document that purports some truth, but I have to go through lots of machinations to get there, but there's a formulaic way then we should be able to use the tools to do it.

Henry: I believe, thoughI I'm not sure, that you could implement XInclude by writing an XProc pipeline that didn't use the XInclude step. Doing so might reveal something about the complexity of XInclude.
... If someone said you don't need to do that, you can always write a pipeline, I'd say "No, wrong, you could but you wouldn't want to."

Norm: If we get to the point where we think we could do it, but we needed a few extra atomic steps, I think we could call that victory.

Henry: Let me introduce one other aspect, in attempting to do this in a way that doesn't require a putative spec to be rewritten every time some new bit of XML processing gets defined,
... the elaborated infoset proposal has this notion of "elaboration cues" and attempts to define the process independent of a concreate list of these cues.
... I'm not sure how valuable that attempt to be generic is.

Norm: I think one possibility is to define a concrete pipeline that does just have a limited set of steps.

Henry: Doesn't that mean that if we add a new obfuscation step, that de-obfuscation requires us to revisit the elaborated infoset spec?

Norm: Yes.

Murray: Right, we're talking about the default processing model. Henry's talking about the obfuscated processing model which would be different.
... You can petition later to become part of the default.

Henry: Another way of putting it is, should the elaboration spec be extensible?

Norm: Right. And one answer is "no".

<MoZ> Scribe : MoZ

Murray: it looks like it has been done in other spec. What we need to do is to define the processing model for the most common cases

Henry: my experience is that it will be easier to have agreement on elaboration to allow people to control what is elaboration and what isn't
... The default is to have XInclude but not external stylesheet
... Some want to have XInclude, some wants external stylesheet and other wants both

Norm: I agree, but will it make any progress on the problem ?

Henry: it will depends on the conformance story
... what I had in mind was : if GRRDL is coming out, and has the ability to say that the input of the processing is a GRDDL elaborated infoset

Murray: what happen to XML document that is not anymore an XML Document (Encryption, Zipping, etc...)

Henry: I agree, that's all we talking about

<Norm> scribe: Norm

<scribe> Scribe : Norm

Some discussion about which technologies preserve the "XMLness" of a document.

Encryption and Henry's obfuscation example both produce XML documents

Mohamed: It's an interesting discussion. There is, I think, a common base on which we can at least agree.
... These are more technical than logical, for example, XInclude, encryption, where the behavior is clearly defined.
... On top of that, there's a layer of user behavior. I think we'll have a hard time at that layer.
... Defining the use caes and requirements is probably the only place we can start.

Norm: Yeah, I don't want to make us do work that's already been done. But I think there would be value in collecting the use cases together
... to see if we have agreement that some, all, or none of them are things we think we could reasonably be expected to achieve.

Murray: Hopefully more than none.

Norm: Indeed.
... Ok, for next week, let's plan to have reviewed the documents that Henry pointed to and spend some time on use cases and requirements.

Henry. Agreed.

Any other business?

None heard.

Adjourned.

- DRAFT -

XML Processing Model WG

07 May 2009

Attendees

Contents

Accept this agenda?

Accept minutes from the previous meeting?

Next meeting: telcon 14 May 2009

The default XML processing model

Any other business?

Summary of Action Items

Scribe.perl diagnostic output