See also: IRC log
Date: 23 Oct 2008
<scribe> Meeting: 128
<scribe> Scribe: Norm
<scribe> ScribeNick: Norm
Mohamed: Is everyone going to be here this afternoon?
Norm: The AC meeting is this
afternoon, we'll see what happens.
... We can rearrange the agenda if necessary.
Agenda accepted, for the time being.
Vojtech: What is the default XML processing model?
Norm: It's the other work item on
our charter; in the absence of any explicit instructions, what
processing should an XML processor perform.
... We need to start thinking about that item.
Norm: Following discussions with the XML Security WG, we're not likely to have any definitions in time for V1.
Mohamed: What about having simple steps with parameters?
Norm: I don't see how that provides any more interoperability than just letting implementors do it in their own namespace.
Proposal: close with no action.
Norm: I think these are all ok, but I haven't implemented them yet.
Alex: Where did we leave off?
Norm: We just need to be careful that introducing "implementation defined namespaces" doesn't leak outside the XQuery step. But I don't think that's going to be a problem because we have an XML syntax.
Alex: Do we need to say something about who wins when they come from both places?
Norm: So if my p:xquery call has a foo: namespace declaration and my XQuery implementation predefines the foo: namespace (differently), who wins?
Alex: It seems like the right answer would be, we stuff our things into the static context and that overrides what was in the by default.
Norm: My guess is that the query processor starts and will overwrite anything that we put in the static context.
<scribe> ACTION: Alex/Norm to investigate how this actually works. [recorded in http://www.w3.org/2008/10/23-xproc-minutes.html#action01]
Mohamed: My thought was about all the validation steps. Is there a static context for them too?
Alex: For schema there isn't.
Mohamed: All the steps make it clear what is declared in XProc but XQuery is starting to make us think differently about it.
Norm: I don't think any of the other steps have this sort of defaulted namespace behavior.
Norm: I'm perfectly happy with
Henry's proposal for lax/strict.
... Then Henry goes on to propose some new options: use-schema-location and try-namespace.
Alex: I think use-schema-location is a really good idea.
Some discussion of whether or not parameters should be passed to the schema-validate step.
Norm: Let's set this one aside
until Henry gets here.
... The only thing you can't do with extension attributes is compute their values dynamically. I don't know how serious that is.
Vojtech: Sometimes it's difficult to detect exactly why an XPath expression failed.
Proposal: Accept the changes.
Vojtech: If you define a default binding for p:input and you then refer to a variable not-in-scope, what happens?
Norm: The expression fails.
... I think the upshot is that we need to say somewhere general that it's an error to refer to varible bindings that are not in scope.
<scribe> ACTION: Norm to add a general statement about out-of-scope variables. [recorded in http://www.w3.org/2008/10/23-xproc-minutes.html#action02]
Mohamed: With respect to the binding of p:option, we should say that it's as if the binding was to p:empty then in 5.15 we should say what that means (empty in 1.0 and undefined in 2.0)
Norm: Makes sense to me.
Mohamed: Then maybe we wouldn't have to cut-and-paste that prose everywhere
Norm: Anyone disagree?
<scribe> ACTION: Norm to fix p:empty and p:option as Mohamed suggests. [recorded in http://www.w3.org/2008/10/23-xproc-minutes.html#action03]
Henry: I chose these two options
explicitly because these are the ones that you need to get
Saxon to do the right thing. The default behavior changed
between 8.0 and 9.0.
... What exactly it means to "try namespaces" is implmeentation defined (RDDL, GRDDL, etc.)
Alex: For Xerces, if you turn off the use-schema-location hints and add a catalog, that'll just work.
Henry: Catalogs should be transparent. They enter the game at the time you have a URI that you're trying to dereference.
Alex: We need to be very clear about what try-namespaces it means.
Henry: We can point directly into the schema spec for the right paragraph and clause.
Alex: I have a catalog for my
schema processor and I need to tell it where the catalog
... I could do it externally, but that would be global in some way.
Henry: We haven't decided if
parameters are a mechanism which people can use to extend the
option set in implementation specific ways.
... I don't think that's what they were intended for.
... They were intended to operate in the case where it is in the nature of a particular step that it has an open-ended set of options.
Alex: We have steps that violate that: p:hash and p:xsl-formatter
Henry: Are we sure we're capable of predicting in advance which steps are likely to want parameters? Shouldn't every step have a parameter port?
Alex: Going back through last call?
Henry: Right, I've said it, but I agree we don't want to go through last call for it.
Vojtech: We have an explicit error for p:hash
Alex: Maybe we should make that a
general "I didn't like your parameter" error.
... The only thing I can see parameters for are weird implementation features.
Henry: Don't we really need a way to allow implementations to extend the list of options available on the step?
Alex: Can we do this in V.next
Henry: Yes, but it will be very
disruptive. The p:hash and p:xsl-formatter steps will have
these parameters when they don't need them anymore.
... This would actually have the benefit of packaging things a little better.
Inspection of 3.8
Henry: It seems to me that extension attributes can be used to pass implementation-specific strings, but they are static.
Vojtech: Why don't we have a way to compute extension attribute values?
Norm: We decided not to do attribute value templates, and we don't have an element syntax for them.
Does anyone want to add a parameter input port to p:validate-with-schema?
Do we want to add the use-location-hints and try-namespaces options ?
Are we happy with the proposed error?
Alex: Should we have the general error about bad parameters or bad parameter values?
Reconvene at 14:00
Norm: I think we should allow it, but may require adding some prose about the base URI of the pipeline or library document.
Vojtech: We can import pipelines, not just libraries, but the prose talks about libraries.
Norm: Yes, that's probably just sloppy wording. I'll fix it.
<scribe> ACTION: Norm to fix the wording about imports so that it applies equally to p:pipelines and p:libraries [recorded in http://www.w3.org/2008/10/23-xproc-minutes.html#action04]
Vojtech: Does this include little self-contained compound steps?
Henry: Yes, this is fine.
Norm: There's no issue, we can just close this without action.
Norm: I asked if unknown steps
were an error, and the consensus was that they are not.
... I'm satisfied.
... I propose we close this with no action.
<scribe> ACTION: Norm to change 6.1 so that it's not a static error. [recorded in http://www.w3.org/2008/10/23-xproc-minutes.html#action05]
<MoZ> Norm, what does it means for the implementation ?
It means that it's a dynamic error if you attempt to evaluate it.
Which we already say
Does that make sense, MoZ ?
Norm: I think it should not be primary; Henry agreed. Any objections?
Vojtech: As long as you can bind something, I'm fine.
Norm: I was confused because of
our changes to tracking position and length in for-each and
... I think Mohamed is right and there's no problem.
... Proposal: close without action.
Norm attempts to explain the situation.
Norm: I think p:store w/o an href
should write the document to the location of the base URI of
the document being stored.
... Though we appear not to actually say that yet.
Henry: If base URIs are propagated, doesn't that run the risk of blowing away the pipeline document.
Norm: If I have an XSLT step that produces a result document, and I p:store that result document, I want it to be written to the right URI.
Henry: See what we say at the top
of section 7. If I feed file://important/document into a
complex pipeline that has a p:store somewhere and I've
forgotten to put href on it, we'll overwrite the
... Is that really what we want?
Norm: We have our own base URI
function (because XPath 1.0 didn't)
... So you could say:
<p:with-option name="href" select="p:base-uri(/)">
Henry: There are three options: (1) make it required, (2) give the empty string special status, perhaps an error, or (3) give it a default that we think does something useful, like /dev/null
Norm: If I have a p:xslt step that produces a bunch of secondary result documents and I want to write them to disk, I'll have to write the complex form of p:store in order to save the documents.
Henry: We could specify that the base URI for absolutization in p:store is the base URI of the primary input.
Alex/Norm: We could add a separate option for store to base-URI?
Henry: On balance, I think the
facts are that you can get what you want and anything else puts
carelessness at high risk.
... But do we call the empty string an error?
Norm: No, becaues #foo would do the same thing.
Henry: So I think the consensus is that the href attribute is required.
Proposal: Make the href attribute required.
Norm: I think Mohamed is
... Proposal: Make it explicit that position() and last() are available in wrap sequence.
Norm: This is a spec exposition bug. We just need to say somewhere that p:log can be used on all the atomic steps.
<scribe> ACTION: Norm to change 3.3 so that it refers to with-option, variable, etc. [recorded in http://www.w3.org/2008/10/23-xproc-minutes.html#action06]
Mohamed: I made a proposal and we talked about it and decided not to do it.
Alex: It should be put in the serialization spec, we shouldn't have to do it. It's something everyone wants.
Proposed: Close with no action.
Mohamed: I did make a request for an example.
Norm: I'm fine with that.
<scribe> ACTION: Norm to add an example of C14N [recorded in http://www.w3.org/2008/10/23-xproc-minutes.html#action07]
Review of use cases and requirements
<scribe> ACTION: Norm to add our use cases and requirements document to the References [recorded in http://www.w3.org/2008/10/23-xproc-minutes.html#action08]
Use case 5.10 requires dsig, so we can't do that one.
Use case 5.11 requires a validator that preserves base URI properties.
Use case 5.14 requires tagsoup or tidy, so we can't do that one.
<alexmilowski> What's up with the git down?
Discussion of content-type on p:load and p:document to satisfiy 5.14
Mohamed observes that p:data can load non-XML resource, but we have no facility for doing that with a computed URI.
Mohamed: So we need to create another step or somehow extend p:load
<alexmilowski> well... yes
Vojtech: You can't use text/plain on p:load because it doesn't provide a wrapper.
<alexmilowski> not even with with-option ?
Norm: Maybe this is how we decided to use p:http-request for this case...
Mohamed: I want to fetch an xhtml document which is distributed as text/html so that I is able to work with it.
<alexmilowski> you can pass a computed URI with a 'file' scheme
<alexmilowski> (or whatever)
<alexmilowski> unescape-markup ...
<alexmilowski> HTML5 with need a "special" parser for sure...
<alexmilowski> "All strings are valid HTML5" ...
<alexmilowski> guess who said that...
<alexmilowski> Besides... ISO-8859-1 is really Windows-1252 according to HTML5 ...
<alexmilowski> ...so, you really want p:data ...
<alexmilowski> (seriously... you really do...)
<alexmilowski> In fact... you want a byte sequence base64 encoding so you can run their crazy redefinition of character encodings
You want p:data, but you can't use p:data if you need to construct the URI
<scribe> ACTION: Norm to fix the note in http-request that says unescape-markup will undo base64 encoding [recorded in http://www.w3.org/2008/10/23-xproc-minutes.html#action09]
<alexmilowski> If you get a text/html media type...
<alexmilowski> ...and it has a non-unicode encoding...
If you get back base64 encoded text, you're screwed.
<alexmilowski> do you get base64 ?
Given the above description, any content identified as text/html will be base64-encoded in the c:body element, as HTML isn't always well-formed XML. A user can attempt to convert such content into XML using the p:unescape-markup step.
<alexmilowski> (checking spec)
<alexmilowski> Here's our note:
<alexmilowski> "Given the above description, any content identified as text/html will be base64-encoded in the c:body element, as HTML isn't always well-formed XML. A user can attempt to convert such content into XML using the p:unescape-markup step."
<alexmilowski> "is recognized as a non-XML media type whose contents are encoded as a sequence of Unicode characters (e.g. it has a character parameter or the definition of the media type is such that it requires Unicode),"
<alexmilowski> That says that text/html; charset=UTF-8 should end up as characters and not base64
<alexmilowski> But text/html; charset=ISO-8859-1 should be base64
<alexmilowski> Thus... you might have to look at the 'encoding' attribute of 'c:body' to understand whether you have characters or not.
<alexmilowski> What we need is a media type parameter of 'version'
This is all very unsatisfying
<alexmilowski> so p:unescape-markup can use
<alexmilowski> text/html; charset=ISO-8859-1; version=5.0
What are the problems?
<alexmilowski> text/html isn't what you expect anymore...
1. p:data can load a non-XML resource, but can't do so with a computed URI
<alexmilowski> especially if they are going to codify the sins of the past...
<alexmilowski> that ISO-8859-1 will be treated as Windows-1252 (along with others)
alexmilowski, keep the html5 chatter in /me 's or something ok?
2. p:load takes a computed URI, but can't load non-XML data
<alexmilowski> I was talking about mohamed's want to use this for html
<alexmilowski> ..which led us to p:http-request...
<alexmilowski> so I thought we were talking about the same thing.
3. p:http-request can take a dynamic URI and can load non-XML data, but it's likely to base64 encode the result
4. And we don't have a way to unescape base64 encoded text
<alexmilowski> In (3) you get two different outputs for text/html
<alexmilowski> I think our note is wrong... am I correct or wrong?
<alexmilowski> ISO-8859-1 is not a unicode encoding...
<alexmilowski> so you get base64 for that...
<alexmilowski> UTF-8 is... so text/html; charset=UTF-8 gives you escaped html
<alexmilowski> Rather unfortunate
The note is wrong.
<alexmilowski> What we need is the "HTML munge" step...
But you're right about ISO-8859-1 text, which is still pretty common.
<alexmilowski> Or... we could have a "treat as text" option...
<alexmilowski> e.g. if you get text/* media type and, in theory, can map to unicode via the encoding...
<alexmilowski> then treat as text...
<alexmilowski> I hate to say this sounds like an issue we might want to take up while we are all here.
<alexmilowski> I'll come back... They aren't covering what I was interested in...
<alexmilowski> I'll be there soon as I can walk there...
Alex: We separated out the
encoding on the result from http-request, but we don't seem to
be doing this here.
...c: data and c:body are slightly out of step in this regard.
Alex: You might want to choose what to do with data based on its encoding: even if it's a mappable encoding, you might want to treat it as data.
We need to clarify how/what encoding means on c:body when it appears in a response.
<scribe> ACTION: Norm to clarify encoding on c:body in a response--probably by saying that it isn't used [recorded in http://www.w3.org/2008/10/23-xproc-minutes.html#action10]
Norm: I think there's consensus that we could make forward progress by saying that implementations SHOULD attempt to convert the content of any text/* media type into Unicode characters. Implementations MUST present text/* media types that use a Unicode encoding into characters.
Light breaks over Marblehead...the p:unescape-markup step *can* decode base64 encoded text.
Mohamed: We need encoding on c:data
Alex: That's right because it
might or might not be base64 encoded.
... In unescape-markup we need to say that there can be an charset parameter on the content-type.
Vojtech: We should remove the charset parameter's default value and say that it's only used if it's specified and it overrides the charset on the content-type.
Norm: What have we decided?
1. Remove the default value from the charset parameter on p:unescape-markup
2. Steps that take a content-type should respect the charset parameter
3. If you specify a charset on unescape-markup, it overrides the charset parameter on the content-encoding
4. If you don't specify the charset in either place, and the encoding is base64, that's a dynamic error
5. Change p:unescape-markup so that it ignores the charset if the encoding isn't specified.
6. If you want to load a non-XML resource, you're stuck with p:http-request
7. Specifically, it's not a dyanmic error if encoding isn't specified and the charset is
8. Add encoding attribute to c:data
9. Document that http-request can be used to load non-XML resources
Add an example that shows that there are a bunch of optoins that don't make sense
<scribe> ACTION: Alex to go through the spec again and look at the encoding/charset things [recorded in http://www.w3.org/2008/10/23-xproc-minutes.html#action11]
This is scribe.perl Revision: 1.133 of Date: 2008/01/18 18:48:51 Check for newer version at http://dev.w3.org/cvsweb/~checkout~/2002/scribe/ Guessing input format: RRSAgent_Text_Format (score 1.00) Succeeded: s/itme/item/ Succeeded: s/http/-> http/ Succeeded: s/validate-with-xml-schema/xsl-formatter/ Found Scribe: Norm Inferring ScribeNick: Norm Found ScribeNick: Norm Present: Zarella_(PTC) Mohamed Vojtech Norm Alex Agenda: http://www.w3.org/XML/XProc/2008/10/tpac-agenda Found Date: 23 Oct 2008 Guessing minutes URL: http://www.w3.org/2008/10/23-xproc-minutes.html People with action items: alex norm[End of scribe.perl diagnostic output]