XML Processing Model WG -- 16 Apr 2009

<ht> RESOLVED: Accept minutes of 9 Apr 2009 as a correct record http://www.w3.org/XML/XProc/2009/04/09-minutes.html

<ht> No audible regrets for 23 April

<ht> HST: Do we have errors for undeclared inputs or outputs?

<ht> TV: No

<ht> TV: NW said in his reply he liked the extra specifity of 31. . .

<ht> TV: I'm OK with keeping two, but it does mean there may be differences between implementations as to which they raise

<ht> RESOLUTION: No change required#

Issue 88

<ht> TV: In the case of multipart requests, I thought there was a potential conflict

<ht> ... but we've decided that these must be consistent

<ht> AM: Yes, we decided this, and there are a number of sources of possible conflict, but they just have to be detected and flagged

<ht> RESOLUTION: Overtaken by resolution of more general issue

Issue 97

<ht> TV: This is now addressed in the spec: 7.1.10.3.2

<ht> ACTION: Norm to follow up with OP to signal that issue 97 has been addressed, ask for agreement [recorded in http://www.w3.org/2009/04/16-xproc-minutes.html#action01]

<ht> RESOLUTION: Issue 97 has been addressed in the current draft: should be preserved

Issue 101

<ht> NDW: related to 97

<ht> ... Seems to me we should follow 302

<ht> ... no control over it for the time being

<ht> AM: Agreed

<ht> ... Maybe we come back to this if we're asked

<ht> RESOLUTION: We will always follow redirect, no option to allow skipping

Issue 102

<ht> NW: If assert-valid is false, maybe you get some PSVI stuff, maybe you don't, but in either case nothing else happens

<ht> TV: But the prose doesn't describe what _happens_ if it's false

<ht> ACTION: NDW to clarify 7.2.6 validate-with-xml-schema to make clear what happens if assert-valid is false [recorded in http://www.w3.org/2009/04/16-xproc-minutes.html#action02]

Issue 103

<scribe> scribe: Norm

NDW: I think the question here is how do the xs:include's interact with the schemas passed into the step

AM: From an XML Schema perspective, you can imagine an implementation using a catalog system.

NDW: I don't think that applies here. The schema's are inline.

HST: What's crucial is that all the schemas involved are for the same namespace.

TV: Maybe that's an error.

HST: Yes, but that's going to uncover another error in the simplest way to resolve this.
... We could say "that is, no schema document for a any namespace provided for by any of the supplied documents may be processed."
... That needs to be clarified, for each namespace that is supplied by one of the specific instance inputs, no processing for that namespace should be done elsewhere.
... that should be the only document.

AM: How much of this do we have to specify?

HST: The schema spec has one MUST and leaves everything else up to the impl; the MUST is that if you try to redefine something, it's an error if you can't get at the something

AM: To some extent, this should generate an error. You've got two top level schemas for the same namespace.
... I think in Xerces, the second schema would just replace the first.

NDW: Let's imagine that there were schemas for two namespaces here. I think the collision is accidental in Vojtech's example.

HST: Schema locations are hints except in the case of xs:include and xs:redefine.
... Schema location provided in the case of xs:include are not hints, unless you get a 404, you must process what you find.
... What's also true is that you are allowed to detect duplicates and not throw an error on second and subsequent times.
... As it happens Voytech's example is a real cracker because it can't arise outside of XProc because if you ask for a URI, you'll get the same data.
... Unless it changes under you, which some processors care about
... You couldn't have thsi problem in anything except XProc. Because anywhere else you'd be allowed to notice the same URI used a second time and ignore it.

AM: How is our case any different from Xerces with a catalog.

HST: Because they're indexed by URI. This isn't.

AM: I think this is the same.
... It goes to the catalog.

<ht> I really don't think that you can get Xerces to reproduce this case

Some clarification: Alex expects an implementation to pre-scan the schemas coming in and note their base URIs so that if there's subsequent reference to one of the ones "later" in the queue, it gets used anyway.

<ht> ... but in any case I think that's a red herring

AM explains how this case is very similar to the catalog case.

HST: Implementations are always free to ignore the second one because it's base URI is already loaded.

AM: If you process them serially, you're free not to process them serially.

HST: Stipulate that AM is right, I still don't think we should go there.

VJ: If we do this, it may have implications on other steps as well. It sets a precedent that we might want to apply globally.

AM: But you can always do that.

NDW: That's true, we have a section that allows that.

NDW: I agree you could do this way, but I think it's a novel application of catalogs.

<ht> I suggest the following replacement: "... must be used in preference to any schema locations provided by schema location hints encountered during schema validation, that is, schema locations supplied for xs:import, xsi:schema-location or determined by schema-processor-defined namespace-based strategies, for the namespaces covered by the documents available on the schemas port."

VJ: If you want to do this, you could in theory use XProc to fetch the schema first then change the schemaLocation attribute to something else and then pass it to the validate step.

AM: Yes, you could process the schemas to try to make them consistent.
... Do we have this consistency problem in other steps?

VJ: In XQuery if you passed in documents that referred to each other...
... You'd pass something which refers by URI and you pass a second document with that base URI, then the second document should override the external document.
... If we say it should behave that way, then it's the same case as this.

AM: In the XQuery case, it's the default collection so it would work.

HST outlines the suggestion he put into IRC above

HST: There are three different bits of the spec where the notion of a hint arises. One is for xs:import, one is xs:[noNamespace]schemaLocation, and one is for namespace-based discover.
... So in all those cases, what we say is that if one of the schema documents you have been given on the schemas port is for that namespace, then you're done.
... What it keys off of is the targeNamespace attribute of the schemas that you find on that port.
... That's what matters, not the base URI.

NDW: What about this example?

HST: Whether it works or not depends entirely on whether there are any conflicting definitions in these places, because you're going to process both of them.
... There's nothing in my proposal that says just because you get two documents with the same base URI, you don't process them both.

AM: So if you have two schema documents and one defines type A and one defines type B, same target namespace,then that's no problem, right?

HST: Yes, that's right. No problem.

VJ: So in that case, how do you know which schema to use as a starting point?
... Here we have two schemas for the same namespace.

HST: If they define any items of the same kind with the same name at the top level, then you're broken.

AM: There are lots of ways to start a validation episode, but we don't specify any of them specifically.

NDW: So we're going to add Henry's prose about hints and say that for the case of this example, you'd load both documents and if there are no conflicts you're golden, if they are, you lose.

AM: It seems to me that we're still asking the step to look ahead at all the schemas on the port to find the target namespaces.

HST: That's my understanding of how the existing processors deal with what we'll call for the sake of argument "command-line arguments".

AM: You can intercept every step along the way in Xerces. You can do all kinds of crazy stuff. I'm not saying that's consistent behavior, but you can do anything you want.

Norm: We're running out of time. I'll incorporate some of Henry's proposal, because I don't think that was controversial.
... Let's try to take the harder cases to email and see if we can work them out.

Any other business?

Norm thanks Henry for scribing/chairing the beginning of the meeting.

Adjourned.

- DRAFT -

XML Processing Model WG

Meeting 140, 16 Apr 2009

Attendees