W3C

W3C Workshop on XML Schema 1.0 User Experiences and Interoperability

22 June 2005

Nearby: Call for participation · Workshop Program · Chairs' summary report · Minutes Day 1

See also: IRC log

Attendees

Present
Alex Milowski (University of California, Berkeley), Allen Brookes (Rogue Wave), Chris Ferris (IBM), Dan Vint (Acord), Dana Florescu (Oracle), David Ezell (NACS, Chair of the XML Working Group), Derek Denny-Brown (Microsoft), Douglas Purdy (Microsoft), Erik Johnson (Epicore), Heidi Buelow (Rogue Wave), Jon Calladine (BT), Jonathan Marsh (Microsoft, Chair of the WSDL Working Group), Kohsuke Kawaguchi (Sun), Kongyi Zhon (Oracle), Leonid Arbouzov (Sun), Mark Nottingham (BEA), Mary Holstege (Mark Logic), Michael Rowell (OAGi), Michael Sperberg-McQueen (W3C), Noah Mendelsohn (IBM), Paul Biron (HL7), Paul Downey (BT), Philippe Le Hégaret (W3C), Radu Preotiuc-Pietro (BEA), Ravi Murthy (Oracle), Soumitra Sengupta (Microsoft), Sridhar Guthula (QuickTree), Steve Ericsson-Zenith (Semeiosis), Tony Cincotta (NIST), Ümit Yalçınalp (SAP), Waleed Abdulla
Chairs
Paul Downey, Michael Sperberg-McQueen
Scribes
Jon, Erik, Alex, Ümit

Contents

  1. Matrix of user experience reports
  2. Panel discussion
  3. Industry Way Forward
  4. Validation/Code Generation
  5. Profile

Minutes of the first day are also available.

Matrix of user of experience reports

Paul Biron: created summary of all reports. Stand out observation: Too many people have wanted too many things from schema from day1. One mans 80 is another mans 20 e.g. redefine either bread and butter or completely useless. we all tend to be selfish in defining what goes into the spec. Equally number of people indicated no change was needed. haven't heard anything new here… these issues have all been expressed before

Steven Ericsson-Zenith: you are being unfair in the reprimand

Paul Biron: deliberately! Standards writing is hard. Compromise is essential. People have to realize that you got to bend. Cannot satisfy everyone with this language. As a group we are at crossed purposes with each other. As an editor and a user I know we have to listen to each other and achieve solutions together. so Is my characterization correct?

Mary Holstege: you cannot have tight idiomatic code generation by tossing out features that support versioning/extensibility plus versioning/extensibility plus universal feature support in all tools, all at the same time.

Steven Ericsson-Zenith: 2 issues. 1) vendors implementation in their tools. vendors say impossible to have consistent implementation. Onus is on committee to define what is correct

Noah Mendelsohn: Can only go so far. Can say what is or isn't an XML document, can't say what software can be used. W3C normally specifies what it takes for a document or format to be conformant; sometimes provides test cases. Usually stays out of the business of certifying that a particular piece of software is conformant.

Dan Vint: That is a reasonable position but( e.g) many tools claim to be an XML tool while supporting very little

Noah Mendelsohn: WS-I doesn't really do this either

Dan Vint: not looking for W3C to certify everything but need an indication of what level of support is being offered.

<Chris Ferris> I think that a comprehensive test suite that was an accurate reflection of the spec (even if it didn't provide complete coverage) could go a long way to providing a means by which the market can police itself with regards to conformance. Of course, that would also require that the w3c "market" this concept… something along the lines of what we have done in WS-I. I would be glad to discuss this with staff or the WG at their pleasure

David Ezell: profiles have already been proposed, but I'm not sure they're adequate. What we need is an XML object serialization language that covers the problem space between different OO languages.

Paul Downey: but schema has already profiled all the languages in the world…

Panel discussion

Panel Panel = Ashok Malhotra, Paul Biron, Mary Holstege, Noah Mendelsohn, Michael Sperberg-McQueen, David Ezell, Leonid Arbouzov, Tony Cincotta

David Ezell: quick intro to WG. and NACS (National Association of Convenience Stores) they care about standard = saving money. Sometimes NACS member vendors hate it that competitive advantage is 'given away' in a standard. As such a vendor, I can identify with the discomfort of the major vendors in this meeting, but at a different level. Two new schema features about which NACS cares a great deal are versioning and co-constraints; we already have a huge investment in XML Schemas, even though it is not really good enough to support our requirements.

<Steven Ericsson-Zenith> note: I did not mean to suggest the W3 should tell software vendors what software to write or how to write it - my point was rather that the committee is responsible for enabling conformable tools through the standard

<David Orchard> I think "1.1" will be an inappropriate name. Hard to see how backwards compatibility will be maintained and meet the versioning goals, particularly if wildcards are changed.

David Ezell: We have produced a document on component designators. That document potentially fulfills some of the requirements of the semantic web with regard to type (i.e. content) identification. We have countenanced limited changes to the component model but so far only one "breaking" change with XML Schema 1.0, that change being addition of precision decimal.

Jonathan Marsh: is the 'formal description of XSD' still active?

Panel: Answer no lack of resources has meant this is on the back burner

<Paul Downey> thinks formal description would be of most value if you could change the spec as a result. it's too late now it's out there in so much code. though a formal description which could generate test cases could be of value

Michael Sperberg-McQueen: basically members got enough from what was produced. It would still be a good thing…

Steve Ericsson-Zenith: small but important audience for formal description

Michael Sperberg-McQueen takes the floor to talk about comments

Michael Sperberg-McQueen: WG has error reporting procedure. Fixes for 1.0, changes for 1.1, clarification etc. Written procedure to follow linked off home page. Problem with collection of errata is co-ordinating changes with other grpups that may not have reviewed errata. W3C approach is that errata are informative not normative. Last Nov 2nd edition of XML schema 1.0 with circa 150 changes. WG seeing problems that indicate vedors are still working off the old version

<Steven Ericsson-Zenith> so I heard the following: the formal specification work that was completed - and is apparently used in the formal work of XPath and XQuery is not supported by the committee because there is a mismatch between the specification and the published standard

Chris Ferris: this is a common problem, bugs found? WG fixes bug in spec and publishes as errata. Some vendors pick up on the published errata and make the necessary changes to their product. Other vendors note the errata but feel the issue(s) not important enough to change delivery dates of software etc. This results in interoperability issues between implementations. So what can we do to better provide for the community?

Michael Sperberg-McQueen: 2 things, in general WG position is to promote 1.0 answer unless 1.0 spec has a contradicion. Each erratum should have 1 or more test case. Would also be useful to have tool that examines schema. Maybe not every erratum can be detected that way but most could

<Paul Downey> +1 to erratum specific test cases, before and after would be useful

David Ezell: WG did minute decision to have test cases for every errata…

Paul Biron: There is an alarming lack of documentation from vendors about what level of errata have been included

Jonathan Marsh: wouldn't that just be noise from our point of view?

David Ezell: I like the suggestion that submitters of errata should accompany their comments with test case assertions. Tony will talk about the test suite in a moment. I strongly encourage everyone to read the WG Note Processing XML 1.1 docs with XML schema 1.0 processors, since is shows a possible way forward for supporting XML 1.1. Community feedback would help the WG plan how best to support (or not support) XML 1.1 in the future.

Tony Cincotta takes floor to talk about test suites

Tony Cincotta: There is a framework document for the test suites. Metadata includes location, what is being tested etc, submissions are public. Contributions to test suite open to everyone and encouraged. Process document describes process for dealing with tests. First pass assesses that structure is correct. Then WG gives it 'acceptable' status and it is displayed consistently with other tests in the suite. One it is reviewed found to be error free it is given status of stable and published. errors mean it is give a or 'disputed' status and goes back to the WG. Schema tests 1.0 compiled by Henry with major submission from Sun, microsoft, NACS

Leonid Arbouzov: test suite is an important tool for proving interop of implementations but unfortunately not required conformance for vendors. Vendors may choose to pass some tests and fail others. So what we did in Java conformance tests is that we have included all W3C Schema tests into Java conformance tests. Now all Java implementations must pass every single valid test of W3C Schema test suite. Hopefully this should improve compatibility and interoperability of XML Schema implementations at least in Java world. Main two issues for us however are
- first, we don't know which tests in W3C Schema test suite are valid and which are not. Some of them are outdated.
- second, if someone wants to add new tests and improve test coverage, one doesn't know which test should be developed. There is no accurate information on which parts of specs are tested and which are not and which require extra testing. Sun will continue to provide W3C XML Schema tests in Java conformance tests and we also have some extra XML Schema tests that we plan to contribute back to W3C.

Paul Downey: wrt Java are there specific tests for individual areas e.g. JaxB

Leonid Arbouzov: Yes but vendors must pass all tests

David Ezell: How many tests are there in the second edition of the test suite?

Tony Cincotta: several thousand.

Mary takes floor

Mary Holstege: SCD's identify components in the schema component model

Waleed Abdulla: so it's like XPath for Schema

Mary Holstege: you can use it to compare two schemas

Jonathan Marsh: can be used for layered spec, that's the use-case for WSDL component designators

Noah Mendelsohn: URIs for identification is a part of the webarch

Noah Mendelsohn's presentation: IBM pointed out shortcomings in the xsd:Decimal type at the time schema 1.0 was going to Recommendation status. The Schema WG promised to reconsider for Schema 1.1, and as a result the Working Drafts for Schema 1.1 propose a new type. The latest draft calls it "pDecimal" but the group has since agreed to the name "precisionDecimal." Of key importance is that the new type is aligned with the emerging IEEE754r decimal type, and thus also with java.math.BigDecimal, .Net System.decimal, and many others. See [slides] for more details.

Noah Mendelsohn: new decimal type to coexist with existing one. original type not compatible with emerging IEEE. Differences. IEEE unifying Decimal and floating point in a single standard with similar semantics. Operations defined. Significant digits count… IEEE suggests storage formats

Ashok takes floor

Ashok Malhotra: XPath strongly typed taking types system from schema 1.0

<Steven Ericsson-Zenith> precision decimal reference: http://www2.hursley.ibm.com/mfcsumm.html

Waleed Abdulla: are new types supported in XPATH

Panel: Yes, only introduced a type that they had in their hierarchy anyway

Waleed Abdulla: does schema 1.1 have new namespace?

Noah Mendelsohn takes floor to talk about versioning [slides]

<Steven Ericsson-Zenith> I asked if it was possible for xsd 1.0 to be a profile of xsd 1.1 - Noah Mendelsohn pointed out some issues with base types that might make that a challenge

Paul Downey: most WG conduct themselves in public but schema does not. Is that a problem for getting people involved?

David Ezell: most of what we do feels public anyway, but it's possible that monitoring public lists could slow us down even further. However with regard to the versioning topic, we need much more public scrutiny. We need to examine all the ways that people "version" their XML languages and try to come up with a way to support those as best we can.

<David Orchard> An additional resource Compatibility articles

Noah Mendelsohn: (from slides) XML should be key to loose coupling. idioms for evolving language vocaband schema not agreed e.g. roles of namespaces also disagreement over extension vs restriction. Use cases driving a lot of the WG analysis. WG has description of terminology and proposed mechanisms. Noah has written white paper on evolving XML schemas

<David Orchard:> there is the W3C versioning mailing list that is public

Noah Mendelsohn: Document ed David Orchard and Norm Walsh referenced. Some basic principles: clean support for repeated revisions (>20). Versioning 'Sometimes Not Always' (SNA) tied to namespaces. Don't presume constructs in instance docs (e.g. <extension>). More controversial? .. forward/backward SNA required. Breaking changes happen

Ümit Yalçınalp: what is the issue here with breaking the rules?

Steven Ericsson-Zenith: are you arguing against arbitrary changes in schemas?

Michael Sperberg-McQueen: Noahs point is that creating a new version MUST make a schema processor 'die'

Noah Mendelsohn: difficult thing is schema is a tool for the application, can not say what users will do. We are building a tool but trying not to bake the model into the tool

Steven Ericsson-Zenith: are you proposing to make changes to schema language to enable schema to capture redefinition /deprecation of elements?

Noah Mendelsohn: Principles continue… Check/enforce compatibility in tools. Versions may or may not form sequence or tree. Revisions sometimes not always expressed as deltas. Restriction vs Extension. Base schema allows all future content. wildcards… validate future content, weak wildcards beat UPA problems, New wildcard matches: any elements not known about in this schema. Process decision 2 minutes to finish off… Extension… base schema validates only version 1 content but new processing nodes indictae which subset of content would have

Steven Ericsson-Zenith: uncomfortable with use of the term 'wildcard' here

David Ezell: Major minor use case. pulled from UBH covers the 'breaking change' scenario. describes schemas as the exist today. 2nd use case. OO how do you serialize/deserialize object and how do you cope when things change. 'Specialization' is need to create template schema

David Ezell quickly decribes remaining use cases and invites review from participants.

<David Orchard> FWIW, I plan on offering an evaluation compared to some Web services use cases.

[break]

Paul Downey: Topics to be discussed based on yesterday's vote: Schema 1.1, Test Suite, Versioning. Now is the appropriate time to bring up ideas about how to change the schema language. Let's start with versioning

Paul Biron: There have been a lot of discussions going on…

Eric Johnson: other say it's different [bit buckets]. Agrees with Noah — should not bake any of these ideas into the XSD spec.

<David Orchard> I disagree that changing versions always means changing namespaces.

Erik Johnson: we should put hooks in so that processors can use their own rules

Paul was using the namespace change concept as an example only, BTW.

Eric Johnson: Would anyone in this room be happy if the spec says the only legal schemas are version in one or two ways mandated by the spec?

Waleed Abdulla: Could the W3C do a best practices doc?

Ashok Malhotra: Someone *else* should do that work.

Dan Vint: Isn't there precedence that a new version = new namespace?

Paul Biron: Maybe, but there is probably no one right answer.

Dan Vint: It's the only solution I can get to reliably work at this time.

<David Orchard> Versioning and XML namespace policy

Ümit Yalçınalp: No one has an identification scheme to declare "this is the major version", "this is the minor version", etc.

<David Orchard> The meaning of "Major" and "Minor" is the problem.. does "minor" mean "backwards compatible" change, "small software" change, "small document" change?

<David Orchard> re: xml 1.1 and schema "1.1".

<Chris Ferris> a namespace is just that, a space of names

Mary Holstege: There is a hole in the architecture in that namespaces are unrelated [values]

<Paul Downey> thinks namespaces are about ownership not just versioning

Paul Downey has a presentation about Web Services Description issue LC124 [slides]

Paul Downey: "Compatible Evolution". XML is "yet another self-describing format". add optional stuff, don't delete stuff, don't change the meaning of stuff, communicate breaks in compatibility. LC124 — Evolution is a *big* issue for web services, WSDL WG failed to engage the XSD WG, LC124 - versioning last-stand, Concrete Proposal. Since we use schema to describe message exchanges, this is really a schema problem

<David Orchard> Options being discussed for LC124 at LC124 Options

Paul Downey: XML Schema 1.0 for Description — 1. Description of content constructed by a sender, 2. description of content available for a receiver, 3. validating the format of a message. Versioning and XML 1.0 — got to get it right in version 0, UPA and greedy xs:any make writing extensible schemas *tricky*, have to resort to dumb schema tricks. Validate Twice (Henry Thompson's technique) — 1. Validate document, 2. Prune PSVI elements marked "*[pe:validity()='notKnown']", 3. Validate pruned document.. [going back to "XML Schema for 1.0 for Description] — you can do point 3.. LC124 Questions — Does ignoreUnknowns impact data mapping?, Is this a tractable problem, Should WSDL define this?

<David Orchard> btw, an "isCompatibleWith" attribute in WSDL 2.0 also went down to flaming defeat, see LC54 Proposal

<Chris Ferris> repeating URI for Henry's Validate Twice: Versioning made easy with W3C XML Schema and Pipelines

[The group is walking through a scenario introduced by Waleed, but missed by the scribe]

Waleed Abdulla: what about a content model (a, b, c, xs:any) where c is optional? c will be unknown because of UPA?

Michael Sperberg-McQueen: There is a terminology problem: this is *ambiguous* / non-deterministic, but not unknown

David Ezell: Speaking not as chair, Henry's solution to this problem represents the best of the 80/20 rule.. Hopes we'll give serious consideration of this, although WSDL can do whatever you want

Mary Holstege: Need clarification [on the "Annotation, Extension, Mandatory" slide in Paul Downey's slides], which Paul had skipped for time.

Jonathan Marsh: So, WDSL would say that "a WSDL processor may, must, or should ignore unknown content in msgs where "unknown" is determined by "Henry's algorthm"?

<David Orchard> Mary Holstege, see my note on the options for lc124. It's Annotation for Schema, Extension in WSDL, and possible Mandatory for WSDL.

Paul Downey: I talked to Henry, there were some concerns [but missed by the scribe]

Michael Sperberg-McQueen: A processor that offers an option for this can be a conformant processor

<David Orchard> I would say "The "ignoreUnknown" property set to "true" indicates that a processor should not fault when processing messages that contain _unexpected items_….." and "… The unknown content may be identified by a W3C XML Schema processor. The [validity] property in the Post Schema-Validation Infoset will contain a "notKnown" value if unknown content is found."

Radu Preotiuc-Pietro: Isn't the behavior of validators described by the spec? If so, how can you change the validator without changing the spec?

<David Orchard> Radu answer: behaviour is described, you don't change the validator but you layer on top of it to make sure the 2nd validation gets what you want.

Noah Mendelsohn: What if I wanted a tool that compiles "C", sees a problem, changes the code, and recompiles the code again — it's the same process

Noah is explaining the layered concept in this situation

Noah Mendelsohn: XML Schema validation is applied to one of the validation passes

Radu Preotiuc-Pietro: but having it all in XSD would provide interoperability

Derek Denny-Brown: Not having a good way of separating concerns between what to ignore under what context. Partial understanding can turn into "silently ignored"

Chris Ferris: Chris Ferris: Most web services don't do validation, many cases it's not a B&W thing whether a message is [XSD] valid,. You don't throw a million dollar P.O. away just because of a schema error. That's not how business is done.

<David Orchard> I think web services do validation, but they have compiled code based on the schema which does the validation…

Chris Ferris: Validation is not necessarily a boolean result, the 2-pass validation may be problematic with security (digital sigs, etc.).

<Ümit Yalçınalp> +1 to David Orchard. In essence the validation is in the data binding

Jonathan Marsh: Regarding Noah's presentation and restriction / extension — Noah: which do you like better?

Noah Mendelsohn: I was speaking for the WG, there is no definitive answer yet — we are still working the use cases, and we've had users express strong preferences both ways.

Jonathan Marsh: The pruning step: requiring validation to process the data in the message (e.g. expose through programming constructs) may be too painful. Typically each message isn't validated at runtime.

<David Orchard> If the wording is done right, it doesn't mandate a validation step or 2. It mandates acceptance of unknowns..

Paul Downey: I found myself using UPA because it helps narrow the permutations of possible schemas

Dan Vint: This needs to be down in XSD as well as WSDL — we need it in schema as well

Paul Downey: [basically agrees]

<Paul Downey> XML is about communication, so useful to be able to annotate the schema, not just WSDL

Paul Biron: Validate twice operation comes about because the notion that validation in schema is NOT a binary op — every node in the tree has 6 possible outcomes. tried to validate, did not try, pass, fail, etc.. validate 2x means "if you send extra stuff, it technically des not conform to the schema, but the application may choose to accept it".. I would not want to have a mode in schema that declares the document as valid. other processors can decide if the unknown content is valid

Michael Sperberg-McQueen: I think this is different — the behavior of the validate 2x process is different than RDF. I am making the assumption these are all top-level elements. [On the white board]. 2x validation will not accept a document (abcccc) for schema (a, b, c[1-3]). What does WSDL say about the relationship between schemas and the messages a service SHOULD, MUST, MAY accept?

Jonathan Marsh: It just says "this is the message format"

Douglas Purdy: Extension/restriction — we have thought a lot about that. Both are useful for different things. Restriction — how type authors revise existing types. Extension — Say I author a P.O. type, but implementers want to add new stuff. so, one is for versioning and one is for extension. On this proposal for 2x validation — it scares me.. we never know where the message is going to be [physically]. It may be that I have a reliable msg system that processes messages not via WSDL (might be Michael Sperberg-McQueenQ)

Paul Downey: I was careful to avoid the term intermediary — I used "observer".

Douglas Purdy: If I strip the soap:header off the top oand process the body later, how to I validate the body?

Paul Downey: The output of validation is not a binary step — it's a PSVI

Douglas Purdy: Does my schema processor need to be aware of these rules?

Paul Downey: The app that invokes the schema validator has to be aware of these rules.

Douglas Purdy: We always want to be able to "beware the evil intermediary"

Radu Preotiuc-Pietro: If you add things that are not in the spec, they are chameleon and risk interoperability issues

David Ezell: I would like to ba able to invoke a "contract test"

Jonathan Marsh: Could we put this in schema?

David Ezell: I don't think we could get there

<David Orchard> WRT Douglas Purdy's point, it seems somewhat strange to not use WSDL but to use Schema and validate Web Service messages. How does the msmq know what schema to use? Needs a "web service description"..

<Chris Ferris> hmmm… again, I'm not sure I agree

<Chris Ferris> much of the discussion here has been that validation is really an application-level thing…

<Mary Holstege> more like: what to _make_ of incomplete validity or invalidity is an application-level thing

Alex: The validate 2x algorithm is a pipeline, so in that sense it is very much at the application level

<David Orchard> no more at the app level than SOAP Handler chains are at the app level..

<Chris Ferris> david e's contract test was: can I process the message? he specifically said, does the message have a 'b' (using Michael Sperberg-McQueen's (a, b, c{1-3}) example where the instance was a, b, c, c, c, c

Dan Vint: We just need to have something in schema to help solidify how these processors are expected to act.

Noah Mendelsohn: The reason (good or bad) you choose to skip content is a question you can't answer in the schema spec.

Ümit Yalçınalp: From a data binding perspective, the WSDL processing is separate from the schema processing issues

<Paul Downey> we wouldn't have a web if HTML was strictly validated

<David Orchard> There's nothing stopping a Java implementation from doing this in one step when it gets fed the schema + "ignoreUnknowns" property

Ümit Yalçınalp: I don't like to look at this as a two-step process because it won't happen. It's not an application problem— schema validation is used by data binding independently of the application.. I don't see this as an application issue.

<Ümit Yalçınalp> JAXB 2.0 does this already

David Ezell: I can't call a competing vendor and tell them the message they are sending me is invalid because it fails to deserialize in JAXB and expect them to help me debug the problem. However, if I can tell them it fails schema validation at such-and-such a point, I'll be much more likely to get positive action.

<Steven Ericsson-Zenith> per "we would not have a web if it were strict" - it is one thing to display public data - another to build reliable applications that depend on data contracts

<scribe> Chair: The minutes should show that versioning took up all of the available time.

<Mary Holstege> right, so for some applications, what you make of partial or complete invalidity is to barf. fine choice

<David Orchard> or partially barf?

[The workshop is recessed for lunch until 13:30 Pacific Time]

<Douglas Purdy> David, I saw that you had a question about my statement about durable messages. I don't think that I explained my scenario clearly enough — I was just referring to processors that rip the envelope and look at the body solely.

Industry Way Forward

Noah Mendelsohn: Tim Berners-Lee asked me to convey his perspective to you. Specifically, Tim strongly believes that schemas should be about constraining the "sentential forms" of documents. In other words, schemas help you separate the set of all possible documents into two piles: those that are correctly formed per the schema and those that are not. Tim believes it is typically a mistake to view the content of documents as op-code like instructions for processing. Some people use schemes to implement that operation-oriented model, and Tim wants in most cases to discourage it.

<Chris Ferris> http://mathworld.wolfram.com/Grammar.html

Philippe Le Hégaret does a W3C 101 presentation [slides]

Philippe Le Hégaret: Articles are on the W3C website and are typically linked from the group page. WAI is for accessibility … Might want to discuss that these (slide looking around) are working groups rather than interest groups. I18n working group does not currently develop specifications but they wanted the option. The wiki is writable by the world… which is an issue. The i18n group produces articles for users and technical spec writers. There is a concern about patent policy and the wiki. That is, outside contributors haven't agreed to the patent policy. RSS has been working quite well for i18n to reach their users. Considerations: the resource problem is the biggest issue (e.g. if there is a decision to write an article, someone has to write it).

Jonathan Marsh: If more work has to be done to write up Henry's validate twice, then maybe an incubator group would be a good way to do that. Incubator groups are somewhat like task forces.

Ashok Malhotra, Oracle [slides]

Ashok Malhotra: support entities, schema evolution, remove UPA. Need to think about how to sell XML Schema. #1 thing: Marketing!

Alexander Milowski: Can you take a class? Yes, in the information systems schools and not the CS departments.

Ashok Malhotra: Need more books (e.g. more than one). A better primer… Some published best practices aren't necessarily "best practices". Need better testing and certification.

Questions: Who ought to do this?. and need resources.

<Chris Ferris> is this the "coach" to which Ashok was referring? it is Regex Coach and written in Lisp. The Regex Coach - interactive regular expressions

Philippe Le Hégaret: The W3C is not experienced in certification. NIST has experience doing that.

Question: What does SQL do? Is there some kind of SQL certification?

Jeff Mischkinsky: NIST had some kind of SQL certification. but that was in the 80's. There was a certification program with fairly good coverage. But that's a very expensive thing to run. And it is hard to do. Right now, there is no SQL certification program.

<Paul Downey> thought all ISO STANDARDS had conformance suites

One thing missing on the slides: library of reusable components (e.g. a type library).

?: Registries aren't popular at the W3C. Registry implies a unique thing… Repository is where you put it. Registry has a unique identification.

Noah Mendelsohn: Producing schemas is not the same as testing and processor that uses schemas… A lot of care has to be done to put into describing what the kinds of software is considered a process and what it does. There is a layer of work that needs to be done… Type libraries may be an ideal thing for an incubator group.

Soumitra Senguptra Soumitra Sengupta, Microsoft [slides]

Soumitra Sengupta: entrusted with the team that builds the core components. Co-occurrence constraints are important that they'd like to see it layered on schema. The format for word will be XML going forward. Visio is the only office format that won't move from binary files to XML. When confused on simple types, it defaults to string. In office 2003, there are some restrictions and may not be there in the next version. InfoPath has something similar to co-occurrence constraints. There has been a tremendous amount of work on schema conformance in the next System.XML and MSXML 6. A better formal specification would be helpful. XML to object/XML to Relational is not an easy problem and will not get fixed easily. Trying to change the schema to solve this problem is not the right thing. 300,000+ developers using XSD. At XML 2005 they will present a full study of what they've found. There is tremendous value in XML Schema. Don't just change it… Office support the full XSD spec will increase its use. The file saved in the next version will be a zipped XML document. XSD is providing real value in-spite of the pain. XSD 1.0 should still be the basic foundation of exchange. Will collaborate with all vendors for interoperability. Strongly believe data binding should be out-of-scope for the W3C. Profiling is not a bad thing. Microsoft had a different position at one point.

Noah Mendelsohn: When Douglas Purdy talked about profiles, profiles are a good thing to optimize around… you appear to be going further.

Soumitra Sengupta: Profiles are a good thing but middleware should support the whole thing. Investing a lot in best practices… should include actual schemas that are working.

Philippe Le Hégaret: Lots of people say the spec is complex, but no one mentions the primer. Are people reading the primer?

several in the room: Everyone reads the primer and then the spec…

Philippe Le Hégaret: People have said "just say no to WSDL2"… just look at the spec… but it is hard to make those kind of specs easier.

several in the room: The primer is great for your first introduction… then you need to find your local guru to understand further.

Michael Sperberg-McQueen: What we need is "readers" for XML Schema.. Examples need to be relevant to the reader.. this is not a new problem… every language has this challenge at the beginning…

Paul Biron: My memory, we had part 1 and part 2 and one member said "I'm going to write a primer" and then it was published as part 0. So, if someone thinks we need a reader, then write one.

Jonathan Marsh: Now every working group should have a primer material and that's not core to the WG and so the work stretches out for years. Constrain the WG, try not to do primers, and then have outreach groups to do these things. At this point, you've done a primer, 5 years of experts, there's a market for that material,… but the schema group now offers producing the errata and getting all the nits worked out.

Jonathan Marsh: If interoperability is a priority, test suites and errata would be very valuable.

<David Orchard> I will point out that in WSDL 2.0 people have not generally wanted to do work on the primer, but people have now started using the primer examples for making their points and it's changed WG members opinion on issues.

Noah Mendelsohn: Clarification work is need for both users and spec readers. Have had reports from people who said that it was very painful but that it does answer the questions. One of the traps we could run into is that "doing x or y that is different" could run into problems.. This spec two many passes and this is the best we came up with…. One of the problems is that we'll change something that was correct event if hard to find when we change the spec for clarity.. My view, at this point, the best is to do selective clarification. Fixes to the spec where necessary.. In selected more major areas, clarification of certain areas (e.g. imports and includes). Point: the original specification doesn't go away…. The risk in clarifying the spec is that to change one thing you might have to change a lot of the spec.

Chris Ferris: Chris Ferris: Often, people such as myself go to the XSD spec and chase down the rules to figure out a particular problem and become enlightened but there is no record of that so others can benefit. A wiki where people could post their findings would be useful in helping to clarify issues for others benefit.

Noah Mendelsohn: The FAQ started that way…

Chris Ferris: If it was easier for the public to contribute that might provide a lot of value.

Soumitra Sengupta: What we do is write the tests…

Noah Mendelsohn: The spec does get easier to read once you learn how it's organized. I suggest we might want to publish some information on how to use the specification to answer your questions.
Might be a good piece of work for the Schema WG to produce.

Michael Sperberg-McQueen: If you want to know if an instance is valid, you start in section 5.2. Not everyone knows that.

(?): go through and actually apply the rules… to figure that out, the rules aren't cross-related. It is clumsy… maybe if you start from the right spot knowing what you are looking for… you have to watch for these edge cases isn't actually called out… and follow a lot of dead ends.

Noah Mendelsohn: want an annotated XSV…

Derek Denny-Brown: tool vendors encounter issues in different order…

Michael Sperberg-McQueen: if we could safely refactor the document, we'd do it in a flash. There are different opinions on what safe, refactoring, etc. means…

Jonathan Marsh: What's done is done. Not clear if the pain is increasing or dropping.

Paul Downey: What about new implementers? Do they need to spend 5 years too?

Noah Mendelsohn: Mystified that there aren't more books.

David Ezell: wanted to respond to "chickening out"… don't feel that we're chickening out. Feeling that since we've failed to write a clear thing, we may not be the right person to write this. There are two important reader groups: Einstein's and elvis's…. Need more material for Elvis…

<Henry Thompson> I would like to get a sense from this group if we added one more thing to structures beyond the status quo, namely weakened wildcards (we already have subsumption for particle restriction agreed), nothing more to datatypes (i.e. the impending last-call draft), and NOTHING MORE, and declared victory on 1.1, with a promise to focus subsequent effort on 1) Layering co-constraints on top, separately; 2) Best practices for versioning, working with what we've got; 3) lot

<David Orchard> A few years ago people would use XFront's schema design guides regularly..

[session ended]

<Henry Thompson> It's my bedtime, hopes someone will feed in his comment at the appropriate time. Michael Sperberg-McQueen/Noah Mendelsohn/Mary Holstege — do you understand what I'm suggesting? I feel that we could easily use at least 50% of the (hopefully expanded) WG's time for a year putting out Best Practice Versioning Notes just on how to make the best use of our existing design, and if we did that for a year or 18 months then we might have enough concrete experience to look at XML Schema V for Versioning. And I hear a lot people asking for those Notes here today

<Erik Johnson> Are the Best Practices Notes public?

Paul Downey suggests 15 minutes for each 3 topics and 45 mins for wrap up

Three topics are validation, profiles and UPA

Validation/Code Generation

Jonathan Marsh: This is related to LC124 in WSDL.

Paul Downey: Is the code binding a requirement?

Michael Sperberg-McQueen: Presents David Orchard's example (last name, first name, any element can appear in between or after)

Noah Mendelsohn: Code binding is an important requirement, but some people are using it as an implied requirement dumbing down XML and schema to meet the needs of more traditional programming languages. Mixed content and choice were both in XML before schema was hatched. Both are tremendously important. "Code binding" should not be a euphemism for getting rid of such key features of XML.

Dana Florescu: Sympathetic to code binding, but it is not the only use case (X Query)

David Orchard: First name, last name, and a wildcard that leaves the possibility of extension a middle name.. There are actually two examples. Middle name (when it occur in the middle) is separate.

Douglas Purdy: To make changes to schema to support data binding is leaky. This leads to the profile discussion. However, w3c making a change for writing generators is not good idea.

Discussion about what UPA is…

Ashok Malhotra: Prevents backtracking.

<Kohsuke Kawaguchi> RELAX NG proved once and for all that you can do without back-tracking even if you don't have UPA

Paul Downey: What is the difference in typing and binding?

Noah Mendelsohn: Binding is where you will put it in a Java bean, for example.

Michael Sperberg-McQueen: With UPA you can use a simpler construction and build deterministically. The automata is deterministic by UPA.

<Kohsuke Kawaguchi> I mentioned that you might save a little in constructing a state machine but you'll pay a big cost of checking UPA

Paul Biron: The other thing UPA does is when there are appinfos attached to elements, if I get content model that violates UPA which one do I use? It prevents this case.

Derek Denny-Brown: We have tools that depend on appinfo, annotations. I will be uncomfortable in loosening the restriction.

Paul Downey: How does this interact with databinding? JAXB uses Appinfo. In an ambiguous content model, it will be problematic. I like determinism.

Profile

Paul Downey: Are profiles useful or harmful?

Douglas Purdy: In perfect world we will not have a profile. Programming model will be aligned with schema, but it is not the case. Pressure from customers dictate what we do.. We have used the lessons learned from SOAP builders and adopted in WS-I basic profile.

<David Orchard> It's not nearly as rosy as that because some vendors have profiled the profile.

Douglas Purdy: No attributes, use element, xsd primitives sequence,… can become a profile.. There are two different scenerios: code first vs. schema first.

Chris Ferris: If WS-I did not preclude the stuff outside the profile, then it will be possible to guarantee a reasonable programming model.. We could say this and it will not preclude others to use other features.

Douglas Purdy: The key is that tools should not fail outside the boundary. The vendors should support all the schemas.

Chris Ferris: It is not an interop problem, it is a convenience problem.

David Ezell: I keep hearing IIOP. Does the profile solve a problem or introduce a set of conventions?

Paul Downey: There is a strong market for tools that support a particular feature set.. If you stay within a feature set, you will get a good programming experience.

<Chris Ferris> there is never a guarantee of interoperability

Ashok Malhotra: Another way is to say always use element form qualified, etc. These are user guidance.

Jon Calladine: If the profiles can not be verified by tools, they are not that useful.

Ümit Yalçınalp: +1

Noah Mendelsohn: Optimize for the profile, but support everything…

<Chris Ferris> +1 to what noah said

Mark Nottingham: There are languages PHP, Python…

Noah Mendelsohn: A simple subset of the schema will be easy to write a parser for.

Paul Biron: Worried about profiling. We have profiles of the WS-I profiles for use in HL7… and I don't understand why, because those specs are so easy to implement I don't see why our people need profiles, but they want them.

Douglas Purdy: When we engage with customers, users do not want DOM.

Ümit Yalçınalp: +1

<Mark Nottingham> +1

Paul Biron: Why can not the tool vendors do more?

Jonathan Marsh: Another thing a profile does to encourage vendors to extend the patterns of use.

<Chris Ferris> Soumitra mentioned in his talk two experimental efforts at incorporating XML into Java and C#, namely XJ and Cw

Douglas Purdy: We like to represent graphs, but noone can understand semantics.. As long as we can use a different mapping but preserve our semantics, we are ok.

Paul Downey: how many think that a schema profile is a good idea?

Answer: 16

array

<Mark Nottingham> 30 in the room (not counting Chairs and W3C Team)

Discussion on what the next question should be…

Noah Mendelsohn: The profile we are discussing is for a specific purpose: data binding. There are other uses of profiles.

Paul Downey: If a profile is done, should it be done at the w3c?

Steven Ericsson-Zenith: The question is whether the mechanism is defined by the w3c and the profile is defined elsewhere…

Ümit Yalçınalp: There are two questions. Explicit support in the language for the definition of a profile vs definition of a contrained set of XML Schema features which is helpful for data binding

DF: Layers of compatibility was not a good experience for XQuery. Everyone will implement everything anyway.

??: Profile will help getting the spec together. For example, profiles that define extensions will help reduce the complexity of the spec and understanding of the requirements.

scribe: There needs to be one source of profiles, preferably w3c.

David Ezell: A process question: We are running out of time and we can not resolve what the profile should be. We need to focus on the specific question.. Is the w3c the right place to do this?

<Chris Ferris> my view is that if the web services activity were the venue for a profile that defined the sweet spot of xsd 1.0 that made for a pleasant user experience, I could support that, but I also think that ws-i may be a better venue since we already are set up to develop profiles. wiki good

Ümit Yalçınalp: +1 wiki

Michael Sperberg-McQueen: Time for us to move to the summary.

Noah Mendelsohn: It has been an repeated experience in w3c we do better with concrete proposals. That is hard to ask from the users. Nonetheless, the same approach applies…

Henry Thompson: weakened wildcards, simplified definition of restriction, datatypes in current draft. can we call victory ? This is a proposed program for XML Schema 1.1

scribe: would like to focus on (1) layered co-constraints (2) best practices for versioning.

<David Orchard> Worried that this doesn't set Schema up for what it needs going forward wrt versioning. weakened wildcards is at best a partial solution. Mutliple namespace documents are very common.

Derek Denny-Brown: How does it work with existing processors?

David Ezell: our intention has been not to cause unnecessary pain.

<David Orchard> Can fix versioning by doing additions that are optional and thus compatible.

Discussion on what w3c should do…

scribe: should there be Schema 1.1 or not?

Noah Mendelsohn: What does it mean? What should the WG should do then if 1.1 was dropped.

Michael Sperberg-McQueen: Effect of dropping 1.1 would mean that maintainance of the test suite and errata as well as the promotion material preparation.. will be the focus of the wg.

Meeting adjourned

Ways forward
General W3C Send people Hmh?
Nested profiles 2 1 0 1
Domain specific profiles 17 13 8
Wiki 22 21 18
Focus on Errata 23 7
Focus on test suite 22 9
Henry's proposal 2 6
Best practice versioning Note 21 18 4 1
Marketing documents 16 5 0 2
Type libraires 16 11 1 1
Finish 1.1 11 5 0

[adjourned]


Minutes formatted by David Booth's scribe.perl version 1.126 (CVS log)
$Date: 2005/07/12 19:15:03 $