XML Schema Patterns for Databinding WG F2F

27-28 February 2006

W3C Technical Plenary, Cannes-Mandelieu La-Napoule


See also: IRC log: Monday, Tuesday
Log is incomplete due to networking issues.


Paul Downey (BT)
Jon Calladine (BT)
Yves Lafon (W3C)
Observers Present for part of meeting:
Philippe Le Hégare (W3C)
David Ezell (NACS)
Paul Thorpe (OSS Nokalva)
Michael Sperberg-McQueen (W3C)
Carine Bournez (W3C)



trackbot, init

<trackbot> Tracking ISSUEs and ACTIONs from http://www.w3.org/2005/06/tracker/databinding/

trackbot, start meeting

<trackbot> Date: 27 February 2006


discussion about participation, goals for this meeting

pauld: want's to document 'state of the art' in Basic patterns and work towards an interop fest during CR, later this year

discussion on roadmap

pauld: would like to identify schema features used by each document
... will enumerate 'features'

discussion of process, low participation is a worry from a 'mindshare' pov. we do have some people on the list who are adding value

pauld: basic patterns is really aimed at schema authors - here's what works well now with 'mainstream' tools
... 'mainstream' is a loaded term
... advanced is aimed at implementers. We have few implementers in the working group

<scribe> ACTION: pdowney to enumerate schema 'features' for the roadmap [recorded in http://www.w3.org/2006/02/27-databinding-minutes.html#action01]

<trackbot> Created ACTION-17 - Enumerate schema \'features\' for the roadmap [on Paul Downey - due 2006-03-06].

jonc:identification of patterns - we need to enlist help from schema gurus

How we work

Philippe joins us and we discuss how we can make progress

in essence we are doing unglamerous QA work

it's tough at the bottom :)

<Yves> http://www.w3.org/2005/06/tracker/databinding/issues/4

pauld: issues list is member visible

<scribe> ACTION: yves to write a wget script to publish issues on the public page [recorded in http://www.w3.org/2006/02/27-databinding-minutes.html#action02]

<trackbot> Created ACTION-18 - Write a wget script to publish issues on the public page [on Yves Lafon - due 2006-03-06].

ISSUE-4 - Collection of Databinding Implementations

pauld: suggest writing a small XSLT to rebuild Schema WG list

yves: and the subset is a list within the world of Web services

<scribe> ACTION: pdowney to build list of implementations [recorded in http://www.w3.org/2006/02/27-databinding-minutes.html#action03]

<trackbot> Created ACTION-19 - Build list of implementations [on Paul Downey - due 2006-03-06].

yves: flag list to SOAPbuilders

pauld: missing tools should ping us

ISSUE-1 Scope of structures


discussion on mapping specific data structures - a pattern for vector doesn't have to be represented as a vector

not in business of saying this pattern MUST be presented as 'a vector' in all toolkits

pauld: or specifying turn-round patterns

document contains three sections - issues, schema patterns and data stuctures

Philippe: don't forget you can publish versions of a Rec 1.0, 1.1, etc
... what about testing this section

pauld: we can't test how each pattern is represented internally

discussion of echo test

pauld: test consists schema, schema instance, echo input document, toolkit desrialises, then serialises, client compares infosets

Philippe: sounds like you could re-use ws-addressing CR test suite

discussion of xs:any and how that might imact 'the bar' for basic patterns

ISSUE-12 identifying a conformant schema

<Yves> http://www.w3.org/2005/06/tracker/databinding/issues/12

<Yves> let's produce one using xslt and close the issue

<Philippe> W3C can host XSLT validation service

ISSUE-16 - Multiple patterns for a single data structure

RESOLUTION: first version of Basic patterns will not include quantification, may be added in later versions

ISSUE-2 - WSDL and the Test Suite

we could take schemas from the wild and identify elements, but more likely to create feature based testing

requires work one schema per pattern, we can build one WSDL per pattern or one 'mother or all

WSDLs' for all of the patterns

How we work, Redux

discussion of anti-patterns

testsuite, again

dezell: rather then just echo, how about an 'operation', e.g add two numbers, count elements, concat type values

discussion of pattern checker

dezell: maybe you could look ar XQuery as well as XSLT
.. declarative XML processing language
.. in 'selling' mode, so may be able to offer resource!

<Yves> Agenda: http://lists.w3.org/Archives/Public/public-xsd-databinding/2006Feb/0046.html

ISSUE-10 - mapping element and type names

pauld: introduces the issue

dezell: programing language to programming language is one problem
... xml is the superset? are there characters in programming languages not allowed in XML?

pauld: perl names begin with $, @, % etc
... NCName ::= (Letter | '_') (NCNameChar)*
... will recommending developers can using any XML NCName for elements and types give 'good experience' with 'state of the union' toolkits?
... recommending schema authors stick to US-ASCII would work best with C programming languge, but as a W3C WG, we can't go against i18n

yves: even if we allowed japanese variable names, it wouldn't help non-japanses programmers
... we could add text to say 'consider the audience of your schemas'

jonc: protect ourselves with that statement, but guarentee US-ASCII

pauld: worry about hand waving, when Kanji element names won't give 'good experience' with many tools
... maybe a 'warning' from the test suite, would be good enough

yves: not even US-ASCII - alphanumeric, not CR, LF, + * @ etc

dezell: we should say there is a general purpose mechanism, but we should also allow to advise a subset

RESOLUTION: ISSUE-6 closed, will use same text as ISSUE-10

ISSUE-19: Advice against using the 'all' model group

dezell: sequence should be used
- when you care about order of elements,
- and when you don't care!
... 'all' otoh is useful when order has meaning, i.e. 'a' before 'b' is different in meaning to 'b' 'a'

pauld: brilliant!, but cites example of reflection of Java class

dezell: can databinding tools keep a record of the schema defined sequence
... schema 1.1 working group working on the 'all' issue
... is it better to put the burdon of ordering on the sender or the receiver?

jonc: sender is usually best, e.g. receiver may not be schema aware

pauld: surely it 'depends', cites XDR (v) IDL 'sender makes good' (v) receiver makes good

dezell: ordering is useful, lack of ordering can lead to interoperability and can provide a processing burdon
... sees providing a default value as being closely related

jonc: would like all default values to be filled in ahead

dezell: many in the schema WG would agree with you

pauld: are we suggesting 'all' doesn't appear in our basic patterns?

yves: well, everything is ordered in memory

jonc: do toolkits generate 'all'?

pauld: not sure

<scribe> ACTION: jcalladi to report on code-first experiences with 'all' and toolkits [recorded in http://www.w3.org/2006/02/27-databinding-minutes.html#action05]

<trackbot> Created ACTION-21 - Report on code-first experiences with \'all\' and toolkits [on Jonathan Calladine - due 2006-03-06].

<trackbot> Tracking ISSUEs and ACTIONs from http://www.w3.org/2005/06/tracker/databinding/

let's go to break - back at 15:45 to talk about xsi:nil

ISSUE-7 xsi:nil and minOccurs=0

dezell: rec stands on its own, but can offer context for nil in the language
... distinction between absence of an element and null value - null is 'infectious'
... designation as null or nil, expresses half-hearted deprecation of data

jonc: paul biron has differences between not knowing and not existing

pauld:let's drill down into combinations http://www.flickr.com/photos/psd/111538769/in/set-72057594072218284/


dezell: analogous to floating point and NaN
... it's purely semantics

yves: should we document the background?

discussion of history .. nil is mea culpa in the document, not here means "not here"

jonc: minOccurs="0" nillable="true" does at least help mask toolkits which send nil regardless of the schema

pauld: let's work on the minOccurs/nillable truth table

dezell: are you sanctioning what works over best practice?
... XML Schema says little on the semantics of xsi:nil
... spec allows attibutes on a nil='true' element!!!

yves: use-case of an empty vector over no vector at all

pauld: maybe we could allow all combinations, but document avoiding 'nillable="true"/minOccurs="0"' as not being goood practice

dezell: noodles on yve's use-case ; we could advicate against the bad practice
... how do we get interoperability, even in a dark corner - do we say what each case "means"?

yves: can I craft a counter example to demonstrate the interop issue? I can try to produce something based upon the vector example

<scribe> ACTION: yves to craft vector counter example for ISSUE-7

<trackbot> Created ACTION-22 - Craft vector counter example for ISSUE-7 [on Yves Lafon - due 2006-03-06].

pauld: decision we have to make surrounds minOccurs=0 and nillable=true. Do we a) include it in BASIC patterns b) if we include it, do we add advice to avoid it

dezell: I would only cite nillable='true' on simple content

pauld: sounds like an excellent suggestion
... worry about the effect on the fixed array of objects use-case

dezell: suggests adding conservative on sending, liberal on reception rule at start of document

paud: Postel's law

yves: HTML horror!

<scribe> ACTION: jcalladi to craft text for ISSUE-7, table followed by 'best practice' against using nil minOccurs=0 and nillable

dezell: xsi:type in combination with a union makes sense


move both issues to advanced patterns

ISSUE-23 Use of Mixed Content datatype

pauld: what does it 'mean' (to a toolkit) to say mixed='true'

msmq: likely to be handed a DOM, what does a DOM mean to an OO developer

yves: it's generic, and lacks types

pauld: essence of schema is adding types, I could imagine being given a PSVI annotated DOM node as being useful

msmq: natural language prose, XML allows me to add in phrase level annotations, tagging data would then give me a bag of typed data

pauld: were trying to document the art of the possible, my experience is tools bail when they hit mixed

dezell: by bail you mean drop to DOM

pauld: yes, (on a good day). We're not seeing 'strings' containing objects

ISSUE-17 - Advice for representing a duration

dezell: schema 1.1 working group looking at calculation of duration from two datetime values. Unambiguous representations. We took out leap seconds from calculations.

pauld: do people have good experiences using xs:duration

dezell: XSQuery has shaken a lot of the bugs out of this area
... a combination of a duration and a datetime isn't ambiguous

pauld: ws-addressing chose to use longInteger seconds

yves: documenting in BASIC that a duration is best represented

as a whole number of units: minutes, seconds, hours, years

dezell: derivative types of xs:duration works

ednote work taking place in XML Schema 1.1 WG

ISSUE-23 - Use of Mixed Content datatype

dezell: what do we do about sending HTML, mixed='true'

pauld: escaping seems attractive to many

yves: they base64 encode it!

discussion of appealing to i18n, markup may be required

for right to left.

pauld: cites ws-addressing log file use-case where BOM and

encoding caused problems, but saw great value in keeping contained XML as XML

msmq: proposition - suggest mixed content drops to DOM, should be easy to implement

discussion of mixed content use-cases

yves: basic patterns is documenting state of the art

jonc: we have recent experience with a mainstream SIP server which rejects schemas that contain mixed, anywhere

dezell: tentative solution is not to include it in the BASIC patterns

pauld: for me mixed is key to XML, it's saying a node contains XML

want to say something, even if it's an anti-pattern

discussion around anti-patterns, what's so special about mixed?

dezell: 'indictment', Advanced could be seen as 'ammunition'

pauld: I see this is like the name mapping - we have to document 'what works' but what works is so abhorrent we have to document what SHOULD work and say 'buyer beware'

msmq: draws attention to Henry's sample collection 1023 schemas 133 contain 'mixed'

yves: not many of those are likely to be databinding

dezell: would bet most of those are to escape HTML

<scribe> ACTION: msmq to report on status of Schema Collection

RESOLUTION: close ISSUE-23 with advice that tools should support mixed as a DOM node, but YMMV

ISSUE-18 - Schema Authoring Style

discussion of venetian (v) russian doll, salami slice akin to a DTD.

how Java inner and anonymous types may be represented

jonc: BT best practice has been to use venetian blind

pauld: worries this may be out of date, we need to do some work with recent versions of tools

<scribe> ACTION: pauld to investigate tools with authoring styles

ISSUE-13 and ISSUE-14 - xs:default handling by generated types

msmq: if validation was faster, wouldn't people use it all the time?

yves: it's seen as being redundant as processing the data model adds checking

pauld: issue falls into two parts: xs:default (v) other features

discussion around sender / receiver performing validation and completing default values

pauld: I have a schema with 500 optional/default values, my 80 point is sending 5 values in a message, xs:default seems useful

msmq: topnode validation at least, seems like an obvious, minimal requirement to me, mapping to objects seems to have the opportunity to perform validation. making it survive invalid documents seems like more work.

dezell: error reporting seems implementation dependent
... statement "for interoperability, all documents MUST be schema valid"

msmq: Mary is interested in partial validity

pauld: versioning is to be discussed this afternoon, and that's a good use-case for partial validity. Can't we say "SHOULD"

dezell: MUST seems to be state of the art

pauld: hears three issues here: xs:default (we've drilled into) and partial validation (state of the art seems seems to be strict validation) and the versioning use-case (possibly advanced patterns)
... seems like we're in need of some understanding of how tools handle xs:default in the wild

dezell: 'skip' and 'lax' seems to build upon PSVI

msmq: 'skip' seems to be a drop to DOM

dezel: is there a Web services use-case for lax and skip?

msmq: versioning?

jonc: we have a use-case with WS-Notification - cites use-case

discussion of lax and strict and wrapping

pauld: sounds like an implicit choice, an xsi:type even

jonc: it's not my spec :-)

msmq: sounds like sub-classing, difference seems to be you're avoiding substitution groups

<scribe> ACTION: pdowney to raise a new issue on 'skip'

<scribe> ACTION: pdowney to raise a new issue on 'lax'

pauld: skip and lax seems to be similar to 'mixed'

jonc: Erik's issue is about defaulting creating a wrapper element

pauld: seems like a common bug which we can document as an issue

<scribe> ACTION: pdowney to investigate xs:default with tools

ISSUE-8 - Using patterns to constrain numerical types

msmq: some patterns may be harmless (ignore leading zeros)

whereas others can be difficult

dezell: patterns could ask for even numbers, multiple of 10

msmq: forbidding some lexical forms as opposed to *all* lexical forms for a value. imagine incrementing a counter then trying to generate a lexical value
- a pattern may prevent or make that task difficult

msmq: procedure is to generate a lexical value, then schema validate it

msmq: Patterns can get complex. Have a schema pattern to only allow modulo any number!

dezell: nice theorem and matches all of my empirical cases

dezell: BASIC seems like don't do it

pauld: wondering how to manage this issue. would like to close it for BASIC and open a separate issue for Advanced

msmq: patterns should be OK for strings, and other lexical forms. Will enumerate (as soon as network wakes up)
... decimal, prescisionDecimal, etc
... strings, URIs and other one-to-one mappings (date times not including minutes and seconds, year, day, etc don't have leading zeros one-to-one mapping) should be OK
... all patterns are ok, so long as they don't remove a canonical form

dezell: boolean?

msmq: if you removed '0' and 'false' then 'true' and '1' would be allowed, checking to see if can serialised would be OK in that case. writing a pattern of '0'|'true' would require an oracle, possible for boolean, but much harder for 'float'.

dezell: '0'|'1' or 'false'|'true' seems to be the common use-case. A pattern could prevent leading zeros, why would that be bad

msmq: a System 370 editor pattern may always lead to a value with a leading zero. A brute-force approach doesn't work for numerical forms, they may be almost infinite.

yves: my pattern allows odd numbers, adding leading zeros will never help ..

dezell: in the advanced we should explain if a pattern eliminates any lexical form, it should eliminate *all* lexical forms of that value. You still have to serialize and then check.

pauld: Sounds like we're moving towards a point where we can craft a concrete proposal.


ISSUE-20 - Extension of collections

Paul introduces topic, cites chartered requirement to consider additonal elements as a use-case

discussion of ignoring additional content.

dezell: extension element, with a model group

pault: ideally you know what types you're going to extend

pauld: the patterns with extension wrapper leads to a staircase of elements

dezell: it's horrible, no matter what you do

pault: sounds like it comes from ASN.1 - cites pattern of extending at the end

dezell: schema 1.1 is looking at extensions at the suffix, or I prefer the validate twice - remove unknowns. turns all type derivation into restriction

pauld: we went down that road with WSDL 2.0 LC124, 'unknown' has to be just that - 'unknown' in the schema

pault: ASN.1 has extensibility implied - adds to the end of each type implicitly

dezell: Schema 1.1 is likely to have good stuff, but not for this WG

discussion of adding to end of structure, contrived extension patterns, v1-v2-v3 evolotions. Sadly detailed notes were lost thanks to pauld's mac crashing and our not being able to use IRC :-(

pault: can't we suggest extension at end in a different namespace

pauld: do you mean identify extension for v2 in v1 structures?
... coming to the conclusion we won't have a basic pattern for versioning, but we'll have to show our working

ISSUE-7 - xsi:nil and minOccurs=0 xsi:nil, Redux

Jonc presents proposed text for ISSUE-7 (projected, not on the list)

pauld: minOccurs=0 and nil=true is a design point

dezell: we discussed this at length, bottom line is nil use-case for database vendors sending missing (unexpected) values (e.g. mirroring databases)

pauld: wants to call out the design point between nil=true and minOccurs=0 as one of our documented issues
... want's to avoid normative advice to toolkit implementers in BASIC patterns

jonc: will continue to work on text and post to list

ISSUE-15 - ASN.1 null datatype

pault: null datatype useful in a choice of types in which one may be explicity missing
... also seen bitstring like applications, for citing missing values

pauld: sounds like yet another semantically different null, minOccurs=0

pault: choice itself could be optional, other ways of making it absent.
... some people wanted to say this explicitly - I always have a value
... it's never sent in packed encodings

pauld: seems like it has use-cases, mapping to ASN.1 is very interesting for fast-XML etc,

but how do current toolkits deal with it? Is it suitable for BASIC patterns?

jonc: worries if this may become an alternative to minOccurs=0 or nillable for empty content

pauld: doesn't seem as attractive as minOccurs or nillable, but has utility for ASN.1

jonc: what happens if you hang attributes off the element?

pault: from ASN.1 perspective, attribute no different to element

pauld: likely to be mapped to a class - my concern is an empty element may not work with tools
... lets park this

<scribe> ACTION: pdowney to investigate null patterns on tools

<scribe> ACTION: pdowney to raise an issue on more than one way to express empty or missing content

General ways forward

pauld: worries about depth of knowledge from within WG, especially wrt practical implementation of schema

pauld: starting to worry less about content, but more about written style of our specs.

yves: specification guidelines from the QA activity will help, may be able to enlist editorial guidance.

<scribe> ACTION: pdowney to plan two Editorial/F2F meetings in Europe

pauld: we need to move forward with the text. noodling on splitting the weekly call to have 15min/45min editorial/WG split

pauld: thanks paulf, and would like to record thanks to our other observers for their valuable input
... and thanks the W3C for hosting!

yves: W3C is sorry about the network!


Summary of Action Items

[NEW] ACTION: pdowney to enumerate schema 'features' for the roadmap
[NEW] ACTION: yves to write a wget script to publish issues on the public page
[NEW] ACTION: pdowney to build list of implementations
[NEW] ACTION: jcalladi to report on code-first experiences with 'all' and toolkits
[NEW] ACTION: yves to craft vector counter example for ISSUE-7
[NEW] ACTION: jcalladi to craft text for ISSUE-7, table followed by 'best practice' against using nil minOccurs=0 and nillable
[NEW] ACTION: msmq to report on status of Schema Collection
[NEW] ACTION: pauld to investigate tools with authoring styles
[NEW] ACTION: pdowney to raise a new issue on 'skip'
[NEW] ACTION: pdowney to raise a new issue on 'lax'
[NEW] ACTION: pdowney to investigate xs:default with tools
[NEW] ACTION: pdowney to investigate null patterns on tools
[NEW] ACTION: pdowney to raise an issue on more than one way to express empty or missing content
[NEW] ACTION: pdowney to plan two Editorial/F2F meetings in Europe
[End of minutes]

Minutes formatted by David Booth's scribe.perl version 1.127 (CVS log)
$Date: 2006/03/28 16:51:28 $