Comments on July 26, 2006 Versioning Draft from Marc de Graauw on 2006-08-29 (www-tag@w3.org from August 2006)

From: Marc de Graauw <marc@marcdegraauw.com>
Date: Tue, 29 Aug 2006 14:59:07 +0200
To: "'David Orchard'" <dorchard@bea.com>, <www-tag@w3.org>
Message-ID: <006f01c6cb6a$ea0de970$fd00a8c0@MARCNOTE>
Hi David,

Below some more comments on the July 26 Versioning Draft. BTW, I am
co-responsible for the versioning standards in the Dutch Healthcare and
Criminal Justice Exchanges which are currently being developed. As such,
your document provides very valuable input and I want to stress how much I
appreciate the existence of such a document and the effort you've put into
it.

1.1, par. 3: "The set of information in a language almost always has
semantics." I believe the use of the term "Information Set" here and in
other parts is confusing. In an XML context one tends to equate this with
Information set as defined in XML Infoset [1]. However, if an XML document
(Text) is just syntax, the XML Infoset is just syntax too (in fact [1] does
not even contain the word "semantics"). Your use of "Information" and
"Information Set" is very semantical in nature, as shown by the direct
association with semantics in the diagram, as well as the Act of Consumption
which impacts the consumer. So I think you should make clear that
"Information Set" is not the same as an XML Infoset, but a semantical notion
- or use another term.

1.1.1, par. 6: "The strings could be compatible but the information conveyed
is not compatible." This sentence is unclear to me, I think it needs
elaboration.

1.1.1, last par.: "We have shown that forwards and backwards
compatibility..." should probably read "We have shown that *both* forwards
and backwards compatibility...". In general, I struggled sometimes with your
use of the word "compatibility", as you often use it to mean "both forwards
and backwards compatibility", what you also term "full compatibility". Now
backwards compatibility (without forwards compatibility) of course is a form
of compatibility as well, so I think the text would become clearer if you
always use "full compatibility" when you mean "both forwards and backwards
compatibility", and use "compatibility" to just mean any form of
compatibility, full or forwards or backwards .

1.1.1.1, par. 3: "It may be very difficult for a language designer to know
many different language flavours are in existance." should be "It may be
very difficult for a language designer to know *how* many different language
flavours are in exist*e*nce."

1.1.1.1, par. 4: If a flavour of a language was also used for consumption,
it would have create an instance that is valid according to the Language V1
rules." should be "If a flavour of a language was also used for
*production*, it *should* have *to* create an instance that is valid
according to the Language V1 rules."

3.1: "If the language can be extended in a compatible way, then a few
specific schema design choices must be followed." Further on you describe
the possibility to transform new (extended) instances to older instances. If
a language makes such transformation (strip all unkown content) required,
the Schema's do not need extensibility (with wildcards), so the "must" in
this sentence is too strong. 

4, Good Practice #1: "Languages SHOULD be designed for extensibility." I
feel this is a bit too strong. Most exchange languages I know of do not
implement extensibility mechanisms in the way you describe, and although
this is a SHOULD, not a MUST, it still means a lot of well-functioning
languages violate this Good Practice. Extensibility should be an option for
a language designer, not a SHOULD. You yourself show with your discussions
of closed systems and security languages there are perfectly good reasons
for not using extensibilty. I myself work in Healthcare, and a Must-Ignore
default to medical information is often not the way to go either...

5, Example 3: I think you should mention XML Schema (non)determinism
pitfalls here if you show the Schema, now you only mention in the last
paragraph of 5 "Attribute extensions do not have non-determinism issues..."
without further explanation. The problem is important enough to mention
here.

8: The Chapter title should be "Version Identification Strategies *Using
Namespaces*" since in the previous part you describe alternatives not using
namespaces (although you recommend using namespaces).

8, numbered list:
http://www.oasis-open.org/archives/ubl/200511/msg00014.html and its
predecessors list languages which use those various approaches, maybe this
could serve as illustrative examples. As an addition, HL7v3 currently has a
single namespace for all versions and uses (sevaral flavours of) version
indicators in the instance.

As for approach 1, "all components in new namespace(s) for each version",
when proposing this we met with severe opposition from our clients. Since a
new namespace means new Schema's, consumers which do not use any features of
the new language are faced with an upgrade if they want to process messages
from new producers - ven when those messages contain only 'old' information
content. Since our client's software release policy required regression
testing for all changes in software (also Schema's) this would mean a
serious effort without any benefits. I think this drawback deserves
mentioning here.

9: There is another strategy to versioning which you do not mention: a
producer simply lists in an instance which consumer versions may process the
message. A producer could thus simply say "Consumers who understand version
2 or 3 may process this message". The advantage is you don't need
mustUnderstand flags everywhere. If a newer version of a language L2
contains an optional item whose understanding is mandatory, the producer
could require L2 consumers if the optional item occurs, and L1 or L2
consumers if the optional item does not occur. Of course the number of
versions could theoretically become high, but in practice there often aren't
that many versions of a language: we have two XML's, two SOAP's, two UBL's,
so this approach is feasible in practice. It works for forward
(in)compatibility since it requires a newer producer who knows the
capabilities of older consumers. 

In general can backwards (in)compatibility be defined in the language
specification, since the implementer of a new consumer will know from the
specification which older versions of the language are processable by it,
but forwards (in)compatibility must be defined in the instance, since the
implementer a newer producer may not know which versions the consumers are. 

10. You probably should mention one drawback to extensioning: if multiple
parties "invent" the same (functional) extension which comes in a new
version, getting the extensions back in sync in the new version may meet
with opposition. I don't say this is a reason for not using extensioning,
but I think in fairness it should be mentioned.

I hope this is helpful, and do want to stress again my appreciation for this
effort.

Regards,

Marc de Graauw
 
[1] http://www.w3.org/TR/xml-infoset/

| -----Original Message-----
| From: www-tag-request@w3.org [mailto:www-tag-request@w3.org] 
| On Behalf Of David Orchard
| Sent: donderdag 27 juli 2006 0:41
| To: www-tag@w3.org
| Subject: Updated versioning finding
| 
| I've added the section I mentioned on partial understanding.  
| I also tied this in with the internet robustness principle 
| and provided a couple of good practice notes around "liberal" 
| in accepting and "conservative" in producing.
| 
|  
| 
| It's pretty rough around the edges, but I think it really 
| completes the story about compatibility.
| 
|  
| 
| http://www.w3.org/2001/tag/doc/versioning
| 
|  
| 
| Cheers,
| 
| Dave
| 
|
Received on Tuesday, 29 August 2006 12:59:35 UTC