Re: Formulate erratum text on versioning for the web architecture document

Hello Noah,

On Feb 18, 2009, at 10:22 PM, ext noah_mendelsohn@us.ibm.com wrote:

> John Kemp wrote:
>
>> Was it the presence of the 'version' attribute in the specification  
>> of
>> XML, or the fact that it must say '1.1' in the case that an XML 1.1
>> instance was being exchanged?
>
> First, I think it's worth observing that at best the attribute did not
> solve a compatibility problem, since (unless I'm remembering some  
> detail)
> no document that was otherwise legal in 1.0 and 1.1 had differing
> interpretations per the two specifications.  So, labeling a document  
> 1.1
> really was a signal that "yes, I know I'm using new characters in the
> document below, and I meant it".

Yes, agreed.

> * The XML Recommendation did not provide any incremental forward
> compatibility with later versions.  If a document is labeled anything
> other than 1.0 (or no label, which defaults to 1.0), then the document
> must be rejected.  Labeling a document 1.1 thus provided insurance  
> that it
> wouldn't be processable at all by the tens or hundreds of millions of
> already deployed XML processors.

So was XML 1.0 technically guilty of violating the AWWW best practice?:

   "A specification SHOULD provide mechanisms that allow any party to  
create extensions."

>
> * The XML 1.1 Recommendation did suggest that the 1.1 version marker  
> be
> used only if some 1.0-incompatible content was in the document.  Turns
> out, that's easy advice to give, but hard to implement.  It almost  
> ensures
> that a general purpose application will have to make two passes in
> creating a document:  one pass to look for new characters, and the  
> second
> to output it.  You can have a gigabyte of 1.0 compatible output and
> discover that, at the very end, some character you wanted to write is
> legal only in 1.1.  Well, you better not have written the very first  
> line
> of the file, because that now has to have its version attribute  
> changed.
> In practice, lots of 1.1-capable software (well, I'm not sure there  
> was
> much 1.1-capable software, but a high percentage of what there was...)
> unconditionally applied the new label.
>
> Obviously, to answer your question directly, the attribute would have
> caused no trouble if it was merely treated as a comment.  Maybe or  
> maybe
> not some less draconian compatibility rules could have been applied,  
> and
> maybe or maybe not the attribute would have been helpful in  
> implementing
> them.  That's at best undemonstrated, IMO.

I think we're now bordering on talking about error-handling, with  
respect to the presence of "non-conforming" content. Related to how  
the version attribute was used, but not to the presence of the version  
attribute in the format specification.

>
>
>> I read this line as suggesting that a format specification should
>> provide a mechanism for instances to indicate a version of the
>> specification to which the author of the instance believes the
>> instance complies.
>
> Me too.  For reasons such as the ones given above, I'm not convinced
> that's in general good advice.

If the author of a specification fails to provide an EXPLICIT  
mechanism for indicating format version in instances of the format  
then what will happen is that IMPLICIT versioning will occur. For  
example, an XML element will contain a certain XML attribute in one  
version of the language, and will not contain that attribute in  
another version of the language. Of course, that may happen when  
people author instances anyway, so the intent, I think, of this best  
practice is to ensure that the authors of a format specification think  
about version indications (somewhat) separately from the actual  
changes that occur in different versions of the language.

It is right to question the use of version indications given the  
failure in adoption of so many 1.X versions that did provide explicit  
methods of versioning. We might ask "what is an explicit versioning  
mechanism good for?" and attempt to document that. Or provide specific  
examples (as you have started below) where an explicit version  
indication is useful.

>  Furthermore, I think it's only defensible
> if one can answer the sorts of questions raised in the recipe  
> example in
> the TAG blog entry.  The AWWW suggests that a mechanism should be  
> provided
> in the instance, without pointing out such points of confusion.  I'm  
> not
> saying that providing for version information in the instance is  
> always a
> mistake.  I do think it only makes sense when:
>
> * One can answer questions such as:  is an author responsible for  
> naming
> any one version with which the document is compatible?  The newest?   
> The
> oldest?  More than one?
>
> * The rules for accepting, rejecting and interpreting the content of a
> document are shown to be (helpfully) influenced by the presence of the
> version information.
>
> The one case I'm convinced of is the one I mentioned earlier:  if you
> introduce incompatible changes, such that the same document is legal  
> in
> more than one version, but that it means different things, then  
> labeling
> the instance is essential.  For example, if an early version of a data
> format referenced arrays with one-based indices, and a later version
> changed to zero-based, it would be essential to label the instance  
> in a
> way that would allow the intended interpretation to be discovered.

I agree with your points.

I would suggest that none of these points invalidate the best practice  
"that a format specification should provide a mechanism for instances  
to indicate a version of the specification to which the author of the  
instance believes the instance complies."

However, they do raise interesting issues that, in my opinion, should  
be documented somewhere (if not already) as they provide an additional  
level of detail based on actual experience in performing language  
versioning on the Web.

Regards,

- johnk

Received on Thursday, 19 February 2009 14:49:11 UTC