<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE spec PUBLIC "-//W3C//DTD Specification V2.1//EN"
               "../xmlspec-v21/xmlspec.dtd" [
	<!-- ================================================================ -->
	<!ENTITY draft.day "12">
	<!ENTITY draft.month "12">
	<!ENTITY draft.monthname "December">
	<!ENTITY draft.year "2006">
	<!ENTITY iso6.doc.date "&draft.year;-&draft.month;-&draft.day;">
	<!ENTITY basename "http://www.w3.org/2001/tag/doc/versioning-xml">
	<!ENTITY draftname "&basename;-&draft.year;&draft.month;&draft.day;">
]>
<spec w3c-doctype="other">
	<?CVS $Id$?>
	<header>
		<title>[Editorial Draft] Extending and Versioning Languages: XML Languages</title>
		<w3c-designation>&basename;-&iso6.doc.date;</w3c-designation>
		<w3c-doctype>Draft TAG Finding</w3c-doctype>
		<pubdate>
			<day>&draft.day;</day>
			<month>&draft.monthname;</month>
			<year>&draft.year;</year>
		</pubdate>
		<publoc>
			<loc href="&draftname;.html">&draftname;.html</loc>  ( <loc href="&draftname;.xml">xml</loc> )
		</publoc>
		<latestloc>
			<loc href="&basename;">&basename;</loc>
		</latestloc>		

		<authlist>
			<author>
				<name>David Orchard</name>
				<affiliation>BEA Systems, Inc.</affiliation>
				<email href="mailto:David.Orchard@BEA.com">David.Orchard@BEA.com</email>
			</author>
		</authlist>
		<copyright>
			<p>
				<loc href="http://www.w3.org/Consortium/Legal/ipr-notice-20000612#Copyright">Copyright</loc> &#xA9; 2003
<loc href="http://www.w3.org/">W3C</loc>
				<sup>&#xAE;</sup>
(<loc href="http://www.lcs.mit.edu/">MIT</loc>,
<loc href="http://www.inria.fr/">INRIA</loc>,
<loc href="http://www.keio.ac.jp/">Keio</loc>),
All Rights Reserved. W3C
<loc href="http://www.w3.org/Consortium/Legal/ipr-notice-20000612#Legal_Disclaimer">liability</loc>,
<loc href="http://www.w3.org/Consortium/Legal/ipr-notice-20000612#W3C_Trademarks">trademark</loc>,
<loc href="http://www.w3.org/Consortium/Legal/copyright-documents-19990405">document use</loc>, and
<loc href="http://www.w3.org/Consortium/Legal/copyright-software-19980720">software licensing</loc>
rules apply.
</p>
		</copyright>
		<abstract>
			<p>This document extends the first part of the versioning finding with XML specific versioning information and practices.</p>
		</abstract>
		<status>
			<p>This document has been developed for discussion by the
<loc href="/2001/tag/">W3C Technical Architecture Group</loc>. It does
not yet represent the consensus opinion of the TAG.</p>
			<p>Publication of this finding does not imply endorsement by the W3C
Membership. This is a draft document and may be updated, replaced or
obsoleted by other documents at any time.</p>
			<p>
				<loc href="/2001/tag/findings">Additional TAG findings</loc>, both
approved and in draft state, may also be available. The TAG expects to
incorporate this and other findings into a Web Architecture Document
that will be published according to the process of the <loc href="/Consortium/Process-20010719/tr#Recs">W3C Recommendation
Track</loc>.</p>
			<p>Please send comments on this finding to the publicly archived TAG
mailing list <loc href="mailto:www-tag@w3.org">www-tag@w3.org</loc>
(<loc href="http://lists.w3.org/Archives/Public/www-tag/">archive</loc>).</p>
		</status>
		<pubstmt>
			<p>Chicago, Vancouver, Mountain View, et al.: World-Wide Web Consortium,
Draft TAG Finding, 2003.</p>
		</pubstmt>
		<sourcedesc>
			<p>Created in electronic form.</p>
		</sourcedesc>
		<langusage>
			<language id="EN">English</language>
		</langusage>
		<revisiondesc>
			<slist>
				<sitem>2003-07-29: Published draft</sitem>
			</slist>
		</revisiondesc>
	</header>
	<body>
		<div1 id="introduction">
			<head>Introduction</head>
			<p>(much reworking in progress).  Extending and Versioning XML Languages Part 1 described extending and versioning languages.  The XML parts focuses on XML and includes schema language specific aspects of extending and versioning XML.  The choices, decisions, and strategies described in Part 1 are augmented with xml and schema instances herein.</p>
		<div2>
			<head>XML Terminology</head>
				<p>There are many different systems for exchanging texts in languages, such as SQL, Java, XML, ECMAScript, C#.  We will briefly describe some key refinements to our lexicon for XML.  An XML language has a vocabulary that may use terms from one or more XML Namespaces (or none), each of which has a namespace name.
  <termdef id="dt-xml-language" term="xmllanguage">An <term>XML language</term> is an identifiable set of vocabulary terms with defined XML syntactic and semantic constraints. </termdef>   By XML language, we mean the set of elements and attributes, or instances, used by a particular application.     The Name Language - consisting of name, given, family terms -  has a namespace for the terms.  We use the prefix "namens" to refer to that namespace.  The Name Language could consist of terms from other vocabularies, such as Dublin Core or UBL.  These terms each have their own namespaces, illustrating that a language can comprise vocabularies from multiple namespaces. An XML Namespace is a convenient container for collecting terms
that are intended to be used together within a language or across languages. It provides a mechanism for creating globally unique names.  </p>
				<p>We shall use the term <termref def="instance">instance</termref> when speaking of sequences of characters (aka text) in XML.    <termdef id="instance" term="instance">An <term>instance</term> is a specific, discrete Text in XML format.</termdef>   Documents are instances of a language.  In XML, they must have a root element.   A name text might have a name element as the root element.   Alternatively, the name vocabulary may be used by a language such as purchase orders.  The purchase order texts may contain name elements.  Thus instances of a language are always part of a text and also may be the entire text.  XML instances (and all other instances of markup languages) consist of markup and content.  In the name example, the given and family elements including the end markers are the markup.  The values between the start and end markers are the content.  An instance has an information model.  There are a variety of data models within and without the W3C, and the one standardized by the W3C is the XML infoset.</p>
				<p>The XML related terms and their relationships are shown below</p>
				<graphic source="ext-vers-xml-uml.png" alt="UML diagram of XML terms"/>
				<p/>
				<p>A stylesheet processor is a
consumer of the XML text that it is processing (the producer isn't
mentioned); in the Web services context the roles of producer and
consumer alternate as messages are passed back and forth.  Note that most Web service specifications provide definitions of
inputs and outputs. By our definitions, a Web service
that updates its output schema is considered a new producer. A Web service
that updates its input schema is a new consumer.  </p>
			</div2>
			<div2 id="vocabkind">
				<head>Kinds of XML Languages</head>
				<p>Ultimately, there are different kinds of XML languages. The
versioning approaches and strategies that are appropriate for one kind
of language may not be appropriate for another. Among the various kinds
of vocabularies, we find:</p>
				<ulist>
					<item>
						<p>
							<emph>Just Names</emph>: some languages don't actually identify elements
or attributes; they're just lists of names. Using QNames to identify words
in the WordNet database, for example, or the names of functions and operators
in XPath2 are examples of <quote>just name</quote> languages.
</p>
					</item>
					<item>
						<p>
							<emph>Standalone</emph>: languages designed to be used
more-or-less by themselves, for example XHTML, DocBook, or The TEI.</p>
					</item>
					<item>
						<p>
							<emph>Containers</emph>: languages designed to be used as a
wrapper or framework for some other language or payload, for example
SOAP or WSDL.
</p>
					</item>
					<item>
						<p>
							<emph>Container Extensions</emph>: languages designed to extend
or augment a particular class of container. Specifications that extend SOAP by defining SOAP
header blocks, for example, to provide security, asynchrony or reliable messaging
are examples of container extension languages.
</p>
						<p>There are a couple types of XML extension languages, element extension and attribute extension.

<ulist>
								<item>
									<p>
										<emph>Element Extension</emph>.  Languages that are elements.  SOAP, etc. are element extensions.  </p>
								</item>
								<item>
									<p>
										<emph>Attribute or type Extensions</emph>.  Languages that are types or attributes.  These languages must exist in the context of an element.  Sometimes called "parasite" languages as they require a "host" element.  XLink is an example.</p>
								</item>
							</ulist>
						</p>
					</item>
					<item>
						<p>
							<emph>Mixtures</emph>: languages designed for, or often used for,
encapsulating some semantics inside another language. For example, MathML
might be mixed inside of another language.
</p>
					</item>
				</ulist>
				<p>This is by no means an exhaustive list. Nor are these categories
completely clear cut. MathML can certainly be used standalone, for
example, and languages like SVG are a combination of standalone,
containers, and mixtures.</p>
			</div2>
			</div1>
			<div1 id="example">
			<head>Example</head>
			<p>Suppose that you have designed a language for handling personal
information consisting of a single “Name" element.
The first version of the Name contains a “given" and a “family" element.
There are a variety of strategies for extensibility and versioning, detailed later.  This example will simply show the "new components in new namespace" strategy.  We use this strategy because it is probably the simplest strategy that works using W3C XML Schema.</p>
			<example>
				<head>New components in new namespace(s) schema Version 1</head>
				<eg><![CDATA[<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
      targetNamespace="http://www.example.org/name/1" 
      xmlns:name="http://www.example.org/name/1"> 

  <xs:complexType name="name">
    <xs:sequence>
      <xs:element name="given" type="xs:string"/>
      <xs:element name="family" type="xs:string"/>
      <xs:element name="middle" type="xs:string" minOccurs="0"/>
      <xs:any namespace="##other" processContents="lax" 
              minOccurs="0" maxOccurs="unbounded"/>
    </xs:sequence>
    <xs:anyAttribute/>
  </xs:complexType>
</xs:schema>]]></eg>
			</example>
			<p>The language designer and 3rd parties can now
use different namespaces for their versions.  The language designer makes a variety of choices, particularly the schema language, that affect their strategy for namespaces.  </p>
			<p>In general, an extension can be defined by a new
specification that makes a normative reference to the earlier specification and
then defines the new element. No permission should be needed from the authors
of the specification to make such an extension. In fact, the major design
point of XML namespaces is to allow decentralized extensions. The corollary is
that permission is required for extensions in the same namespace. A namespace
has an owner; non-owners changing the meaning of something can be harmful.</p>
			<p>Attribute extensions can be in any namespace because in XML schema, attributes do not have non-determinism (aka Unique Particle Attribution) constraints that elements do.   In XML Schema, the attributes are always unordered and the model group for attributes
uses a different mechanism for associating attributes with schema types than
the model group for elements.  We will discuss this important issue later in the finding.</p>
		</div1>
			<div1 id="identify">
			<head>Identifying and Extending XML Languages</head>
			<p>Designing extensibility into languages typically results
in systems that are more loosely coupled.  Extensibility allows authors to
change instances without going through a centralized authority, and may allow
the centralized authority greater opportunities for versioning.  The common
characteristic of a compatible change is the use of extensibility.</p>
			<p>A supreme example of the benefits of extensibility is
HTML.  The first version of HTML was designed for extensibility; it said that
“unknown markup" may be encountered.  An example of this in action is the
addition of the IMG tag by the Mosaic browser team.  </p>

			<p>The first rule introduced in this Finding relating to
extensibility is:</p>
			<p role="practice">
Allow Extensibility rule: XML Languages SHOULD be extensible.</p>
			<p>A fundamental requirement for extensibility is to be able to
determine the language of elements and attributes. XML Namespaces
[<loc href="#XMLNamespaces"> XML Namespaces 1.0</loc>] provide a mechanism for associating a
URI with an XML element or attribute name, thus specifying the
language of the name. This also serves to prevent name collisions.</p>
			<p>HTML did not have the ability to distinguish between the languages
of extensions.  This meant that authors could produce the same element
name but with different interpretations, and software would have no way of
determining which interpretation was applicable.  This is a great part of the
motivation to move from HTML to the XML vocabulary of HTML, XHTML.</p>
			<p>W3C XML Schema [<loc href="#XMLSchemaPart2">Part 2</loc>] provides a
mechanism called a wildcard, &lt;xs:any&gt;,
for controlling where elements from certain namespaces are allowed.  The
wildcard indicates that elements in specified namespaces are allowed in
texts where the wildcard occurs.  This allows extension
in a well-defined manner. A consumer of extended texts can
identify and, depending upon its processing model, safely ignore the extensions
it doesn't understand.</p>
			<p>&lt;xs:any&gt; uses the namespace attribute to control what namespaces
extension elements can come from. The most interesting values for this
attribute are: ##any, which means one can
extend the schema using an element from any possible namespace; ##other, which only allows extension elements from
namespaces other than the target namespace of the schema; and ##targetnamespace, which only allows extension
elements from the target namespace of the schema.</p>
			<p>&lt;xs:any&gt; uses the processContents attribute to control how a XML
parser validates extended elements.   Permissible methods include “lax" - validate
any elements from supported namespaces but ignore all other elements,
“strict"—validate all elements, and “skip"—validate no elements. This Finding
recommends “lax" validation, as it is the most flexible and is the typical
choice for Web services specifications.</p>
			<p>RDF/OWL and RelaxNG are 2 other popular technologies for schema design.  They have different mechanisms for allowing and controlling schema evolution.</p>
		</div1>
		<div1 id="ident">
			<head>Version Identification Strategies using XML Namespaces</head>
			<p>There are a large variety of version identification designs. They range from many namespaces to only 1 namespace for all versions of a language.  A few
of the most common are listed below and described in more detail
later.</p>
			<olist>
				<item>
					<p>all components in new namespace(s) for each version</p>
					<p>ie
version 1 consists of namespaces a + b, version 1.1 consists of
namespaces c + d; or version 1 consists of namespace a, version 1.1
consists of namespace b.</p>
				</item>
				<item>
					<p>all new components in new namespace(s) for each compatible version</p>
					<p>ie version 1 consists of namespaces a +
b; version 1.1 consists of namespaces a + b + c; version 2.0 consists
of namespaces d + e.</p>
				</item>
				<item>
					<p>all new components in existing or new namespace(s) for each
compatible version</p>
					<p>ie version 1 consists of namespace a, version 1.1
consists of namespace a, version 2 consists of namespace b; or version
1 consists of namespace a, version 1.1 consists of namespace a +
b.</p>
				</item>
				<item>
					<p>all new components in existing or new namespace(s) for each
version and a version identifier</p>
					<p>ie version 1 consists of namespace a
+ b + version attribute “1", version 2 consists of namespace c + d +
version attribute “2".</p>
				</item>
				<item>
					<p>all components in existing namespace(s) for each
version (compatible and incompatible) and a version identifier</p>
					<p>ie version 1 consists of namespace a
+ version attribute “1.0", version 1.1 consists of namespace a +
version attribute “1.1", version 2.0 consists of namespace a + version attribute "2.0".</p>
				</item>
			</olist>
			<p>Whatever the design chosen, the language designer must
decide the component name, namespace name, and any version identifier for new
and all existing components. The trade-offs between the decisions relate to
the importance of:</p>
			<ulist>
				<item>
					<p>Supporting Compatible evolution.</p>
				</item>
				<item>
					<p>namespaces for identifying compatible components. 
Changing namespace names is typically a very invasive change</p>
				</item>
				<item>
					<p>A complete Schema for the language. We will see how
some designs preclude full Schema description</p>
				</item>
				<item>
					<p>Use of generic XML and namespace only (precluding vocabulary
specific versions) tools.  This itself is a trade-off because some generic XML tools (like XPath) are more difficult to use with multiple namespaces containing the same "thing", like XHTML's P element.</p>
				</item>
			</ulist>
			<p>Elaborating on these designs is illustrative.</p>
			<div2>
				<head>Version
Strategy: all components in new namespace(s) for each version
(#1)</head>
				<p>The following names would be valid:</p>
				<example>
					<head>All components in new namespace(s) instances</head>
					<eg><![CDATA[<name xmlns="http://www.example.org/name/1">
  <given>Dave</given>
  <family>Orchard</family>
</name>

<name xmlns="http://www.example.org/name/2">
  <given>Dave</given>
  <family>Orchard</family>
  <middle>Bryce</middle>
</name>

<name xmlns="http://www.example.org/name/3">
  <given>Dave</given>
  <family>Orchard</family>
  <mid:middle xmlns:mid="http://www.example.org/name/3/mid/1">Bryce</mid:middle>
</name>

<name xmlns="http://www.example.org/name/3">
  <given>Dave</given>
  <family>Orchard</family>
  <middiffdomain:middle xmlns:middiffdomain="http://www.example.com/mid/1">Bryce</middiffdomain:middle>
</name>]]></eg>
				</example>
				<p>The 2<sup>nd</sup> example shows all the components in the same new namespace. The 3rd and 4<sup>th</sup>
example show an additional middle element in 2 different namespace names. The 3rd example comes from a namespace name that is
in the same domain as the name element’s new namespace name.  One reason for 2 namespaces is to modularize the language. The 4<sup>th</sup>
example shows a namespace name from a different domain for the middle. It is
probable that the mid:middle was created by the name author, and the
middiffdomain:middle was created by a 3rd party.</p>
<p>This strategy has the impact of potentially and generally resulting in incompatible changes.  When an older consumer receives the new texts in the new namespace, most of the software will break, such as performing schema validation without the new schema.  Achieving forwards compatibility requires careful selection of technologies, such as XPath expressions that are namespace agnostic.  The effect of the change being an incompatible change is the design goal of some systems that have adopted this strategy</p>
			</div2>
			<div2>
				<head>Version
Strategy: all new components in new namespace(s) for each compatible version
(#2)</head>
				<p>In this strategy, the following names would be valid:</p>
				<example>
					<head>New components in new namespace(s) instances</head>
					<eg><![CDATA[<name xmlns="http://www.example.org/name/1">
  <given>Dave</given>
  <family>Orchard</family>
</name>

<name xmlns="http://www.example.org/name/1">
  <given>Dave</given>
  <family>Orchard</family>
  <mid:middle xmlns:mid="http://www.example.org/name/mid/1">Bryce</mid:middle>
</name>

<name xmlns="http://www.example.org/name/1">
  <given>Dave</given>
  <family>Orchard</family>
  <middiffdomain:middle xmlns:middiffdomain="http://www.example.com/mid/1">Bryce</middiffdomain:middle>
</name>]]></eg>
				</example>
				<p>The 2<sup>nd</sup> and 3<sup>rd</sup>
example show an additional middle element in 2 different namespace names. The
first middle, the 2<sup>nd</sup> example, comes from a namespace name that is
in the same domain as the name element’s namespace name. The 3<sup>rd</sup>
example shows a complete different namespace name for the middle. It is
probable that the mid:middle was created by the name author, and the
middiffdomain:middle was created by a 3rd party.</p>
			</div2>
			<div2>
				<head>Version
Strategy: all new components in new or existing namespace(s) for each compatible version
(#3)</head>
				<p>In this strategy, the following names would be valid:</p>
				<example>
					<head>New components in new or existing namespace(s) instances</head>
					<eg><![CDATA[<name xmlns="http://www.example.org/name/1">
  <given>Dave</given>
  <family>Orchard</family>
</name>

<name xmlns="http://www.example.org/name/1">
  <given>Dave</given>
  <family>Orchard</family>
  <middle>Bryce</middle>
</name>

<name xmlns="http://www.example.org/name/1">
  <given>Dave</given>
  <family>Orchard</family>
  <mid:middle xmlns:mid="http://www.example.org/name/mid/1">Bryce</mid:middle>
</name>

<name xmlns="http://www.example.org/name/1">
  <given>Dave</given>
  <family>Orchard</family>
  <middiffdomain:middle xmlns:middiffdomain="http://www.example.com/mid/1">Bryce</middiffdomain:middle>
</name>]]></eg>
				</example>
				<p>The 2<sup>nd</sup> example shows the use of the optional
middle name in the name namespace. The 3<sup>rd</sup> and 4<sup>th</sup>
example show an additional middle element in 2 different namespace names. The
first middle, the 3rd example, comes from a namespace name that is
in the same domain as the name element’s namespace name. The 4<sup>th</sup>
example shows a complete different namespace name for the middle. It is
probable that the mid:middle was created by the name author, and the
middiffdomain:middle was created by a 3rd party.</p>
			</div2>
			<div2>
				<head>Version Strategy: all new
components in existing or new namespace(s) for each version and a version
identifier(#4)</head>
				<p>Using a version identifier, the name instances would
change to show the version of the name they use, such as:</p>
				<example>
					<head>New components in existing or new namespace(s)
with version identifier instances</head>
					<eg><![CDATA[<name xmlns="http://www.example.org/name/1" version="1.0">
  <given>Dave</given>
  <family>Orchard</family>
</name>

<name xmlns="http://www.example.org/name/1" version="1.1">
  <given>Dave</given>
  <family>Orchard</family>
  <middle>Bryce</middle>
</name>

<name xmlns="http://www.example.org/name/1" version="1.1">
  <given>Dave</given>
  <family>Orchard</family>
  <mid:middle xmlns:mid="http://www.example.org/name/mid/1">Bryce</mid:middle>
</name>

<name xmlns="http://www.example.org/name/1" version="1.0">
  <given>Dave</given>
  <family>Orchard</family>
  <mid:middle xmlns:mid="http://www.example.org/name/mid/1">Bryce</mid:middle>
</name>

<name xmlns="http://www.example.org/name/1" version="2.0">
  <given>Dave</given>
  <family>Orchard</family>
  <mid:middle xmlns:mid="http://www.example.org/name/mid/1">Bryce</mid:middle>
</name>

<name xmlns="http://www.example.org/name/2" version="2.0">
  <given>Dave</given>
  <family>Orchard</family>
  <middle>Bryce</middle>
</name>]]></eg>
				</example>
				<p>The last two examples show that the middle is now a mandatory
part of the name.   This is indicated by just the version number or a new namespace plus version number.</p>
				<p>A significant downside with using version identifiers is that
software that supports both versions of the name must perform special
processing on top of XML and namespaces. For example, many components
“bind" XML types into particular programming language types. Custom
software must process the version attribute before using any of the
“binding" software. In Web services, toolkits often take SOAP body
content, parse it into types and invoke methods on the types. There
are rarely “hooks" for the custom code to intercept processing between
the “SOAP" processing and the “name" processing. Further, if version
attributes are used by any 3rd party extensions—say
mid:middle has a version—then the schema cannot refer to the correct
middle.</p>
			</div2>
			<div2>
				<head>Version Strategy: all
components in existing namespace(s) for each version and a version
identifier(#5)</head>
				<p>Using a version identifier, the name instances would
change to show the version of the name they use, such as:</p>
				<example>
					<head>New components in existing namespace(s)
with version identifier instances</head>
					<eg><![CDATA[<name xmlns="http://www.example.org/name/1" version="1.0">
  <given>Dave</given>
  <family>Orchard</family>
</name>

<name xmlns="http://www.example.org/name/1" version="1.1">
  <given>Dave</given>
  <family>Orchard</family>
  <middle>Bryce</middle>
</name>

<name xmlns="http://www.example.org/name/1" version="2.0">
  <given>Dave</given>
  <family>Orchard</family>
  <middle>Bryce</middle>
</name>]]></eg>
				</example>
				<p>The 2<sup>nd</sup> example shows that the middle is an optional part of the name.  The last example shows that the middle is a mandatory
part of the name. </p>
				<p>A downside with using new namespace names is that some tools, like XPath, can be harder to use in the face of new namespace names.  Software that extracts the given and family name based upon the expanded name will often break if a new namespace name is used. </p>
			</div2>
		</div1>
		<div1>
			<head>Indicating Incompatible changes</head>
			<p>Given adoption of the Must Ignore rule, it is often the
case that the creator of an extension or a new version wants to require that
the consumer understand the extension, overriding the Must Ignore rule. The
previous section showed how a version author could use new namespace names,
element names, or version numbers to indicate an incompatible change. An
extension author does not have these mechanisms available for indicating an
incompatible or mandatory extension. A language provider that wants to allow
extension authors to indicate incompatible extension must provide a mechanism
for indicating that consumers must understand the extension.</p>
			<p role="practice">Provide Must Understand Rule: Container languages
SHOULD provide a “Must Understand" model for indicating extensions that override a default Must Ignore Rule.</p>
			<p>This rule and the Must Ignore rule work together to
provide a stable and flexible processing model for extensions.</p>
			<p>Must Understand flag</p>
			<p>Arguably the simplest and most flexible over-ride
technique is a Must Understand flag that
indicates whether the item must be understood. The <loc href="#soap12">SOAP</loc>,
<loc href="#wsdl11">WSDL</loc>, and <loc href="#ws-policy">WS-Policy</loc>
attributes and values for specifying understand are respectively: soap:mustUnderstand="1", wsdl:required="1",
wsp:Usage="wsp:Required". SOAP is probably
the most common case of a container that provides a Must
Understand model. The default value is 0,
which is effectively the Must Ignore rule.</p>
			<p>A language designer can re-use an existing Must
Understand model by constraining their language to an existing Must Understand
model. A number of Web services specifications have done this by specifying
that the components are SOAP header blocks, which explicitly brings in the SOAP
Must Understand model.</p>
			<p>A language designer can design a Must Understand model
into their language. A Must Understand flag allows the producer to insert
extensions into the container and use the Must Understand attribute to
over-ride the must
Ignore rule. This allows producers to extend instances without
changing the extension element’s parent’s namespace, retaining backwards
compatibility. Obviously the consumer must be extended to handle new extensions,
but there is now a loose coupling between the language’s processing model and
the extension’s processing model. A Must Understand flag is provided below:</p>
			<example>
				<head>New components in new namespace(s)
with Must Understand</head>
				<eg><![CDATA[<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
      targetNamespace="http://www.example.org/name/1" 
      xmlns:name="http://www.example.org/name/1"> 

  <xs:complexType name="name">
    <xs:sequence>
      <xs:element name="given" type="xs:string"/>
      <xs:element name="family" type="xs:string"/>
      <xs:any namespace="##other" processContents="lax" 
            minOccurs="0" maxOccurs="unbounded"/>
    </xs:sequence>
    <xs:attribute ref="name:mustUnderstand"/>
    <xs:anyAttribute/>
  </xs:complexType>

  <xs:attribute name="mustUnderstand" type="xs:boolean"/>
</xs:schema>]]></eg>
			</example>
			<p>An example of an instance of a 3rd party
indicating that a middle component is an incompatible change:</p>
			<example>
				<head>New components in existing or new namespace(s)
instance with Must Understand</head>
				<eg><![CDATA[<name xmlns="http://www.example.org/name/1">
  <given>Dave</given>
  <family>Orchard</family>
  <mid:middle xmlns:mid="http://www.example.org/name/mid/1"
                name:mustUnderstand="true">
      Bryce
  </mid:middle>
</name>]]></eg>
			</example>
			<p>Specification of a Must Understand flag must be treated
carefully as it can be computationally expensive. Typically a processor will
either: perform a scan for Must Understand components to ensure it can process
the entire text, or incrementally process the instance and is prepared to
rollback or undo any processing if an not understood Must Understand is found.</p>
			<p>There are other refinements related to Must Understand. 
One example is providing an element that indicates which extension namespaces
must be understood, which avoids the scan of the instance for Must Understand
flags.</p>
			<p>It is also possible to re-use the SOAP processing model with it's mustUnderstand.  </p>
			<example>
				<head>Using SOAP Must Understand</head>
				<eg><![CDATA[<soap:envelope>
  <soap:body>
    <name xmlns="http://www.example.org/name/1">
    <given>Dave</given>
    <family>Orchard</family>
   </name>
  </soap:body>
</soap:envelope>

<soap:envelope>
  <soap:header>
  <mid:middle xmlns:mid="http://www.example.org/name/mid/1"
                soap:mustUnderstand="true">
      Bryce
  </mid:middle>
  </soap:header>
  <soap:body>
    <name xmlns="http://www.example.org/name/1">
    <given>Dave</given>
    <family>Orchard</family>
   </name>
  </soap:body>
</soap:envelope>]]></eg>
			</example>
			<div2>
				<head>Type changes</head>
				<p>The various schema languages have a variety of defined mechanisms to indicate extensions that are incompatible.  For example, XML Schema provides Type Extension and Substitution Groups.  These mechanisms are described in Part 2.</p>
			</div2>
		</div1>
		
		<div1 id="identUsingSchema">
			<head>Schemas for Version Identification Strategies</head>
			<div2>
				<head>Version
Strategy: all components in new namespace(s) for each version
(#1)</head>
				<p>Using XML Schema 1.0, the name owner might like to write a schema such as:</p>
				<example>
					<head>All components in new namespace(s) schema</head>
				<eg><![CDATA[<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
      targetNamespace="http://www.openuri.org/name/1" 
      xmlns:name="http://www.openuri.org/name/1"> 

  <xs:complexType name="name">
    <xs:sequence>
      <xs:element name="first" type="xs:string"/>
      <xs:element name="last" type="xs:string"/>
      <xs:any namespace="##other" processContents="lax" 
              minOccurs="0" maxOccurs="unbounded"/>
    </xs:sequence>
    <xs:anyAttribute/>
  </xs:complexType>
</xs:schema>]]></eg>
	
				</example>
				<p>The next version of the schema, with middle name added, might look like</p>
			<example>
				<head>New components in new namespace(s) with Type Extension schema</head>
				<eg><![CDATA[<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
      targetNamespace="http://www.openuri.org/name/2" 
      xmlns:name2="http://www.openuri.org/name/2"> 

  <xs:import namespace="http://www.openuri.org/name/1"/>

  <xs:complexType name="name">
    <xs:complexContent>
      <xs:extension base="name:name">
        <xs:sequence>
          <xs:element name="middle" type="xs:string" minOccurs="0"/>
        </xs:sequence>
      </xs:extension>
    </xs:complexContent>
  </xs:complexType>
</xs:schema>]]></eg>
			</example>
			<p>This schema is illegal because the
middle in the 2nd name namespace and the wildcard with ##other are
non-deterministic.  More on non-determinism in the next strategy.  An alternative is to not have the wildcard at all, and rely
upon subtyping for extension. But this prevents any kind of compatible
evolution as both sides must have the new schema to understand the type. There
language designer has to choose between allowing compatible
extensibility/versioning OR incompatible extensibility when subtyping is used. 
</p>
<p>Because of the type extension determinism problem, the language designer cannot re-use the existing name definition.  They must create a new schema without any reference to the previous schema.</p>
<example>
					<head>All components in new namespace(s) schema, no-reuse of v1 schema</head>
				<eg><![CDATA[<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
      targetNamespace="http://www.openuri.org/name/2" 
      xmlns:name="http://www.openuri.org/name/2"> 

  <xs:complexType name="name">
    <xs:sequence>
      <xs:element name="first" type="xs:string"/>
      <xs:element name="last" type="xs:string"/>
      <xs:element name="middle" type="xs:string" minOccurs="0"/>
      <xs:any namespace="##other" processContents="lax" 
              minOccurs="0" maxOccurs="unbounded"/>
    </xs:sequence>
    <xs:anyAttribute/>
  </xs:complexType>
</xs:schema>]]></eg>	
		</example>
<p>The new namespace for all components does not allow compatible evolution by the language designer, unless they choose to put new components in a new namespace which is strategy #2.  Additionally, the version 2 schema cannot re-use the existing type definition.  </p>
	</div2>
	
	
			<div2>
				<head>Version Strategy: all new components in new namespace(s) for each compatible version (#2)</head>
				<p>We previously saw how re-use by importing and extending schemas with wildcards is not possible.  In this strategy, the schema designer attempts to insert the new extension in the existing schema definition, like:</p>
				<example>
					<head>Illegal New components in new namespace(s) schema</head>
				<eg><![CDATA[<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
      targetNamespace="http://www.openuri.org/name/1" 
      xmlns:name="http://www.openuri.org/name/1"> 

  <xs:complexType name="name">
    <xs:sequence>
      <xs:element name="first" type="xs:string"/>
      <xs:element name="last" type="xs:string"/>
      <xs:element ref="mid:middle" minOccurs="0"/>
      <xs:any namespace="##other" processContents="lax" 
              minOccurs="0" maxOccurs="unbounded"/>
    </xs:sequence>
    <xs:anyAttribute/>
  </xs:complexType>
</xs:schema>

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
      targetNamespace="http://www.openuri.org/name/mid/1" 
      xmlns:mid="http://www.openuri.org/name/mid/1"> 
  <xs:element name="middle" type="xs:string"/>
</xs:schema>]]></eg>
	
				</example>
				
				<p>However, the determinism constraint of XML Schema,
described in more detail later, prevents this from working. The problem arises
in a version when an optional element is followed by a wildcard. In this
example, this occurs when an optional element is added and extensibility is
still desired. This is an ungentle introduction to the difference between
extensibility and versioning. An optional middle name added into a subsequent
version is a good example. Consumers should be able to continue processing if
they don’t understand an additional optional middle name, and we want to keep
the extensibility point in the new version. We can't write a schema that contains
the optional middle name and a wildcard for extensibility. The previous schema
schema is roughly what is desired using wildcards, but it is illegal because of
the determinism.</p>
<p>The author has 4 options for the v2 schema for name and middle, listed below and detailed
subsequently:</p>
				<olist>
					<item>
						<p>optional middle, extensibility retained, but name type does not refer to
middle;</p>
					</item>
					<item>
						<p>optional middle, extensibility is lost, name type refers to middle;</p>
					</item>
					<item>
						<p>required middle, extensibility retained, name type refers to middle but
compatibility is lost (essentially strategy #1);</p>
					</item>
					<item>
						<p>no update to the Schema</p>
					</item>
				</olist>
				<p>If they leave the middle as optional and retain the
extensibility point, the best schema that they can write is:</p>
				<example>
					<head>New components in new namespace(s) schema V2,
no change to name type</head>
					<eg><![CDATA[<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
      targetNamespace="http://www.openuri.org/name/1" 
      xmlns:name="http://www.openuri.org/name/1"> 

  <xs:complexType name="name">
    <xs:sequence>
      <xs:element name="first" type="xs:string"/>
      <xs:element name="last" type="xs:string"/>
      <xs:any namespace="##other" processContents="lax" 
              minOccurs="0" maxOccurs="unbounded"/>
    </xs:sequence>
    <xs:anyAttribute/>
  </xs:complexType>
</xs:schema>

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
      targetNamespace="http://www.openuri.org/name/mid/1" 
      xmlns:mid="http://www.openuri.org/name/mid/1"> 
  <xs:element name="middle" type="xs:string"/>
</xs:schema>]]></eg>
				</example>
				<p>This is not a very helpful XML Schema change. The problem is that
they cannot insert the reference to the optional mid:middle element
in the name schema and retain the extensibility point because of the
aforementioned Non-Determinism Constraint.</p>
				<p>The core of the problem is that there is no mechanism for
constraining the content of a wildcard. For example, imagine that ns1 contains
foo and bar. It is not possible to take the SOAP schema—an example of a
schema with a wildcard - and require that ns1:foo element must be a child of
the header element and ns1:bar must not be a child of the header element using
just W3C XML Schema constructs.   Indeed, the need for this functionality
spawned some of the WSDL functionality.</p>
				<p>They could decide to lose the extensibility point (option
#2), such as</p>
				<example>
					<head>New components in new namespace(s) schema V2,
no extensibility</head>
					<eg><![CDATA[<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
      targetNamespace="http://www.openuri.org/name/1" 
      xmlns:name="http://www.openuri.org/name/1"
      xmlns:mid="http://www.openuri.org/name/mid/1">

  <xs:complexType name="name">
    <xs:sequence>
      <xs:element name="first" type="xs:string"/>
      <xs:element name="last" type="xs:string"/>
      <xs:element ref="mid:middle" type="xs:string" minOccurs="0"/>
    </xs:sequence>
    <xs:anyAttribute/>
  </xs:complexType>
</xs:schema>

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
      targetNamespace="http://www.openuri.org/name/mid/1" 
      xmlns:mid="http://www.openuri.org/name/mid/1"> 
  <xs:element name="middle" type="xs:string"/>
</xs:schema>]]></eg>
				</example>
				<p>This does lose the possibility for forwards-compatible evolution.</p>
				<p>The final option, #3, is adding a required middle. They
must indicate the change is incompatible. A new namespace name for the name
element can be created.  This is essentially strategy #1, new namespace for all components.</p>
				<p>The downsides of the 3 options for new components in new
namespace name(s) design have been described. Additionally, the design can
result in specifications and namespaces that are inappropriately factored, as
related constructs will be in separate namespaces.</p>
<p>The "new components in new namespace(s)" versioning strategy implies a rule for namespaces of:</p>
			<p role="practice">Allow Extensions in Other Namespace rule: The
extensibility point SHOULD at least allow for extension in other
namespaces.</p>
			<p>The rule for allowing extensibility:</p>
			<p role="practice">Full Extensibility rule: All XML Elements
SHOULD allow for element extensibility after element definitions, and
allow any attributes.</p>
			</div2>
			<div2>
				<head>Version Strategy: All new components
in existing or new namespace(s) for each compatible version(#3)</head>
				<p>It is possible to create Schemas with additional optional
components. This requires re-using the namespace name for optional components
and special schema design techniques. The re-using namespace rule is:</p>
				<p role="practice">Re-use namespace names Rule: If a backwards
compatible change can be made to a specification, then the old
namespace name SHOULD be used in conjunction with XML’s extensibility
model.</p>
				<p>It is important to note that that a new namespace name is not required whenever a
specification evolves - strategies #1 and #2 - but rather a new namespace name can be required only if an incompatible change is made.  Strategy #1 uses a new namespace for all existing components and any additions, Strategy #2 uses a new namespace for all additions.  Strategy #3 re-uses namespaces for compatible extensions.</p>
				<p role="practice">New namespaces to break Rule: A new namespace name
is used when backwards compatibility is not permitted, that is
software MUST break if it does not understand the new language
components.</p>
				<p>Earlier examples showed that it is not possible to have a
wildcard with ##any (or even ##targetnamespace) following optional elements in
the targetnamespace. The solution to this problem is to introduce an element
in the schema that will always appear if the extension appears. The content
model of the extensibility point is the element + the extension. There
are two styles for this. The first was published in an earlier version of this
Finding in December 2003. It uses an Extensibility element with the extensions
nested inside. The second was published in July 2004, then updated on MSDN. 
It uses a Sentry or Marker element with extensions following it.</p>
				<p>A name type
with extension elements is</p>
				<example>
					<head>New components in existing or new namespace(s)
with Extension Type Schema version 1</head>
					<eg><![CDATA[<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
      targetNamespace="http://www.openuri.org/name/1" 
      xmlns:name="http://www.openuri.org/name/1"> 

  <xs:complexType name="name">
    <xs:sequence>
      <xs:element name="first" type="xs:string"/>
      <xs:element name="last" type="xs:string"/>
      <xs:element name="Extension" type="name:ExtensionType" 
                 minOccurs="0" maxOccurs="1"/>
      <xs:any namespace="##other" processContents="lax" 
              minOccurs="0" maxOccurs="unbounded"/>
    </xs:sequence>
    <xs:anyAttribute/>
  </xs:complexType>

  <xs:complexType name="ExtensionType">
    <xs:sequence>
      <xs:any processContents="lax" minOccurs="1" 
              maxOccurs="unbounded" namespace="##targetnamespace"/>
    </xs:sequence>
    <xs:anyAttribute/>
  </xs:complexType> 
</xs:schema>]]></eg>
				</example>
				<p>Because each extension in the targetnamespace is inside an
Extension element, each subsequent target namespace extensions will
increase nesting by another layer. While this layer of nesting per
extension is not desirable, it is what can be accomplished today when
applying strict XML Schema validation. It seems to at least this
author that potentially having multiple nested elements is worthwhile
if multiple compatible revisions can be made to a language. This
technique allows validation of extensions in the targetnamespace and
retaining validation of the targetnamespace itself.</p>
				<p>The previous schema allows the following sample name:</p>
				<example>
					<head>New components in existing or new namespace(s)
with Extension Type instances</head>
					<eg><![CDATA[<name xmlns="http://www.openuri.org/name/1">
  <first>Dave</first>
  <last>Orchard</last>
  <Extension>
    <middle>Bryce</middle>
  </Extension>
</name>]]></eg>
				</example>
				<p>The namespace author can create a schema for this type </p>
				<example>
					<head>New components in existing or new namespace(s)
with Extension Type Schema version 2</head>
					<eg><![CDATA[<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
      targetNamespace="http://www.openuri.org/name/1" 
      xmlns:name="http://www.openuri.org/name/1"> 

  <xs:complexType name="name">
    <xs:sequence>
      <xs:element name="first" type="xs:string"/>
      <xs:element name="last" type="xs:string"/>
      <s:element name="Extension" type="name:middleExtensionType" 
                 minOccurs="0" maxOccurs="1"/>
      <xs:any namespace="##other" processContents="lax" 
              minOccurs="0" maxOccurs="unbounded"/>
    </xs:sequence>
    <xs:anyAttribute/>
  </xs:complexType>

  <xs:complexType name="middleExtensionType">
    <xs:sequence>
       <xs:element name="middle" type="xs:string"/>
       <xs:element name="Extension" type="name:middleExtensionType" 
                   minOccurs="0" maxOccurs="1"/>
    </xs:sequence>
    <xs:anyAttribute/>
  </xs:complexType> 

  <xs:complexType name="ExtensionType">
    <xs:sequence>
      <xs:any processContents="lax" minOccurs="1" 
              maxOccurs="unbounded" namespace="##targetnamespace"/>
    </xs:sequence>
    <xs:anyAttribute/>
  </xs:complexType> 
</xs:schema>]]></eg>
				</example>
				<p>The advantage of this design technique is that a forwards
and backwards compatible Schema V2 can be written. The V2 schema can validate
documents with or without the middle, and the V1 schema can validate documents
with or without the middle.</p>
				<p>Further, the re-use of the same namespace has better
tooling support. Many applications use a single schema to create the
equivalent programming constructs. These tools often work best with single
namespace support for the “generated" constructs. The re-use of the namespace
name allows at least the namespace author to make changes to the namespace and
perform validation of the extensions.</p>
				<p>An obvious downside of this approach is the complexity of
the schema design. Another downside is that changes are linear, so 2
potentially parallel extensions must be nested rather than parallel.</p>
			</div2>
			<div2>
				<head>Version Strategy: all new
components in existing or new namespace(s) for each version and a version
identifier(#4)</head>
				<p>Using a version identifier, the name instances would
change to show the version of the name they use, such as:</p>
				<example>
					<head>New components in existing or new namespace(s)
with version identifier instances</head>
					<eg><![CDATA[<name xmlns="http://www.openuri.org/name/1" version="1.0">
  <first>Dave</first>
  <last>Orchard</last>
</name>

<name xmlns="http://www.openuri.org/name/1" version="1.0">
  <first>Dave</first>
  <last>Orchard</last>
  <middle>Bryce</middle>
</name>

<name xmlns="http://www.openuri.org/name/1" version="1.1">
  <first>Dave</first>
  <last>Orchard</last>
  <pref1:middle xmlns:mid1="http://www.openuri.org/name/mid/1">Bryce</pref1:middle>
</name>

<name xmlns="http://www.openuri.org/name/1" version="1.0">
  <first>Dave</first>
  <last>Orchard</last>
  <pref2:middle xmlns:mid2="http://www.example.org/name/mid/1">Bryce</pref2:middle>
</name>

<name xmlns="http://www.openuri.org/name/1" version="2.0">
  <first>Dave</first>
  <last>Orchard</last>
  <pref1:middle xmlns:mid1="http://www.openuri.org/name/mid/1">Bryce</pref1:middle>
</name>]]></eg>
				</example>
				<p>The last example shows that the middle is now a mandatory
part of the name. As with Design #2, the schema for the optional middle cannot
fully express the content model. A schema for the mandatory middle is</p>
				<example>
					<head>New components in existing or new namespace(s)
with version identifier schema v2, incompatible change</head>
					<eg><![CDATA[<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
      targetNamespace="http://www.openuri.org/name/1" 
      xmlns:name="http://www.openuri.org/name/1"
      xmlns:mid="http://www.openuri.org/name/mid/1">

  <xs:complexType name="name">
    <xs:sequence>
      <xs:element name="first" type="xs:string"/>
      <xs:element name="last" type="xs:string"/>
      <xs:element name="middle" type="xs:string" minOccurs="0"/>
      <xs:element ref="mid:middle"/>
      <xs:any namespace="##other" processContents="lax" 
              minOccurs="0" maxOccurs="unbounded"/>
    </xs:sequence>
    <xs:anyAttribute/>
  </xs:complexType>
</xs:schema>

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
      targetNamespace="http://www.openuri.org/name/mid/1" 
      xmlns:mid="http://www.openuri.org/name/mid/1"> 
  <xs:element name="middle" type="xs:string"/>
</xs:schema>]]></eg>
				</example>
				<p>A significant downside with using version identifiers is that
software that supports both versions of the name must perform special
processing on top of XML and namespaces. For example, many components
“bind" XML types into particular programming language types. Custom
software must process the version attribute before using any of the
“binding" software. In Web services, toolkits often take SOAP body
content, parse it into types and invoke methods on the types. There
are rarely “hooks" for the custom code to intercept processing between
the “SOAP" processing and the “name" processing. Further, if version
attributes are used by any 3rd party extensions—say
mid:middle has a version—then the schema cannot refer to the correct
middle.</p>
			</div2>
			
	</div1>
		<div1>
			<head>Indicating Incompatible changes</head>
			<div2>
				<head>Must Understand</head>
			<p>A schema for a Must Understand flag is provided below:</p>
			<example>
				<head>New components in existing or new namespace(s)
with Extension Type Schema and Must Understand</head>
				<eg><![CDATA[<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
      targetNamespace="http://www.openuri.org/name/1" 
      xmlns:name="http://www.openuri.org/name/1"> 

  <xs:complexType name="name">
    <xs:sequence>
      <xs:element name="first" type="xs:string"/>
      <xs:element name="last" type="xs:string"/>
      <xs:element name="middle" type="xs:string" minOccurs="0"/>
      <s:element name="Extension" type="name:ExtensionType" 
                 minOccurs="0" maxOccurs="1"/>
      <xs:any namespace="##other" processContents="lax" 
            minOccurs="0" maxOccurs="unbounded"/>
    </xs:sequence>
    <xs:attribute ref="name:mustUnderstand"/>
    <xs:anyAttribute/>
  </xs:complexType>

  <xs:complexType name="ExtensionType">
    <xs:sequence>
      <xs:any processContents="lax" minOccurs="1" 
             maxOccurs="unbounded" namespace="##targetnamespace"/>
    </xs:sequence>
    <xs:anyAttribute/>
  </xs:complexType>

  <xs:attribute name="mustUnderstand" type="xs:boolean"/>
</xs:schema>]]></eg>
			</example>
			<p>An example of an instance of a 3rd party
indicating that a middle component is an incompatible change:</p>
			<example>
				<head>New components in existing or new namespace(s)
instance with Must Understand</head>
				<eg><![CDATA[<name xmlns="http://www.openuri.org/name/1">
  <first>Dave</first>
  <last>Orchard</last>
  <pref2:middle xmlns:mid2="http://www.example.org/name/mid/1"
                name:mustUnderstand="true">
      Bryce
  </pref2:middle>
</name>]]></eg>
			</example>
	</div2>
	<div2>
		<head>Type extension</head>
	
			<p>Another option for indicating mandatory requirements is
allowing extension authors to use other schema mechanisms for extending the
main type, such as type extension. The language designer allows for type
extension, and they must specify that type extensions must be understood.  Strategy #1 showed that it is impossible to write a schema that extends an existing type that has a wildcard with a new namespace'd component.  Type extension does not enable extension authors
to indicate incompatible extensions.</p>
</div2>
<div2>
	<head>	Substitution Groups</head>

			<p>Another mechanism for extending a type in XML Schema is
substitution groups. Substitution groups enable an element to be declared as
substitutable for another. This can only be used for incompatible extensions
as the consumer must understand the substitution type. Substitution groups
require that elements are available for substitution, so the name designer must
have provided a name element in addition to the name type.</p>
			<p>Substitution groups do allow a single extension author to
indicate that their changes are mandatory. The limitations are that the
extension author has now taken over the type’s extensibility. A visual way of
imagining this is that the type tree has now been moved from the language designer
over to the extensions author. And the language designer probably does not
want their type to be “hijacked".</p>
			<p>However, this is not substantially different than an
extension being marked with a “Must Understand". In either case—with the
extensions higher up in the tree (sometimes called top-typing) or lower in the
tree (bottom-typing)—a new type is effectively created.</p>
			<p>The difference is that there can only be 1 element at the
top of an element hierarchy. If multiple mandatory extensions are added, then
the only way to compose them together is at the bottom of the type because that
is where the extensibility is.</p>
			<p>Substitution groups do not allow a language designer and
an extension author to incompatibly change the language as they end up
conflicting over what to call the name element. Thus substitution groups are a
poor mechanism for allowing an extension author to indicate that their changes
are incompatible. A Must Understand flag is a superior method because it
allows multiple extension authors to mix their mandatory extensions with a
language designer’s versioning strategy. Hence language designers should
prevent substitution groups and provide a Must Understand flag or other model
when they wish to allow 3rd parties to make incompatible changes.</p>
			<p>In some cases, a language does not provide a Must Understand
mechanism. In the absence of a Must Understand model, the only way to force consumers
to reject a message if they don’t understand the extension namespace is to
change the namespace name of the root element, but this is rarely desirable.</p>
</div2>
		</div1>
		<div1>
			<head>Determinism</head>
			<p>This Finding has spent considerable material describing
deterministic content models, and so it is worthy of describing the W3C XML
Schema determinism rules in more detail. The reader is reminded that these
rules are unique to W3C XML Schema and other XML Schema languages like RELAX NG
do not use these rules and so do not suffer from the contortions one is forced
through when using W3C XML Schema. XML DTDs and W3C XML Schema have a rule
that requires schemas to have deterministic content models. From the XML 1.0
specification,</p>
			<p>“For example, the content model ((b, c) | (b, d)) is
non-deterministic, because given an initial b the XML processor cannot
know which b in the model is being matched without looking ahead to
see which element follows the b."</p>
			<p>The use of ##any means there are some schemas that we might like to
express, but that aren’t allowed.
</p>
			<ulist>
				<item>
					<p>Wildcards with ##any, where minOccurs does not equal
maxOccurs, are not allowed before an element declaration. An instance
of the element would be valid for the ##any or the element. ##other
could be used.</p>
				</item>
				<item>
					<p> The element before a wildcard with ##any must have
cardinality of maxOccurs equals its minOccurs. If these were
different, say minOccurs="1" and maxOccurs="2", then the optional
occurrences could match either the element definition or the ##any. As
a result of this rule, the minOccurs must be greater than zero.</p>
				</item>
				<item>
					<p>Derived types that add element definitions after a wildcard
with ##any must be avoided. A derived type might add an element
definition after the wildcard, then an instance of the added element
definition could match either the wildcard or the derived element
definition.</p>
				</item>
			</ulist>
			<p role="practice">Be Deterministic rule: Use of wildcards MUST be
deterministic. Location of wildcards, namespace of wildcard
extensions, minOccurs and maxOccurs values are constrained, and type
restriction is controlled.</p>

			<p>As shown earlier, a common design pattern is to provide
an extensibility point—not an element - allowing any namespace at the end of
a type. This is typically done with &lt;xs:any
namespace="##any"&gt;.</p>
			<p>Determinism makes this unworkable as a complete solution
in many cases. Firstly, the extensibility point can only occur after required
elements in the original schema, limiting the scope of extensibility in the
original schema. Secondly, backwards compatible changes require that the added
element is optional, which means a minOccurs="0". 
Determinism prevents us from placing a minOccurs="0"
before an extensibility point of ##any. Thus, when adding an element at an
extensibility point, the author can make the element optional and lose the
extensibility point, or the author can make the element required and lose
backwards compatibility.</p>
		</div1>
		
		<div1>
			<head>Other technologies</head>
			<p>The W3C XML Schema Working has heard and taken to heart
many of these concerns. They have plans to remedy some of these issues in XML
Schema 1.1 [<loc href="#daveowritings">21</loc>]. They currently are looking at a
“weak wildcard" model, which solves some but not all of the problems. There is
no public Working Draft of a Schema 1.1 with improved extensibility or
versioning at the time of writing this Finding.</p>
			<p>A simple analysis of doing compatible extensibility and
versioning using RDF and OWL is available [<loc href="#daveowritings">21</loc>]. 
In general, RDF and OWL offer superior mechanisms for extensibility and
versioning. RDF and OWL explicitly allow extension components to be added to
components. And further, the RDF and OWL model builds in the notion of “Must
Ignore Unknowns" as an RDF/OWL processor will absorb the extra components but
do nothing with them. An extension author can require that consumers
understand the extension by changing the type using a type extension
mechanism.   </p>
			<p>RELAX NG is another schema language. It explicitly allows
extension components to be added to other components as it does not have the
non-determinism constraint.</p>
		</div1>
		<div1>
			<head>Conclusion</head>
			<p>This Finding describes a number of questions, decisions
and rules for using XML, W3C XML Schema, and XML Namespaces in language
construction and extension. The main goal of the set of rules is to allow
language designers to know their options for language design, and ideally make backwards-
and forwards-compatible changes to their languages to achieve loose coupling
between systems.</p>
		</div1>
		<div1>
			<head>References</head>
	<blist>
				<bibl id="FOLDOC" href="http://wombat.doc.ic.ac.uk/foldoc/" key="FOLDOC">
					<titleref>Free Online Dictionary of Computing</titleref>.
</bibl>
				<bibl id="FlexXMLP" href="http://www.upnp.org/download/draft-goland-fxpp-01.txt" key="FlexXMLP">
					<titleref>Flexible XML Processing Profile</titleref>.
</bibl>
				<bibl id="rfc1521" href="http://www.ietf.org/rfc/rfc1521.txt" key="MIME">
					<titleref>RFC 1521, MIME</titleref>.
</bibl>
				<bibl id="rfc1866" href="http://www.ietf.org/rfc/rfc1866.txt" key="HTML 2.0">
					<titleref>RFC 1866, HTML 2.0</titleref>.
</bibl>
				<bibl id="WebDAVXMLIgnorePost" href="http://lists.w3.org/Archives/Public/w3c-dist-auth/1997AprJun/0190.html" key="WebDAV XMLIgnore post">Yaron Goland<titleref>XML Ignore proposed for WebDAV</titleref>
				</bibl>
				<bibl id="rfc2518" href="http://www.ietf.org/rfc/rfc2518.txt" key="WebDAV">
					<titleref>RFC 2518, WebDAV</titleref>
				</bibl>
				<bibl id="html40" href="http://www.w3.org/TR/1998/REC-html40-19980424/" key="HTML 4.0">
					<titleref>HTML 4.0</titleref>.
</bibl>
				<bibl id="TBL-MandExt" href="http://www.w3.org/DesignIssues/Mandatory.html" key="TBL Mandatory Extensions">Berners-Lee.
<titleref>Web Architecture: Mandatory extensions</titleref>.
</bibl>
				<bibl id="TBL-Extensible" href="http://www.w3.org/DesignIssues/Extensible.html" key="TBL Extensible languages">Berners-Lee.
<titleref>Web Architecture: Extensible languages</titleref>.
</bibl>
				<bibl id="TBL-Evolution" href="http://www.w3.org/DesignIssues/Evolution.html" key="TBL Evolution">Berners-Lee.
<titleref>Web Architecture: Evolvability</titleref>.
</bibl>
				<bibl id="Note-extlang" href="http://www.w3.org/TR/1998/NOTE-webarch-extlang-19980210" key="Web Architecture: Extensible Languages">Berners-Lee and Connolly, ed.
<titleref>Web Architecture: Extensible Languages</titleref>
World Wide Web Consortium, 1998.</bibl>
				<bibl id="WD-doctypes" href="http://www.w3.org/MarkUp/WD-doctypes" key="HTML Document types">Connolly, ed.
<titleref>HTML Document dialects</titleref>
World Wide Web Consortium, 1996.</bibl>
			<bibl id="soap12" href="http://www.w3.org/TR/SOAP/" key="SOAP 1.2">
					<titleref>W3C
Recommendation, SOAP 1.2 Part 1: Messaging Framework</titleref>
				</bibl>
				<bibl id="wsdl11" href="http://www.w3.org/TR/WSDL/" key="WSDL 1.1">
					<titleref>W3C Note, WSDL 1.1</titleref>
				</bibl>
				<bibl id="ws-policy" href="http://www.w3.org/Submissions/WS-Policy/" key="WS-Policy 1.2">
					<titleref>W3C Note, WS-Policy 1.2</titleref>
				</bibl>
				<bibl id="XML1-0" href="http://www.w3.org/TR/REC-xml" key="XML 1.0">
					<titleref>W3C Recommendation, XML 1.0</titleref>
				</bibl>
				<bibl id="Xinclude" href="http://www.w3.org/TR-Xinclude" key="XInclude">
					<titleref>W3C Working Draft, XML Inclusions</titleref>
				</bibl>
				<bibl id="XMLNamespaces" href="http://www.w3.org/TR/REC-xml-names" key="XML Namespaces">
					<titleref>W3C Recommendation, XML Namespaces</titleref>
				</bibl>
				<bibl id="XMLSchemaPart2" href="http://www.w3.org/TR/xmlschema-2" key="XML Schema Part 2">
					<titleref>W3C Recommendation, XML Schema, Part 2</titleref>
				</bibl>
				<bibl id="XMLSchemaWildcardTestCollection" href="http://www.w3.org/XML/2001/05/xmlschema-test-collection/result-ms-wildcards.htm" key="XML Schema Wildcard Test Collection">
					<titleref>XML Schema Wildcard Test collection</titleref>
				</bibl>
				
				<bibl id="XFrontSchemaBestPractices" href="http://www.xfront.com/BestPracticesHomepage.html" key="XFront Schema Best Practices">
					<titleref>XFront Schema Best Practices</titleref>
				</bibl>
				<bibl id="XMLDOTCOMSchemaDesignPatterns" href="http://www.xml.com/pub/a/2002/07/03/schema_design.html" key="XML.com Schema Design Patterns">Dare Obasanjo<titleref>XML.com Schema design patterns</titleref>
				</bibl>
				<bibl id="daveowritings" href="http://www.pacificspirit.com/Authoring/Compatibility" key="Dave Orchard writings on Extensibility and Versioning"><titleref>Dave Orchard writings on extensibility and versioning</titleref>
				</bibl>

			</blist>
		</div1>
		<div1 id="ack">
			<head>Acknowledgements</head>
			<p>The author thanks the many reviewers that have
contributed to the finding, particularly David Bau, William Cox, Ed Dumbill,
Chris Ferris, Yaron Goland, Hal Lockhart, Mark Nottingham, Jeffrey Schlimmer,
Cliff Schmidt, and Norman Walsh.</p>
		</div1>
		
	</body>
</spec>

