Copyright © 2005 W3C® (MIT, ERCIM, Keio), All Rights Reserved. W3C liability, trademark, document use and software licensing rules apply.
This document defines a set of simple types to describe abstract content, e.g. in an XML Information Set [XML Information Set] or in an abstract model (e.g. WSDL 2.0's component model [WSDL 2.0 Core Language]).
The types are defined in order to be largely indepedent of the version of XML used when serializing the abstract content as an XML document.
This document is an editors' copy that has no official standing.
This document is a draft intended to be reviewed by the Web Services Description Working Group for possible publication as a Working Group Note and has no formal status.
The author feels that it contains useful information that is worth publishing for the benefit of others.
1 Introduction
1.1 Notational Conventions
2 Background
3 Definition of the Simple Types
3.1 string Type
3.2 Token Type
3.3 NCName Type
3.4 anyURI Type
3.5 QName Type
3.6 boolean Type
3.7 int Type
3.8 unsignedLong Type
3.9 anyType Type
4 Serialization as various versions of XML
4.1 Serialization as XML 1.0 and Relationship with XML Schema
1.0 Datatypes
4.2 Example of the serialization as different versions of XML
of a simple type: stype:NCName
4.2.1 XML 1.0 serialization of an stype:NCName
4.2.2 XML 1.1 serialization of an stype:NCName
5 Using the simple types
6 Interoperability considerations
7 References
7.1 Normative References
7.2 Informative References
8 Acknowledgements
This document defines a set of simple types commonly used in Web services specifications. They are defined independently of any version of XML.
This document is an example of how to allow specifications to be abstracted from a particular version of XML, in particular XML 1.0. Other types MAY be added to this document depending on the feedback received.
The keywords "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC2119 [IETF RFC 2119].
This specification uses a number of namespace prefixes throughout, listed in the table below. Note that the choice of any namespace prefix is arbitrary and not semantically significant (see [XML Information Set]).
Prefix | Namespace | Notes |
---|---|---|
stype | "http://www.w3.org/2005/08/17-xml-simp-types" | Defined in this document. |
xs | "http://www.w3.org/2001/XMLSchema" | Defined in the W3C XML Schema specification [XML Schema: Structures], [XML Schema: Datatypes]. |
The use of XML Schema 1.0 datatypes [XML Schema: Datatypes] to define properties in a specification mandates an XML 1.0 [XML 1.0] serialization and prevents an XML 1.1 [XML 1.1], because the definitions of datatypes in [XML Schema: Datatypes] depend on XML 1.0 productions [XML 1.0].
This unfortunate side-effect of XML Schema datatypes is unnecessarilly certain specifications to be compatible with XML version 1.1, and probably other versions of XML that the community may come up with in the future.
A previous Working Draft of WSDL 2.0 defined simple types independent of a particular version of XML to free itself from an unnecessary dependency from XML 1.0, making the XML Schema defined with [XML Schema: Structures] for WSDL 2.0 normative only for XML 1.0 serialization.
However, the Working Group later took the decision that this additional layer of abstraction was too complex and decided to go back to defining its properties with XML Schema datatypes.
This document captures the method which was used in this 2004-08-03 Working Draft of WSDL 2.0, explaining the objectives it was trying to reach, as it is believed that this technique to write specifications independent of a particular version of XML has merit.
This specification provides its own definition of those types, patterned after [XML Schema: Datatypes] but independent of it. This allows processors to accept descriptions serialized using a mechanism that is not compatible with [XML Schema: Datatypes], such as XML 1.1 [XML 1.1].
All types defined in this section are formally assigned to the "http://www.w3.org/2005/08/17-xml-simp-types" namespace. All references to them in this specification are made via qualified names that use the stype prefix. It should be noted though that there is no schema (in the sense of [XML Schema: Structures]) for that namespace, because the types defined here go beyond the capabilities of XML Schema to describe.
All types listed above are such that their value spaces are a superset of the value space of the type with the same name defined by XML Schema [XML Schema: Datatypes]. In particular, the value space of the stype:string type is a strict superset of the value space of xsd:string, as shown by the one-character string consisting exclusively of the #x0 character.
Note:
The small list of types provided here is believed to cover list the WSDL 2.0 [WSDL 2.0 Core Language] and WS-Addressing 1.0 [WS-Addressing 1.0 - Core],[WS-Addressing 1.0 - SOAP Binding]. Other simple types may be defined.
The value space of the stype:string type consists of finite-length sequences of characters in the range #x0-#x10FFFF inclusive, where a character is an atomic unit of text as specified by ISO/IEC 10646 [ISO/IEC 10646] and Unicode [Unicode].
The value space of the stype:Token type is the subset of the value space of the stype:string type consisting of strings that do not contain the line feed (#xA), tab (#x9) characters, that have no leading or trailing spaces (#x20) and that have no internal sequences of two or more spaces.
The value space of the stype:NCName type is the subset of the value space of the stype:Token type consisting of tokens that do not contain the space (#x20) and ':' characters.
The value space of the stype:anyURI type consists of all International Resource Identifiers (IRI) as defined by [IETF RFC 3987].
The value space of the stype:QName type consists of the set of 2-tuples whose first component is of type stype:anyURI and whose second component is of type stype:NCName.
The value space of the stype:boolean type consists of the two distinct values true and false.
An instance of a datatype that is defined as boolean can have the following legal literals {true, false, 1, 0}.
The value space of the stype:int type consists of the infinite set {…,-2,-1,0,1,2,…} representing the standard mathematical concept of the integer numbers.
An instance of a datatype that is defined as int has a lexical representation consisting of a finite-length sequence of decimal digits (#x30-#x39) with an optional leading sign ("-" or "+"). If the sign is omitted, "+" is assumed.
The value space of the stype:unsignedLong type consists of the set {0,1,2,…,18446744073709551615} of integer numbers.
unsignedLong has a lexical representation consisting of a finite-length sequence of decimal digits (#x30-#x39).
Any combination of element, processing instruction, unexpanded entity reference, character, and comment information items as defined by [XML Information Set].
When serializing as other versions of XML, such as XML 1.0 [XML 1.0] or XML 1.1 [XML 1.1], the set of characters allowed by the simple types defined in section 3 Definition of the Simple Types are restricted to the ones allowed by those versions of XML.
When serializing the information to XML 1.0 [XML 1.0], the simple types defined in section 3 Definition of the Simple Types map naturally to well-known datatypes defined in [XML Schema: Datatypes] which add additional constraints to the content serialized:
Simple type | Corresponding schema type for an XML 1.0 serialization |
---|---|
stype:string | xs:string |
stype:Token | xs:Token |
stype:NCName | xs:NCName |
stype:anyURI | xs:anyURI |
stype:QName | xs:QName |
stype:boolean | xs:boolean |
stype:int | xs:int |
stype:unsignedLong | xs:unsignedLong |
stype:anyType | xs:anyType |
Let's consider when stype:NCName may be serialized as XML 1.0 and as XML 1.1 as an example.
An stype:NCName MAY be serialized in an XML 1.0 document if it is only composed of the characters allowed by XML 1.0, i.e. matching the NCName production from the Namespaces in XML specification [Namespaces in XML].
An stype:NCName MAY be serialized in an XML 1.0 document if it is only composed of the characters allowed by XML 1.1, i.e. matching the NCName production from the Namespaces in XML 1.1 specification [Namespaces in XML 1.1].
Typically, a specification with a dependency on XML 1.0 [XML 1.0] will have defined its content using types from XML Schema 1.0 Part 2 [XML Schema: Datatypes], and provided a normative XML 1.0 schema [XML Schema: Structures].
In order to allow XML versioning independence, types defined by this specification SHOULD be used. The XML 1.0 schema defined SHOULD be declared normative for XML 1.0 serializations only.
Note:
This document having gone through the W3C Recommendation Track Process and therefore not having received a wide review, a normative reference to this document is difficult.
Conformance to a specification defined independent of any version of XML does NOT require to accept documents using all existing versions of XML, unless specifically called out.
Conformance is considered for processing documents using the XML version supported by the implementation.
The original idea for defining types independent of a version of XML was proposed by Jonathan Marsh.
The core content of this document is extracted from the this 2004-08-03 Working Draft of WSDL 2.0 Part 1. The editors of this specifications were:
Roberto Chinnici, Sun Microsystems
Martin Gudgin, Microsoft
Jean-Jacques Moreau, Canon
Jeffrey Schlimmer, Microsoft
Sanjiva Weerawarana, IBM Research
Commenters on this part of the WSDL 2.0 specification are acknowledged, as well as Richard Ishida and Felix Sasaki for their feedback.