The tests described in this document provide an initial set of metrics to determine how well a particular implementation conforms to the W3C XML 1.0 (Second Edition) Recommendation. The XML Conformance Test Suite is intended to complement the W3C XML 1.0 (Second Edition) Recommendation. All interpretations of this Recommendation are subject to confirmation by the W3C XML Group .
Conformance tests can be used by developers, content creators, and users alike to increase their level of confidence in product quality. In circumstances where interoperability is necessary, these tests can also be used to determine that differing implementations support the same set of features.
The XML Test Suite was transferred from OASIS to W3C and is being augmented to reflect the current work of the W3C XML Core Working Group, including resolved issues related to the Recommendation and published Errata. This report provides supporting documentation for all the tests included in the test suite. Sources from which these tests have been collected include: James Clark XMLTEST cases, 18-Nov-1998; Fuji Xerox Japanese Text Tests; Sun Microsystems XML Tests; OASIS/NIST TESTS, 1-Nov-1998; IBM XML Tests; .
Two basic types of test are presented here. These are respectively Binary Tests and Output Tests .
Binary conformance tests are documents which are grouped into one of four categories. Given a document in a given category, each kind of XML parser must treat it consistently and either accept it (a positive test) or reject it (a negative test). It is in that sense that the tests are termed "binary". The XML 1.0 (Second Edition) Recommendation talks in terms of two types of XML processor: validating ones, and nonvalidating ones. There are two differences between these types of processors:
There are two types of such entity, parameter entities holding definitions which affect validation and other processing; and general entities which hold marked up text. It will be appreciated that there are then five kinds of XML processor: validating processors, and four kinds of nonvalidating processor based on the combinations of external entity which they include.
| Nonvalidating | Validating | ||
|---|---|---|---|
| External Entities Ignored (3 cases) |
External Entities Read |
||
| Valid Documents | accept | accept | accept |
| Invalid Documents | accept | accept | reject |
| Non-WF Documents | reject | reject | reject |
| WF Errors tied to External Entity |
accept
(varies) |
reject | reject |
| Documents with Optional Errors |
(not specified) | (not specified) | (not specified) |
At this time, the XML community primarily uses parsers which are in the rightmost two columns of this table, calling them Well Formed XML Parsers (or "WF Parsers") and Validating XML Parsers. A second test matrix could be defined to address the variations in the types of of XML processor which do not read all external entities. That additional matrix is not provided here at this time.
The XML 1.0 (Second Edition) Recommendation places a number of requirements on XML processors, to ensure that they report information to applications as needed. Such requirements are testable. Validating processors are required to report slightly more information than nonvalidating ones, so some tests will require separate output files. Some of the information that must be reported will not be reportable without reading all the external entities in a particular test. Many of the tests for valid documents are paired with an output file as the canonical representation of the input file, to ensure that the XML processor provides the correct information.
This section of this report contains descriptions of test cases, each of which fits into the categories noted above. Each test case includes a document of one of the types in the binary test matrix above (e.g. valid or invalid documents).
In some cases, an output file , as described in Section 2.2, will also be associated with a valid document, which is used for output testing. If such a file exists, it will be noted at the end of the description of the input document.
The description for each test case is presented as a two part table. The right part describes what the test does. This description is intended to have enough detail to evaluate diagnostic messages. The left part includes:
All conforming XML 1.0 Processors are required to accept valid documents, reporting no errors. In this section of this test report are found descriptions of test cases which fit into this category.
|
Tests with a xml document consisting of prolog followed by element then Misc There is an output test associated with this input file. |
|
Test demonstrates that although whitespace can be used to set apart markup for greater readability it is not necessary. There is an output test associated with this input file. |
|
Test demonstrates that extra whitespace is not intended for inclusion in the delivered version of the document. There is an output test associated with this input file. |
|
Test demonstrates that a line break within CDATA will be normalized. There is an output test associated with this input file. |
|
A combination of carriage return line feed in an external entity must be normalized to a single newline. There is an output test associated with this input file. |
|
A carriage return (also CRLF) in an external entity must be normalized to a single newline. There is an output test associated with this input file. |
|
A carriage return (also CRLF) in an external entity must be normalized to a single newline. There is an output test associated with this input file. |
|
A carriage return (also CRLF) in an external entity must be normalized to a single newline. There is an output test associated with this input file. |
|
This tests normalization of end-of-line characters (CRLF) within entities to LF, primarily as an output test. There is an output test associated with this input file. |
|
Tests definition of an internal entity holding a carriage return character reference, which must not be normalized before reporting to the application. Line break normalization only occurs when parsing external parsed entities. There is an output test associated with this input file. |
|
Test demonstrates the use of optional character and content particles within mixed element content. The test also shows the use of an external entity and that a carriage control line feed in an external entity must be normalized to a single newline. There is an output test associated with this input file. |
|
Test demonstrates the use of a public identifier with and external entity. The test also show that a carriage control line feed combination in an external entity must be normalized to a single newline. There is an output test associated with this input file. |
|
Tests LanguageID with Langcode - Subcode There is an output test associated with this input file. |
|
Duplicate Test as ibm33v01.xml There is an output test associated with this input file. |
|
Tests ISO639Code There is an output test associated with this input file. |
|
Tests IanaCode There is an output test associated with this input file. |
|
Tests UserCode There is an output test associated with this input file. |
|
Tests SubCode There is an output test associated with this input file. |
|
Tests a lowercase ISO language code. There is an output test associated with this input file. |
|
Tests a ISO language code with a subcode. There is an output test associated with this input file. |
|
Tests an uppercase ISO language code. There is an output test associated with this input file. |
|
Tests a IANA language code with a subcode. There is an output test associated with this input file. |
|
Tests a user language code with a subcode. There is an output test associated with this input file. |
|
Tests a user language code. There is an output test associated with this input file. |
|
This test case covers legal character ranges plus discrete legal characters for production 02. |
|
various Misc items where they can occur |
|
Test demonstrates that characters outside of normal ascii range can be used as element content. There is an output test associated with this input file. |
|
Test demonstrates that characters outside of normal ascii range can be used as element content. There is an output test associated with this input file. |
|
The document is encoded in UTF-16 and uses some name characters well outside of the normal ASCII range. There is an output test associated with this input file. |
|
The document is encoded in UTF-8 and the text inside the root element uses two non-ASCII characters, encoded in UTF-8 and each of which expands to a Unicode surrogate pair. There is an output test associated with this input file. |
|
Tests all 4 legal white space characters - #x20 #x9 #xD #xA |
|
Empty EntityValue is legal There is an output test associated with this input file. |
|
Tests a normal EnitityValue There is an output test associated with this input file. |
|
Tests EnitityValue referencing a Parameter Entity There is an output test associated with this input file. |
|
Tests EnitityValue referencing a General Entity There is an output test associated with this input file. |
|
Tests EnitityValue with combination of GE, PE and text, the GE used is declared in the student.dtd There is an output test associated with this input file. |
|
Tests empty AttValue with double quotes as the delimiters There is an output test associated with this input file. |
|
Tests empty AttValue with single quotes as the delimiters There is an output test associated with this input file. |
|
Test AttValue with double quotes as the delimiters and single quote inside There is an output test associated with this input file. |
|
Test AttValue with single quotes as the delimiters and double quote inside There is an output test associated with this input file. |
|
Test AttValue with a GE reference and double quotes as the delimiters There is an output test associated with this input file. |
|
Test AttValue with a GE reference and single quotes as the delimiters There is an output test associated with this input file. |
|
testing AttValue with mixed references and text content in double quotes There is an output test associated with this input file. |
|
testing AttValue with mixed references and text content in single quotes There is an output test associated with this input file. |
|
Tests empty systemliteral using the double quotes There is an output test associated with this input file. |
|
Tests empty systemliteral using the single quotes There is an output test associated with this input file. |
|
Tests regular systemliteral using the single quotes There is an output test associated with this input file. |
|
Tests regular systemliteral using the double quotes There is an output test associated with this input file. |
|
Tests empty systemliteral using the double quotes There is an output test associated with this input file. |
|
Tests empty systemliteral using the single quotes There is an output test associated with this input file. |
|
Tests regular systemliteral using the double quotes There is an output test associated with this input file. |
|
Tests regular systemliteral using the single quotes There is an output test associated with this input file. |
|
Testing PubidChar with all legal PubidChar in a PubidLiteral There is an output test associated with this input file. |
|
Makes sure that PUBLIC identifiers may have some strange characters. NOTE: The XML editors have said that the XML specification errata will specify that parameter entity expansion does not occur in PUBLIC identifiers, so that the '%' character will not flag a malformed parameter entity reference. There is an output test associated with this input file. |
|
valid public IDs. |
|
Uses a legal XML 1.0 name consisting of a single colon character (disallowed by the latest XML Namespaces draft). There is an output test associated with this input file. |
|
The document is encoded in UTF-8 and the name of the root element type uses non-ASCII characters. There is an output test associated with this input file. |
|
various satisfactions of the Names production in a NAMES attribute |
|
various valid Nmtoken 's in an attribute list declaration. |
|
various satisfaction of an NMTOKENS attribute value. |
|
valid EntityValue's. Except for entity references, markup is not recognized. |
|
Test demostrates that extra whitespace is normalized into a single space character. There is an output test associated with this input file. |
|
Test demonstrates that an attribute can have a null value. There is an output test associated with this input file. |
|
Test demonstrates that the Attribute in a Start-tag can consist of numerals along with special characters. There is an output test associated with this input file. |
|
Test demonstrates that all lower case letters are valid for the Attribute in a Start-tag. There is an output test associated with this input file. |
|
Test demonstrates that all upper case letters are valid for the Attribute in a Start-tag. There is an output test associated with this input file. |
|
Test demonstrates that PubidChar can be used for element content. There is an output test associated with this input file. |
|
Test demonstrates the use of a parameter entity reference within an attribute list declaration. There is an output test associated with this input file. |
|
Testing CharData with empty string There is an output test associated with this input file. |
|
Testing CharData with white space character There is an output test associated with this input file. |
|
Testing CharData with a general text string There is an output test associated with this input file. |
|
Valid use of character data, comments, processing instructions and CDATA sections within the start and end tag. |
|
Test demonstrates that character data is valid element content. There is an output test associated with this input file. |
|
Test demonstrates character references can be used for element content. There is an output test associated with this input file. |
|
Comments may contain any legal XML characters; only the string "--" is disallowed. There is an output test associated with this input file. |
|
Tests empty comment There is an output test associated with this input file. |
|
Tests comment with regular text There is an output test associated with this input file. |
|
Tests comment with one dash inside There is an output test associated with this input file. |
|
Tests comment with more comprehensive content There is an output test associated with this input file. |
|
Comments don't get parameter entity expansion There is an output test associated with this input file. |
|
Test demonstrates that comments are valid element content. There is an output test associated with this input file. |
|
Test demonstrates that comments are valid element content and that all characters before the double-hypen right angle combination are considered part of thecomment. There is an output test associated with this input file. |
|
Tests PI definition with only PItarget name and nothing else There is an output test associated with this input file. |
|
Tests PI definition with only PItarget name and a white space There is an output test associated with this input file. |
|
Tests PI definition with PItarget name and text that contains question mark and right angle There is an output test associated with this input file. |
|
Tests PITarget name There is an output test associated with this input file. |
|
Test demonstrates a valid comment and that it may appear anywhere in the document including at the end. There is an output test associated with this input file. |
|
Test demonstrates a valid comment and that it may appear anywhere in the document including the beginning. There is an output test associated with this input file. |
|
Test demonstrates a valid processing instruction. There is an output test associated with this input file. |
|
Test demonstrates a valid processing instruction and that it may appear at the beginning of the document. There is an output test associated with this input file. |
|
Test demonstrates that extra whitespace within a processing instruction willnormalized into s single space character. There is an output test associated with this input file. |
|
Test demonstrates that extra whitespace within a processing instruction is converted into a single space character. There is an output test associated with this input file. |
|
Test demonstrates that Processing Instructions are valid element content. There is an output test associated with this input file. |
|
Test demonstrates that Processing Instructions are valid element content and there can be more than one. There is an output test associated with this input file. |
|
Expands a general entity which contains a CDATA section with what looks like a markup declaration (but is just text since it's in a CDATA section). There is an output test associated with this input file. |
|
Tests CDSect with CDStart CData CDEnd There is an output test associated with this input file. |
|
Tests CDStart There is an output test associated with this input file. |
|
Tests CDATA with empty string There is an output test associated with this input file. |
|
Tests CDATA with regular content There is an output test associated with this input file. |
|
Tests CDEnd There is an output test associated with this input file. |
|
Test demonstrates that all text within a valid CDATA section is considered text and not recognized as markup. There is an output test associated with this input file. |
|
Test demonstrates that CDATA sections are valid element content. There is an output test associated with this input file. |
|
Test demonstrates that CDATA sections are valid element content and that ampersands may occur in their literal form. There is an output test associated with this input file. |
|
Test demonstractes that CDATA sections are valid element content and that everyting between the CDStart and CDEnd is recognized as character data not markup. There is an output test associated with this input file. |
|
Attribute defaults with a DTD have special parsing rules, different from other strings. That means that characters found there may look like an undefined parameter entity reference "within a markup declaration", but they aren't ... so they can't be violating the PEs in Internal Subset WFC. There is an output test associated with this input file. |
|
Parameter entities references are NOT RECOGNIZED in default attribute values. |
|
Tests prolog with XMLDecl and doctypedecl There is an output test associated with this input file. |
|
Tests prolog with doctypedecl There is an output test associated with this input file. |
|
Tests prolog with Misc doctypedecl There is an output test associated with this input file. |
|
Tests prolog with doctypedecl Misc There is an output test associated with this input file. |
|
Tests prolog with XMLDecl Misc doctypedecl There is an output test associated with this input file. |
|
Tests prolog with XMLDecl doctypedecl Misc There is an output test associated with this input file. |
|
Tests prolog with XMLDecl Misc doctypedecl Misc There is an output test associated with this input file. |
|
Tests XMLDecl with VersionInfo only There is an output test associated with this input file. |
|
Tests XMLDecl with VersionInfo EncodingDecl There is an output test associated with this input file. |
|
Tests XMLDecl with VersionInfo SDDecl There is an output test associated with this input file. |
|
Tests XMLDecl with VerstionInfo and a trailing whitespace char There is an output test associated with this input file. |
|
Tests XMLDecl with VersionInfo EncodingDecl SDDecl There is an output test associated with this input file. |
|
Tests XMLDecl with VersionInfo EncodingDecl SDDecl and a trailing whitespace There is an output test associated with this input file. |
|
Tests VersionInfo with single quote There is an output test associated with this input file. |
|
Tests VersionInfo with double quote There is an output test associated with this input file. |
|
Tests EQ with = There is an output test associated with this input file. |
|
Tests EQ with = and spaces on both sides There is an output test associated with this input file. |
|
Tests EQ with = and space in front of it There is an output test associated with this input file. |
|
Tests EQ with = and space after it There is an output test associated with this input file. |
|
Tests VersionNum 1.0 There is an output test associated with this input file. |
|
Tests Misc with comment There is an output test associated with this input file. |
|
Tests Misc with PI There is an output test associated with this input file. |
|
Tests Misc with white spaces There is an output test associated with this input file. |
|
Tests doctypedecl with internal DTD only There is an output test associated with this input file. |
|
Tests doctypedecl with external subset and combinations of different markup declarations and PEReferences There is an output test associated with this input file. |
|
Tests markupdecl with combinations of elementdecl, AttlistDecl,EntityDecl, NotationDecl, PI and comment There is an output test associated with this input file. |
|
Tests WFC: PE in internal subset as a positive test There is an output test associated with this input file. |
|
Tests extSubset with extSubsetDecl only in the dtd file There is an output test associated with this input file. |
|
Tests extSubset with TextDecl and extSubsetDecl in the dtd file There is an output test associated with this input file. |
|
Tests extSubsetDecl with combinations of markupdecls, conditionalSects, PEReferences and white spaces There is an output test associated with this input file. |
|
Constructs an <!ATTLIST...> declaration from several PEs. There is an output test associated with this input file. |
|
XML decl and doctypedecl |
|
just doctypedecl |
|