This simplified version on XML is a variant of the proposed XML language. It removes some features and adds a few others. It mainly differs from XML in these aspects:
Separation between lexical part and grammar part means that is much easier to parse. The grammar has just 16 productions and is LL(1).
There is no distinction between <foo></foo> and <foo/>, they can be used interchangeably.
Documents can be nested: there may be <!doctype> declarations in the middle of a document.
Default attribute declarations are scoped: they are valid until the end of the current element or doctype.
Whitespace handling is simplified: a newline immediately before a `<' and a newline immediately after a `>' are ignored; no other whitespace is ignored by the parser (i.e., all other whitespace is passed on to the application).
Only character entities are allowed, no other types of entities exist.
There is no internal document type subset. The allowed structure of the document can only be specified in a separate document.
A few examples of XML files:
A document with all the information in the attributes.
The current document in XML syntax.
I'm still working on a replacement for the DTD syntax, that would allow content models and attribute sets for elements to differ based on context. It would basically be an EBNF variant, with a few handy abbreviations for common constructs like start tags and end tags.
Last modified: Fri Jul 11 21:22:14 MET DST