Class XMLParser

java.lang.Object
   |
   +----XMLTokenizer
           |
           +----XMLParser

public class XMLParser
extends XMLTokenizer
Parse an XML file and construct a tree. At certain points, callback functions are called. An application needs to implement the XMLParserUser interface, which declares the functions the XMLParser object will call. At the moment, the only callbacks are for errors.

Grammar differences

The grammar used in this parser is different from the one in the first XML draft. This is the grammar (lowercase = nonterminal, uppercase = terminal):

 document: prolog element misc*
 prolog: encodingdecl? misc* [doctypedecl misc*]? [dtdsummary misc*]?
 misc: COMMENT | PI
 doctypedecl: DOCTYPE NAME extid? GT
 attribute: NAME EQ LITERAL
 etag: ETAGO NAME GT
 content: [element | PCDATA | ms | PI | COMMENT]*
 element: LT NAME attribute* [GT content etag | EMPTY]
 dtdsummary: [idinfo | defaultinfo]+
 encodingdecl: ENCODING EQ qencoding ENDPI
 extid: LITERAL
 ms: MSSTART MSDATA MSEND
 qencoding: LITERAL
 idinfo: IDINFO NAME EQ quotedpairs NAME EQ quotedpairs ENDPI
 quotedpairs: LITERAL
 defaultinfo: DEFAULT NAME [NAME EQ LITERAL]* ENDPI
 

Some of the differences are:

Entities other than character entities are not permitted. Character entities are handled invisibly by the tokenizer and are not reported to the parser.

Character entities are permitted in element content.

Validation

The parser doesn't validate.

Also, error messages are not the most helpful. This is a hand-generated parser, so it was easier to only use insert() and not delete(). Some tool should be used to generate the director sets for resynchronizing after a syntax error.

See Also:
XMLTokenizer

Constructor Index

 o XMLParser(InputStream, XMLParserUser, int[], XMLNode[])
Construct a new XMLParser object, giving an XMLStreamTokenizer to read from and an object that implements XMLParserUser.
 o XMLParser(InputStream, XMLParserUser, String, int[], XMLNode[])
Construct a new XMLParse object, giving an XMLStreamTokenizer to read from and an object that implements XMLParserUser.

Constructors

 o XMLParser
 public XMLParser(InputStream aStream,
                  XMLParserUser aUser,
                  int nrerrors[],
                  XMLNode tree[]) throws IOException, UnknownEncoding
Construct a new XMLParser object, giving an XMLStreamTokenizer to read from and an object that implements XMLParserUser.

Parameters:
aStream - a byte stream
aUser - an object that implements the callbacks
Throws: UnknownEncoding
if the encoding isn't either UTF8 or ISO8859-1
 o XMLParser
 public XMLParser(InputStream aStream,
                  XMLParserUser aUser,
                  String encoding,
                  int nrerrors[],
                  XMLNode tree[]) throws IOException, UnknownEncoding
Construct a new XMLParse object, giving an XMLStreamTokenizer to read from and an object that implements XMLParserUser. Also set the default encoding of the input stream,

Parameters:
aStream - a byte stream
aUser - an object that implements the callbacks
encoding - a string such as "UTF8", "ISO8859-1", etc.
nrerrors - an output parameter for the number of errors
tree - an output parameter for the XML tree
Throws: UnknownEncoding
if the encoding isn't either UTF8 or ISO8859-1