XimpleWare W3C Position Paper
jzhang@ximpleware.com
Klovette@ximpleware.com
Abstract
Despite its promises as the foundation technology for
next-generation Web-based applications, XML faces many issues that have
so far slowed its adoption in the contexts of Web services, B2B and EAI.
Most recent efforts of replacing XML with binary surrogates have been directed
toward solving the “verbosity” and/or processing performance of XML at the
expense of losing XML’s human-readability. XimpleWare introduces a new way
of optimizing XML processing performance without compromising the “view
source principle.” We describe a procedure by which one can attach the parsed
state of an XML document in a small binary file along with the original
document so that servers at the receiving end can directly access, query
and update the XML document, all without the need of parsing again. Furthermore,
XimpleWare’s processing model can be directly mapped into FPGA or ASIC to
achieve sustained Gigabit performance.
Position
Performance: The Bigger Problem
Although verbosity and performance of XML are two oft-mentioned
issues, usually only one of those two arises as the dominant bottleneck
within a particular context. Whereas verbosity matters more in a low-bandwidth
and high-latency environment such as mobile computing, for a large class
of applications for which available bandwidth ranges from sufficient to abundant,
it is performance of XML processing that has become the glaring limiting
factor. This in part attributes to enterprises’ network infrastructure
build-up for the past decade during which the rate of bandwidth commoditization
has significantly outpaced the “Moore’s law” for computing power. Various
research reports and publicly available benchmarks have shown that (1)servers'
processing throughput of XML amounts to only a small fraction of network
bandwidth (2) comparable applications based on XML underperforms proprietary-middleware
based ones by orders of magnitude. To those applications, XML processing
performance presents a significant barrier to adoption and clearly is a
bigger problem.
Binary XML: At What Cost?
Binary XML generally refers to alternative XML info-set serialization formats
(other than text) with the goals of optimizing processing performance and/or
bandwidth requirement. However, those goals are often achieved at the expense
of losing human readability. XML is, by design, human-readable as it simplifies
many aspects of XML application programming and helps lowering the barriers
to learning. Adopting binary XML would force XML programmers to give up
the luxury of reading the wire format to quickly figure out how things work
or what goes wrong, and go back to the “dark ages” of CORBA and DCOM. Also
binary XML often mandates the presence of schema, which is precisely the
reason why previous generation of distributed computing has been considered
rigid and brittle. Departure from such “tight-coupling” is what makes Web
such a great success because a Web browser makes few assumptions on what
an HTML page is supposed to look like. In other words, the coupling between
a Web browser and Web server is “loose.” XML is chosen as the wire format
because it brings similar value proposition in Web services paradigm.
Why XML Processing Is Slow
There are at least two factors contributing to the lackluster performance
of XML processing, summarized as follows:
- Inefficiency of traditional text processing techniques
Current XML processing inherits heavily from traditional, LEX-type
text processing techniques, which requires that tokenization be done in
the form of picking apart the original XML document into many string objects.
This is both slow and wasteful in memory usage because of OS inherent
inefficiency in managing small memory blocks.
- Limitation of general-purpose architecture
Modern processors are designed to be flexible, i.e., to be able
to do many different types of tasks, but don’t do anyone particularly
well. Many of its design philosophy and features, such as sequential
execution model, deep pipeline and multi-level memory hierarchy, all become
liability to high performance text processing tasks, which require a small
set of operations to be performed repetitively over large amount of data.
This is an area where custom hardware, which is designed to perform a
small set of operations at very high speed, can come in and help.
Also programmers working with XML often confront the dilemma
of picking the right processing model. Generally people like DOM (Document
Object Model) because it offers a tree view of the XML document and is a
natural and easy way to work with XML. But DOM’s data structure building
is slow and quite resource intensive, making it unsuitable for most high
performance applications. SAX (Simple API for XML) is faster and consumes
less memory, but doesn’t provide much structural information of an XML document.
As a result, programmers using SAX often have to manually maintain the state
information, which can be quite tedious for a complex XML document. In light
of those issues, XML luminary James Clark, in a recent interview (
http://www-106.ibm.com/developerworks/xml/library/x-jclark.html?dwzone=xml),
points out that one of the challenges for XML is to "Improve XML processing
models. Right now, developers are generally caught between the inefficiencies
of DOM and the unfamiliar feel of SAX. An API that offers the best of both
is needed."
A New Processing Model
Based on the premise that server throughput is, in a
typical enterprise data center setting, much slower than the available bandwidth,
Ximpleware has chosen to focus its efforts on improving XML processing performance
rather than reducing the bandwidth requirement. To address this challenge,
we propose a new XML processing model and it works like this: By maintaining
the entire document in memory, a user gets a complete structural view of
the document by navigating a separate binary file (it is analogous to an
index in a DBMS, and allows the DOM /XPath implementation to quickly find
the requested section of the text document), which is typically around 30%~50%
of the size of the XML document. The generation of binary file by software
is comparable to SAX in performance. A custom hardware implementation can
achieve sustained performance of over 100 MB/sec, sufficient to keep pace
with a Gigabit connection. Also because the binary file is inherent persistent,
one can attach the binary file along with the XML document so application
server at the receiving end can directly process the XML data without the
need of parsing again.
Its immediate advantages are:
- Retention of Human readability – Because the processing
performance is gained by attaching a separate file to the XML document,
one doesn’t lose the human-readability of XML. In the worst case, even
the application isn’t equipped to understand the binary file, it can always
get back the original XML text and use whatever XML processing API it
prefers.
- No schema required – Because our processing model is
an innovation starting at tokenization level, it is vocabulary and schema
agnostic and therefore, fits in with the “loose-coupling” aspect of the
grand vision of Web services.
In addition, we plan to move Schema validation on chip, which can be done
in parallel with parsing and incur no penalty in parsing performance. This
is achievable because Schema, in its essence, is nothing more than a finite
state machine, which is well suited for hardware implementation.
How does our processing model compare with DOM or SAX?
In short, our processing model is more DOM-like in that it loads everything
in memory and allows DOM-type of random access within the XML document at
much lower memory consumption. What’s more, by keeping the XML document
in its serialized form, dynamic update or modification to the document doesn’t
require re-serialization of irrelevant parts of the document, resulting
in dramatic serialization performance improvement. This is again superior
to traditional text processing techniques, which requires round-trip of
taking apart the document and putting everything back, regardless of the
processing need.
Preliminary Benchmark Performance
Our preliminary performance test was designed to compare the data structure
building of our Java-based processing solution with two similar type of
processing technologies: Xerces DOM and XML Cursor. The test platform is
an Athlon XP1900+ machine with 512 RAM running Redhat Linux kernel version
2.4. The test file is a single 100k size document purchase order
file. We got it from BEA’s XMLcursor performance benchmark package.
Processor Type |
Xerces 2.3 |
XMLCursor |
XimpleWare |
Average Time for building data structure (ms) |
42.2 |
25.1 |
15.4 |
Table 1 Preliminary Performance Figure of Data Structure Building
The SAX-type of performance of our processing model is “by design” and the
performance figure shown above is preliminary and subject to further optimization.
We intend to provide more up-to-date results by the time of the workshop.
Also the navigation performance, which we are still working on at the time
of submission, is on par with Xerces and XMLCursor. We intend to show that
in the workshop as well.
Potential Applications
Allowing efficient consumption of the binary file by XML processors that
requires optimized processing without losing the standard text file, this
processing model could be particularly useful in a number of scenarios:
XML message load balancers that can boost the XML processing throughput
by quickly generating the binary file and attaching it to the XML document;
XML database servers that could build the binary file as part of the storage/indexing
process and make it available upon retrieval to speed up downstream DOM
and XPath processing; Finally, the processing model is probably the most
efficient for XML intermediary applications operating at near network speed
for which usually limited amount of update is required while most of the
XML payload is left unmodified.
Q/A Section
- What work has your organization done in this area? (We are particularly
interested in measurements!)
We (XimpleWare) are focusing on optimizing the XML processing
performance rather than the bandwidth requirement of XML. Also we
achieve our goal without compromising “view source principle.”
2. What goals do you believe are most important
in this area? (e.g. reducing bandwidth usage; reducing parse time; simplifying
APIs or data structures, or other goals)
We think that processing performance and simplifying API are
most important in this area.
3. What sort of documents have you studied the most?
(e.g. gigabyte-long relational database table dumps; 20-MByte telephone
exchange repair manuals; 2 KByte web service requests)
Our processing model works with any XML file. For our benchmark
purposes, we used XML files ranging from 10k to 1M in size containing
fairly complex structures.
4. What sorts of applications did you have in mind?
Our technology is applicable wherever performance is important.
The possible areas of applications include XML intermediary (firewall,
data router etc.), XML middleware applications, and XML database applications.
5. If you implemented something, how did you ensure
that internationalization and accessibility were not compromised?
Since our processing model maintains the XML document in its
original form, so it goes where XML goes. No internalization or accessibility
is compromised.
6. How does your proposal differ from using gzip on
raw XML?
We don’t optimize the bandwidth requirement of XML initially.
7. Does your solution work with any XML? How is it
affected by choice of Schema language? (e.g. W3C XML Schema, DTD, Relax
NG)
Yes, because our technology is an innovation at tokenization
level, it works with any XML and is Schema agnostic.
8. How important to you are random access within a
document, dynamic update and streaming, and how do you see a binary format
as impacting these issues?
It is very important. In terms of raw speed, SAX is much faster
than DOM; however, when factoring in the overhead of state management,
SAX’ real world performance could be a lot slower than its raw performance.
In that regard, our binary “companion” file combines the best of both
DOM and SAX.
Summary
While the processing performance of XML is a very important issue that will
directly impact the its adoption as a platform-independent, interoperable
and open data/document encoding format, a good solution should not compromise
XML’s human-readability that lies in the core of its value proposition.
By pioneering a hardware-accelerated XML processing model that has some
interesting properties, XimpleWare hopes to help alleviating XML’s performances
issue and does so without violating the “view-source principle.”