The World Wide Success That Is XML

Part of Data

Author(s) and publish date

Skip to 12 comments

Most of the XML Working Groups have been closed by now; this year saw XQuery and XSLT close, their work successfully completed.

As we wind down work on standardizing the XML stack at W3C it’s worth looking at some of what we have accomplished and why. W3C XML, the Extensible Markup Language, is one of the world’s most widely-used formats for representing and exchanging information. The final XML stack is more powerful and easier to work with than many people know, especially for people who might not have used XML since its early days.

Today, XML tools work with JSON, with linked data, with documents, with large databases (both SQL/relational and NoSQL), with the Internet of Things and in automobiles and aircraft and music players. There are even XML shoes. It’s everywhere.

XML can be stored in very efficient databases and processed with a highly optimized query language (XQuery, and its younger cousin JSONiQ), can be transformed with an efficient declarative tree manipulation language (XSLT 3), orchestrated in pipelines (XProc), delivered with one of the most effective compression schemes around (EXI, with low entropy server-side parsing), formatted to PDF with both XSL-FO and CSS, and all of these things can be done both with proprietary applications and with open source software.

How did we get here?

The Web SGML Working Group was formed to solve a specific problem: to agree on a subset, or profile, of the large and complex SGML specification that could be shared on the Web and displayed in browser plugins. There were two such plugins at the time, one from SoftQuad (Panorama) and one from EBT/Inso that was never released. Unfortunately it was difficult to construct an SGML document that both plugins would display - there was a clear need for a standard.

We were not trying to replace HTML. We weren't even expecting native XML support in Web browsers. Nor were we trying to make a format for interchange of data or for remote procedure calls.

XML has some redundancy in its syntax. We knew from experience with SGML that documents are generally hard to test, unlike program data, and the redundancy helped to catch errors early and could save up to 80% of support costs (we measured it at SoftQuad). The redundancy, combined with grammar-based checking using schemas of various sorts, helped to improve the reliability of XML systems. And the built-in support for multilingual documents with xml:lang was a first, and an enduring success.

XML, XSL-FO, XSLT, XQuery, XML Schema, XProc, EXI, all of these Working Groups included world experts and had strong industry representation. They were guided by experienced chairs.

Most of the work has finished: people are using the specifications in production and the rate of errata has slowed to a crawl. XQuery,  XSLT and EXI ended this year.  But just because the specification work is ending doesn’t mean XML is ending! It means XML is at a stage where the technology is mature and widely deployed. People aren’t reporting many new problems because the problems have already been worked out.

For sure some of the more recently-published specifications are still rolling out: XSLT 3 is very recent, but there was good implementation experience when it was published as a Recommendation. EXI Canonicalisation was published as a Recommendation this past June, and because EXI can be used to send just about any stream of parse events over the wire much more efficiently than compressing the interchange syntax, this spec was eagerly awaited.

But for the most part, it’s time to sit back and enjoy the ability to represent information, process it, interchange it, with robustness and efficiency. There's lots of opportunities to explore in making good, sensible use of XML technologies.

XML is everywhere.

Thank you to all who have contributed.

Liam Quin, leaving W3C this week after almost 17 years with XML.

Related RSS feed

Comments (12)

Comments for this post are closed.