The World Wide Success That Is XML
Most of the XML Working Groups have been closed by now; this year saw XQuery and XSLT close, their work successfully completed.
As we wind down work on standardizing the XML stack at W3C it’s worth looking at some of what we have accomplished and why. W3C XML, the Extensible Markup Language, is one of the world’s most widely-used formats for representing and exchanging information. The final XML stack is more powerful and easier to work with than many people know, especially for people who might not have used XML since its early days.
Today, XML tools work with JSON, with linked data, with documents, with large databases (both SQL/relational and NoSQL), with the Internet of Things and in automobiles and aircraft and music players. There are even XML shoes. It’s everywhere.
XML can be stored in very efficient databases and processed with a highly optimized query language (XQuery, and its younger cousin JSONiQ), can be transformed with an efficient declarative tree manipulation language (XSLT 3), orchestrated in pipelines (XProc), delivered with one of the most effective compression schemes around (EXI, with low entropy server-side parsing), formatted to PDF with both XSL-FO and CSS, and all of these things can be done both with proprietary applications and with open source software.
How did we get here?
The Web SGML Working Group was formed to solve a specific problem: to agree on a subset, or profile, of the large and complex SGML specification that could be shared on the Web and displayed in browser plugins. There were two such plugins at the time, one from SoftQuad (Panorama) and one from EBT/Inso that was never released. Unfortunately it was difficult to construct an SGML document that both plugins would display - there was a clear need for a standard.
We were not trying to replace HTML. We weren't even expecting native XML support in Web browsers. Nor were we trying to make a format for interchange of data or for remote procedure calls.
XML has some redundancy in its syntax. We knew from experience with SGML that documents are generally hard to test, unlike program data, and the redundancy helped to catch errors early and could save up to 80% of support costs (we measured it at SoftQuad). The redundancy, combined with grammar-based checking using schemas of various sorts, helped to improve the reliability of XML systems. And the built-in support for multilingual documents with xml:lang was a first, and an enduring success.
XML, XSL-FO, XSLT, XQuery, XML Schema, XProc, EXI, all of these Working Groups included world experts and had strong industry representation. They were guided by experienced chairs.
Most of the work has finished: people are using the specifications in production and the rate of errata has slowed to a crawl. XQuery, XSLT and EXI ended this year. But just because the specification work is ending doesn’t mean XML is ending! It means XML is at a stage where the technology is mature and widely deployed. People aren’t reporting many new problems because the problems have already been worked out.
For sure some of the more recently-published specifications are still rolling out: XSLT 3 is very recent, but there was good implementation experience when it was published as a Recommendation. EXI Canonicalisation was published as a Recommendation this past June, and because EXI can be used to send just about any stream of parse events over the wire much more efficiently than compressing the interchange syntax, this spec was eagerly awaited.
But for the most part, it’s time to sit back and enjoy the ability to represent information, process it, interchange it, with robustness and efficiency. There's lots of opportunities to explore in making good, sensible use of XML technologies.
XML is everywhere.
Thank you to all who have contributed.
Liam Quin, leaving W3C this week after almost 17 years with XML.
I use https://www.w3.org a lot, use for my websites to check for html and css errors. A must have tool. Recommend to every seo experts and webmasters.
There are now about 13,600 Google hits for the string "XML is dead".
And 233,000 hits for "Google is Dead", go figure ;)
And 84.400 hits telling that "XML is alive" (whoever said that… :)
Thanks for your great contributions, Liam!
.. and 563 hits for "XML is sexy." (mind you, 13,100,000 if I omit surrounding quotes ;-) )
Liam, thanks ever so much for all your work in the last nearly two decades inside and outside of the W3C community. I don't like using any search engine metrics for any substantive discussion, I prefer to look on the long list of accomplishments you bring to the table. It makes me believe there is a future for XML in many places - such as for instance office formats - files that are still produced by the billions every week. I have no reason to doubt that this work will continue.
I wish you personally a great next step in your career and look forward to your future endeavours.
In my world of geography, XML is very much a staple - in the guise of "Geography Markup Language" (GML). It sits at the heart of national & international Spatial Data Infrastructures, such as the European INSPIRE - with coordinated application schema development across 27 countries and 30+ themes of information relevant to the environment.
Yes, "younger" programmers like JSON, but XML is still very much alive.
Being a designer using the XML markup, mainly for web design gives me a way to separate data from the format and having been using/learning quite a few different dev skills I do find XML really easy to understand which is great! JSON is always mentioned at work but something I havent used myself yet.
Back in the day, I took a look at SGML and found it too complicated to get into. XML came along, I read Bob DuCharme's Annotated Specification, and it quickly became clear that here was a practical solution to a great many problems of data separation, structure and manipulability.
With the selectability of XPath, the pattern-based transformations of XSLT and the robust typing of XSD, XML was now the ideal format for richly-structured data and an error-constraining intermediary between database and multiple publishing outputs. This was enhanced when SQL databases starting building in serious XML support.
I see an XML-based course publishing HTML subset I helped build many years ago is still going strong.
If the problems that the XML family of technologies solves was explained clearly to unfamiliar developers (web, database etc.) then there might be more uptake. I have seen pretty horrible data manipulation efforts in procedural, markdown, declarative and scripting languages that could be replaced by elegant, efficient XML solutions that offload appropriate work to the XML processor and take advantage of pipeline-based flexibility and maintainability.
I see xml as a massive success since it is not only treated as the backbone in asynchronous communications but also defies the use of database, of course for small data.
Thank you for your reply - i agree, of course. Note that “small” is subjective - there are multi-terabyte and probably multi-petabyte databases whose transfer syntax was XML.
The XML databases actually store indexed forests of data, not the actual pointy-bracket syntax, just as relational databases don't store CSV files :) And XQuery is available from SQL in the major commercial databases, as well as being used as the native query language for “XML-native” databases.
Thanks again for commenting.
I think XML is a success and will not disappear that soon. For example, even today there are many companies depend on XML for API solutions (soap). I personally think soap/xml is better compared to REST approach