DraconianErrorHandling

From HTML WG Wiki
Jump to: navigation, search

One feature of the XML specification that many people now assume was somehow inherent in its design but was actually not is the concept of draconian error handling. Among the members of the group working on the spec, there were actually two camps with differing views on error handling:

  • the camp of so-called "Draconians" (which included Tim Bray, Jon Bosak and others), which favored the requirement that all well-formedness errors be treated as fatal errors -- in contrast to the approach to error handling already used in existing Web browsers, which Tim Bray referred to as a "race-to-the-bottom a la HTML"
  • the camp of so-called "Tolerants" (James Clark, Eliot Kimber, Michael Sperberg-Mcqueen, Dave Hollander, and others), which favored a form of non-fatal error-handling closer to what Web browsers already used when processing HTML
In May of 1997, the group took a vote, which the Draconians won 7-4, and draconian error handling thus became forever  enshrined in XML.
However, despite the outcome of the vote, even some members of the group continued to advocate for non-draconian error handling -- most notably the group's Technical Lead, James Clark, who wrote:
I think users and application builders should have a choice with what they do with invalid data. I cannot see how a user or application builder can be disadvantaged by being provided with this choice, and I therefore plan to continue to provide it even if the spec says that this is non-conforming. For a summary that includes quotations from a number of people from both camps who took part in the discussion, see Mark Pilgrim's The history of draconian error handling in XML

When the XHTML 1.0 Recommendation is published, one of its inherent principles is draconian error handling -- something previously foreign to HTML and seen by many as fundamentally not Web-friendly -- since it was at the time (and continues to be) at odds with how existing browser actually process existing HTML content.

The first month of 2004 saw a flurry of blog postings and mailing-list postings related to dealing with the problem of not-well-formed and invalid content in syndicated feeds. Among others, Mark Pilgrim and Ian Hickson advocated the position that draconian error handling was a counterproductive way of dealing with problem.

Various people have tried to mandate this principle out of existence, some going so far as to claim that Postel’s Law should not apply to XML, because (apparently) the three letters X, M, and L are a magical combination that signal a glorious revolution that somehow overturns the fundamental principles of interoperability.
The client is the wrong place to enforce data integrity. It’s just the wrong place. I hear otherwise intelligent people claim that if everyone did it, it would be okay. No, if everyone did it, it would be even worse. If you want to do it, of course I can’t stop you. But think about who it will hurt.
Authors will write invalid content regardless. If the specification doesn't say what should happen, then once there is a dominant browser, its error handling (whether intentionally designed or just a side-effect of the implementation) will become the de facto standard... Specifications should explicitly state what the error recovery rules are. They should state what the authors must not do, and then tell implementors what they must do when an author does it anyway... Specifications should ensure that compliant implementations interoperate, whether the content is valid or not.

Later in 2004, Mark Pilgrim published and an article at XML.com titled XML on the Web Has Failed:

  • XML on the Web has failed. Miserably, utterly, completely. Multiple specifications and circumstances conspire to force 90% of the world into publishing XML as ASCII. But further widespread bugginess counteracts this accidental conspiracy, and allows XML to fulfill its first promise ("publish in any character encoding") at the expense of the second ("draconian error handling will save us").
  • The promise of "draconian error handling will save us" was an empty promise. Whatever interoperability we now enjoy has not been rooted in draconian error handling. Draconianism was a grand experiment, and maybe it could have ensured interoperability if clients hadn't been so buggy and the creators of XML had understood how it interacted with other specifications (like MIME) from the beginning, instead of being blindsided by it years later.