When to standardize, especially an RDF API

Part of Data

Author(s) and publish date

Skip to 1 comments

The HTML 4.01 specification has an IMG element, but there is no normative dependency on the PNG or GIF or JPEG specifications. "What good is an HTML user agent that doesn't support GIFs?!?" you might ask. And you wouldn't be alone. From the early days of W3C, there have been calls for a standard "web browser profile" component specification that listed which URI schemes (http, ftp, mailto, ...) and which formats (HTML 3.2, GIF, ...) and so on a standard web browser should support. It always seemed to me that the market would sort that out by itself and any standard, W3C could put in place, would be perennially out of date and irrelevant.

According to the Web Architecture document, orthogonal specifications are a good thing. In section 5.1. Orthogonal Specifications:

When two specifications are orthogonal, one may change one without requiring changes to the other, even if one has dependencies on the other. For example, although the HTTP specification depends on the URI specification, the two may evolve independently. This orthogonality increases the flexibility and robustness of the Web.

W3C inherited from the IETF a bias for specifying interfaces rather than components; i.e. data formats and protocols rather than software modules. I gather that in TV/consumer electronics, there are useful component standards for Web User Agents. But note that an in-car voice browser or a screen reader or good old lynx doesn't support PNG nor GIF, and while their marketplaces are perhaps smaller than desktop or mobile screen-oriented browsers, they're pretty much first-class citizens as far as W3C specifications, especially Web Architecture, are concerned.

APIs are more like interfaces than components, but they tend to be tied to platforms. The IETF has a cultural bias for on-the-wire formats over APIs, and for good reasons, I think. With OMG specs, at least the early CORBA specs, lots of products conformed to the spec (or at least claimed to) without actually interoperating with each other. It wasn't until the arrival of IIOP, an on-the-wire CORBA format, that the rubber hit the road and the interoperability issues got addressed.

Meanwhile, the IETF's aversion to APIs is not without exception: witness GSSAPI. And the W3C has been doing Javascript APIs in the form of the DOM since the early days of XML. Some argue that the DOM specs are ugly, and I tend to agree. SAX and JDOM and the libxml2 pull API have a more elegant feel. But with DHTML and AJAX, one has to wonder: did the W3C DOM Recommendations do more harm or good? Sometimes a mediocre standard is better than no standard. It's clear to me that HTML standardization does more good than harm, but I don't pretend that the HTML design is a thing of beauty.

W3C also started out with a bias against standardizing programming languages. The principle of least power is a part of the lore that the TAG has recently adopted in a finding. For those reasons, when the Javascript designers were looking for a standardization forum in 1996 or so, I let it go to ECMA rather than arguing that it should be done at W3C. The fact that XSLT is turing-complete went under the radar a little bit at first; the WG was able to negotiate requirements by noting that its intended scope was formatting XML documents, not transformations in general. And I heard many times that people who don't see themselves writing Java programs are happy to develop XSLT transformations. But I had very strong misgivings about crossing that line. By the time I was reviewing XQuery/XPath 2.0 functions and operators, I disregarded any claims about narrow scope and looked at it as the standard library for the new computing platform that it is.

And now with the Rich Client Activity and the Web API WG, we're fully engaged in standardization of Javascript APIs with no pretense about language independence. It remains to be seen whether we're actually going to tackle enough of the security policy issues to standardize a real platform or whether we need to just leave that to the market for a while. But enough of the right people seem to be involved in the work on XMLHTTPRequest to make me think we're doing more good than harm there. I haven't seen enough test cases for my tastes yet, but I gather they're on the way.

I don't do much of Javascript hacking myself, but I gather it's an unholy mess of incompatibilities. "Where was W3C when XMLHTTPRequest was being designed in the first place?" you might ask. Maybe we were asleep at the wheel and we could and should have prevented the mess. But maybe we were in "mostly harmless" or "first do no harm" mode, letting the market establish what's really needed. I was dead set on tackling multi-namespace integration in the first version of XML Schema, but in hindsight it's pretty clear to me that we should have gone a little slower, i.e. started with a smaller scope.

The question of when and whether to standardize an RDF API has been hanging in the air for a decade or so. My personal experience with python APIs for RDF suggests that, for example, there's a core of cwm and rdflib and redland that is the same except for a few coin-toss issues. And there are several mature Java APIs and the tabulator has an RDF store in Javascript. Meanwhile, SPARQL is maturing; maybe, like SQL, the string format of queries (and other operations) is the main thing we need to standardize. A survey on Standardizing a Semantic Web API for Javascript is open. Please let us know what you think.

There are precious few "W3C should never do XYZ" rules that I think are worth setting in stone. While we will naturally attract work that is like what we have done before, any place we can get a critical mass of the marketplace to get together and do the hard work of testing, internationalization, accessibility in a reasonably timely, fair and accountable way is a place where W3C should be able to do more good than harm.

Related RSS feed

Comments (1)

Comments for this post are closed.