From the Editor -- Web Apps, Issue 2 Mar/Apr 1997

This is a draft of the editorial of the Mar/Apr 1997 issue of Web Apps Magazine, ISSN #1090-2287. See also: Jan/Feb 1997 Languages Issue May/Jun 1997 Databases Issue editorials, more about WebApps.

Dan Connolly
Austin, TX
Created Thu Feb 13 08:52:55 CST

I must take this opportunity to dispell a myth that is all to pervasive in the scientific and product literature surrounding the Web: that distributed objects are something that can be, or some day will be, added to the Web. Distributed objects are the very heart of the Web, and have been since its invention.

HTTP was design as a distributed realization of the Objective C (originally Smalltalk) message passing infrastructure: the first few bytes of every HTTP message are a method name: GET or POST. Uniform Resource Locator is just the result of squeezing the term object reference through the IETF standardization process.

The notion that some web resources are 'static' while others are 'dynamic' or 'computed on the fly' is an unfortunately limited understanding of the system, based on the very common experience of studying the CERN and NCSA httpd implementations. The distinction between 'static' and 'dynamic' web pages is not in any of the HTTP, HTML, or URL specifications at all. By definition, all web resources are opaque objects that respond to a GET method.

So much for the notion that the time when the Web becomes a distributed object system has not come yet...

Don't get me wrong: I won't pretend that the Web is a highly advanced distributed object system. Quite the contrary: it's the minimum amount of distributed object technology necessary to get the job done.

Well... some jobs anyway.

The Web we see today is a tribute to the power of opaque objects: some resources are implemented by simple file servers, some by perl scripts, some by expert systems, some by three-tiered database systems, and for all I know, some by COBOL programs. And some of them are powered by advanced object technology applications. The power is that the same clients interoperate with them all through the same simple interface consisting of GET and POST.

But are there some jobs that just can't be done through that interface? More to the point, are there some jobs that are so impeded by this simple object technology that they aren't feasible, whereas they would be feasible with a bit more advance object technology integrated into the Web?

The lowest-common-denominator is so low that a lot of time and energy is spent dumbing down the service interface of systems to this level; meanwhile, even more energy is spent injecting smarts back into the interface on the client side. It would seem much more cheap and efficient to conduct the communication at a higher level.

But how often does the server's notion of "higher level" coincide with the client's notion of "higher level"? Clearly there is a wide variety of higher level interfaces, and we need some sort of interface discovery and negotiation. What language should we express these interfaces in? Or should we pass them by reference, with an option to dereference the interface identifier, ala some sort of interface repository?

What are the odds that the client and the server share any richer interface than HTTP and forms? For parties with a manually established relationship, the odds are quite high. But such situations don't call for any new infrastructure -- just creative applications design. In order for dynamic interface discovery and negotiation to be useful, we need to see a marketplace of interfaces, from commonly used interfaces and frameworks to highly specialized but widely known interfaces.

The fascinating question that comes to mind is: does this commerce in rich interface have an overall positive or negative impact on the robustness of the system? As the richness of an interface increases, so does its complexity, and down go the odds that the two parties have implemented the subtleties in a compatible way. And I haven't even raised the specter of lifecycle issues. The Web is in continuous operation after all: the old clients and servers never completely go away; any deployment is a commitment to essentially eternal backwards compatibility.

What fascinates me is the tension between classical and chaotic systems. Object technology reminds me of classical mathematics: complexity management by the use of formal systems built from primitives axioms (ala an object system), with inference rules structured on top (ala classes). The whole thing is very left-brain and anal. Very much my style, I have to admit.

But it's also very fagile: one goof, and suddenly you've got both P=true and P=false and the whole thing comes crumbling down. A distributed object system is subject to the same sort of catastrophic failure, if parties blindly assume that other parties adhere strictly to the stated interfaces and protocols. Fault tolerance is essential to robustness.

OK, so that's no great revelation: we all know that there is no such thing as a free lunch, and that distribution doesn't come for free.

But richer interfaces only compound the cost of fault tolerance. For each interface, the possible failures have to be enumerated (usually through painful experience) and addressed. The HTTP interface is shared by the whole community, and that community shares the burden of building this knowledge about fault tolerance. The cost of developing this knowledge seems like a damper on the marketplace of interfaces.

Achieving fault tolerance takes much longer than achieving plain correctness with respect to the interface specification -- I routinely hear reports of two or three three times as much, and even ten times is not terribly rare. So one step forward in complexity management from richer interface description is two, three, or ten steps backward in total development cost.

It's no wonder that a lot of developers choose the quick-and-dirty, organic, chaotic approach to applications development. They simply accept the complexity of distributed applications development as a fact and deal with it as a craftsman or artist, feeling their way around. Given the rapid pace of change in the Web, they shy away from development tools and methodologies, expecting the next application to be sufficiently different that the investment will not pay off.

But what will be the trend as distributed objects go global, as developers can collaborate from across the planet almost as easily as across the hall, echanging software components without even echanging names, and as applications have the total resources of the network at their disposal?

The whole Web is clearly a chaotic system -- predicting its behaviour is more like predicting the weather than predicting the path of a bullet. But the benefits of classical complexity management in the small, i.e. the use of object technology within applications and within software development organizations, is beyond doubt.

It seems to me that the part of the world that's changing too fast for object technology and tools to help is the small fraction of the 80/20 rule. The bulk of most applications is not novel, and while rich interface description may be costly to develop initially, reuse justifies the cost.

The Web is an information system designed to help people exchange information, which means that it is designed to model the real world of ideas, communication, relationships, and social systems. The real world seems to be a system composed of objects with well-defined boundaries: cells with cell walls, organisms, families, companies, and nations. The economies of scale in this system clearly rely on well-defined interfaces: social norms between people in families, policies in companies, laws in nations.

If this analogy is at all useful for charting the evolution of the Web, I see a lot of unexploited territory in the application of object technology to the Web. It seems the Web is still in a primitive state of formation: loose bands of individuals hunting and gathering. I look forward to great nations in this new medium.