Modern Web Programming

Java is a new object-oriented programming language developed at Sun Microsystems to solve a number of problems in modern programming practice
-- java@java.sun.com
HotJava Home Page

The Web is becoming complex distributed information system. To address the current market demand, we need to deploy security mechanism, caching, stylesheets, distributed indexing, and other technologies. And on the horizion we see mobil agents, realtime interaction, and virtual reality.

Meanwhile, we're dealing with interoperability problems because of brokenness like fixed string buffer sizes in HTTP servers and HTML comment parsing bugs.

It's time for a revolution in web software development technology. This is my take on what we need and what technologies fill those needs.

Key Features in Development of Distributed Hypermedia Applications

Interfaces/modules

the applications tend to be large. Interfaces and modules help a lot with complexity management and reuse

Good ideas: Modula-3 namespace management (Ada-9x seems similar, and adds hierarchical interface names). Objective C and Java overcome the need for multiple-inheritance by using protocols/interfaces (something like Beta patterns?). Python follows suit, as does Xerox's ILU.

Tcl and C are really poor here. Perl, scheme48, Java, and OMG IDL are workable but less than ideal.

Thread interface

GUI callbacks, client/server stuff, etc. fits well with threads

Supported by: Java, Modula-3, Ada-9X, some ML implementations. Flakey support in Python. scheme48 thread support is nice, except that no thread switches can occur in foreign function calls. Hmmm... is that a good thing or a bad thing? The important thing is that the I/O library is thread-aware.

Exceptions

counting on folks to check error codes is a disaster waiting to happen

Modula-3 leads the way again. Pretty much everything but C supports it, though I have nothing good to say about portability of systems that use C++ exceptions.

Safe programming support (garbage collection, type safety)

The bulk of application functionality can and should be coded this way

C and C++ lack these features. Hmmm... Does Ada-9x support garbage collection? At least there's purify...

Systems programming suport (unsafe stuff, for fast bit-twiddling)

Gotta have it.

TCP/IP networking interface

flatten/serialize/pickle interface

for building RPC systems, hypermedia databases, and for just saving state across sessions

This was a design goal for Modula-3, and it is realized nicely. Java seems to be designed to support this, though I haven't seen how it works exactly (I hear they're working on CORBA support, for example.)

Objects with Inheritance

So much for C and Tcl (though the object-oriented extensions to Tcl are becoming quite mature).

Object-oriented, thread-aware I/O library (files, sockets, pipes)

Modula-3 has one with a formal specification (in Larch)! Tcl, C, C++, and python are hurting here.

Window/Menu/Mouse API

Java's awt package looks mighty good. The Modula-3 Trestle toolkit is very nice, but it's a little bulky, and it's only supported on X platforms. A port to Windows/NT is underway, I understand.

Platform Support: Unix/Mac/PC. Don't forget Windows-NT!

I'm willing to live without Win3.1 support, since Win95 has fairly clean 32 bit, preemptive multitasking support. I am somewhat out of touch on the Mac platform support issues.

It seems all these languages are implemented on Unix originally. Some of them are more closely tied to unix than others. Perl, for example. But it runs on the Mac and DOS quite nicely, I hear. The real trick to platform support is not so much the compiler or runtime as the user interface library (and to some extent, the networking API).

Browser Extensibility

The CCI proposal from NCSA and NCAPI from Netscape seek to address the need to integrate third party software into web browsers. I think these proposals are somewhat short-sighted.

I think these should be replaced by, or at least integrated with desktop message-bus systems, like AppleEvents, and OLE. The X platform has no such technology widely deployed, but CDE is supposed to include Tooltalk.

AppleEvents and OLE are single-host technologies. They integrate with RPC systems for distributed operation. I think OLE uses DCE RPC. I think AppleEvents is related to OpenDoc, which is related to CORBA.

The right answer here, the way I see it, is to write your protocol description in ILU, and have the machine write stubs for whatever platform you need to deploy on. As to the on-the-wire protocol, I'd say CORBA's tcp-based protocol might be the way to go. But I need to look into its security features. Integrity and confidentiality are essential.

HTTP-NG and the and Web as a Distributed Object System

Once authentication and pipes are supported in ILU, I see no reason not to re-implement everything from CCI to HTTP, along with the Harvest protocols using ILU. We'd get:

Automatically generated stubs for C, C++, Modula-3, Lisp, and python. I think Ada is in the works. Java support would be straightforward, as would perl5.
A solution to the current connection-per-request situation in HTTP. The ILU runtime caches TCP connections in a manner that's invisible to the application programmer, and yet efficient in its use of system resources like file descriptors.
An efficient CGI replacement. Imagine a CGI-workalike using ILU (kinda like Simon Spero's BGI idea). Using ILU's in-memory transport, an HTTP server could call perl/python/tcl/scheme scripts in its own address space. As I understand it, the various sybase/oracle/verity gateways all have CGI programs that are just stubs that make an RPC to a long-running process. An RPC straight from the HTTP server to their server would eliminate conjuring up all those environment variables and forking.
CORBA interoperability: the ILU folks are building support for the CORBA TCP-based protocol, and they support an IDL-to-ISL interface definition translator. Why is this important? The database world is big on CORBA. OpenDoc interoperability might be a big win.
DCE RPC interoperability. This has been demonstrated, in case we find it necessary or useful, though the current level of support is low. DCE and CORBA aren't interoperable at on the wire, but the ILU runtime design supports multiple transports, so a gateway would be trivial.
Transport independence. ILU method calls can go over anything from SunRPC/UDP to DCE RPC. When ATM comes to town, the transition should be straightforward. Off-brand networks like DECNET can be supported too.

Applets and Agents

While I am 100% against using a turing machine to represent document structure (ala Postscript, or even nroff or TeX), there are clear benefits to "scripting" in multimedia applications.

While Safe-Tcl would allow HTML forms to be replaced by objects with arbitrary semantics, it doesn't provide the radical improvement that Java's bytecode technology enables: the Java technology makes it feasible to download support for new protocols, compression or encryption formats, and other fine-grain computations.

And the way Java and HTML (or SGML in general) can be combined, so that the document structure remains declarative, while supporting arbitrary additional semantics looks extremely powerful.

I hope to see (if not write) DSSSL-Lite style sheet implementations in the form of a loadable Java class.

The scheme48 virtual machine provides similar features, plus support for continuations, though I wonder about the interactions between continuations and the foreign-function interface.

Hmmm... and the Java loader does some verification that they claim increases security. I wonder if the scheme48 technology can support this feature.

Obliq has some really nice mechanisms for controlling access to filesystem and processor resources. And I wonder whether Oblique's support for migrating computations across the network can be achieved with Java+ILU or scheme48+ILU.

The real difference between Java and scheme48 is simply engineering resources. Both systems have a compiler, a runtime system, a networking API, and a foriegn-function interface including stub generator. But Java has a complete class library, including their Abstract Window Toolkit, which looks like it might be the holy grail: the cross-platform GUI API.

The licensing of scheme48 is probably more "net-friendly," but I believe that Sun will not make Java any more proprietary than, say, Postscript.

Perhaps the Tk bindings for scheme48 will come out soon, and the two systems will compete neck-and-neck. I am hoping that Java or scheme48 bytecode compilers for Modula-3, python, perl5, smalltalk, ML, guile, Icon, Rexx, Obliq, ... are developed in the near term. The tricky part is mapping the library APIs -- sometimes the object models are inconsistent, or the namespaces clash.

Note that ILU (or CORBA) is a critical feature, currently missing from both systems. Just as you want to be able to combine code from many sources into your web user agent, you will want your web user agent computation to be distributed across hosts, or at least across address spaces on the same host.

For example, I think that a faceless, long-running Personal Information Manager server will be an increasingly popular part of the desktop environment: at first, it will just implement the on-disk cache in browsers like Netscape (and Chimera?). Then it might keep a fulltext index of everything you read and write. It will learn your likes and dislikes, and your habits. It will learn to fetch documents before you knew you wanted them.

Your web browser, news reader, and mail user agent will communicate with this server. Eventually, this PIM server will communicate with a work-group server. The lines between a browser, proxy server, workgroup database, and public server will fade.

Specification and Testing

Just as ILU is a key technology that allows much of the programming of distributed objects to be automated, I believe we can benefit from another bridge technology: a formalism that allows us to mathematically specify interfaces and reason about them.

The Larch toolset supports development of formal theories: definitions, axioms, and theorems. The toolset includes LP, the larch prover, which aids in constructing and checking proofs using these theories.

Theorem proving has a reputation of intractability in the computing industry, but the Larch toolset is extremely practical. It includes tools like lclint, which allows you to do lint-like checking ANSI C programs without developing any Larch theory at all. And once you've developed a theory for your application, you can specify abstract interfaces and check for many common programming errors that end up being violations of that spec.

I believe that tools like this will make radical improvements in code quality and ease of maintenance.

But meanwhile, there is always room for traditional regression testing. For example, we plan to write Tcl bindings for the libwww API, and use the deja-gnu toolset to develop regression test suites based on those bindings.

need a conclusion here

Daniel W. Connolly
$Id: web-programming.html,v 1.2 1996/12/09 03:31:52 jigsaw Exp $