Standardizing a Web-based Application Environment

Abstract

Motorola believes that a standards-based, extensible application environment is essential to expanding the reach of the Web. Such an environment should foster innovation without sacrificing interoperability.

1. Introduction

The World Wide Web is, among other things, a universal networked application platform. Applications can be accessed using various browser clients running on various computing platforms, located anywhere an Internet connection can be had. Until recently, those clients typically ran on desktop personal computers having a physical connection to the network. Web clients for mobile devices with wireless network connections have now emerged, expanding the possibilities of the Web as an application platform.

We, and no doubt many others, would like to see the World Wide Web evolve to become ubiquitous and transparent. Users should be able to access information and services on the Web from anywhere at any time. In addition, information and services should be accessible through any number of different Web clients running on any number of devices. The WAP Forum has moved today's Web down the path toward ubiquity by extending it to mobile devices connected via wireless networks. But much work remains to be done to evolve the Web into the premier platform for networked applications.

As the mobile Web application market matures, competitive pressures and user expectations will drive application developers to differentiate their product offerings by providing value-added features. The W3C and the WAP Forum must anticipate this and offer a standards-based application environment that meets their needs. This environment must ensure interoperability while fostering innovation. In particular, it must enable

As customized content will not be addressed in the workshop, it will not be part of this position paper.

2. Application Environment

For the Web to be ubiquitous, Web-access devices must be everywhere, in all aspects of our lives. For this to occur, Web access devices must become small and portable. As Web-enabled devices evolve from today's desktop computers to such things as cellular telephones, car radios and personal organizers, the challenge will be to provide a common application authoring environment across a diverse range of devices.

The existing "standard" Web application environment consists of HTML, JavaScript and an ad-hoc collection of standard graphics file formats, processed by an HTML browser. To incorporate multimedia content and extend the expressiveness or the functionality of a user interface, Java applets and browser plug-ins can be used. They require extensions that are often device-specific and require special installation and may therefore be met with apprehension by novice users. This tends to drive the application developer back towards a lowest-common-denominator approach.

What is needed is an extensible, standards-based Web application environment that provides application developers with the tools they need to develop innovative products, without sacrificing interoperability. The environment should

The application environment should support a standard extensibility framework. As new media types, user agents or supplemental services emerge, they can be integrated into the environment in a backward-compatible manner; no existing applications will break. A generic interface for user agents and services will provide the necessary extensibility. Each new user agent and service can specify how it uses and extends the generic interface.

The preferred interface for integrating new services into the environment is an event model combined with a service API.

The standardization of an extensible application environment would need to include at least the

3. Multimodal Applications

For the Web to be transparent, Web-access devices must be hidden, embedded into other products or everyday things. For this to occur, more natural interfaces must be developed. Speech input and output, combined with natural language recognition, and integrated with diverse media types, seems to be the best approach. The challenge will be to seamlessly integrate a multimedia voice interface into the application environment.

One aspect of Web applications poised for rapid advancement is the user interface. Considering that advanced services will often be accessed through constrained devices, new, more efficient, input and output mechanisms will be needed to deliver a quality user experience. Speech input and output tightly integrated with other media will be used to provide a natural, intuitive interface. As mobile Web-access devices evolve to deliver speech and other advanced user interfaces, we must ensure that applications can access new features in a standard manner.

Input Modalities

We support the work of the W3C Voice Browser Working Group and believe that speech input and output are important to providing efficient and intuitive interfaces to Web applications and services.

As today's mobile Web-access devices cannot economically support speech recognition, application architectures that employ distributed speech recognition (DSR) should be enabled. DSR splits the speech recognition engine into a front-end part, which resides on the Web client, and a back-end part, which resides on a server. Taken together, the two parts form the Speech Agent.

To support multimodal requirements, such as synchronization between input and output modalities, there needs to exist an interface between the browser and the Speech Agent. Such an interface should be standardized, so that the browser and Speech Agent can be implemented and evolve independently.

We believe that an event model is the appropriate mechanism to implement the interface between the browser and the Speech Agent. An event model will sufficiently abstract the Speech Agent, hiding from the application developer the details of any particular implementation.

In particular, it should be noted that the fact that the Speech Agent is indeed distributed between client and server can be hidden. To the developer, rich Web-access devices that implement the Speech Agent completely in the client appear the same as constrained devices that employ DSR. Applications may not have to be rewritten as speech recognition technology advances.

Output Modalities

We believe that integration of SMIL into various presentation markup languages is the appropriate mechanism for supporting multiple output modalities. Integration with input modalities should be achieved with an event model.

Dialog Markup Language

We support the work of the W3C to define a dialog markup language and encourage such a language to be jointly developed with the WAP Forum. We believe such a language should define an application programming model that is independent of input and output modalities. This would help ensure a common authoring environment across devices, while allowing developers to provide innovative user interfaces on those platforms that supported them. An application could achieve lowest-common-denominator interoperability while seamlessly providing advanced features on those platforms on which they are supported.

4. Conclusion

To achieve the goal of a universal, ubiquitous and transparent Web, the W3C, the WAP Forum and others should work together to create an application environment with an extensibility framework that allows the seamless introduction of new technology. The ongoing voice browser work provides an excellent use case and testbed for such a framework. The evolution of the Wireless Application Environment to include services such as telephony, messaging, synchronization and persistent storage further provides an excellent set of use cases.