Position paper for the W3C/WAP Workshop on the Web Device Independent Authoring

Ralph B. Case, Stéphane H. Maes and T. V. Raman

smaes@us.ibm.com

tvraman@us.ibm.com

IBM Research

Human Language Technologies Group

caser@us.ibm.com

IBM Software Group

WebSphere Transcoding Publisher

Introduction

This position paper outlines an approach for authoring once for the future Mobile Internet where information is expected to be accessible anytime, anywhere, through any device and where the user can at any given time select to use the device and modality best suited to his or her abilities and capabilities at that moment.

The problem of device-independent authoring should not be dissociated from the issues of authoring applications to be rendered in different modalities or multiple synchronized modalities [1].

We first define some terminology:

Ÿ channel: it denotes a particular device or a particular modality.

Ÿ multi-channel applications: applications designed for ubiquitous access through different channels, one channel at a time.

Ÿ multi-modal applications: multi-channel applications, where multiple channels are simultaneously available and synchronized.

For the “new web”, new content and applications are developed with the intent of delivering it through many different channels with different characteristics. Therefore the application must be adapted to each channel. Since new devices and content emerge continuously, this adaptation must be made to work for new devices not originally envisioned. In addition, it is important to be able to adapt existing content that may not have been created with this multi-channel deployment model in mind.

Multiple authoring

Content targeted at multiple channels can be created by multiple authoring:

Ÿ Hand authoring of the application in each target channel.

Ÿ Authoring of style sheet transformations of a common representation (device-independent) into the different target presentation languages (final form).

In addition, for multi-modal applications, the developer must also specify the synchronization of the different channels.

Single Authoring For the Mobile Internet

We feel there is a strong need for a language that supports single authoring across a large variety of devices and modalities. As a first step, we would like to focus the W3C/WAP workshop on collecting requirements for such a language.

Motivation For Single Authoring

Single authoring is motivated by the need to author, maintain, and revise content for delivery to an ever increasing plethora of end-user devices. Hand authoring of the target pages leads to the “M times N problem”. An application composed on M “pages” to be accessed via N devices requires M x N authoring steps and it results into M x N presentation pages to maintain. Generic separation of content from presentation results into non-re-usable style sheets and a similar M x N problem with the style sheets. Using an intermediate format with two-step adaptation calls for M+N reusable transformations to be defined. Appropriate definition of a standard common intermediate format allows the M content-to-intermediate authoring steps or transformations - one for each “page” - to be defined by content domain experts while the N intermediate-to-device transformations can be programmed by device experts.

Because of the increasing rate at which new devices are becoming available, the system must be able to adapt content for new devices that were not envisioned when the content was created. In addition, it is important to be able to adapt existing content that may not have been created with this multi-channel deployment model in mind.

Eventually, multiple authoring is an even more complex problem if synchronization is needed across channels. As described below, single authoring leads to a simplification of the synchronization authoring.

Single Document --Multiple Views

We advocate a programming approach that enables separation of specific content from the presentation enabling reusable style sheets for default presentation in the final form. Specialization can then be performed in-line or via channel specific style sheets.

The underlying principle of single authoring is the Model View Controller:

Ÿ The channel independent description of the application constitutes the model

Ÿ Channels are views of this model. These views are obtained by transforming the model representation into its final target form. The final forms are rendered into channel specific legacy browsers (e.g. WAP browser, Web / HTML browser , C-HTML browser, HDML browser, VoiceXML voice browser, etc.…).

Ÿ The user interacts through the browser on the view.

Separating content from presentation in order to achieve content re-use is now the accepted way of deploying future information on the World Wide Web. In the current W3C architecture, such separation is achieved by representing content in XML that is then transformed to appropriate final-form presentations via XSL transforms. Other transformation mechanisms could be considered.

Appropriate factorization

We believe that single authoring can be achieved by realizing that in addition to form and content there is a third component independent of the target channel, the interaction, that lies at the heart of turning static information presentations into interactive information.

Separation Of Concerns -- ease of application development and maintenance

Single authoring for a multiplicity of interfaces and deployment environments necessarily involves addressing of issues of presentation specific to each channel e.g., designing the look and feel for the visual presentation, the sound and feel for the auditory representation. We believe that a single authoring framework should allow these concerns to be cleanly separated so that:

Ÿ Content can be created and maintained without presentation concerns.

Ÿ Presentation rules --including content transformations and style sheets can be maintained for specific channels and deployment environments without adversely affecting other aspects of the system.

Ÿ Content and style can be independently maintained and revised.

Ÿ The result can be specialized for a specific device or channel.

It is not guaranteed that one system for content adaptation will be practical for all applications. It is necessary to allow vendors the flexibility to develop different adaptation solutions to foster innovation and exploit new technologies.

Some adaptation solutions may require separation of adaptation functions or steps along several dimensions:

Ÿ Different steps of an adaptation solution may be logically separate. For example, one step may summarize content, while another step converts the markup language, while another step adapts the interaction model or application flow.

Ÿ Different steps may be physically separate. For example, some adaptation may be done at a server, some at a gateway or proxy and some on the client device.

Ÿ Finally, different steps may be temporally separate. For example, different versions of content may be produced at authoring time for different classes of devices, while the final adaptation is done at content delivery time based on the specific features of the target device.

In each of these cases, the different adaptation steps may be produced by different specialized vendors. The final solution can then be produced by a third-party integrator. Some standardization of the formats and protocols to connect the adaptation steps is required.

Synchronized Multi-modal Views

Multi-modality can be considered as a particular type of channel.

During multi-modal or multi-device interactions, the MVC principle becomes especially relevant. The user interacts via the controller on a given view. Instead of modifying the view, his or her actions update the state of the model. It results in an update of the different registered views to be synchronized. Details can be found in [2].

Single authoring for delivering to a multiplicity of synchronized target devices and environment has one final crucial advantage. As we evolve towards devices that deliver multi-modal user interaction, single authoring enables the generation of tightly synchronized presentations across different channels, without requiring re-authoring of the multi-channel applications. The MVC principle guarantees that these applications are also ready for synchronization across channels.

Such synchronization allows user intent expressed in a given channel to be propagated to all the interaction components of a multi-modal system. We speak of tightly coupled multi-modal interactions by opposition to loosely coupled multi-modal interactions where each channel has its own model that periodically synchronizes with the models associated to the other channels. A tightly coupled solution can support a wide range of synchronization granularities. It also allows optimization of the interaction, by allowing given interactions to take place in the channel that is best suited as well as to revert to another channel when it is not available or capable enough.

Recommendation

We recommend the constitution of a Working Group within one of the leading standard organizations to address single authoring languages and frameworks for multi-channel interactions including multi-modal interactions.

Device-independent authoring is an integral part of the multi-modal web issues and should not be addressed separately.

The resulting standard activities should address both the authoring of new channel-independent content, and the adaptation of existing content to be channel-independent.

Requirements

This section contains our requirements for a device independent authoring language for the multi-modal web. As we consider that single authoring is the key requirement of the multi-modal web, these requirements are also the requirements for single authoring of multi-channel and multi-modal applications.

Ÿ XML compliant

Ÿ Vendor neutral

Ÿ Any tool developer can target it or use it as an input representation

Ÿ It can be used not only to express data within an application, but also to pass it to a network services provider, portal, or directly to an end-user device

Ÿ A single authoring process should handle both multi-channel applications and multi-modal applications

Ÿ Supports channel-independents interaction description

Ÿ Can be mapped using style sheets to an open-ended set of device specific markups including VoiceXML, WML, CHTML, HTML and others.

Ÿ Support full function programming; enabling rich-function devices such as desktop browsers

Ÿ Extensible to allow new interaction or presentation model abstractions

Ÿ Can accommodate channel- or device-specific specialization either in-line, as annotations, or using style sheets

Ÿ Supports a developer-definable hierarchy of channels and devices

Ÿ Supports specification of data models in Xforms / Xschema to model the data that can be manipulated by the end user

Ÿ Enables fine-grain synchronization of multi-modal interaction

Ÿ Can accommodate both synchronous and asynchronous data exchange, and connected as well as disconnected operation

In addition, for multi-modal rendering, we recommend to leverage the forthcoming DOM level 2 specifications to enable the implementation of the MVC with legacy browsers [1]. It implies that the supported channel specific languages must have a DOM level 2 standardized specification.

References

[1] S. H. Maes and T. V. Raman, IBM position paper: “Position paper for the W3C/WAP Workshop on the Multi-modal Web”, W3C/WAP joint Workshop on the Multi-modal Web, Hong Kong, September 2000.

[2] S. H. Maes and T. V. Raman, Multi-modal interaction in the Age of Information Appliance, in Proceedings ICME 2000, July 2000, New York, USA.