Ralph B. Case, Stéphane H. Maes and T. V. Raman
smaes@us.ibm.com
tvraman@us.ibm.com
IBM Research
Human Language Technologies Group
caser@us.ibm.com
IBM Software Group
WebSphere Transcoding Publisher
Introduction
This position paper outlines an approach for authoring
once for the future Mobile Internet where information is expected to be
accessible anytime, anywhere, through any device and where the user can at any
given time select to use the device and modality best suited to his or her
abilities and capabilities at that moment.
The problem of device-independent authoring should not
be dissociated from the issues of authoring applications to be rendered in
different modalities or multiple synchronized modalities [1].
We first define some terminology:
channel: it denotes a
particular device or a particular modality.
multi-channel
applications: applications designed for ubiquitous access through different
channels, one channel at a time.
multi-modal
applications: multi-channel applications, where multiple channels are
simultaneously available and synchronized.
For the “new web”, new content and applications are
developed with the intent of delivering it through many different channels with
different characteristics. Therefore
the application must be adapted to each channel. Since new devices and content emerge continuously, this
adaptation must be made to work for new devices not originally envisioned. In addition, it is important to be able to
adapt existing content that may not have been created with this multi-channel
deployment model in mind.
Multiple
authoring
Content targeted at multiple channels can be created
by multiple authoring:
Hand authoring of the application
in each target channel.
Authoring of style sheet
transformations of a common representation (device-independent) into the
different target presentation languages (final form).
In addition, for multi-modal applications, the
developer must also specify the synchronization of the different channels.
Single
Authoring For the Mobile Internet
We feel there is a strong need for a language that
supports single authoring across a large variety of devices and
modalities. As a first step, we would
like to focus the W3C/WAP workshop on collecting requirements for such a
language.
Motivation
For Single Authoring
Single authoring is motivated by the need to author,
maintain, and revise content for delivery to an ever increasing plethora of
end-user devices. Hand authoring of the
target pages leads to the “M times N
problem”. An application composed
on M “pages” to be accessed via N devices requires M x N authoring steps and it
results into M x N presentation pages to maintain. Generic separation of content from presentation results into
non-re-usable style sheets and a similar M x N problem with the style
sheets. Using an intermediate format
with two-step adaptation calls for M+N reusable transformations to be
defined. Appropriate definition of a standard
common intermediate format allows the M content-to-intermediate authoring steps
or transformations - one for each “page” - to be defined by content domain
experts while the N intermediate-to-device transformations can be programmed by
device experts.
Because of the increasing rate at which new devices are becoming available, the system must be able to adapt content for new devices that were not envisioned when the content was created. In addition, it is important to be able to adapt existing content that may not have been created with this multi-channel deployment model in mind.
Eventually, multiple authoring is an even more complex
problem if synchronization is needed across channels. As described below, single authoring leads to a simplification of
the synchronization authoring.
Single
Document --Multiple Views
We advocate a programming approach that enables
separation of specific content from the presentation enabling reusable style
sheets for default presentation in the final form. Specialization can then be performed in-line or via channel
specific style sheets.
The underlying principle of single authoring is the
Model View Controller:
The channel independent
description of the application constitutes the model
Channels are views of
this model. These views are obtained by
transforming the model representation into its final target form. The final forms are rendered into channel
specific legacy browsers (e.g. WAP browser, Web / HTML browser , C-HTML
browser, HDML browser, VoiceXML voice browser, etc.…).
The user interacts
through the browser on the view.
Separating content from presentation in order to achieve content re-use is now the accepted way of deploying future information on the World Wide Web. In the current W3C architecture, such separation is achieved by representing content in XML that is then transformed to appropriate final-form presentations via XSL transforms. Other transformation mechanisms could be considered.
Appropriate
factorization
We believe that single authoring can be achieved by
realizing that in addition to form and content there is a third component
independent of the target channel, the interaction, that lies at the heart of
turning static information presentations into interactive information.
Separation
Of Concerns -- ease of application development and maintenance
Single authoring for a multiplicity of interfaces and
deployment environments necessarily involves addressing of issues of
presentation specific to each channel e.g., designing the look and feel for the
visual presentation, the sound and feel for the auditory representation. We believe that a single authoring framework
should allow these concerns to be cleanly separated so that:
Content can be created
and maintained without presentation concerns.
Presentation rules
--including content transformations and style sheets can be maintained for
specific channels and deployment environments without adversely affecting other
aspects of the system.
Content and style can be
independently maintained and revised.
The result can be
specialized for a specific device or channel.
It is not guaranteed that one system for content
adaptation will be practical for all applications. It is necessary to allow vendors the flexibility to develop
different adaptation solutions to foster innovation and exploit new
technologies.
Some adaptation solutions may require separation of
adaptation functions or steps along several dimensions:
Different steps of an
adaptation solution may be logically separate.
For example, one step may summarize content, while another step converts
the markup language, while another step adapts the interaction model or
application flow.
Different steps may be
physically separate. For example, some
adaptation may be done at a server, some at a gateway or proxy and some on the
client device.
Finally, different steps
may be temporally separate. For
example, different versions of content may be produced at authoring time for
different classes of devices, while the final adaptation is done at content
delivery time based on the specific features of the target device.
In each of these cases, the different adaptation steps
may be produced by different specialized vendors. The final solution can then be produced by a third-party
integrator. Some standardization of the
formats and protocols to connect the adaptation steps is required.
Synchronized
Multi-modal Views
Multi-modality can be considered as a particular type of channel.
During multi-modal or multi-device interactions, the
MVC principle becomes especially relevant.
The user interacts via the controller on a given view. Instead of modifying the view, his or her
actions update the state of the model.
It results in an update of the different registered views to be
synchronized. Details can be found in
[2].
Single authoring for delivering to a multiplicity of
synchronized target devices and environment has one final crucial
advantage. As we evolve towards devices
that deliver multi-modal user interaction, single authoring enables the
generation of tightly synchronized presentations across different channels,
without requiring re-authoring of the multi-channel applications. The MVC principle guarantees that these
applications are also ready for synchronization across channels.
Such synchronization allows user intent expressed in a given channel to be propagated to all the interaction components of a multi-modal system. We speak of tightly coupled multi-modal interactions by opposition to loosely coupled multi-modal interactions where each channel has its own model that periodically synchronizes with the models associated to the other channels. A tightly coupled solution can support a wide range of synchronization granularities. It also allows optimization of the interaction, by allowing given interactions to take place in the channel that is best suited as well as to revert to another channel when it is not available or capable enough.
Recommendation
We recommend the constitution of a Working Group
within one of the leading standard organizations to address single authoring
languages and frameworks for multi-channel interactions including multi-modal
interactions.
Device-independent authoring is an integral part of
the multi-modal web issues and should not be addressed separately.
The resulting standard activities should address both
the authoring of new channel-independent content, and the adaptation of
existing content to be channel-independent.
Requirements
This section contains our requirements for a device
independent authoring language for the multi-modal web. As we consider that single authoring is the
key requirement of the multi-modal web, these requirements are also the
requirements for single authoring of multi-channel and multi-modal
applications.
XML compliant
Vendor neutral
Any tool developer can
target it or use it as an input representation
It can be used not only
to express data within an application, but also to pass it to a network
services provider, portal, or directly to an end-user device
A single authoring
process should handle both multi-channel applications and multi-modal
applications
Supports
channel-independents interaction description
Can be mapped using
style sheets to an open-ended set of device specific markups including
VoiceXML, WML, CHTML, HTML and others.
Support full function
programming; enabling rich-function devices such as desktop browsers
Extensible to allow new
interaction or presentation model abstractions
Can accommodate channel-
or device-specific specialization either in-line, as annotations, or using
style sheets
Supports a
developer-definable hierarchy of channels and devices
Supports specification
of data models in Xforms / Xschema to model the data that can be manipulated by
the end user
Enables fine-grain
synchronization of multi-modal interaction
Can accommodate both
synchronous and asynchronous data exchange, and connected as well as
disconnected operation
In
addition, for multi-modal rendering, we recommend to leverage the forthcoming
DOM level 2 specifications to enable the implementation of the MVC with legacy
browsers [1]. It implies that the
supported channel specific languages must have a DOM level 2 standardized
specification.
References
[1]
S. H. Maes and T. V. Raman, IBM position paper: “Position paper for the W3C/WAP Workshop on the
Multi-modal Web”, W3C/WAP joint Workshop on the Multi-modal Web, Hong Kong, September
2000.
[2] S. H. Maes and T. V. Raman, Multi-modal interaction in the Age of Information Appliance, in Proceedings ICME 2000, July 2000, New York, USA.