"One model and many presentations" in wireless applications

Thomas Lee, William W. Song, Ken Sung, Frank Tong

E-Business Technology Institute, The University of Hong Kong

Hong Kong, PRC

(Position Paper)

Email: {ytlee, wsong, ksung, ftong}@eti.hku.hk

With the rapid development of the Internet based connection to different devices, "one content-many presentations" has become a converging issue in the context of development of various markup languages for description of content as well as presentations. XML is widely accepted for description of the web content due to its simplicity, well-formedness, and extensibility. However, to present an XML document to a browser requires a style sheet to define how to print or display the XML document. In other words, since XML separates content from appearance, an XML requires a presentation description for the XML document appearance. For example, CSS (cascading style sheet) is such a presentation description and XSL (extensible style language) is another. Moreover, as more and more wireless devices are connected to the Web, such as PDA, WAP phones, pagers, to tailor the Web information to fit to the wireless devices is a key to extending the use of the Internet. Likewise, to actively acquire the messages from the WAP devices and transfer the messages to other Web users, who can merely access to the ordinary browsers, is an important issue in the sense of (short) message management, exchange, and logging.

Stemmed from the wireless collaboration, WAP is to bring the Internet content and advanced data services to digital cellular phones and other wireless devices through creating a global wireless protocol specification that works and scales across different networks.

Let us consider an example:

A user uses his WAP phone to connect to an inventory, obtains the inventory data, and sends an instruction (message) to supplement the inventory. The process is going on through the Web/Internet. The inventory data is stored in a database. The data in the Web is represented in XML and the data presented in the WAP phone is in WML.

From the example, two major problems, among others, can be identified, namely, scalability and diversity. By scalability, it is meant how to scale rich presentation description for the ordinary Internet browsers down to poor presentation support of wireless devices. For example, a normal PC browser can accommodate a color display of an image while a WAP phone can only show black and white image. Thus, the presentation description for the color display must be transformed into the description in which the major content is maintained when using the black and white device. Diversity indicates that the wireless terminals have distinct and various features, each with a particular merit. For example, cellular phones can provide voices and short textual message, PDA can display low-resolution picture, and so on.

At ETI (E-Business Technology Institute), we are putting efforts to address the issue of "one document many presentations". The issue deals with two aspects: presentation formatting and transcoding, and general presentation description.

Presentation formatting and transcoding

Information access on the Internet or the World Wide Web is no longer restricted to the browsers on PCs, and has been extended to a growing variety of information appliances, including cellular phones and television sets.

Different kinds of the appliances possess different combinations of characteristics, such as CPU power, screen size, data transfer rate, and user input interface. Effective content presentation and efficient content representation have become two major challenges confronting the Internet content providers.

The Extensible Markup Language (XML) and many technologies built with XML are devised to address these two issues. Unlike HyperText Markup Language (HTML), XML allows the content providers to specify content and presentation in separate documents. It separates content preparation and content presentation into two layers. As long as two processes of handling documents agree on a common interface between two layers, the change in one process will not affect the other. This common interface is usually specified by the Data Type Definition (DTD). When the content is prepared in XML complying with a particular DTD, different stylesheets, e.g. in XML Stylesheet Language Transformation (XSLT), can be used to format the same content into different presentation documents described by different markup languages, e.g. HTML, WML, etc. However, this does not solve all problems.

Firstly, this approach does not give significant improvement on the efficiency in content preparation. Different classes of XML contents are usually using different DTDs. For instance, the XML content for a customer profile and the content for a product catalog are very likely described by different DTDs. A stylesheet is required for each DTD of contents and for each markup language. Therefore, this approach can only reduces the complexity from M sets of contents by N markup languages to P groups of content sets by N markup languages. Of course, here P should be much less than M since each group of the P groups contains a number of sets from the M content sets. Furthermore, writing stylesheets can be a complicated process with a steep learning curve. This problem has led to our "One Document Many Presentations" R&D efforts.

Secondly, the stylesheet formatting techniques only apply on text but not on multimedia contents, which are mostly binary data. More often than not, the multimedia data, especially the rich media data, like animation, panorama and 3D graphics, requires scaling down the colors and other detail levels for display on low-computing-power devices, such as handheld devices and mobile phones. This is where Multimedia Transcoding Technology, that has been developed by ETI, plays a key role.

Device independent presentation description for various browsing devices

Nowadays, it can be observed that there are many different devices available for various purposes. If people want to publish their content to these devices, they are required to write different presentation documents (descriptions) for different devices because different devices describe the rich media in different ways. For example, video presentation is fine in PocketPC. For Palm, the video may be required to downgrade to slide show with no audio output. For WAP phone, the presentation may even become a text description. Moreover, different presentations are defined in different description languages, such as HTML description for browsers in PC, WML description for WAP devices. This is not only very time consuming, but also prone to conflict in connotations and structures. More importantly, the designers will be frustrated when they have to write different descriptions to present rich media in different devices because the requirements vary greatly for different presentation devices.

For example, suppose that the designer wants to publish a video presentation on multiple devices. For the devices with fast processor, the designer can directly transplant the video to these devices. For the devices with medium speed processor, he may have to reformat the video to a slide show. For the devices which can only display text, he may have to try to extract the caption on the video to form a text paragraph. If it is required to make all the formatting for each presentation, it would be quite frustrating.

In order to help the presentation designers, we consider a standard modeling language to handle rich media. This language should meet the following requirements:

At ETI, we attempt to define such a modeling language based on XML. A simple example of this markup (tag modeling) language is illustrated to capture the panorama media type as follows.

<panorama MaxLOD=1 MinLOD=0.3 InitLOD=1>
<photo src='panorama.jpg'/>
<narration text='The buildings you can see in Victoria Harbor'/>
<link x=120 y=343 tooltip='the University of Hong Kong'
      href='http://www.hku.hk' />
<link x=1221 y=734 tooltip='Hong Kong Bank'
    href='http://www.hsbc.com' />
</panorama>

The above script tries to describe a panorama view of Victoria Harbor. On the panorama, we can find Hong Kong Bank and the University of Hong Kong. This script has the following advantages. First, the script does not specify the size of the panorama on the screen. Instead, the size of the panorama is calculated based on the screen size of the device and the values of the panorama tag attributes. This approach enables the panorama to be displayed by the devices having different screen size.

Second, the script contains both narration and tool-tips. If the device supports only text display, the script can provide the following content to be seen on the device:

The buildings you can see in Victoria harbor

Third, the script allows links on the panorama. If the device supports users' interactions, the user will be able to supply instructions to navigate around the panorama image. In addition, it is noted that no browser-dependent information is used in the above script.

Next step

We will consider a formal definition of the general presentation description language, enabling it to bridge the gap between XML content representation and various device-specific presentation languages, such as WML, VoiceML, SMIL. The transcoders or mappings are implemented as transformation modulars to support both XML content design and presentation design.