Towards a Generic XML Content Presentation Model

Web Applications and Compound Documents Workshop
Position Paper

Michael Pediaditakis ( and David Shrimpton (
Computing Laboratory, University of Kent, United Kingdom.
30 April, 2004


Web applications are typically proprietary and incompatible technologies. We propose a solution towards device neutral interoperable applications based on a common model for XML document presentation. The proposed model features a stable core that enables stable browser implementations and a highly extensible upper layer to support new functionality without browser modifications. We summarize the problems of compound documents which are necessary for such a presentation model by separating their integration requirements according to their native support by a browser or not. We conclude by future directions and standardization requirements for generic, device independent applications and for a generic integration model for compound documents.

1. Introduction

Web applications are typically used to enhance web sites by providing functionality that is not accessible using widely supported languages such as XHTML. The processing model of a web application and the associated data are downloaded either independently (e.g. plug-ins) or as a whole (e.g. an applet). However, in general the processing models of the various web applications are incompatible browser extensions, where no common data representation is used.

We believe that the heavy use web applications to extend browser functionality is a habit of the pre-XML era, where a browser extension was needed for practically everything not in the HTML specification. XML is a generic data representation language where any type of presentation, no matter how complex the graphics and the interaction, can be described. Current web browsers, aiming to provide all the required functionality, resemble more generic middleware applications that purpose specific ones. Consequently, the data representation and the processing power already exists. What is missing is a well defined, generic XML content presentation model that defines the presentation of XML data in a generic and preferably device neutral manner. Such a model will enable the evolution of current web applications to device neutral and interoperable components that reuse existing functionality to achieve arbitrary complex presentation and interaction levels.

It is clear that the sum of the required functionality should not (or indeed cannot) be defined within a single language specification. Instead, it should be provided in a modular and extensible way using a variety of XML languages with additional languages built on top of these. Any non-trivial application would require compound XML documents that combine the above languages. Consequently, a generic specification of how compound documents are presented is required for a generic XML presentation model. The basic problem with compound documents is that their meaning is generally undefined concerning both the core XML processing tasks, such as validation, and higher level tasks such as inter-language user interface event communication[Lee02, TAG04].

Our research focuses on the development of a generic model for presentation of compound XML documents in a variety of devices. We have already presented a transformation approach[PeS03] and we are currently working on a more generic model that is based on a generic and device neutral browser presentation model. Section 2 describes our views and efforts on how current web applications can evolve to device neutral interoperable technologies based on a generic XML content presentation model. Section 3 focuses on the compound document processing aspect of the above model.

2. Web Applications

Typically, a web application is used instead of a document for presentation and interaction functionality not supported by native browser languages or for more precise control of the presentation. However, the above mostly reflects common practice and not any real difference between web applications and documents. If there is a generic XML document presentation model, which can be precisely controlled in a device independent manner (e.g. using scripting), the difference between web applications and documents disappears. We, therefore, propose that web applications can evolve to generic interoperable components if a sufficiently expressive presentation model is established for XML documents.

2.1 A presentation model

A generic presentation and interaction model requires a well defined set of core components with their associated syntax and behaviour and a generic processing model that enables extensions to this set. Ideally, the set of components expose well defined interfaces that can be used to manipulate the presentation in a device independent manner.

2.1.1 A set of low level semantics

The set of core components is essentially the set of languages that a browser may support. It is clear that it is impossible to include all the components required for current and future web applications. However, at an adequate abstraction level, the core components can sufficiently support the majority of the required extensions. They must be sufficiently abstract to be common across the devices and sufficiently low level to give full control over the presentation. These components, essentially, provide the lower level functionality upon which more complex components can be built. Therefore, we will refer the them as the low level semantics.

We have performed a preliminary investigation, based on the requirements for a generic presentation model and existing language features, to specify the required low level semantics set. The resulting components belong to the following categories:

The set of low level semantics for the Web must also consider a variety of devices. This can be done in two levels. Firstly, the above set should be separated in modules (similarly to XHTML basic, XHTML mobile etc). A device profile (such as CC/PP), in addition to the basic characteristics of a device, can describe the modules supported by a device. Such as profile, allows dynamic content creation and mapping of more complex (higher level) semantics that fit the specific device. The above mapping is partially illustrated in XMLPipe, a transformation engine for device neutral content presentation[PeS03]. The second level of device specific support comes from a device independent API to device dependent implementations of the low level semantics. Device neutral scripting can be used (at presentation time) to query and control device dependent information that cannot be easily described in a profile (such as text layout algorithms etc).

A combination of existing recommendations can be used as the low level semantics set as far as their interoperation and a presentation-time access API are precisely defined. For instance, a combination of CSS + XHTML + SVG + MathML + XSL-FO + SMIL + XForms + JavaScript covers a wide range of the functionality above.

2.1.2 Generic behaviour model

In order to define a generic behaviour model for web applications presentation and interaction we must bound the required behaviour. Concerning the presentation, most applications are limited to two dimensional graphics and the notion of time and sound. There is already significant research, in the multimedia/document formatting area, on generic models for organizing presentable components (graphics and sound) to a timed two-dimensional model. It is feasible to develop such a generic model for the Web. Concerning the interaction, there is no constraint on the various forms of interaction that a device can provide. However, the combination of predefined interactive low level semantics and a consistent way of modifying the properties of any presentable object upon interaction can be established. The result can be an sufficiently generic interaction model.

2.1.3 Generic processing model

An extensible processing model must allow documents to contain higher level constructs (henceforth: high level semantics) that combine low level semantics to enhance the browser functionality. We propose the following processing sequence:

  1. Document parsing to a DOM tree (High Level Semantics)
  2. Validation
  3. Transformation to a low level semantics representation
  4. Instantiation of the device dependent low level semantics presentable objects
  5. Orchestrate the presentation (spatio-temporal component layout)
  6. Present the document
  7. Process any user/remote interaction

The above model is based on a stable set of low level semantics. High level semantics are supported by transformation to the low level ones. For instance, a high level concept of an image-map can be transformed to a combination of an image and a link orchestrated by scripting.

The internal representation can be based on the Document Object Model, orchestrated using DOM Events where behaviour is attached to objects using a binding syntax such as XBL[Hya01]. The low level semantics interoperate (orchestration, presentation) using the well defined behaviour model. The interoperation of high level semantics is an interesting issue that we discuss in Section 3.

The above model, essentially, provides a distributed browsing platform where there is a stable device dependent core, with a device neutral interface (the low level semantics) and a high level continuously evolving set of high level semantics (in terms of supported XML languages). This allows both a stable client that must not be continuously updated and an architecture that continuously adapts to new applications.

2.3 Current state and future directions

Currently, even if there is the foundation for the proposed generic presentation model there a significant amount of work to be done. The low level semantics already exist in current recommendations. However a generic presentation and processing model is missing. A first step would be to standardize how existing language modules can be integrated. Then, a well defined syntax for application profiles, a presentation model and a well defined API for the presentable objects can follow (this might be an additional part of DOM-3).

We are currently experimenting with the above ideas in a pilot browser implementation. There is a minimalistic set of low level semantics residing in a lower level than existing recommendations (containing concepts such as scroll areas, pointers etc). A well defined API is defined which can be accessed by Java code (in the document or in a separate behaviour specification document) and each device supplies device specific implementations of the aforementioned API. The results are promising but there is still work to be done

3. Compound Documents

The multitude of XML languages and the need for compound documents introduce the requirement for a generic integration model for all the XML processing stages (e.g. the ones in Section 2.1.3). The requirements of such a model differ between the integration within a well defined set of a languages (low level semantics) and languages which are continuously created and updated by independent language authors (high level semantics).

High level semantics prohibit the use of specific integration profiles (such as XHTML + SVG + MathML) since their development is exponentially complex due to the multitude of languages. We have proposed a simplistic integration model for the validation and transformation processing steps of independently developed languages in [PeS03]. It is based on the classification language elements and attributes in three types of "handled constructs". A document fragment of a language rooted at a handled construct can be integrated at specific points to a document tree of another language (according to some simple rules). The syntax and integration information for each language is found using RDDL[BoB02] links at the URI of the language namespace. Thus, no processing instructions or document specific processing sequences are required. The result is successful document presentation that covers cases such as XHTML + SVG + MathML without the need of specific profiles.

Concerning the low level semantics, profiles are only good as short term solutions due to the multitude of module combinations required for the various devices. Their applicability can be enhanced by validators that use device profiles to retrieve the set of supported elements. A long term solution would require a generic integration model such as the one for high level semantics. A well defined presentation model (Section 2) is the basis of their interoperation (events etc). However, we have noticed that, due to the multitude of the required presentation abstractions, the layout processing cannot be efficiently described in procedural terms. It has been proven that most of the web layout problems can be efficiently addressed in a declarative manner using linear constraints[BLM00]. We believe that incorporating a constraint system in the presentation model will enhance the low level semantics interoperation in a modular manner.

4. Conclusion

Summarizing, we propose that in order to move from a multitude of incompatible web applications to an equally or even more expressive set of interoperable technologies, a standard presentation model for XML documents must be established. Moreover, we have expressed the different integration requirements for compound documents depending on their direct support by the browser.

Our current research examines the problem of generic XML content presentation. The proposed validation and transformation of compound documents have been implemented in the XMLPipe platform[PeS03]. We are currently investigating more generic approaches based on a constraint based presentation processing model that might lead to a generic way of handling the majority of XML content in a variety of devices.


[BoB02] Jonathan Borden, Tim Bray, Resource Directory Description Language (RDDL), W3C Note, February 2002
[BLM00] Alan Borning, Richard Kuang-Hsu Lin, Kim Marriott, Constraint-based document layout for the Web, in Multimedia Systems, Vol 8, pp 177-189, Springer-Verlag New York, Inc.
[Hya01] David Hyatt, XBL - XML Binding Language, W3C Note, February 2001
[Lee02] Tim Berners Lee, Axioms of Web Architecture: the meaning of a document, December 2002, W3C
[PeS03] Michael Pediaditakis and David Shrimpton, Device neutral pipelined processing of XML documents, May 2003, in Proceedings of the Twelfth International World Wide Web Conference
[TAG04] Technical Architecture Group, Technical Architecture Group Issues, April 2004, W3C