This Report provides a summary of the work and results obtained by the Model-Based User Interfaces Incubator Group (MBUI-XG). The MBUI-XG hopes to enable a new generation of Web authoring tools and runtimes that will make it much easier to create tomorrow's Web applications and to tailor them for a wide range of user preferences, device capabilities and environments. To achieve this, the MBUI XG has evaluated research on model-based user interface (MBUI) design as a framework for authoring Web applications and with a view to proposing work on related standards.

Introduction

Web application developers face increasing difficulties due to wide variations in device capabilities, in the details of the standards they support, the need to support assistive technologies for accessibility, the demand for richer user interfaces (UIs), the suites of programming languages and libraries, and the need to contain costs and meet challenging schedules during the development and maintenance of applications.

Research work on model-based design of context-sensitive UIs has sought to address the challenge of reducing the costs for developing and maintaining multi-target UIs through a layered architecture that separates out different concerns. This architecture focuses on design and separates off the implementation challenges posed by specific delivery channels. The architecture enables developers to work top-down or bottom-up. The implementation or "Final-UI" can be generated automatically, subject to developer preferences or adaptation policies. This includes the notion of UI skins where a particular style is applied to the models defined by the Concrete UI.

During the last year the W3C Model-Based UI XG has evaluated research on MBUIs , including end-to-end models that extend beyond a single Web page, and has assessed its potential as a framework for developing context-sensitive Web applications. This report gives an overview of the main results achieved by such an Incubator Group. After this introduction the main concepts and rationale of MBUI Design is described followed by a description of the CAMELEON Unified Reference Framework. Then, a complete example is shown. The report continues with a description of the meta-models ....

[section to be completed by JMCF]

Model-Based Approaches for User Interfaces

Introduction

The purpose of Model-Based Design is to identify high-level models, which allow designers to specify and analyse interactive software applications from a more semantic oriented level rather than starting immediately to address the implementation level. This allows them to concentrate on more important aspects without being immediately confused by many implementation details and then to have tools which update the implementation in order to be consistent with high-level choices. Thus, by using models which capture semantically meaningful aspects, designers can more easily manage the increasing complexity of interactive applications and analyse them both during their development and when they have to be modified [[P05]].

In recent years model-based approaches have evolved in parallel with the aim of coping with the different challenges raised by the design and development of UIs in continuously evolving technological settings. We can identify various generation of works in this area [[PSS09]]. The first generation of model-based approaches in which the focus was basically on deriving abstractions for graphical UIs (see for example UIDE [[FS94]]). At that time, UI designers focused mainly on identifying relevant aspects for this kind of interaction modality.

Then, the approaches evolved into a second generation focusing on expressing the high-level semantics of the interaction: this was mainly supported through the use of task models and associated tools, aimed at expressing the activities that the users intend to accomplish while interacting with the application (see for example Adept [[J93]], GTA [[vdV94]], ConcurTaskTrees (CTT) [[P99]]).

Afterwards, thanks to the growing affordability of new interactive platforms, in particular mobile ones, the work of UI designers mainly focused on how to cope with the relentless appearance of new devices on the market and the need to cope with their different characteristics. As previously pointed out by Myers et al. (2000), the increasing availability of new interaction platforms raised a new interest in model-based approaches in order to allow developers to define the input and output needs of their applications, vendors to describe the input and output capabilities of their devices, and users to specify their preferences. However, such approaches should still allow designers to have good control on the final result in order to be effective.

After having identified relevant abstractions for models, the next issue is specifying them through a suitable language that enable integration within development environments, so as to facilitate the work of the designers and developers. For this purpose, the notion of User Interface Description Language (UIDL) has emerged in order to express any model.

It is expected that a model-driven approach of UI development could represent an engineering effort attempting to systematize UI development. It does so by constructing high-level requirements and, progressively, transforms them to obtain specifications that are detailed and precise enough to be rendered or transformed into code. This type of approach is referred to in the Software Engineering literature as the transformational approach.

Any method and development tool is expected to effectively and efficiently support a flexible development lifecycle, enforcing a minimum number of priority constraints. These constraints should define which development artifacts must be specified before others, suggesting for example how and when to proceed from one development step to another.

The CAMELEON Reference Framework

The CAMELEON Unified Reference Framework [[CCB02]] [[CCTLBV03]] was produced by the EU-funded CAMELEON Project [[CAM-Proj]] and results from two key principles:

CAMELEON describes a framework that serves as a reference for classifying UIs supporting multiple targets, or multiple contexts of use in the field of context-aware computing. Furthermore, the CAMELEON Framework provides a unified understanding of context-sensitive UIs rather than a prescription of various ways or methods of tackling different steps of development.

The Context of Use

Context is an all-embracing term. Composed of “con” (with) and “text”, context refers to the meaning that must be inferred from the adjacent text. As a result, to be operational, context can only be defined in relation to a purpose, or finality [[CRO02]]. In the field of context-aware computing a definition of Context that has been largely used is provided by [[Dey2000]]: Context is any information that can be used to characterize the situation of entities (i.e. whether a person, place or object) that are considered relevant to the interaction between a user and an application, including the user and the application themselves. Context is typically the location, identity and state of people, groups and computational and physical objects.

While the above definition is rather general, thus encompassing many aspects, it is not directly operational. Hence, we hereby define the Context of Use of an interactive system as a dynamic, structured information space that includes the following entities:

  • a model of the User, U, (who is intended to use or is actually using the system)
  • the hardware-software Platform, P, (which includes the set of computing, sensing, communication, and interaction resources that bind together the physical environment with the digital world)
  • the social and physical Environment, E, (where the interaction is actually taking place).
Thus, a context of use is a triple composed by (U, P, E)

The User represents the human being (or a human stereotype) who is interacting with the system. The characteristics modelled or relevant can be very dependant on the application domain. Specific examples are age, level of experience, the permissions, preferences, tastes, disabilities, short term interests, long term interests, etc. In particular, perceptual, cognitive and action disabilities may be expressed in order to choose the best modalities for the rendering and manipulation of the interactive system.

The Platform is modeled in terms of resources, which in turn, determine the way information is computed, transmitted, rendered, and manipulated by users. Examples of resources include memory size, network bandwidth, and input and output interaction devices. [[CCB02]] distinguishes between elementary platforms (e.g. laptop, PDA, mobile phone), which are built from core resources (e.g. memory, display, processor) and extension resources (e.g. external displays, sensors, mice), and clusters, which are built from elementary platforms. Resources motivate the choice for a set of input and output modalities and, for each modality, the amount of information made available. W3C's Delivery Context Ontology [[DCONTOLOGY]] is intended to define a standard Platform Model.

The Environment denotes the set of objects, persons and events that are peripheral to the current activity but that may have an impact on the system and/or users behavior, either now or in the future (Coutaz and Rey, 2002). According to our definition, an environment may encompass the entire world. In practice, the boundary is set up by domain analysts whose role is to elicit the entities that are relevant to the case at hand. Specific examples are: user's location, ambient sound, lighting or weather conditions, present networks, nearby objects, user's social networks, level of stress ...

The relationship between a UI and its contexts of use leads to the following definitions:

Multi-target (or multi-context) UI
A multi-target (or multi-context) UI supports multiple types of users, platforms and environments. Multi-user, multi-platform and multi-environment UIs are specific classes of multi-target UIs which are, respectively, sensitive to user, platform and environment variations. [[CCTLBV03]]
Adaptive UI
An Adaptive UI refers to a UI capable of being aware of the context of use and to (automatically) react to changes of this context in a continuous way (for instance, by changing the UI presentation, contents, navigation or even behaviour).
Adaptable UI
An Adaptable UI can be tailored according to a set of predefined options. Adaptability normally requires an explicit human intervention. We can find examples of UI adaptability on those word processors where the set of buttons contained by toolbars can be customized by end users.
Plastic UI
A Plastic UI is a multi-target UI that preserves usability across multiple targets. Usability is not intrinsic to a system. Usability can only be validated against a set of properties set up in the early phases of the development process. [[CCTLBV03]]

Abstraction Levels

The CAMELEON Reference Framework, structures the development life cycle into four levels of abstraction, from the task specification to the running interface (see [[fig-cameleon]]):

  • The Task and Concepts level (corresponding to the Computational-Independent Model–CIM–in MDE) which considers: (a) the logical activities (tasks) that need to be performed in order to reach the users’ goals and (b) the domain objects manipulated by these tasks. Often tasks are represented hierarchically along with indications of the temporal relations among them and their associated attributes.
  • The Abstract User Interface (AUI) (corresponding to the Platform-Independent Model–PIM– in MDE) is an expression of the UI in terms of interaction spaces (or presentation units), independently of which interactors are available and even independently of the modality of interaction (graphical, vocal, haptic …). An interaction space is a grouping unit that supports the execution of a set of logically connected tasks.
  • The Concrete User Interface (CUI) (corresponding to the Platform-Specific Model–PSM– in MDE) is an expression of the UI in terms of “concrete interactors”, that depend on the type of platform and media available and has a number of attributes that define more concretely how it should be perceived by the user. "Concrete interactors" are, in fact, an abstraction of actual UI components generally included in toolkits.
  • The Final User Interface (FUI) (corresponding to the code level in MDE) consists of source code, in any programming language or mark-up language (e.g. Java or HTML5). It can then be interpreted or compiled. A given piece of code will not always be rendered on the same manner depending on the software environment (virtual machine, browser …). For this reason, we consider two sublevels of the FUI: the source code and the running interface

These levels are structured with a relationship of reification going from an abstract level to a concrete one and a relationship of abstraction going from a concrete level to an abstract one. There can be also a relationship of translation between models at the same level of abstraction, but conceived for different contexts of use. These relationships are depicted on [[fig-cameleon]].

CAMELEON Reference Framework
Relationships between components in the CAMELEON Reference Framework

User Interface Description Languages

After having identified relevant abstractions for models, the next issue is specifying them through a suitable language that enable integration within development environments, so as to facilitate the work of the designers and developers. For this purpose, the notion of User Interface Description Language (UIDL) has emerged in order to express any aforementioned model.

A UIDL [[GGC09]] is a formal meta-language used in Human-Computer Interaction (HCI) in order to describe a particular UI independently of any implementation technology. As such, the UI might involve different interaction modalities (e.g., graphical, vocal, tactile, haptic, multimodal), interaction techniques (e.g., drag and drop) or interaction styles (e.g., direct manipulation, form fillings, virtual reality). A common fundamental assumption of most UIDLs is that UIs are modelled as algebraic or model-theoretic structures that include a collection of sets of interaction objects together with behaviours over those sets. A UIDL can be used during:

The design process for a UIDL encompasses the definition of the following artefacts:

UIDL is a more general term than "User Interface Markup Language" (UIML) which is often defined as [[UIML-Def]]: a markup language that renders and describes graphical user interfaces and controls. Many of these markup languages are dialects of XML and are dependent upon a pre-existing scripting language engine, usually a JavaScript engine, for rendering of controls and extra scriptability. Thus, as opposed to a UIML, a UIDL is not necessarily a markup language (albeit most UIDLs are) and does not necessarily describe a graphical user interface (albeit most UIDLs abstract only graphical user interfaces).

[[GGC09]] includes a table comparing major UIDLs today. Most UIDLs are limited in scope and/or usage, have been stopped or are the property of some companies that do not allow their usage without paying any royalty. It can also be noticed that these UIDLs are very much heterogeneous in terms of coverage, aims, and goals, software support, etc. Hence, many UIDLs have been introduced so far, but still there is a need for a unified, standard UIDL that will encompass the fruitful experiences of the most recent of them.

Multi-Path Transformational UI Development

The variety of the approaches adopted in organizations and the rigidity of existing solutions provide ample motivations for a UI development paradigm that is flexible enough to accommodate multiple development paths and design situations while staying precise enough to manipulate the information and knowledge required for UI development. To alleviate these problems, a development paradigm of multipath UI development is introduced by [[LV09]]. Such a development paradigm is characterized by both a transformational approach and multiple development paths formed by different development steps. Thus, different development steps can be combined together to form alternate development paths that are compatible with the organization's tools, contraints, conventions and contexts of use.

Transformation Steps

[[LV09]] describes different kinds of transformation steps:

  • Reification covers the inference process from high-level abstract descriptions to run-time code. The CAMELEON Reference Framework recommends a four-step reification process: a Concepts-and-Tasks Model is reified into an Abstract UI which in turn leads to a Concrete UI. A Concrete UI is then turned into a Final User Interface, typically by means of code generation techniques.
  • Code generation is a particular case of reification which transforms a Concrete UI Model into compilable or interpretable code.
  • Translation is an operation that transforms a description intended for a particular target into a description at the same abstraction level but aimed at a different target.
  • Reflection transforms a UI representation at a given level of abstraction to another UI representation at the same level of abstraction for the same context of use.
  • Abstraction is an operation intended to map a UI representation from one non-initial level of abstraction to a higher level of abstraction. In the context of reverse engineering, it is the opposite of reification.
  • Code reverse engineering is a particular case of abstraction from executable or interpretable code to models.

Development Paths

Transformation types have been introduced in the previous section. These transformation types are instantiated into development steps. These development steps may be composed to form development paths. Several types of development paths are identified by [[LV09]]:

  • Forward engineering is a composition of reification and code generation enabling a transformation of a high-level viewpoint into a lower-level viewpoint.
  • Reverse engineering is a composition of abstractions and code reverse engineering enabling a transformation of a low-level, viewpoint into a higher level viewpoint.
  • Context of use adaptation is a composition of a translation with another type of transformation enabling a viewpoint to be adapted in order to reflect a change in the context of use of a UI.
  • Middle-out development: This term refers to a situation where a programmer starts a development by a specification of the UI (no task or concept specification is priorly built). Several contributions have shown that in reality, a development cycle is rarely sequential and even rarely begins by a task and domain specification. Literature in rapid prototyping converges with similar observations. Middle-out development shows a development path starting in the middle of the development cycle e.g., by the creation of a CUI or AUI model. After several iterations of this level (more likely until customer's satisfaction is reached) a specification is reverse engineered. From this specification the forward engineering path is followed.
  • Retargeting: This transition is useful in processes where an existing system should be retargeted, that is, migrated from one source computing platform to another target computing platform that poses different contraints. Retargeting is a composition of reverse engineering, context adaptation and forward engineering. In other workds a Final UI code is abstracted away into a CUI (or an AUI). This new CUI and/or AUI is reshuffled according to specific adaptation heuristics. From this reshuffled CUI and/or AUI specification a new interface code is created along a forward engineering process.

The CAMELEON Reference Framework promotes a four-step forward engineering development path starting with domain concepts and task modelling. Although research in HCI has promoted the importance of task modelling, practicioners often skip this stage, and directly produce CUIs using protyping tools such as Flash because of the lack of tools allowing rapid prototyping from task models. This practice corresponds to the last two steps of the reification process recommended in the reference framework. Nonetheless, the framework can be instantiated with the number of reification steps that fits designers culture. In other words, designers can choose the entry point in the reification process that best fits their practice. If necessary, the missing abstractions higher in the reification process can be retrieved through reverse engineering

A Simple Example

The following example is intended to provide a better understanding of the different layers of abstraction introduced by the CAMELEON Reference Framework. It illustrates how a simple web search interface can be modelled at different abstraction levels. At the task level the activities to be performed by the user to reach his goal are modelled. Then, the AUI level serves to model the interactors and containers which can support the user's tasks. It can be observed that such interactors are platform and modality independent. At the CUI level graphical concrete interactors and containers (window, textInput, button) have been introduced. Finally, CUI interactors are realized by means of HTML markup.

Cameleon Reference Framework instantiated
An instantation of the CAMELEON Reference Framework

State of the Art

Context Models

Context of Use Model (NEXOF-RA)

[[fig-cusemodel]] depicts graphically the "Context of Use" Model proposed by the NEXOF-RA Project [[NEXOF-RA]]. Such a Model captures the Context of Use in which a user is interacting with a particular computing platform in a given physical environment in order to achieve an interactive task.

Context of Use is the main entity which has been modelled as a an aggregation of User, Platform and Environment. They are all Context Elements. A Context Element is an instance of Context Entity. A Context Property represents a characteristic of a Context Element or information about its state. A Context Property might be associated to zero or more instances of Context Value. Examples of Context Property instances are: 'position', 'age' or 'cpuSpeed'. There can be Context Property instances composed by other sub-properties. For example, the 'position' property is typically composed by: 'latitude', 'longitude' and 'altitude'. Both a Context Property and a Context Value can be associated to different metadata represented by the Context Property Description and Context Value Description classes respectively. A Context Property can be obtained from different Context Providers. A Device Description Repository (DDR) [[DDR-REQUIREMENTS]] [[DD-LANDSCAPE]] is a Context Provider of information about the "a priori known" characteristics of Platform Components, particularly devices or web browsers.

The model below also describes a simple, abstract conceptual model for the Platform. A Platform can be represented as an aggregation of different Aspect [[DDR-SIMPLE-API]] instances (device, web browser, network, camera, ...), which are called Components. To this aim we have splitted this relationship into three different aggregations: active, default and available. active indicates that aComponent is "running". For example, if a camera is "on" such a Component is said to be "active". default conveys what is the referred Aspect instance when there is no a explicit mention of an specific one. Finally, available represents what are the "ready to run" Components. For example, when a device has more than one web browser installed, the "available web browsers" would be all the installed web browsers that could potentially be put into running.

Context Meta-Model
Context of Use Model (NEXOF-RA Project)

Platform Model : W3C's Delivery Context Ontology

The Delivery Context Ontology (DCO) [[DCONTOLOGY]] is a W3C specification (work in progress) which provides a formal model of the characteristics of the environment in which devices interact with the Web or other services. The Delivery Context, as defined by [[DI-GLOSS]], includes the characteristics of the Device, the software used to access the service and the Network providing the connection among others. DCO is intended to be used as a concrete, standard Platform Model, even though due to convenience reasons it also models some Environment entities.

[[fig-dco]] gives an overview of the main entities modelled by DCO. The root entity is the DeliveryContext class which is linked to the currentDevice, currentUserAgent, currentNetworkBearer and currentRuntimeEnvironment. The former are in fact active Components from the point of view of the Context of Use Model presented on the previous section. The Device class has been modelled as an aggregation of DeviceSoftware and DeviceHardware. In addition DCO also models some elements of the Environment such as the Location or the Networks present.

Delivery Context Ontology
Delivery Context Ontology : Main Entities

GUMO and UserML

GUMO and UserML are two formalisms proposed by [[HCK06]] in order to deal with the problem of representing generic user models. The SITUATIONALSTATEMENTS and the exchange language UserML work on the syntactical level, while the general user model ontology GUMO has been developed (using OWL) on the semantical level. SITUATIONALSTATEMENTS represent partial descriptions of situations like user model entries, context information or low-level sensor data. SITUATIONALSTATEMENTS follow a layered approach of meta level information arranged in five boxes: mainpart , situation, explanation, privacy and administration. These boxes have an organizing and structuring functionality. An example of a SITUATIONALSTATEMENT represented in the UserML language can be seen below

Insert the figure with the XML code

The main conceptual idea in the approach of SITUATIONALSTATEMENTS that influences the construction of the GUMO ontology is the division of descriptions of user model dimensions into three parts: auxiliary, predicate and range as shown in the example below. As a matter of fact, the range attribute offers a new degree of freedom to the ontology definition: it decouples the definition for the predicate from possible range scales.

subject {UserModelDimension} object
subject {auxiliary,predicate,range} object

For example if one wants to say something about the user’s interest in football, one could divide this so-called user model dimension into the auxiliary part “has interest”, the predicate part “Football” and the range part “low-medium-high” as shown in the figure below. Likewise, If a system wants to express something like the user’s knowledge about Beethoven’s Symphonies, one could divide this into the triple “has knowledge”, “Beethoven’s Symphonies” and “poor-average-good-excellent”

Peter {hasInterest,Football,low-medium-high} low
Peter {hasKnowledge,Beethoven's Symphonies,poor-average-good-excellent} good

The implication for the general user model ontology GUMO of these examples above is the clear separation between user model auxiliaries, predicate classes and special ranges. What leads to a tricky problem is that actually everything can be a predicate if the auxiliary is “interest” or “knowledge”.

Information in the situation box is responsible for the temporal and spatial embedding of the whole statement in the real physical world. With this open approach one can handle the issue of the history in user modeling and context-awareness. Particularly the attribute durability carries the qualitative time span of how long the statement is expected to be valid (minutes, hours, days, years). In most cases when user model dimensions or context dimensions are measured, one has a rough idea about the expected durability, for instance, emotional states change normally within hours, however personality traits won’t change within months.

The GUMO Ontology defines both User Model Dimensions (e.g. hasInterest, hasKnowledge, hasProperty, hasBelieve, hasPreference) and User Model Auxiliaries (e.g. Ability And Proficiency, Personality, Emotional State, Physiological State, Mental State). However, as stated above, it turned out that actually any concept in the whole world can be needed to express user model data. To overcome such an issue the GUMO authors propose to use the UBISWORLD Ontology. UBISWORLD can be used to represent some parts of the real world like an office, a shop, a museum or an airport. It represents persons, objects, locations as well as times, events and their properties and features.

Task Models

ConcurTaskTrees (CTT)

CTT is a notation for task model specifications which has been developed to overcome limitations of notations previously used to design interactive applications. Its main purpose is to be an easy-to-use notation that can support the design of applications of any degree of complexity.

The main features of CTT are:

  • Hierarchical structure, providing different levels of granularity and allowing large and small task structures to be reused, at both a low and a high semantic level.
  • Graphical syntax, CTT task models are represented as icons and trees.
  • Concurrent notation, operators for temporal ordering are used to link subtasks at the same abstraction level. Such semantic ordering determines the user tasks that should be active at any time.
  • Focus on activities, thus it allows designers to concentrate on the most relevant aspects when designing interactive applications that encompass both user and system-related aspects avoiding low levels implementation details that at the design stage would only obscure the decisions to take.

This notation has shown two positive results:

  • an expressive and flexible notation able to represent concurrent and interactive activities, also with the possibility to support cooperations among multiple users and possible interruptions;
  • compact, understandable representation, in fact the key aspect in the success of a notation is the ability to provide many information in an intuitive way without requiring excessive efforts from the users of the notation. ConcurTaskTrees is able to support this as it has been demonstrated by its use also by domain experts without a background in Computer Science.

A set of tools to develop task models in ConcurTaskTrees, to analyse their content and to generate corresponding UIs are being developed and are available at [CTTE].

The figure below shows a CTT task model which describes an ATM UI. It has been modelled as two different abstract tasks (depicted as a cloud): EnableAccess and Access. There is an enabling temporal relationship (>>) between them, which indicates that the Access task can only be performed after the successful completion of the EnableAccess task. If we have a look at the EnableAccess task it can be seen that it have been splitted into: two interaction tasks (InsertCard, EnterPassword) and an application (system-performed) task (RequirePassword). Likewise, the Access task has been decomposed into: WithdrawCash, DepositCash and GetInformation. These tasks are related by means of the choice ([]) operator which indicates that different tasks can be chosen, but once a task is chosen the other will not be available until the former is finished. It can be observed, particularly, the DecideAmount task which it is a user task, representing a cognitive activity. The []>> symbol indicates an enabling with information passing relationship, which means that there also exists an information flow between the concerned tasks.

ATM Task Model
CTT Task Model for the User Interface offered by an ATM

ANSI/CEA-2018 Task Model Description Standard

ANSI/CEA-2018 [[Rich09]] defines an XML-based language for task model descriptions. The standard was published by CEA in Nov. 2007 [[CEA2007]], and by ANSI in March 2008. In this standard a task model is defined as a formal description of the activities involved in completing a task, including both activities carried out by humans and those performed by machines. The standard defines the semantics and an XML notation for task models relevant to consumer electronics devices, but nothing prevents anybody from using it in a broader domain.

The figure below shows an XML excerpt of an ANSI/CEA-2018 task model for playing music on an entertainment system consisting of a media server and a media player [[MBUI-CEA2018]]. It can be observed that the main task is decomposed into different subtasks (steps). Steps are sequential by default, in the order as defined in the XML structure. Tasks have input and output slots, representing the data to be communicated with other tasks. Restrictions over such data are expressed by means of Preconditions and Postconditions. Bindings specify the data flow between the input and output slots of a task and its subtasks, and those between the subtasks.

AUI Models

MARIA AUI

MARIA [[PSS09]] Model-based lAnguage foR Interactive Applications, is a universal, declarative, multiple abstraction level language for service-oriented applications in ubiquitous environments. It provides a flexible dialogue and navigation model, a flexible data model, which allows the association of various types of data to the various interactors, and support for more recent techniques able to change the content of UIs asynchronously respect to the user interaction.

[[fig-MARIA-aui]] shows the main elements of the abstract user interface metamodel (some details have been omitted for clarity). As can be seen, an interface is composed of one data model and one or more presentations. Each presentation is composed of name, a number of possible connections, elementary interactors, and interactor compositions. The presentation is also associated with a dialog model which provides information about the events that can be triggered at a given time. The dynamic behavior of the events, and the associated handlers, is specified using the CTT temporal operators (for example, concurrency, or mutually exclusive choices, or sequentiality, etc.).

MARIA AUI Meta-Model
MARIA AUI Meta-Model

When an event occurs, it produces a set of effects (such as performing operations, calling services, etc.) and can change the set of currently enabled events (for example, an event occurring on an interactor can affect the behavior of another interactor, by disabling the availability of an event associated to another interactor). The dialog model can also be used to describe parallel interaction between user and interface. A connection indicates what the next active presentation will be when a given interaction takes place. It can be either an elementary connection, a complex connection (when Boolean operators compose several connections), or a conditional connection (when specific conditions are associated with it). There are two types of interactor compositions: grouping or relation. The latter has at least two elements (interactor or interactor compositions) that are related to each other.

In MARIA an interactor can be either an interaction object or an "only output" object. The first one can be one of the following types: selection, edit, control, interactive description, depending on the type of activity the user is supposed to carry out through such objects. The control object is refined into two different interactors depending on the type of activity supported (navigator: navigate between different presentations; activator: trigger the activation of a functionality). An only output interactor can be object, description, feedback, alarm, text, depending on the supposed information that the application provides to the user through this interactor.

It is worth pointing out that further refinement of each of these interactors can be done only by specifying some platform-dependent characteristics, therefore it is specified at the concrete level.

UsiXML AUI Meta-Model

UsiXML [[LVMBL04]] [[UsiXML]] is an XML-compliant markup language, which aims to describe the UI for multiple contexts of use. UsiXML adheres to MBUI by having meta-models describing different aspects of the UI. At the time of writing a new version of UsiXML is under development thanks to the support of an Eureka ITEA2 project [[UsiXML-Proj]].

The figure below depicts the current version of the UsiXML Meta-Model for AUI description (work in progress). The class AUIobject is at the top of the hierarchy representing the elements that populate an AUI Model. AUIInteractor and AUIContainer are subsumed by AUIObject. The latter defines a group of tasks that have to be presented together and may contain both AuiInteractors or other AuiContainers. An association class AuiRelationship allows to define the kind of relationship (Ordering, Hierarchy, Grouping, or Repetition) between an object and its container. AUIInteractionUnit is an aggregation of AUIObject and Behaviour specified by means of Listener, Event and Action. AuiInteractor has been splitted into DataInteractor (for UI data input/output) or TriggerInteractor (for UI command). Selection,Input and Output are data interactors. Concerning trigger interactors, Command is intended to launch any kind of action within the UI whilst Navigator allows to change the interaction unit.

UsiXML AUI Meta-Model
Abstract User Interface Meta-Model in UsiXML

CUI Models

UsiXML CUI Meta-Model

[[fig-UsiXML-cui]] is the graphical representation of the UsiXML Meta-model for the Concrete UI (work in progress). The root entity is CUIObject which has been subclassed in CUIInteractor and CUIContainer. The relationship between Interactors and Containers is captured by the 'contains' relationship and the CUIRelationship association class. It is important to note that the meta-model includes specializations for the different modalities (graphical, tactile, vocal), as a CUI Model is modality-dependent. The Style class is intended to capture all the presentational attributes for a CUI Object. This design pattern decouples the model from 'presentational vocabularies'.

UsiXML CUI Meta-Model
UsiXML CUI Meta-Model

[[fig-UsiXML-CUI-int]] depicts the hierarchy of GraphicalInteractor modelled by UsiXML. As it can be observed, the typical graphical interactors found in conventional toolkits are included.

The UsiXML meta-models presented above are still under development, thus many issues are open. For example, layout representation, bindings between the Domain Model and the AUI / CUI, model modularization and extension, etc.

UsiXML CUI Graphical Interactors
UsiXML CUI Graphical Interactors

A pragmatic approach to MBUIs : MyMobileWeb

MyMobileWeb [[MYMW]] is an open source, standards-based software framework that simplifies the rapid development of mobile web applications and portals. MyMobileWeb encompasses a set of technologies which enable the automatic adaptation to the target Delivery Context [[DI-GLOSS]], thus offering a harmonized user experience. Concerning Model-Based approaches, the technologies offered by MyMobileWeb are:

IDEAL2 is an XML-based language aimed to simplify the creation of web applications and contents that adapt to their delivery context. IDEAL2 is easy to be learned by web developers, modular and standards-compliant (it makes use of XForms [[XFORMS11]] and DISelect [[CSELECTION]]). By using IDEAL2 authors can concentrate on the application functionality without worrying about markup implementation languages or scripting capabilities. AUI interaction units (presentations) are described using XML elements that correspond to abstract containers (section, div) and abstract interactors (select, select1, input, output, menu, ...). Designers can force specific mappings between the AUI and the graphical, mobile CUI by means of attributes expressed using the CSS2 syntax. The decision on how an AUI element will be finally rendered will depend on the device and web browser identified at runtime. For example, a select1 element can be rendered as a drop down list, a set of radio buttons or as a nice navigation list with hyperlinks. Specific examples on the usage of IDEAL2 can be found on [[MyMw-Tut]].

SCXML is a W3C standard for specifiying state machines based on Harel State Tables. SCXML provides XML elements to define states, transitions between states and actions to be performed when certain conditions are met. According to the MyMobileWeb conventions, a state typically denotes that the user is interacting with a presentation. There are, at least, as many states as presentations. User interaction events (activate,submit, etc.) trigger transitions. Actions correspond to the executable content (application logic execution, navigation) that has to be launched when a transition is triggered or when a state is entered. Concrete examples on the usage of SCXML within MyMobileWeb are available at [[MyMw-Tut]].

Other Work

Research-Driven

  • The Useware Markup Language (useML) is a notation for specifying enhanced task models in industrial environments and is part of the user-centered Useware-engineering development process [[ZT08]]. Originally developed in 2003 [[MT08]], useML was enhanced in 2009 with several aspects concerning temporal operators, conditions and optionality of tasks. UseML has shown its applicability and usefulness in several other domains e.g. automotive or medicine [[MTK07]]. UseML is embedded in a model-based architecture for developing multimodal and multiplatform UIs. This model-based architecture was developed as an instance of the CAMELEON Reference Framework using different abstraction layers and different UIDLs e.g. DISL and UIML. For editing useML the graphical useML-Editor (Udit) [[MSN09]] was developed. Furthermore useML was extended to work in ambient intelligence factory environments like e.g. the SmartFactoryKL [[BMGMZ09]] which enables the run-time generation of graphical UIs.
  • RIML [[DWGSZ03]] Renderer-Independent Markup Language) is a markup language based on W3C standards that allows document authoring in a device independent fashion. RIML is based on standards such as: XHMTL 2.0 and XFORMS. Special row and column structures are used in RIML to specify content adaptation. Their semantics is enhanced to cover pagination and layout directives in case pagination needs to be done. Due to the use of XForms, RIML is device independent and can be mapped into a XHTML specification according to the target device. RIML semantics is enhanced to cover pagination and layout directives in case pagination needs to be done, in this sense it was possible to specify how to display a sequence of elements of the UI.
  • TERESA [[PSM08]], is a XML-based language for describing UIs, which has an associated authoring environment, Multimodal TERESA. It provides designers with the possibility of designing interfaces for a wide set of platforms, which support various modalities.
  • UIML [APB99] (User Interface Markup Language) was one of the first model-based languages for multi-target UIs. A UI is decomposed into structure, style, contents, and behaviour. It is however only partially compliant with the CAMELEON Reference Framework (e.g., it does not have any task or context model) or with MDE-UI (no transformation approach) and it has not been applied to obtain multi-target user interfaces or to context-aware adaptation.
  • XIML [EVP00, EVP01](eXtensible Interface Markup Language) is composed of four types of components: models, elements, attributes, and relations between the elements. The presentation model is composed of several embedded elements, which correspond to the widgets of the UI, and attributes of these elements representing their characteristics (color, size…). The relations at the presentation level are mainly the links between labels and the widgets that these labels describe. XIML supports design, operation, organization, and evaluation functions; it is able to relate the abstract and concrete data elements of an interface; and it enables knowledge-based systems to exploit the captured data.

Industry-Driven

  • XForms [[XFORMS11]] XForms is a widely adopted W3C-standard targeting the next generation of (web) form applications. Following the MVC-design pattern these operate on a model comprising data-oriented XML-instances enhanced e.g. by validity restrictions, node computations expressed in XPath 1.0 and a variety of data submission models. The event-based controller layer leverages XML-Events and a rich set of predefined actions eliminating the need for imperative programming (Javascript). The generic, device-independent and extensible set of user controls support advanced interactions like tabs and wizard-like page flow. For rendering purposes XForms markup is embedded into a presentation oriented host language (HTML, SVG) and additionaly formated via CSS. The XForms implementations range from stand-alone clients (XSmiles, Swiftfox), browser plug-ins (XForms for Firefox), to client-side transcoding into DHTML (XSLTForms) and server-based solutions (Chiba, Orbeon). A powerful, thoroughly XML-based architecture arose from the combination of XForms clients and native XML-database servers exposing (stored) XQuery statements through a REST-interface (XRX).
  • XUL The XML User Interface Language (XUL) [[XUL]] is a component of the Mozilla browser and related applications and is available as part of Gecko [[Gecko]]. With XUL and other Gecko components, developers can create sophisticated applications without special tools. XUL was designed for creating the user interface of Mozilla applications including the web browser, mail client and page editor. In XUL developers can describe a concrete UI using a markup language, use CSS style sheets to define appearance and JavaScript for behavior. Programming interfaces for reading and writing remote content over the network and for calling web services are also available. Unlike HTML, however, XUL provides a powerful set of interactors for creating menus, toolbars, tabbed panels, and hierarchical trees to give a few examples. This means that developers do not have to look for third party code or include a large block of JavaScript in their application just to handle a popup menu. XUL has all of these elements built-in. In addition, the elements are designed to look and feel just like those on the user's native platform, or designers can use standard CSS to create their own look.
  • XAML Microsoft's XAML (Extensible Application Markup Language) serves several purposes within the .NET framework: a declarative definiton of visual user interfaces for desktop applications (via Windows Presentation Foundation) and web (RIA via Silverlight) comprising a hierarchical model of 2D, 3D objects and media, flow control, data binding, eventing, transformations and styling through a templating mechanism. It may as well describe long-running processes executed via Windows Workflow Foundation.
  • Open Laszlo Open Laszlo is an open source platform for development and delivery of rich internet applications (RIA). These are defined in LZX and JavaScript and deployed either as static, pre-compiled binaries (DHTML, Flash) or rendered dynamically by the OpenLaszlo Server into a device-sensitive OpenLaszlo client application (DHTML, Flash). The XML dialect LZX resembles HTML, while supporting high level UI (sliders, trees, grids) and action elements (animator) an XML-based data model and declarative dependencies at view level (constraints) or data level (data pathes).
  • Flex Adobe's Flex comprises an open source framework for creation and deployment of Flash-based applications. While leveraging ActionScript for implementation Flex offers an higher-level XML-syntax (MXML) for declarative specification of application and user interface components. This covers features like web service requests, data-binding and validation and a rich, extensible library of UI-controls, containers and animation effects. MXML files are compiled into Flash bytecode (SWF) for platform-neutral execution on client side within browsers (via Flash Player) or as standalone desktop applications (via Adobe AIR runtime).
  • Collage IBM's Collage is a declarative programming language and runtime for cumulative building of data-centric, reactive systems out of distributed web components. Nodes of the underlying RDF data model are dynamically typed, interpreted and updated in response to occurrence of external and internal events (user input, service responses, data computations etc.) Collage's recursive MVC-approach allows for arbitrary detail of UI-specification ranging from abstract UI primitives (adopted from XForms) to concrete layout overlays

Use Cases

This section presents a set of compelling use cases on which model-based design of UIs will be particularly suitable.

Enabling advanced user-service interactions in a digital home

Digital home refers to a residence with devices that are connected through a computer network. A digital home has a network of consumer electronics, mobile, and PC devices that cooperate transparently and simplify usability at home. All computing devices and home appliances conform to a set of interoperable standards so that everything can be controlled by means of an interactive system. Different electronic services can be offered by a digital home system, including but not limited to:

These functionalities should be made available through context-sensitive service front-ends. In fact, such SFEs have to be capable of adapting to different computing platforms (touch points, web, mobile, TV, Home PDA, DECT handset, …), users (children, teenagers, adults, the elderly, disabled people, …) and environments or situations (at home, away, at night, while music is playing, …). The final aim is to provide a seamless and unified user experience which it is critical in the digital home domain. Thus a UI for the digital home adapted to different computing platforms. It can be observed that the layout and the UI widgets are different. In addition the functionalities offered can vary depending on the situation. Furthermore other specific adaptations could be

Our ambition is to develop standards that will cater the needs imposed by the scenario described. Dynamic variations in the computing platform, user and environment dimensions of the context of use will be automatically accommodated, thus supporting users in a more effective, personalized and consistent way. Furthermore, the engineering costs of developing contextsensitive SFEs for the digital home will be cut off, and lastly, the time to market will be improved.

Multi-Channel UIs in the Warehouse Consignment Process

In order to ensure a just-in-time deployment of relevant components to the respective assembly stations, many companies define a proceeding consignment step. Here workers run off storage racks in a warehouse intending to collocate the necessary parts. The workers of the succeeding assembly step strongly rely on a correct consignment. An uncompleted consignment will lead to a downtime of the production line, which is translated into losses for the company. In a similar manner, inaccurate pickings will affect a company's success and increase workers' frustration. Therefore, the picking process makes high quality and time demands on the workers. Especially the fact that warehouses employ unskilled workers to relieve the work load during peak times makes consignment a critical task and bottleneck in the overall process. The problem is that workers are often unfamiliar with warehouse settings. Additionally, they neither know the products nor have the necessary skills to carry out the job on their own.

Imagine John who is working as commissioner in an automobile company. For large orders, John collects the relevant components utilizing a cart. The necessary components and their location are shown in lists on a display mounted on the cart. Since John can orient himself only in subsections of the warehouse, he can additionally make use of a head mounted display (HMD). Using the HMD, relevant components are no longer displayed as lists. John gets a visual representation of the number and location of the objects ( (storage rack and box number) he is looking for. Furthermore, John receives direct feedback on the HMD in the case of wrong or missing parts. For smaller orders or single parts, John often moves more efficiently without the large storage cart. In that case, John can read details about relevant components as well as their location on a GUI running on his cell phone.

The described use case motivates how proactive applications can provide unobtrusive and adequate help (e.g. missing parts, location of necessary parts, etc.) when the user needs help. Thereby, the service time can be reduced while increasing the quality of service. Note that human computer interaction can happen on manifold output devices strongly depending on the context of use -e.g., user’s identity and preferences, task size, or display resolution. In this case both user input and output have to be considered when designing a multi-channel system. Whereas a use of several channels for processing the same information provides an increased bandwidth of information transfer, the development of multi-channel applications is still complex and expensive due to lacking tools. A simplification of the development of adaptive multi-channel applications by providing an integrated development and runtime environment seems to be crucial and a key factor.

Migratory User Interfaces

Migratory user interfaces are interactive applications that can transfer among different devices while preserving the state and therefore giving the sense of a non-interrupted activity. The basic idea is that devices that can be involved in the migration process should be able to run a migration client, which is used to allow the migration infrastructure to find such devices and know their features. Such client is also able to send the trigger event to the migration server, when it is activated by the user. At that point the state of the source interface will be transmitted to the server in order to be adapted and associated to the new UI automatically generated for the target device.

[[fig-migratory]] shows how the abstraction layers are exploited to support migratory UIs, by showing the various activities that are done by the Migration Server. This solution has been developed in the EU OPEN Project. First of all the migration approach supposes that various UI models at different abstraction levels are associated to the various devices involved in a migration: such UI models are stored/manipulated centrally, in the Migration Server.

The current architecture assumes that a desktop Web version of the application front-end exists and it is available in the corresponding Application Server: this seems a reasonable assumption given the wide availability of this type of applications. Then, from such final UI version for the desktop platform, the Migration Server automatically generates a logical, concrete UI description for the desktop platform through a reverse-engineering process. After having obtained such a concrete UI description for the desktop platform, the Migration server performs a semantic redesign of such CUI [Paternò et al., 2008] for creating a new, concrete, logical description of the UI, adapted to the target device.

The purpose of the semantic redesign is to preserve the semantics of the user interactions that should be available for the user but to adapt the structure of the UI to the resources available in the target device. It may happen that some task is not supported by the target device (e.g. a long video cannot be rendered with a limited mobile phone).

For all the tasks that can be supported the semantic redesign identifies concrete techniques that preserve the semantics of the interaction but supports it with techniques most suitable for the new device (for example in mobile devices it will replace interactors with others that provide the same type of input but occupying less screen space). In a similar way also page splitting is supported: when there are pages too heavy for the target device they are split taking into account their logical structure so that elements logically connected remain in the same page. Thus, the groupings and relations are identified and some of them are allocated to newly created presentations so that the corresponding page can be sustainable by the target devices.

The relationships among abstraction layers supporting migration

A detailed example of application of MBUI

This section describes a complete example that illustrates how to apply model-based design of UIs

"Making a Hotel Reservation" is a task that can be decomposed into selecting arrival and departure dates and other subtasks.

At the abstract user interface level we need to identify the interaction objects needed to support such tasks. For example, for easily specifying arrival and departure days we need selection interaction objects.

When we move on to the concrete user interface, we need to consider the specific interaction objects supported. So, in a desktop interface, selection can be supported by a graphical list object. This choice is more effective than others because the list supports a single selection from a potentially long list of elements.

The final user interface is the result of these choices and others involving attributes such as the type and size of the font, the colours, and decoration images that, for example, can show the list in the form of a calendar.

Many transformations are possible among these four levels for each interaction platform considered: from higher level descriptions to more concrete ones or vice versa or between the same level of abstraction but for different type of platforms or even any combination of them. Consequently, a wide variety of situations can be addressed. More generally, the possibility of linking aspects related to user interface elements to more semantic aspects opens up the possibility of intelligent tools that can help in the design, evaluation and run-time execution.

Task Model

AUI Model

CUI Model

Conclusions and Recommendations

Benefits of Model-Based UIs

In general it can be said that Model-Based Engineering provides the following benefits:

More specifically Model-Based UIs present the following advantages:

Challenges for Deployment

Suggested Standardization Work Items

Acknowledgements

This work was partially supported by the following R&D projects:

Thomas Ziegert and his team at SAP AG who provided the "Warehouse Consignment" use case.