W3C

Multimodal Architecture and Interfaces

W3C Working Draft 16 October 2008 1 December 2009

This version:
http://www.w3.org/TR/2008/WD-mmi-arch-20081016/ http://www.w3.org/TR/2009/WD-mmi-arch-20091201/
Latest version:
http://www.w3.org/TR/mmi-arch/
Previous version:
http://www.w3.org/TR/2008/WD-mmi-arch-20080414/ http://www.w3.org/TR/2008/WD-mmi-arch-20081016/
Editor:
Jim Barnett, Genesys Telecommunications Laboratories
Authors:
Michael Bodell, Microsoft
Deborah Dahl, Invited Expert
Ingmar Kliche, Deutsche Telekom AG, T-Com
Jim Larson, Invited Expert
Raj Tumuluri, Openstream
Moshe Yudkowsky, Invited Expert
Michael Bodell, Microsoft Muthuselvam Selvaraj, HP (until 5 October 2009)
Brad Porter (until 2005, while at TellMe) Tellme)
Dave Raggett (until 2007, while at W3C/Volantis)
T.V. Raman (until 2005, while at IBM)
Andrew Wahbe (until 2006, while at VoiceGenie)

Abstract

This document describes a loosely coupled architecture for multimodal user interfaces, which allows for co-resident and distributed implementations, and focuses on the role of markup and scripting, and the use of well defined interfaces between its constituents.

Status of this Document

This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at http://www.w3.org/TR/.

This document is the 16 October 2008 sixth Public Working Draft of "Multimodal Architecture and Interfaces". Interfaces" published on 1 December 2009 for review by W3C Members and other interested parties, and has been developed by the Multimodal Interaction Working Group as part of the W3C Multimodal Interaction Activity . The main difference differences from the previous draft are:

This A diff-marked version of this document is the fifth Public Working Draft also available for review by W3C Members and other interested parties, and has comparison purposes. Please note that many sections have been developed by the Multimodal Interaction Working Group modified because of above changes, the W3C Multimodal Interaction Activity . editors would like readers to read the whole document carefully and give comments.

Comments for this specification are welcomed and should have a subject starting with the prefix '[ARCH]'. Please send them to www-multimodal@w3.org , the public email list for issues related to Multimodal. This list is archived and acceptance of this archiving policy is requested automatically upon first post. To subscribe to this list send an email to www-multimodal-request@w3.org > with the word subscribe in the subject line.

For more information about the Multimodal Interaction Activity, please see the Multimodal Interaction Activity statement .

This document was produced by a group operating under the 5 February 2004 W3C Patent Policy . The group does not expect this document to become a W3C Recommendation. W3C maintains a public list of any patent disclosures made in connection with the deliverables of the group; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy .

Publication as a Working Draft does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.

Table of Contents

1 Abstract Summary
2 Overview
3 Design versus Run-Time considerations
    3.1 Markup and The Design-Time View
    3.2 Software Constituents and The Run-Time View
    3.3 Relationship to Differences from Compound Document Formats
    3.4 Relationship to EMMA
4 Overview of Constituents Architecture
    4.1 Run-Time Architecture Diagram
5     4.2 The Constituents
    5.1 The Runtime Framework         5.1.1         4.2.1 The Interaction Manager
        5.1.2 The Delivery Context Component         5.1.3         4.2.2 The Data Component
    5.2         4.2.3 The Modality Components
    5.3 Examples 6 Interface between the         4.2.4 The Runtime Framework and the Modality Components
    6.1             4.2.4.1 The Event Delivery Mechanism Transport Layer
        6.1.1                 4.2.4.1.1 Event and Information Security
        6.1.2 Multiple Protocols         6.1.3         4.2.5 System and OS Security
    6.2         4.2.6 Examples
5 Interface between the Interaction Manager and the Modality Components
    5.1 Standard Life Cycle Events
        6.2.1         5.1.1 NewContextRequest
            6.2.1.1             5.1.1.1 NewContextRequest Properties
        6.2.2         5.1.2 NewContextResponse
            6.2.2.1             5.1.2.1 NewContextResponse Properties
        6.2.3         5.1.3 PrepareRequest
            6.2.3.1             5.1.3.1 PrepareRequest Properties
        6.2.4         5.1.4 PrepareResponse
            6.2.4.1             5.1.4.1 PrepareResponse Properties
        6.2.5         5.1.5 StartRequest
            6.2.5.1             5.1.5.1 StartRequest Properties
        6.2.6         5.1.6 StartResponse
            6.2.6.1             5.1.6.1 StartResponse Properties
        6.2.7         5.1.7 DoneNotification
            6.2.7.1             5.1.7.1 DoneNotification Properties
        6.2.8         5.1.8 CancelRequest
            6.2.8.1             5.1.8.1 CancelRequest Properties
        6.2.9         5.1.9 CancelResponse
            6.2.9.1             5.1.9.1 CancelResponse Properties
        6.2.10         5.1.10 PauseRequest
            6.2.10.1             5.1.10.1 PauseRequest Properties
        6.2.11         5.1.11 PauseResponse
            6.2.11.1             5.1.11.1 PauseResponse Properties
        6.2.12         5.1.12 ResumeRequest
            6.2.12.1             5.1.12.1 ResumeRequest Properties
        6.2.13         5.1.13 ResumeResponse
            6.2.13.1             5.1.13.1 ResumeResponse Properties
        6.2.14         5.1.14 ExtensionNotification
            6.2.14.1             5.1.14.1 ExtensionNotification Properties
        6.2.15         5.1.15 ClearContextRequest
            6.2.15.1             5.1.15.1 ClearContextRequest Properties
        6.2.16         5.1.16 ClearContextResponse
            6.2.16.1             5.1.16.1 ClearContextResponse Properties
        6.2.17         5.1.17 StatusRequest
            6.2.17.1             5.1.17.1 Status Request Properties
        6.2.18         5.1.18 StatusResponse
            6.2.18.1             5.1.18.1 StatusResponse Properties
7 6 Open Issues

Appendices

A Examples of Life-Cycle Events
    A.1 newContextRequest (from MC to IM)
    A.2 newContextResponse (from IM to MC)
    A.3 prepareRequest (from IM to MC, with external markup)
    A.4 prepareRequest (from IM to MC, inline VoiceXML markup)
    A.5 prepareResponse (from MC to IM, success)
    A.6 prepareResponse (from MC to IM, failure)
    A.7 startRequest (from IM to MC)
    A.8 startResponse (from MC to IM)
    A.9 doneNotification (from MC to IM, with EMMA result)
    A.10 doneNotification (from MC to IM, with EMMA "no-input" result)
    A.11 cancelRequest (from IM to MC)
    A.12 cancelResponse (from IM to MC)
    A.13 pauseRequest (from IM to MC)
    A.14 pauseResponse (from MC to IM)
    A.15 resumeRequest (from IM to MC)
    A.16 resumeResponse (from MC to IM)
    A.17 extensionNotification (formerly the data event, sent in both directions)
    A.18 clearContextRequest (from the IM to the MC)
    A.19 statusRequest (from the IM to the MC)
    A.20 statusResponse (from the MC to the IM)
B Event Schemas
    B.1 mmi.xsd
    B.2 mmi-datatypes.xsd
    B.3 mmi-attribs.xsd
    B.4 mmi-elements.xsd
    B.5 NewContextRequest.xsd
    B.6 NewContextResponse.xsd
    B.7 PrepareRequest.xsd
    B.8 PrepareResponse.xsd
    B.9 StartRequest.xsd
    B.10 StartResponse.xsd
    B.11 DoneNotification.xsd
    B.12 CancelRequest.xsd
    B.13 CancelResponse.xsd
    B.14 PauseRequest.xsd
    B.15 PauseResponse.xsd
    B.16 ResumeRequest.xsd
    B.17 ResumeResponse.xsd
    B.18 ExtensionNotification.xsd
    B.19 ClearContextRequest.xsd
    B.20 ClearContextResponse.xsd
    B.21 StatusRequest.xsd
    B.22 StatusResponse.xsd
C Ladder Diagrams
    C.1 Creating a Session
    C.2 Processing User Input
    C.3 Ending a Session
D Glossary Localization and Customization
E Use Case Discussion HTTP transport of MMI lifecycle events
    E.1 Lifecycle event transport from modality components to Interaction Manager
    E.2 Lifecycle event transport from IM to modality components (HTTP clients only)
    E.3 Lifecycle event transport from Interaction Manager to modality components (HTTP servers)
    E.4 Error handling
F Glossary
G Rules and Best Practices for Creating a MMI Modality Component
G Acknowledgements     G.1 Simple modality components
    G.2 Complex modality components
    G.3 Nested modality components
    G.4 Modality component rules
        G.4.1 Rule 1: Each modality component must implement all of the MMI life-cycle events.
        G.4.2 Rule 2: Identify other functions of the modality component that are relevant to the interaction manager.
        G.4.3 Rule 3: If the component uses media, specify the media format. For example, audio formats for speech recognition, or InkML for handwriting recognition.
        G.4.4 Rule 4: Specify protocols for use between the component and the IM (e.g., SIP or HTTP).
        G.4.5 Rule 5: Specify supported human languages, e.g., English, German, Chinese and locale, if relevant.
        G.4.6 Rule 6: Specify supporting languages required by the component, if any.
        G.4.7 Rule 7: Modality components sending data to the interaction manager must use the EMMA format where appropriate.
        G.4.8 Rule 8: Specify error codes and their meanings to be returned to the IM.
    G.5 Modality component Guidelines
        G.5.1 Guideline1: Consider constructing a complex modality component with multiple functions if one function handles the errors generated by another function.
        G.5.2 Guideline2: Consider constructing a complex modality component with multiple functions rather than several simple modality components if the functions need to be synchronized.
        G.5.3 Guideline3: Consider constructing a nested modality component with multiple child modality components if the children modality components are frequently used together but do not handle ther errors generated by the other children components and the children components do not need to be extensively synchronized.
    G.6 Example simple modality: Face Identification
        G.6.1 Functions of a Possible Face Identification Component
        G.6.2 Event Syntax
            G.6.2.1 Examples of events for starting the component
            G.6.2.2 Example output event
    G.7 Example simple modality: Form-filling using Handwriting Recognition
        G.7.1 Functions of a Possible Handwriting Recognition Component
        G.7.2 Event Syntax
            G.7.2.1 Examples of events for preparing the component
            G.7.2.2 Examples of events for starting the component
            G.7.2.3 Example output event
H References


1 Abstract Summary

This document describes a loosely coupled architecture for multimodal user interfaces, which allows for co-resident and distributed implementations, and focuses on the role of markup and scripting, and the use of well defined interfaces between its constituents.

2 Overview

This document describes the architecture of the Multimodal Interaction (MMI) framework [MMIF] and the interfaces between its constituents. The MMI Working Group is aware that multimodal interfaces are an area of active research and that commercial implementations are only beginning to emerge. Therefore we do not view our goal as standardizing a hypothetical existing common practice, but rather providing a platform to facilitate innovation and technical development. Thus the aim of this design is to provide a general and flexible framework providing interoperability among modality-specific components from different vendors - for example, speech recognition from one vendor and handwriting recognition from another. This framework places very few restrictions on the individual components or on their interactions with each other, but instead focuses on providing a general means for allowing them to communicate with each other, plus basic infrastructure for application control and platform services.

Our framework is motivated by several basic design goals:

Even though multimodal interfaces are not yet common, the software industry as a whole has considerable experience with architectures that can accomplish these goals. Since the 1980s, for example, distributed message-based systems have been common. They have been used for a wide range of tasks, including in particular high-end telephony systems. In this paradigm, the overall system is divided up into individual components which communicate by sending messages over the network. Since the messages are the only means of communication, the internals of components are hidden and the system may be deployed in a variety of topologies, either distributed or co-located. One specific instance of this type of system is the DARPA Hub Architecture, also known as the Galaxy Communicator Software Infrastructure [Galaxy] . This is a distributed, message-based, hub-and-spoke infrastructure designed for constructing spoken dialogue systems. It was developed in the late 1990's and early 2000's under funding from DARPA. This infrastructure includes a program called the Hub, together with servers which provide functions such as speech recognition, natural language processing, and dialogue management. The servers communicate with the Hub and with each other using key-value structures called frames.

Another recent architecture that is relevant to our concerns is the model-view-controller (MVC) paradigm. This is a well known design pattern for user interfaces in object oriented programming languages, and has been widely used with languages such as Java, Smalltalk, C, and C++. The design pattern proposes three main parts: a Data Model that represents the underlying logical structure of the data and associated integrity constraints, one or more Views which correspond to the objects that the user directly interacts with, and a Controller which sits between the data model and the views. The separation between data and user interface provides considerable flexibility in how the data is presented and how the user interacts with that data. While the MVC paradigm has been traditionally applied to graphical user interfaces, it lends itself to the broader context of multimodal interaction where the user is able to use a combination of visual, aural and tactile modalities.

3 Design versus Run-Time considerations

In discussing the design of MMI systems, it is important to keep in mind the distinction between the design-time view (i.e., the markup) and the run-time view (the software that executes the markup). At the design level, we assume that multimodal applications will take the form of multiple documents from different namespaces. In many cases, the different namespaces and markup languages will correspond to different modalities, but we do not require this. A single language may cover multiple modalities and there may be multiple languages for a single modality.

At runtime, the MMI architecture features loosely coupled software constituents that may be either co-resident on a device or distributed across a network. In keeping with the loosely-coupled nature of the architecture, the constituents do not share context and communicate only by exchanging events. The nature of these constituents and the APIs between them is discussed in more detail in Sections 3-5, below. Though nothing in the MMI architecture requires that there be any particular correspondence between the design-time and run-time views, in many cases there will be a specific software component responsible for each different markup language (namespace).

3.1 Markup and The Design-Time View

At the markup level, an application consists of multiple documents. A single document may contain markup from different namespaces if the interaction of those namespaces has been defined (e.g., as part of the Compound Document Formats Activity [CDF] .) By the principle of encapsulation, however, the internal structure of documents is invisible at the MMI level, which defines only how the different documents communicate. One document has a special status, namely the Root or Controller Document, which contains markup defining the interaction between the other documents. Such markup is called Interaction Manager markup. The other documents are called Presentation Documents, since they contain markup to interact directly with the user. The Controller Document may consist solely of Interaction Manager markup (for example a state machine defined in CCXML [CCXML] or SCXML [SCXML] ) or it may contain Interaction Manager markup combined with presentation or other markup. As an example of the latter design, consider a multimodal application in which a CCXML document provides call control functionality as well as the flow control for the various Presentation documents. Similarly, an SCXML flow control document could contain embedded presentation markup in addition to its native Interaction Managment markup.

These relationships are recursive, so that any Presentation Document may serve as the Controller Document for another set of documents. This nested structure is similar to 'Russian Doll' model of Modality Components, described below in 3.2 Software Constituents and The Run-Time View . .

The different documents are loosely coupled and co-exist without interacting directly. Note in particular that there are no shared variables that could be used to pass information between them. Instead, all runtime communication is handled by events, as described below in 6.2 5.1 Standard Life Cycle Events . .

Furthermore, it is important to note that the asynchronicity of the underlying communication mechanism does not impose the requirement that the markup languages present a purely asynchronous programming model to the developer. Given the principle of encapsulation, markup languages are not required to reflect directly the architecture and APIs defined here. As an example, consider an implementation containing a Modality Component providing Text-to-Speech (TTS) functionality. This Component must communicate with the Runtime Framework Interaction Manager via asynchronous events (see 3.2 Software Constituents and The Run-Time View ). In a typical implementation, there would likely be events to start a TTS play and to report the end of the play, etc. However, the markup and scripts that were used to author this system might well offer only a synchronous "play TTS" call, it being the job of the underlying implementation to convert that synchronous call into the appropriate sequence of asynchronous events. In fact, there is no requirement that the TTS resource be individually accessible at all. It would be quite possible for the markup to present only a single "play TTS and do speech recognition" call, which the underlying implementation would realize as a series of asynchronous events involving multiple Components.

Existing languages such as XHTML may be used as either the Controller Documents or as Presentation Documents. Further examples of potential markup components are given in 5.3 4.2.6 Examples

3.2 Software Constituents and The Run-Time View

At the core of the MMI runtime architecture is the distinction between the Runtime Framework Interaction Manager (IM) and the Modality Components, which is similar to the distinction between the Controller Document and the Presentation Documents. The Runtime Framework Interaction Manager interprets the Controller Document and provides the basic infrastructure which while the various Modality Components plug into. Individual individual Modality Components are responsible for specific tasks, particularly handling input and output in the various modalities, such as speech, pen, video, etc.

The Interaction Manager receives all the events that the various Modality Components generate. Those events may be commands or replies to commands, and it is up to the Interaction Manager to decide what to do with them, i.e., what events to generate in response to them. In general, the MMI architecture follows a 'targetless' event model. That is, the Component that raises an event does not specify its destination. Rather, it passes it up to the Runtime Framework, which will pass it to the Interaction Manager. The IM, in turn, decides whether to forward the event to other Components, or to generate a different event, etc.

Modality Components are black boxes, required only to implement the Modality Component Interface API which is described below. This API allows the Modality Components to communicate with the Framework IM and hence with each other, since the Framework IM is responsible for delivering events/messages among the Components. Since the internals of a Component are hidden, it is possible for a Runtime Framework an Interaction Manager and a set of Components to present themselves as a Component to a higher-level Framework. Interaction Manager. All that is required is that the Framework IM implement the Component API. The result is a "Russian Doll" model in which Components may be nested inside other Components to an arbitrary depth. Nesting components in this manner is one way to produce a 'complex' Modality Component, namely one that handles multiple modalities simultaneously. However, it is also possible to produce complex Modality Components without nesting, as discussed in 5.2 4.2.3 The Modality Components . .

The Runtime Framework is itself divided up into sub-components. One important sub-component is the Interaction Manager (IM), which executes the Interaction Manager markup. The IM receives all the events that the various Modality Components generate. Those events may be commands or replies to commands, and it is up In addition to the Interaction Manager to decide what to do with them, i.e., what events to generate in response to them. In general, and the MMI architecture follows modality components, there is a 'targetless' event model. That is, the Component that raises an event does not specify its destination. Rather, it passes it up to the Runtime Framework, which will pass it to the Interaction Manager. The IM, in turn, decides whether to forward the event to other Components, or to generate a different event, etc. The other sub-components of the Runtime Framework are the Device Context Component, which that provides information about device capabilities and user preferences, and the Data Component, infrastructure support, in particular a transport layer which stores the Data Model for the application. We do not currently specify the interfaces for the IM and the Data Component, so they represent only the logical structure of the functionality that the Runtime Framework provides. The interface to delivers events among the Device Context Component is specified in [DCCI] . components.

Because we are using the term 'Component' to refer to a specific set of entities in our architecture, we will use the term 'Constituent' as a cover term for all the elements in our architecture which might normally be called 'software components'.

3.3 Relationship to Differences from Compound Document Formats

The W3C Compound Document Formats Activity [CDF] is also concerned with the execution of user interfaces written in multiple languages. However, the CDF group focuses on defining the interactions of specific sets of languages within a single document, which may be defined by inclusion or by reference. The MMI architecture, on the other hand, defines the interaction of arbitrary sets of languages in multiple documents. From the MMI point of view, mixed markup documents defined by CDF specifications are treated like any other documents, and may be either Controller or Presentation Documents. Finally, note that the tightly coupled languages handled by CDF will usually share data and scripting contexts, while the MMI architecture focuses on a looser coupling, without shared context. The lack of shared context makes it easier to distribute applications across a network and also places minimal constraints on the languages in the various documents. As a result, authors will have the option of building multimodal applications in a wide variety of languages for a wide variety of deployment scenarios. We believe that this flexibility is important for the further development of the industry.

3.4 Relationship to EMMA

The Extended Multimodal Annotation Language [EMMA] , is a set of specifications for multimodal systems, and provides details of an XML markup language for containing and annotating the interpretation of user input. For example, a user of a multimodal application might use both speech to express a command, and keystroke gesture to select or draw command parameters. The Speech Recognition Modality would express the user command using EMMA to indicate the input source (speech). The Pen Gesture Modality would express the command parameters using EMMA to indicate the input source (pen gestures). Both modalities may include timing information in the EMMA notation. Using the timing information, a fusion module combines the speech and pen gesture information into a single EMMA notation representing both the command and its parameters. The use of EMMA enables the separation of recognition process from the information fusion process, and thus enables reusable recognition modalities and general purpose information fusion algorithms.

4 Overview of Constituents Architecture

Here is a list of the Constituents of the MMI architecture. They are discussed in more detail in the next section. below.

4.1 Run-Time Architecture Diagram

architecture diagram

5 4.2 The Constituents

This section presents the responsibilities of the various constituents of the MMI architecture. 5.1 The Runtime Framework The Runtime Framework is responsible for starting the application and interpreting the Controller Document. More specifically, the Runtime Framework must: load and initialize the Controller document initialize the Component software. If the Component is local, this will involve loading the corresponding code (library or executable) and possibly starting a process if the Component is implemented as a separate process, etc. If the Component is remote, the Runtime Framework will load a stub and possibly open a connection to the remote implementation. generate the necessary life-cycle events handle communication between the Components map between the asynchronous Modality Component API and the potentially synchronous APIs of other components (e.g., the Delivery Context Interface) The need for mapping between synchronous and asynchronous APIs can be seen by considering the case where a Modality Component wants to query the Delivery Context Interface [DCCI] . The DCCI API provides synchronous access to property values whereas the Modality Component API, presented below in 6.2 Standard Life Cycle Events , is purely asynchronous and event-based. The Modality Component will therefore generate an event requesting the value of a certain property. The DCCI cannot handle this event directly, so the Runtime Framework must catch the event, make the corresponding function call into the DCCI API, and then generate a response event back to the Modality Component. Note that even though it is globally the Runtime Framework's responsibility to do this mapping, most of the Runtime Framework's behavior is asynchronous. It may therefore make sense to factor out the mapping into a separate Adapter, allowing the Runtime Framework proper to have a fully asynchronous architecture. For the moment, we will leave this as an implementation decision, but we may make the Adapter a formal part of the architecture at a later date. The Runtime Framework's main purpose is to provide the infrastructure, rather than to interact with the user. Thus it implements the basic event loop, which the Components use to communicate with one another, but is not expected to handle by itself any events other than life-cycle events. However, if the Controller Document markup section of the application provides presentation markup as well as Interaction Management, the Runtime Framework will execute it just as the Modality Components do. Note, however, that the execution of such presentation markup is internal to the Runtime Framework and need not rely on the Modality Component API.

5.1.1 4.2.1 The Interaction Manager

The Interaction Manager (IM) is the sub-component of the Runtime Framework that is responsible for handling all events that the other Components generate. Normally there will be specific markup associated with the IM instructing it how to respond to events. This markup will thus contain a lot of the most basic interaction logic of an application. Existing languages such as SMIL, CCXML, SCXML, or ECMAScript can be used for IM markup as an alternative to defining special-purpose languages aimed specifically at multimodal applications. In a future draft of this specification, we may define the interface between the IM and the Runtime Framework, with the goal of making it easy to plug in different IM languages into a given Framework. However, the current draft does not specify such an API so that the Runtime Framework and IM appear as a single unit to the Modality Components. The IM fulfills multiple functions. For example, it is responsible for synchronization of data and focus, etc., across different Modality Components as well as the higher-level application flow that is independent of Modality Components. It also maintains the high-level application data model and may handle communication with external entities and back-end systems. In the future we may split these functions apart and define different components for each of them. However, for the moment, we leave them rolled up in a single monolithic Interaction Manager component. We note that state machine languages such as SCXML are a good choice for authoring such a multi-function component, since state machines can be composed. Thus it is possible to define a high-level state machine representing the overall application flow, with lower-level state machines nested inside it handling the the cross-modality synchronization at each phase of the higher-level flow.

Due to the Russian Doll model, Components may contain their own Interaction Managers to handle their internal events. However these Interaction Managers are not visible to the top level Runtime Framework or Interaction Manager.

If the Interaction Manager does not contain an explicit handler for an event, any default behavior that has been established for the event will be respected. If there is no default behavior, the event will be ignored. (In effect, the Interaction Manager's default handler for all events is to ignore them.)

5.1.2 The Delivery Context Component The Delivery Context [DCCI] is intended to provide a platform-abstraction layer enabling dynamic adaptation to user preferences, environmental conditions, device configuration and capabilities. It allows Constituents and applications to: query for properties and their values update (run-time settable) properties receive notifications of changes to properties Note that some device properties, such as screen brightness, are run-time settable, while others, such as whether there is a screen, are not. The term 'property' is also used for characteristics that may be more properly thought of as user preferences, such as preferred output modality or default speaking volume. 5.1.3 4.2.2 The Data Component

The Data Component is a sub-component of the Runtime Framework which is responsible for storing application-level data. The Interaction Manager is a client of the Data Component and must be able to access and update the Data Component it as part of its control flow logic, but Modality Components do not have direct access to it. Since Modality Components are black boxes, they may have their own internal Data Components and may interact directly with backend servers. However, the only way that Modality Components can share data among themselves and maintain consistency is is via the Interaction Manager. It is therefore good application design practice to divide data into two logical classes: private data, which is of interest only to a given modality component, and public data, which is of interest to the Interaction Manager or to more than one Modality Component. Private data may be managed as the Modality Component sees fit, but all modification of public data, including submission to back end servers, should be entrusted to the Interaction Manager.

For the initial version of this specification, we will do not specify a data access language, but will assume that define an interface between the Data Component and the Interaction Manager language provides sufficient Manager. This amounts to treating the Data Component as part of the Interaction Manager. (Note that this means that the data access capabilities, including submission to back end servers. language will be whatever one the IM provides.) The Data Component is shown with a dotted outline in the diagram above, because it is only logically distinct. However, at some point in the future, we may define the interface between the Data Component and the Interaction Manager and require support for a specific data access language, independent of the Interaction Manager.

5.2 4.2.3 The Modality Components

Modality Components, as their name would indicate, are responsible for controlling the various input and output modalities on the device. They are therefore responsible for handling all interaction with the user(s). Their only responsibility is to implement the interface defined in 6 5 Interface between the Runtime Framework Interaction Manager and the Modality Components . . Any further definition of their responsibilities must be highly domain- and application-specific. In particular we do not define a set of standard modalities or the events that they should generate or handle. Platform providers are allowed to define new Modality Components and are allowed to place into a single Component functionality that might logically seem to belong to two different modalities. Thus a platform could provide a handwriting-and-speech Modality Component that would accept simultaneous voice and pen input. Such combined Components permit a much tighter coupling between the two modalities than the loose interface defined here. Furthermore, modality components may be used to perform general processing functions not directly associated with any specific interface modality, for example, dialog flow control or natural language processing.

In most cases, there will be specific markup in the application corresponding to a given modality, specifying how the interaction with the user should be carried out. However, we do not require this and specifically allow for a markup-free modality component whose behavior is hard-coded into its software.

5.3 Examples 4.2.4 The Runtime Framework

For The Runtime Framework is a cover term for all the sake of concreteness, here are some examples of components that could be implemented using existing languages. Note infrastructure services that we are mixing the design-time and run-time views here, since it is the implementation necessary for successful execution of the language (the browser) that serves as the run-time component. CCXML [CCXML] could be used as both the Controller Document and the Interaction Manager language, with the CCXML interpreter serving as the Runtime Framework and Interaction Manager. SCXML [SCXML] could be used as the Controller Document and Interaction Manager language In an integrated a multimodal browser, the markup language that provided the document root tag would define the Controller Document while application. This includes starting the associated scripting language could serve as components, handling communication, and logging, etc. For the Interaction Manager. XHTML [XHTML] could be used as most part, this version of the markup for a Modality Component. VoiceXML [VoiceXML] could specification leaves these functions to be used as the markup for defined in a Modality Component. SVG [SVG] could be used as the markup for platform-specific way, but we do specifically define a Modality Component. SMIL [SMIL] could be used as Transport Layer which handles communications between the markup for a Modality Component. components.

6 Interface between the Runtime Framework and the Modality Components 4.2.4.1 The Event Transport Layer

The most important interface in this architecture Event Transport Layer is responsible for delivering events among the one between the Modality Components IM and the Runtime Framework. Modality Components communicate with the Framework via asynchronous events. Components must be able to raise events and to handle events that Components. Clearly, there are delivered to them asynchronously. It is not required that components use these events internally since the implementation of a given Component is black box to the rest of the system. In general, it is expected that Components will raise events both automatically (i.e., as part of their implementation) and under mark-up control. The disposition of events is the responsibility of the Runtime Framework layer. That is, the Component multiple transport mechanisms (protocols) that raises an event does not specify which Component it should can be delivered used to or even whether it should implement a Transport Layer and different mechanisms may be delivered to any Component at all. Rather that determination is left up used to communicate with different modality components. Thus the Framework and Interaction Manager. 6.1 Event Delivery Mechanism We do not currentlyspecify Transport Layer consists of one or more transport mechanisms linking the mechanism used IM to deliver events between the various Modality Components and the Runtime Framework, but we may do so in the future. Components.

We do place the following requirements on it: all transport mechanisms:

  1. Events must be delivered reliably. In particular, the event delivery mechanism must report an error if an event can not be delivered, for example if the destination endpoint is unavailable.
  2. Events must be delivered to the destination in the order in which the source generated them. There is no guarantee on the delivery order of events generated by different sources. For example, if Modality Component M1 generates events E1 and E2 in that order, while Modality Component M2 generates E3 and then E4, we require that E1 be delivered to the Runtime Framework before E2 and that E3 be delivered before E4, but there is no guarantee on the ordering of E1 or E2 versus E3 or E4.

For a sample definition of a Transport Layer relying on HTTP, see E HTTP transport of MMI lifecycle events . In the current draft, this definition is provided as an example only, but in future drafts we may require support for this and possibly other Transport Layer definitions.

6.1.1 4.2.4.1.1 Event and Information Security

Events will often carry sensitive information, such as bank account numbers or health care information. In addition events must also be reliable to both sides of transaction: for example, if an event carries an assent to a financial transaction, both sides of the transaction must be able to rely on that assent.

We do not currently specify delivery mechanisms or internal security safeguards used by the Modality Components and the Runtime Framework. Interaction Manager. However, we believe that any secure system will have to meet the following requirements at a minimum:

The following two optional requirements can be met by using the W3's XML-Signature Syntax and Processing specifiction specification [XMLSig] .

  1. Authentication. The event delivery mechanism should be able to ensure that the identity of components in an interaction are known.
  2. Integrity. The event delivery mechanism should be able to ensure that the contents of events have not been altered in transit.

    The remaining optional requirements for event delivery and information security can be met by following other industry-standard procedures.

  3. Authorization. A component should provide a method to ensure only authorized components can connect to it.
  4. Privacy. The event delivery mechanism should provide a method to keep the message contents secure from any unauthorized access while in transit.
  5. Non-repudiation. The event delivery mechanism, in conjunction with the components, may provide a method to ensure that if a message is sent from one constituent to another, the originating constituent cannot repudiate the message that it sent and that the receiving constituent cannot repudiate that the message was received.
6.1.2 Multiple Protocols

Multiple protocols may be necessary to implement these requirements. For example, TCP/IP and HTTP provide reliable event delivery, but additional protocols such as TLS or HTTPS could be required to meet security requirements.

6.1.3 4.2.5 System and OS Security

This architecture does not and will not specify the internal security requirements of a Modality Component or Runtime Framework.

4.2.6 Examples

For the sake of concreteness, here are some examples of components that could be implemented using existing languages. Note that we are mixing the design-time and run-time views here, since it is the implementation of the language (the browser) that serves as the run-time component.

  • CCXML [CCXML] could be used as both the Controller Document and the Interaction Manager language, with the CCXML interpreter serving as the Runtime Framework and Interaction Manager.
  • SCXML [SCXML] could be used as the Controller Document and Interaction Manager language
  • In an integrated multimodal browser, the markup language that provided the document root tag would define the Controller Document while the associated scripting language could serve as the Interaction Manager.
  • XHTML [XHTML] could be used as the markup for a Modality Component.
  • VoiceXML [VoiceXML] could be used as the markup for a Modality Component.
  • SVG [SVG] could be used as the markup for a Modality Component.
  • SMIL [SMIL] could be used as the markup for a Modality Component.

5 Interface between the Interaction Manager and the Modality Components

The most important interface in this architecture is the one between the Modality Components and the Interaction Manager. Modality Components communicate with the IM via asynchronous events. Components must be able to raise events and to handle events that are delivered to them asynchronously. It is not required that components use these events internally since the implementation of a given Component is black box to the rest of the system. In general, it is expected that Components will raise events both automatically (i.e., as part of their implementation) and under mark-up control. The disposition of events is the responsibility of the IM. That is, the Component that raises an event does not specify which Component it should be delivered to or even whether it should be delivered to any Component at all. Rather that determination is left up to the Interaction Manager.

6.2 5.1 Standard Life Cycle Events

The Multimodal Architecture defines the following basic life-cycle events which must be supported by all modality components. These events allow the Runtime Framework Interaction Manager to invoke modality components and receive results from them. They thus form the basic interface between the Runtime Framework IM and the Modality components. Note that the 'Extension' event offers extensibility since it contains arbitrary XML content and be raised by either the Runtime Framework IM or the Modality Components at any time once the context has been established. For example, an application relying on speech recognition could use the 'Extension' event to communicate recognition results or the fact that speech had started, etc.

The concept of 'context' is basic to these events described below. A context represents a single extended interaction with one (or possibly more) users. In a simple unimodal case, a context can be as simple as a phone call or SSL session. Multimodal cases are more complex, however, since the various modalities may not be all used at the same time. For example, in a voice-plus-web interaction, e.g., web sharing with an associated VoIP call, it would be possible to terminate the web sharing and continue the voice call, or to drop the voice call and continue via web chat. In these cases, a single context persists across various modality configurations. In general, we intend for 'context' to cover the longest period of interaction over which it would make sense for components to store state or information.

For examples of the concrete XML syntax for all these events, see B A Examples of Life-Cycle Events

6.2.1 5.1.1 NewContextRequest

Optional event that a Modality Component may send to the Runtime Framework IM to request that a new context be created. If this event is sent, the Runtime Framework IM must respond with the NewContextResponse event.

6.2.1.1 5.1.1.1 NewContextRequest Properties
  • RequestID . An arbitrary identifier generated by the Modality Component used to identify this request.
  • Media One or more valid media types indicating the media to be associated with the context.
  • Source URI representing the address of the sender of the event.
  • Target URI representing the address of the destination of the event.
  • Data Optional additional data.

6.2.2 5.1.2 NewContextResponse

Sent by the Runtime Framework IM in response to the NewContextRequest message.

6.2.2.1 5.1.2.1 NewContextResponse Properties
  • RequestID . Matches the RequestID in the NewContextRequest event.
  • Status An enumeration of Success or Failure. If the value is Success, the NewContextRequest has been accepted and a new context identifier will be included. (See below). If the value is Failure, no context identifier will be included and further information will be included in the StatusInfo field.
  • Context A URI identifying the new context. Empty if status is Failure.
  • Media One or more valid media types indicating the media to be associated with the context. Note that these do not have to be identical to the ones contained in the NewContextRequest.
  • StatusInfo If status equals Failure, this field holds further information.
  • Source URI representing the address of the sender of the event.
  • Target URI representing the address of the destination of the event.
  • Data Optional additional data.

6.2.3 5.1.3 PrepareRequest

An optional event that the Runtime Framework IM may send to allow the Modality Components to pre-load markup and prepare to run. Modality Components are not required to take any particular action in response to this event, but they must return a PrepareResponse event.

6.2.3.1 5.1.3.1 PrepareRequest Properties
  • RequestID . An arbitrary identifier generated by the IM and used to identify this request.
  • Context . A unique URI designating this context. Note that the Runtime Framework IM may re-use the same context value in successive calls to Start if they are all within the same session/call.
  • ContentURL Optional URL of the content that the Modality Component should execute. Includes standard HTTP fetch parameters such as max-age, max-stale, fetchtimeout, etc. Incompatible with content .
  • Content Optional Inline markup for the Modality Component to execute. Incompatible with contentURL . Note that it is legal for both contentURL and content to be empty. In such a case, the Modality Component will revert to its default hard-coded behavior, which could consist of returning an error event or of running a preconfigured or hard-coded script.
  • Source URI representing the address of the sender of the event.
  • Target URI representing the address of the destination of the event.
  • Data Optional additional data.

A given component may only execute a single StartRequest at one time (see 6.2.5 5.1.5 StartRequest ). However, the Interaction Manager may send multiple PrepareRequest events to a Modality Component for the same Context, each referencing a different ContentURL or containing different in-line Content, before sending a StartRequest. In this case, the Modality Component should prepare to run any of the specified content. The subsequent StartRequest event will determine which specific content the Modality Component should execute.

6.2.4 5.1.4 PrepareResponse

Sent by the Modality Component in response to the Prepare event. Modality Components that return a PrepareResponse event with Status of 'Success' should be ready to run with close to 0 delay upon receipt of the Start event.

6.2.4.1 5.1.4.1 PrepareResponse Properties
  • RequestID . Matches the RequestID in the PrepareRequest event.
  • Context Must match the value in the Prepare event.
  • Status Enumeration: Success or Failure.
  • StatusInfo If Status equals Failure, this field holds further information (examples: NotAuthorized, BadFormat, MissingURI, MissingField.)
  • Source URI representing the address of the sender of the event.
  • Target URI representing the address of the destination of the event.
  • Data Optional additional data.

6.2.5 5.1.5 StartRequest

The Runtime Framework IM sends this event to invoke a Modality Component. The Modality Component must return a StartResponse event in response. If the Runtime Framework has sent a previous Prepare event, it may leave the contentURL and content fields empty, and the Modality Component will use the values from the Prepare event. If the Runtime Framework IM includes new values for these fields, the values in the Start event override those in the Prepare event.

6.2.5.1 5.1.5.1 StartRequest Properties
  • RequestID . An arbitrary identifier generated by the IM and used to identify this request.
  • Context . A unique URI designating this context. Note that the Runtime Framework IM may re-use the same context value in successive calls to Start if they are all within the same session/call.
  • ContentURL Optional URL of the content that the Modality Component should execute. Includes standard HTTP fetch parameters such as max-age, max-stale, fetchtimeout, etc. Incompatible with content .
  • Content Optional Inline markup for the Modality Component to execute. Incompatible with contentURL . Note that it is legal for both contentURL and content to be empty. In such a case, the Modality Component will either use the values provided in the preceding Prepare event, if one was sent, or revert to its default hard-coded behavior, which could consist of returning an error event or of running a preconfigured or hard-coded script.
  • Source URI representing the address of the sender of the event.
  • Target URI representing the address of the destination of the event.
  • Data Optional additional data.

If the Interaction Manager sends multiple StartRequests to a given Modality Component before it receives a DoneNotification, each such request overrides the earlier ones. Thus if a Modality Component receives a new StartRequest while it is executing a previous one, it should cancel the execution of the previous StartRequest, producing a suitable DoneNotification, and begin executing the content specified in the most recent StartRequest. If it is unable to cancel the execution of the previous StartRequest, the Modality Component should reject the new StartRequest, returning a suitable failure code in the StartResponse.

6.2.6 5.1.6 StartResponse

The Modality Component must send this event in response to the Start event.

6.2.6.1 5.1.6.1 StartResponse Properties
  • RequestID . Matches the RequestID in the StartRequest event.
  • Context Must match the value in the Start event.
  • Status Enumeration: Success or Failure.
  • StatusInfo If status equals Failure, this field holds further information (examples: NotAuthorized, BadFormat, MissingURI, MissingField.)
  • Source URI representing the address of the sender of the event.
  • Target URI representing the address of the destination of the event.
  • Data Optional additional data.

6.2.7 5.1.7 DoneNotification

Returned by the Modality Component to indicate that it has reached the end of its processing.

6.2.7.1 5.1.7.1 DoneNotification Properties
  • RequestID . Matches the RequestID of the StartRequest event.
  • Context Must match the value in the Start event.
  • Status Enumeration: Success or Failure.
  • StatusInfo If status equals Failure, this field holds further information.
  • Source URI representing the address of the sender of the event.
  • Target URI representing the address of the destination of the event.
  • Data Optional additional data.

Note that the DoneNotification lifecycle event is optional. The DoneNotification event is intended to indicate the completion of its processing that has been initiated by the Interaction Manager using a StartRequest before. As an example a voice modality component might use the DoneNotification event to indicate the completion of a recognition task. In this case the DoneNotification event might carry the recognition result expressed using EMMA. However, there may be tasks which do not have a specific end. For example a graphical modality component will be forced by the Interaction Manager to display information using a StartRequest. Such a task does not necessarily have a specific end and thus the graphical modality component might never send a DoneNotification event to the Interaction Manager. That means the graphical modality component will display the screen until it receives another StartRequest (or some other lifecycle event) from the Interaction Manager.

6.2.8 5.1.8 CancelRequest

Sent by the Runtime Framework IM to stop processing in the Modality Component. The Modality Component must return CancelResponse.

6.2.8.1 5.1.8.1 CancelRequest Properties
  • RequestID . An arbitrary identifier generated by the IM and used to identify this request.
  • Context Must match the value in the Start event.
  • Immediate Boolean value indicating whether a hard stop is requested.
  • Source URI representing the address of the sender of the event.
  • Target URI representing the address of the destination of the event.
  • Data Optional additional data.

6.2.9 5.1.9 CancelResponse

Returned by the Modality Component in response to the Cancel command.

6.2.9.1 5.1.9.1 CancelResponse Properties
  • RequestID . Matches the RequestID in the CancelRequest event.
  • Context Must match the value in the Start event.
  • Status Enumeration: Success or Failure.
  • StatusInfo If status equals Failure, this field holds further information.
  • Source URI representing the address of the sender of the event.
  • Target URI representing the address of the destination of the event.
  • Data Optional additional data.

6.2.10 5.1.10 PauseRequest

Sent by the Runtime Framework IM to suspend processing by the Modality Component. Implementations may ignore this command if they are unable to pause, but they must return PauseResponse.

6.2.10.1 5.1.10.1 PauseRequest Properties
  • RequestID . An arbitrary identifier generated by the IM and used to identify this request.
  • Context Must match the value in the Start event.
  • Immediate Boolean value indicating whether a hard pause is requested.
  • Source URI representing the address of the sender of the event.
  • Target URI representing the address of the destination of the event.
  • Data Optional additional data.

6.2.11 5.1.11 PauseResponse

Returned by the Modality Component in response to the Pause command.

6.2.11.1 5.1.11.1 PauseResponse Properties
  • RequestID . Matches the RequestID in the PauseRequest event.
  • Context Must match the value in the Start event.
  • Status Enumeration: Success or Failure.
  • StatusInfo If status equals Failure, this field holds further information.
  • Source URI representing the address of the sender of the event.
  • Target URI representing the address of the destination of the event.
  • Data Optional additional data.

6.2.12 5.1.12 ResumeRequest

Sent by the Runtime Framework IM to resume paused processing by the Modality Component. Implementations may ignore this command if they are unable to pause, but they must return ResumeResponse.

6.2.12.1 5.1.12.1 ResumeRequest Properties
  • RequestID . An arbitrary identifier generated by the IM and used to identify this request.
  • Context Must match the value in the Start event.
  • Source URI representing the address of the sender of the event.
  • Target URI representing the address of the destination of the event.
  • Data Optional additional data.

6.2.13 5.1.13 ResumeResponse

Returned by the Modality Component in response to the Resume command.

6.2.13.1 5.1.13.1 ResumeResponse Properties
  • RequestID . Matches the RequestID in the ResumeRequest event.
  • Context Must match the value in the Start event.
  • Status Enumeration: Success or Failure.
  • StatusInfo If status equals Failure, this field holds further information.
  • Source URI representing the address of the sender of the event.
  • Target URI representing the address of the destination of the event.
  • Data Optional additional data.

6.2.14 5.1.14 ExtensionNotification

This event may be generated by either the Runtime Framework IM or the Modality Component. It is used to encapsulate application-specific events that are extensions to the framework defined here. For example, if an application containing a voice modality wanted that modality component to notify the Interaction Manager when speech was detected, it would cause the voice modality to generate an Extension event (with a 'name' of something like 'speechDetected') at the appropriate time.

6.2.14.1 5.1.14.1 ExtensionNotification Properties
  • RequestID . An arbitrary identifier assigned by the component that sends the event.
  • Name The name of this event. This is an application-specific value.
  • Context Must match the value in the Start event.
  • Source URI representing the address of the sender of the event.
  • Target URI representing the address of the destination of the event.
  • Data Optional additional data.

6.2.15 5.1.15 ClearContextRequest

Sent by the Runtime Framework IM to indicate that the specified context is no longer active and that any resources associated with it may be freed. (More specifically, the next time that the Runtime Framework IM uses the specified context ID, it should be understood as referring to a new context.)

6.2.15.1 5.1.15.1 ClearContextRequest Properties
  • RequestID . An arbitrary identifier generated by the IM and used to identify this request.
  • Context Must match the value in the Start event.
  • Source URI representing the address of the sender of the event.
  • Target URI representing the address of the destination of the event.
  • Data Optional additional data.

6.2.16 5.1.16 ClearContextResponse

Returned by the Modality Component in response to theClearContext command.

6.2.16.1 5.1.16.1 ClearContextResponse Properties
  • RequestID . Matches the RequestID in the ClearContextRequest event.
  • Context Must match the value in the Start event.
  • Status Enumeration: Success or Failure.
  • StatusInfo If status equals Failure, this field holds further information.
  • Source URI representing the address of the sender of the event.
  • Target URI representing the address of the destination of the event.
  • Data Optional additional data.

6.2.17 5.1.17 StatusRequest

The StatusRequest message and the corresponding StatusResponse are intended to provide keep-alive functionality, informing the Runtime Framework IM about the presence of the various modality components. Note that both these messages are not tied to any Context and may thus be either linked to a specific context or sent to the underlying server independent of any user interaction. In the former case, the IM is inquiring about the status of the specific interaction (i.e. context). In the latter case, it is in effect asking the underlying server whether it could start a new Context if requested to do so.

The StatusRequest message is sent from the Runtime Framework IM to a Modality Component. By waiting for an implementation dependent period of time for a StatusResponse message, the Runtime Framework IM may determine if the Modality Component is active.

6.2.17.1 5.1.17.1 Status Request Properties
  • RequestID . An arbitrary identifier generated by the IM and used to identify this request.
  • Context Optional specification of the context for which the status is requested. If it is not present, the request is directed to the underlying server that would host a new context if one were created.
  • RequestAutomaticUpdate . A boolean value indicating whether the Modality Component should send ongoing StatusResponse messages without waiting for additional StatusRequest messages from the Runtime Framework.
  • Source URI representing the address of the sender of the event.
  • Target URI representing the address of the destination of the event.
  • Data Optional additional data.

6.2.18 5.1.18 StatusResponse

Sent by the Modality Component to the Runtime Framework. Interaction Manager. If automatic updates are enabled, the Modality Component may send multiple StatusResponse messages in response to a single StatusRequest message.

6.2.18.1 5.1.18.1 StatusResponse Properties
  • RequestID . Matches the RequestID in the StatusRequest event.
  • AutomaticUpdate . A boolean indicating whether the Modality Component will keep sending StatusResponse messages in the future without waiting for another StatusRequest message.
  • Context An optional specification of the context for which the status is being returned. If not present, the response represents the status of the underlying server.
  • Status An enumeration of 'Alive' or 'Dead'. The meaning of these values depends on whether the 'context' parameter is present. If it is, the status is 'Alive', 'Alive' means that the specified session is still active and capable of handling new life cycle events. The status 'Dead' means that the context has terminated and no further interaction with the user is available using it. If the 'context' parameter is not provided, the status refers to the underlying server. A value of 'Alive' indicates that the Modality Component is able to handle subsequent Prepare and Start messages. If status is 'Dead', it is not able to handle such requests. Thus the status of 'Dead' indicates that the modality component is going off-line. If the Runtime Framework IM receives a StatusResponse message with status of 'Dead', it may continue to send StatusRequest messages, but it may not receive a response to them until the Modality Component comes back online.
  • Source URI representing the address of the sender of the event.
  • Target URI representing the address of the destination of the event.
  • Data Optional additional data.

7 6 Open Issues

Issue (confidential event data): (confidential_event_data):

We are considering adding a field to life-cycle events indicating that the event contains confidential data (such as bank account numbers or PINs) which should not be implicitly logged by the platform or made potentially available to third parties in any way. Note that this is a separate requirement than the security requirements placed on the event transport protocol in 6.1 Event Delivery Mechanism . We would like feedback from potential implementers and users of this standard as to whether such a feature would be useful and how it should be defined.

Resolution:

None recorded.

A Examples of Life-Cycle Events

In this specification we use elements from a fictional "dcont" namespace in some examples. The W3C Ubiquitous Web Application Working Group (UWA-WG) is developing such an ontology and expects to define a "dcont" namespace. The examples below are informative only and may, unintentionally, be incompatible with the work of the UWA-WG. For authoritative information on a (future) "dcont" namespace, please consult the Delivery Context Ontology

specification. 1. A.1 newContextRequest (from MC to IM)

(The definition of "media" and the details of the media element will be discussed in the next draft.)

<mmi:mmi xmlns="http://www.w3.org/2008/04/mmi-arch" version="1.0">
   <mmi:newContextRequest source="someURI" requestID="request-1">
                <media id="mediaID1>media1</media>
                <media id="mediaID2">media2</media>
        <mmi:data xmlns:dcont="http://www.w3.org/2008/04/dcont">
                <dcont:DeliveryContext>
                ... 
                </dcont:DeliveryContext >
        </mmi:data>

<mmi:mmi xmlns:mmi="http://www.w3.org/2008/04/mmi-arch" version="1.0">
   <mmi:newContextRequest source="someURI" target="someOtherURI" requestID="request-1">
       <media id="mediaID1">media1</media>
       <media id="mediaID2">media2</media>

   </mmi:newContextRequest>
</mmi:mmi>
  

2.

A.2 newContextResponse (from IM to MC)

<mmi:mmi xmlns="http://www.w3.org/2008/04/mmi-arch" version="1.0">
   <mmi:newContextResponse source="someURI" requestID="request-1" status="success" context="URI-1">
        <media>media1</media>
        <media>media2</media>

<mmi:mmi xmlns:mmi="http://www.w3.org/2008/04/mmi-arch" version="1.0">
   <mmi:newContextResponse source="someURI" target="someOtherURI" requestID="request-1" status="success" context="URI-1">
       <media>media1</media>
       <media>media2</media>

   </mmi:newContextResponse>
</mmi:mmi>
 

3.

A.3 prepareRequest (from IM to MC, with external markup)

<mmi xmlns="http://www.w3.org/2008/04/mmi-arch" version="1.0">
   <mmi:prepareRequest source="someURI" context="URI-1" requestID="request-1">
        <mmi:contentURL href="someContentURI" max-age="" fetchtimeout="1s"/>
        </mmi:prepareRequest>
</mmi>

<mmi:mmi xmlns:mmi="http://www.w3.org/2008/04/mmi-arch" version="1.0">
   <mmi:prepareRequest source="someURI" target="someOtherURI" context="URI-1" requestID="request-1">
       <mmi:contentURL href="someContentURI" max-age="" fetchtimeout="1s"/>
   </mmi:prepareRequest>
</mmi:mmi>

4.

A.4 prepareRequest (from IM to MC, inline VoiceXML markup)

<mmi xmlns="http://www.w3.org/2008/04/mmi-arch" version="1.0">
   <mmi:prepareRequest source="someURI" context="URI-1" requestID="request-1" >
        <mmi:content>
                <vxml:vxml version="2.0">
                        <vxml:form>
                                <vxml:block>Hello World!</vxml:block>
                        </vxml:form>
                </vxml:vxml>
        </mmi:content>

<mmi:mmi xmlns:mmi="http://www.w3.org/2008/04/mmi-arch" version="1.0"
xmlns:vxml="http://www.w3.org/2001/vxml">
   <mmi:prepareRequest source="someURI" target="someOtherURI" context="URI-1" requestID="request-1" >
       <mmi:content>
           <vxml:vxml version="2.0">
               <vxml:form>
                   <vxml:block>Hello World!</vxml:block>
               </vxml:form>
           </vxml:vxml>
       </mmi:content>

   </mmi:prepareRequest>
</mmi:mmi>
  

5.

A.5 prepareResponse (from MC to IM, success)

<mmi xmlns="http://www.w3.org/2008/04/mmi-arch" version="1.0">
   <mmi:prepareResponse source="someURI" context="someURI" requestID="request-1" status="success"/>

<mmi:mmi xmlns:mmi="http://www.w3.org/2008/04/mmi-arch" version="1.0">
   <mmi:prepareResponse source="someURI" target="someOtherURI" context="someURI" requestID="request-1" status="success"/>

</mmi:mmi>
 

6.

A.6 prepareResponse (from MC to IM, failure)

<mmi xmlns="http://www.w3.org/2008/04/mmi-arch" version="1.0">
   <mmi:prepareResponse source="someURI" context="someURI" requestID="request-1" status="failure">
        <mmi:statusInfo>
                NotAuthorized
            </mmi:statusInfo>

<mmi:mmi xmlns:mmi="http://www.w3.org/2008/04/mmi-arch" version="1.0">
   <mmi:prepareResponse source="someURI" target="someOtherURI" context="someURI" requestID="request-1" status="failure">
       <mmi:statusInfo>
           NotAuthorized
       </mmi:statusInfo>

   </mmi:prepareResponse>
</mmi:mmi>
 

7.

A.7 startRequest (from IM to MC)

<mmi xmlns="http://www.w3.org/2008/04/mmi-arch" version="1.0">
   <mmi:startRequest source="someURI" context="URI-1" requestID="request-1">
        <mmi:contentURL href="someContentURI" max-age="" fetchtimeout="1s">

<mmi:mmi xmlns:mmi="http://www.w3.org/2008/04/mmi-arch" version="1.0">
   <mmi:startRequest source="someURI" target="someOtherURI" context="URI-1" requestID="request-1">
       <mmi:contentURL href="someContentURI" max-age="" fetchtimeout="1s"/>

  </mmi:startRequest>
</mmi> 

</mmi:mmi>

8.

A.8 startResponse (from MC to IM)

<mmi xmlns="http://www.w3.org/2008/04/mmi-arch" version="1.0">
   <mmi:startResponse source="someURI" context="someURI" requestID="request-1" status="failure">
        <mmi:statusInfo>
                NotAuthorized
        </mmi:statusInfo>

<mmi:mmi xmlns:mmi="http://www.w3.org/2008/04/mmi-arch" version="1.0">
   <mmi:startResponse source="someURI" target="someOtherURI" context="someURI" requestID="request-1" status="failure">
       <mmi:statusInfo>
           NotAuthorized
       </mmi:statusInfo>

   </mmi:startResponse>
</mmi:mmi>
9.

A.9 doneNotification (from MC to IM, with EMMA result)

This requestID corresponds to the requestID of the "startRequest" event that started it.

<mmi:mmi xmlns="http://www.w3.org/2008/04/mmi-arch" version="1.0">
   <mmi:doneNotification source="someURI" context="someURI" status="success" requestID="request-1" >
                <mmi:data>
                <emma:emma version="1.0"
                                <emma:interpretation id="int1" emma:medium="acoustic" emma:confidence=".75" emma:mode="voice" emma:tokens="flights from boston to denver">
                                <origin>Boston</origin>
                                <destination>Denver</destination>
                                </emma:interpretation>
                        </emma:emma>
        </mmi:data>

<mmi:mmi xmlns:mmi="http://www.w3.org/2008/04/mmi-arch" version="1.0"
xmlns:emma="http://www.w3.org/2003/04/emma">
   <mmi:doneNotification source="someURI" target="someOtherURI" context="someURI" status="success" requestID="request-1" >
       <mmi:data>
           <emma:emma version="1.0">
               <emma:interpretation id="int1" emma:medium="acoustic" emma:confidence=".75" emma:mode="voice" emma:tokens="flights from boston to denver">
                   <origin>Boston</origin>
                   <destination>Denver</destination>
               </emma:interpretation>
           </emma:emma>
       </mmi:data>

   </mmi:doneNotification>
</mmi:mmi>
 

10.

A.10 doneNotification (from MC to IM, with EMMA "no-input" result)

<mmi:mmi xmlns="http://www.w3.org/2008/04/mmi-arch" version="1.0">
   <mmi:doneNotification source="someURI" context="someURI" status="success" requestID="request-1" >
         <mmi:data>
                <emma:emma version="1.0"
                                <emma:interpretation id="int1" emma:no-input="true"/>
                </emma:emma>
    </mmi:data>

<mmi:mmi xmlns:mmi="http://www.w3.org/2008/04/mmi-arch" version="1.0">
   <mmi:doneNotification source="someURI" target="someOtherURI" context="someURI" status="success" requestID="request-1" >
       <mmi:data>
           <emma:emma version="1.0">
               <emma:interpretation id="int1" emma:no-input="true"/>
           </emma:emma>
       </mmi:data>

   </mmi:doneNotification>
</mmi:mmi>
   

11.

A.11 cancelRequest (from IM to MC)

<mmi xmlns="http://www.w3.org/2008/04/mmi-arch" version="1.0">
   <mmi:cancelRequest context="someURI" source="someURI" immediate="true" requestID="request-1">
   </mmi:cancelRequest>
</mmi>
 

<mmi:mmi xmlns:mmi="http://www.w3.org/2008/04/mmi-arch" version="1.0">
   <mmi:cancelRequest source="someURI" target="someOtherURI" context="someURI" requestID="request-1"/>
</mmi:mmi>

12.

A.12 cancelResponse (from MC IM to IM) MC)

<mmi xmlns="http://www.w3.org/2008/04/mmi-arch" version="1.0">
   <mmi:cancelResponse source="someURI" context="someURI" requestID="request-1" status="success"/>
   </mmi:cancelResponse>

<mmi:mmi xmlns:mmi="http://www.w3.org/2008/04/mmi-arch" version="1.0">
   <mmi:cancelResponse source="someURI" target="someOtherURI" context="someURI" requestID="request-1" status="success"/>

</mmi:mmi>
 

13.

A.13 pauseRequest (from IM to MC)

<mmi xmlns="http://www.w3.org/2008/04/mmi-arch" version="1.0">
   <mmi:pauseRequest context="someURI" source="someURI" immediate="true" requestID="request-1"/>
</mmi>

<mmi:mmi xmlns:mmi="http://www.w3.org/2008/04/mmi-arch" version="1.0">
   <mmi:pauseRequest context="someURI" source="someURI" target="someOtherURI" immediate="true" requestID="request-1"/>
</mmi:mmi>

14.

A.14 pauseResponse (from MC to IM)

<mmi xmlns="http://www.w3.org/2008/04/mmi-arch" version="1.0">
   <mmi:cancelResponse source="someURI" context="someURI" requestID="request-1" status="success"/>
</mmi:mmi> 

<mmi:mmi xmlns:mmi="http://www.w3.org/2008/04/mmi-arch" version="1.0">
   <mmi:pauseResponse source="someURI" target="someOtherURI" context="someURI" requestID="request-1" status="success"/>
</mmi:mmi>

15.

A.15 resumeRequest (from IM to MC)

<mmi xmlns="http://www.w3.org/2008/04/mmi-arch" version="1.0">
   <mmi:resumeRequest context="someURI" source="someURI" requestID="request-1"/>
</mmi>
   

<mmi:mmi xmlns:mmi="http://www.w3.org/2008/04/mmi-arch" version="1.0">
   <mmi:resumeRequest context="someURI" source="someURI" target="someOtherURI" requestID="request-1"/>
</mmi:mmi>

16.

A.16 resumeResponse (from MC to IM)

<mmi xmlns="http://www.w3.org/2008/04/mmi-arch" version="1.0">
   <mmi:resumelResponse source="someURI" context="someURI" requestID="request-2" status="success"/>

<mmi:mmi xmlns:mmi="http://www.w3.org/2008/04/mmi-arch" version="1.0">
   <mmi:resumeResponse source="someURI" target="someOtherURI" context="someURI" requestID="request-2" status="success"/>

</mmi:mmi>
   

17.

A.17 extensionNotification (formerly the data event, sent in both directions)

<mmi:mmi xmlns="http://www.w3.org/2008/04/mmi-arch" version="1.0"> 
   <mmi:extensionNotification name="appEvent" source="someURI" context="someURI" requestID="request-1" >
        <applicationdata/> 

<mmi:mmi xmlns:mmi="http://www.w3.org/2008/04/mmi-arch" version="1.0"> 
   <mmi:extensionNotification name="appEvent" source="someURI" target="someOtherURI" context="someURI" requestID="request-1">
       <applicationdata/> 

   </mmi:extensionNotification>
</mmi:mmi>
    

18.

A.18 clearContextRequest (from the IM to the MC)

<mmi xmlns="http://www.w3.org/2008/04/mmi-arch" version="1.0">
   <mmi:clearContextRequest source="someURI" context="someURI" requestID="request-2"/>

<mmi:mmi xmlns:mmi="http://www.w3.org/2008/04/mmi-arch" version="1.0">
   <mmi:clearContextRequest source="someURI" target="someOtherURI" context="someURI" requestID="request-2"/>

</mmi:mmi>
   

19.

A.19 statusRequest (from the IM to the MC)

<mmi:mmi xmlns="http://www.w3.org/2008/04/mmi-arch" version="1.0"> 
   <mmi:statusRequest requestAutomaticUpdate="true" source="someURI" requestID="request-3"/>

<mmi:mmi xmlns:mmi="http://www.w3.org/2008/04/mmi-arch" version="1.0"> 
   <mmi:statusRequest requestAutomaticUpdate="true" source="someURI" target="someOtherURI" requestID="request-3" context="aToken"/>

</mmi:mmi>
   

20.

A.20 statusResponse (from the MC to the IM)

<mmi:mmi xmlns="http://www.w3.org/2008/04/mmi-arch" version="1.0"> 
        <mmi:statusResponse automaticUpdate="true" status="alive" source="someURI" requestID="request-3"/> 
</mmi:mmi>  

<mmi:mmi xmlns:mmi="http://www.w3.org/2008/04/mmi-arch" version="1.0"> 
   <mmi:statusResponse automaticUpdate="true" status="alive" source="someURI" target="someOtherURI" requestID="request-3" context="aToken"/> 
</mmi:mmi>

B Event Schemas

B.1 mmi.xsd

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:mmi="http://www.w3.org/2008/04/mmi-arch" xmlns:xs="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.w3.org/2008/04/mmi-arch">

<xs:schema xmlns:mmi="http://www.w3.org/2008/04/mmi-arch" 
        xmlns:xs="http://www.w3.org/2001/XMLSchema" 
        targetNamespace="http://www.w3.org/2008/04/mmi-arch">

        <xs:annotation>
                <xs:documentation xml:lang="en">
                         NewContextRequest schema for MMI Life cycle events version 1.0

                         Schema definition for MMI Life cycle events version 1.0

                </xs:documentation>
        </xs:annotation>
        <xs:include schemaLocation="mmi-attribs.xsd"/>

        <xs:include schemaLocation="NewContextRequest.xsd"/>
        <xs:include schemaLocation="NewContextResponse.xsd"/>
        <xs:include schemaLocation="ClearContextRequest.xsd"/>
        <xs:include schemaLocation="ClearContextResponse.xsd"/>
        <xs:include schemaLocation="CancelRequest.xsd"/>
        <xs:include schemaLocation="CancelResponse.xsd"/>
        <xs:include schemaLocation="DoneNotification.xsd"/>
        <xs:include schemaLocation="ExtensionNotification.xsd"/>
        <xs:include schemaLocation="PauseRequest.xsd"/>
        <xs:include schemaLocation="PauseResponse.xsd"/>
        <xs:include schemaLocation="PrepareRequest.xsd"/>
        <xs:include schemaLocation="PrepareResponse.xsd"/>
        <xs:include schemaLocation="ResumeRequest.xsd"/>
        <xs:include schemaLocation="ResumeResponse.xsd"/>
        <xs:include schemaLocation="StartRequest.xsd"/>
        <xs:include schemaLocation="StartResponse.xsd"/>
        <xs:include schemaLocation="StatusRequest.xsd"/>
        <xs:include schemaLocation="StatusResponse.xsd"/>
        <xs:element name="mmi">
                <xs:complexType>
                        <xs:choice>
                                <xs:sequence>
                                        <xs:element ref="mmi:newContextRequest"/>
                                </xs:sequence>
                                <xs:sequence>
                                        <xs:element ref="mmi:newContextResponse"/>
                                </xs:sequence>
                                <xs:sequence>
                                        <xs:element ref="mmi:clearContextRequest"/>
                                </xs:sequence>
                                <xs:sequence>
                                        <xs:element ref="mmi:clearContextResponse"/>
                                </xs:sequence>
                                <xs:sequence>
                                        <xs:element ref="mmi:cancelRequest"/>
                                </xs:sequence>
                                <xs:sequence>
                                        <xs:element ref="mmi:cancelResponse"/>
                                </xs:sequence>
                                <xs:sequence>
                                        <xs:element ref="mmi:doneNotification"/>
                                </xs:sequence>
                                <xs:sequence>
                                        <xs:element ref="mmi:extensionNotification"/>
                                </xs:sequence>
                                <xs:sequence>
                                        <xs:element ref="mmi:pauseRequest"/>
                                </xs:sequence>
                                <xs:sequence>
                                        <xs:element ref="mmi:pauseResponse"/>
                                </xs:sequence>
                                <xs:sequence>
                                        <xs:element ref="mmi:prepareRequest"/>
                                </xs:sequence>
                                <xs:sequence>
                                        <xs:element ref="mmi:prepareResponse"/>
                                </xs:sequence>
                                <xs:sequence>
                                        <xs:element ref="mmi:resumeRequest"/>
                                </xs:sequence>
                                <xs:sequence>
                                        <xs:element ref="mmi:resumeResponse"/>
                                </xs:sequence>
                                <xs:sequence>
                                        <xs:element ref="mmi:startRequest"/>
                                </xs:sequence>
                                <xs:sequence>
                                        <xs:element ref="mmi:startResponse"/>
                                </xs:sequence>
                                <xs:sequence>
                                        <xs:element ref="mmi:statusRequest"/>
                                </xs:sequence>
                                <xs:sequence>
                                        <xs:element ref="mmi:statusResponse"/>
                                </xs:sequence>
                        </xs:choice>
                        <xs:attributeGroup ref="mmi:mmi.version.attrib"/>
                </xs:complexType>
        </xs:element>
</xs:schema>

B.2 mmi-datatypes.xsd

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:mmi="http://www.w3.org/2008/04/mmi-arch" targetNamespace="http://www.w3.org/2008/04/mmi-arch">

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" 
        xmlns:mmi="http://www.w3.org/2008/04/mmi-arch" 
        targetNamespace="http://www.w3.org/2008/04/mmi-arch">

        <xs:annotation>
                <xs:documentation xml:lang="en">
                         general Type definition schema for MMI Life cycle events version 1.0
                </xs:documentation>
        </xs:annotation>
        <xs:include schemaLocation="mmi-attribs.xsd"/>
        <xs:simpleType name="versionType">
                <xs:restriction base="xs:decimal">
                        <xs:enumeration value="1.0"/>
                </xs:restriction>
        </xs:simpleType>
        <xs:simpleType name="mediaContentTypes">
                <xs:restriction base="xs:string">
                        <xs:enumeration value="media1"/>
                        <xs:enumeration value="media2"/>
                </xs:restriction>
        </xs:simpleType>
        <xs:simpleType name="mediaAttributeTypes">
                <xs:restriction base="xs:string">
                        <xs:enumeration value="mediaID1"/>
                        <xs:enumeration value="mediaID2"/>
                </xs:restriction>
        </xs:simpleType>
        <xs:simpleType name="sourceType">
                <xs:restriction base="xs:string"/>
        </xs:simpleType>
        <xs:simpleType name="targetType">
                <xs:restriction base="xs:string"/>
        </xs:simpleType>
        <xs:simpleType name="requestIDType">
                <xs:restriction base="xs:string"/>
        </xs:simpleType>
        <xs:simpleType name="contextType">
                <xs:restriction base="xs:string"/>
        </xs:simpleType>
        <xs:simpleType name="statusType">
                <xs:restriction base="xs:string">
                        <xs:enumeration value="success"/>
                        <xs:enumeration value="failure"/>
                </xs:restriction>
        </xs:simpleType>
        <xs:simpleType name="statusResponseType">
                <xs:restriction base="xs:string">
                        <xs:enumeration value="alive"/>
                        <xs:enumeration value="dead"/>
                </xs:restriction>
        </xs:simpleType>
        <xs:simpleType name="immediateType">
                <xs:restriction base="xs:boolean"/>
        </xs:simpleType>
        <xs:complexType name="contentURLType">
                <xs:attribute name="href" type="xs:anyURI" use="required"/>
                <xs:attribute name="max-age" type="xs:string" use="optional"/>
                <xs:attribute name="fetchtimeout" type="xs:string" use="optional"/>
        </xs:complexType>
        <xs:complexType name="contentType">
                <xs:sequence>
                        <xs:any namespace="http://www.w3.org/2001/vxml" processContents="skip" maxOccurs="unbounded"/>
                </xs:sequence>
        </xs:complexType>
        <xs:complexType name="emmaType">
                <xs:sequence>
                        <xs:any namespace="http://www.w3.org/2003/04/emma" processContents="skip" maxOccurs="unbounded"/>
                </xs:sequence>
        </xs:complexType>
        <xs:complexType name="anyComplexType" mixed="true">
                <xs:complexContent mixed="true">
                        <xs:restriction base="xs:anyType">
                                <xs:sequence>
                                        <xs:any processContents="skip" minOccurs="0" maxOccurs="unbounded"/>
                                </xs:sequence>
                        </xs:restriction>
                </xs:complexContent>
        </xs:complexType>
        

</xs:schema>

B.3 mmi-attribs.xsd

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:mmi="http://www.w3.org/2008/04/mmi-arch" targetNamespace="http://www.w3.org/2008/04/mmi-arch" 
                                attributeFormDefault="qualified">

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" 
        xmlns:mmi="http://www.w3.org/2008/04/mmi-arch" 
        targetNamespace="http://www.w3.org/2008/04/mmi-arch" 
        attributeFormDefault="qualified">

        <xs:annotation>
                <xs:documentation xml:lang="en">
                         general Type definition schema for MMI Life cycle events version 1.0
                </xs:documentation>
        </xs:annotation>
        <xs:include schemaLocation="mmi-datatypes.xsd"/>
        <xs:attributeGroup name="media.id.attrib">
                <xs:attribute name="id" type="mmi:mediaAttributeTypes" use="required"/>
        </xs:attributeGroup>
        <xs:attributeGroup name="mmi.version.attrib">
                <xs:attribute name="version" type="mmi:versionType" use="required"/>
        </xs:attributeGroup>
        <xs:attributeGroup name="source.attrib">
                <xs:attribute name="source" type="mmi:sourceType" use="required"/>
        </xs:attributeGroup>
        <xs:attributeGroup name="target.attrib">
                <xs:attribute name="target" type="mmi:targetType" use="optional"/>
        </xs:attributeGroup>
        <xs:attributeGroup name="requestID.attrib">
                <xs:attribute name="requestID" type="mmi:requestIDType" use="required"/>
        </xs:attributeGroup>
        <xs:attributeGroup name="context.attrib">
                <xs:attribute name="context" type="mmi:contextType" use="required"/>
        </xs:attributeGroup>
        <xs:attributeGroup name="context.optional.attrib">
                <xs:attribute name="context" type="mmi:contextType" use="optional"/>
        </xs:attributeGroup>

        <xs:attributeGroup name="immediate.attrib">
                <xs:attribute name="immediate" type="mmi:immediateType" use="required"/>
        </xs:attributeGroup>
        <xs:attributeGroup name="status.attrib">
                <xs:attribute name="status" type="mmi:statusType" use="required"/>
        </xs:attributeGroup>
        <xs:attributeGroup name="statusResponse.attrib">
                <xs:attribute name="status" type="mmi:statusResponseType" use="required"/>
        </xs:attributeGroup>
        <xs:attributeGroup name="extension.name.attrib">
                <xs:attribute name="name" type="xs:string" use="required"/>
        </xs:attributeGroup>
        <xs:attributeGroup name="requestAutomaticUpdate.attrib">
                <xs:attribute name="requestAutomaticUpdate" type="xs:boolean" use="required"/>
        </xs:attributeGroup>
        <xs:attributeGroup name="automaticUpdate.attrib">
                <xs:attribute name="automaticUpdate" type="xs:boolean" use="required"/>
        </xs:attributeGroup>
        <xs:attributeGroup name="group.allEvents.attrib">
                <xs:attributeGroup ref="mmi:source.attrib"/>
                <xs:attributeGroup ref="mmi:target.attrib"/>

                <xs:attributeGroup ref="mmi:requestID.attrib"/>
                <xs:attributeGroup ref="mmi:context.attrib"/>
        </xs:attributeGroup>
        <xs:attributeGroup name="group.allResponseEvents.attrib">
                <xs:attributeGroup ref="mmi:group.allEvents.attrib"/>
                <xs:attributeGroup ref="mmi:status.attrib"/>
        </xs:attributeGroup>
        

</xs:schema>

B.4 mmi-elements.xsd

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:mmi="http://www.w3.org/2008/04/mmi-arch" targetNamespace="http://www.w3.org/2008/04/mmi-arch" 
                                attributeFormDefault="qualified">

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" 
        xmlns:mmi="http://www.w3.org/2008/04/mmi-arch" 
        targetNamespace="http://www.w3.org/2008/04/mmi-arch" 
        attributeFormDefault="qualified">

        <xs:annotation>
                <xs:documentation xml:lang="en">
                         general elements definition schema for MMI Life cycle events version 1.0
                </xs:documentation>
        </xs:annotation>
        <xs:include schemaLocation="mmi-datatypes.xsd"/>
        
        <!-- ELEMENTS -->
        <xs:element name="statusInfo" type="mmi:anyComplexType"/>
        <xs:element name="media">
                <xs:complexType>
                        <xs:simpleContent>
                                <xs:extension base="mmi:mediaContentTypes">
                                        <xs:attributeGroup ref="mmi:media.id.attrib"/>
                                </xs:extension>
                        </xs:simpleContent>
                </xs:complexType>
        </xs:element>
</xs:schema>

B.5 NewContextRequest.xsd

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:dcont="http://www.w3.org/2008/04/dcont" xmlns:mmi="http://www.w3.org/2008/04/mmi-arch" xmlns:xs="http://www.w3.org/2001/XMLSchema" 
                                targetNamespace="http://www.w3.org/2008/04/mmi-arch" attributeFormDefault="qualified" elementFormDefault="qualified">

<xs:schema xmlns:mmi="http://www.w3.org/2008/04/mmi-arch" 
        xmlns:xs="http://www.w3.org/2001/XMLSchema" 
        targetNamespace="http://www.w3.org/2008/04/mmi-arch" 
        attributeFormDefault="qualified" 
        elementFormDefault="qualified">

        <xs:annotation>
                <xs:documentation xml:lang="en">
                         NewContextRequest schema for MMI Life cycle events version 1.0
                </xs:documentation>
        </xs:annotation>
        <xs:include schemaLocation="mmi-datatypes.xsd"/>
        <xs:include schemaLocation="mmi-attribs.xsd"/>
        <xs:include schemaLocation="mmi-elements.xsd"/>
        <xs:import namespace="http://www.w3.org/2008/04/dcont" schemaLocation="dcont.xsd"/>

        <xs:element name="newContextRequest">
                <xs:complexType>
                        <xs:sequence>
                                <xs:element ref="mmi:media" maxOccurs="unbounded"/>
                                <xs:element name="data">
                                        <xs:complexType>
                                                <xs:sequence>
                                                        <xs:element ref="dcont:DeliveryContext"/>
                                                </xs:sequence>
                                        </xs:complexType>
                                </xs:element>

                                <xs:element name="data" type="mmi:anyComplexType"/>

                        </xs:sequence>
                        <xs:attributeGroup ref="mmi:group.allEvents.attrib"/>

                        <xs:attributeGroup ref="mmi:source.attrib"/>
                        <xs:attributeGroup ref="mmi:target.attrib"/>
                        <xs:attributeGroup ref="mmi:requestID.attrib"/>

                </xs:complexType>
        </xs:element>
</xs:schema>

B.6 NewContextResponse.xsd

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:dcont="http://www.w3.org/2008/04/dcont" xmlns:mmi="http://www.w3.org/2008/04/mmi-arch" xmlns:xs="http://www.w3.org/2001/XMLSchema" 
                                targetNamespace="http://www.w3.org/2008/04/mmi-arch" attributeFormDefault="qualified" elementFormDefault="qualified">

<xs:schema xmlns:mmi="http://www.w3.org/2008/04/mmi-arch" 
        xmlns:xs="http://www.w3.org/2001/XMLSchema" 
        targetNamespace="http://www.w3.org/2008/04/mmi-arch" 
        attributeFormDefault="qualified" 
        elementFormDefault="qualified">

        <xs:annotation>
                <xs:documentation xml:lang="en">
                         NewContextResponse schema for MMI Life cycle events version 1.0
                </xs:documentation>
        </xs:annotation>
        <xs:include schemaLocation="mmi-datatypes.xsd"/>
        <xs:include schemaLocation="mmi-attribs.xsd"/>
        <xs:include schemaLocation="mmi-elements.xsd"/>
        
        <xs:element name="newContextResponse">
                <xs:complexType>
                        <xs:sequence>
                                <xs:element ref="mmi:media" minOccurs="0" maxOccurs="unbounded"/>
                                <xs:element ref="mmi:statusInfo" minOccurs="0"/>
                        </xs:sequence>
                        <xs:attributeGroup ref="mmi:group.allResponseEvents.attrib"/>
                        <xs:attributeGroup ref="mmi:target.attrib"/>

                </xs:complexType>
        </xs:element>
</xs:schema>

B.7 PrepareRequest.xsd

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:dcont="http://www.w3.org/2008/04/dcont" xmlns:mmi="http://www.w3.org/2008/04/mmi-arch" xmlns:xs="http://www.w3.org/2001/XMLSchema" 
                                targetNamespace="http://www.w3.org/2008/04/mmi-arch" attributeFormDefault="qualified" elementFormDefault="qualified">

<xs:schema xmlns:mmi="http://www.w3.org/2008/04/mmi-arch" 
        xmlns:xs="http://www.w3.org/2001/XMLSchema" 
        targetNamespace="http://www.w3.org/2008/04/mmi-arch" 
        attributeFormDefault="qualified" 
        elementFormDefault="qualified">

        <xs:annotation>
                <xs:documentation xml:lang="en">
                         PrepareRequest schema for MMI Life cycle events version 1.0. 
                         The optional PrepareRequest event is an event that the Runtime Framework may send 
                         to allow the Modality Components to pre-load markup and prepare to run (e.g. in case of 
                         VXML VUI-MC). Modality Components are not required to take any particular action in 
                         response to this event, but they must return a PrepareResponse event.
                </xs:documentation>
        </xs:annotation>
        <xs:include schemaLocation="mmi-datatypes.xsd"/>
        <xs:include schemaLocation="mmi-attribs.xsd"/>
        
        <xs:element name="prepareRequest">
                <xs:complexType>
                        <xs:choice>
                                <xs:sequence>
                                        <xs:element name="contentURL" type="mmi:contentURLType"/>
                                </xs:sequence>
                                <xs:sequence>
                                        <xs:element name="content" type="mmi:anyComplexType"/>
                                        <!-- only vxml permitted ?? -->
                                </xs:sequence>
                                <!-- data really needed ?? -->
                                <xs:sequence>
                                        <xs:element name="data" type="mmi:anyComplexType"/>
                                </xs:sequence>
                        </xs:choice>
                        <xs:attributeGroup ref="mmi:group.allEvents.attrib"/>
                        <xs:attributeGroup ref="mmi:target.attrib"/>

                </xs:complexType>
        </xs:element>
</xs:schema>

B.8 PrepareResponse.xsd

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:dcont="http://www.w3.org/2008/04/dcont" xmlns:mmi="http://www.w3.org/2008/04/mmi-arch" xmlns:xs="http://www.w3.org/2001/XMLSchema" 
                                targetNamespace="http://www.w3.org/2008/04/mmi-arch" attributeFormDefault="qualified" elementFormDefault="qualified">

<xs:schema xmlns:mmi="http://www.w3.org/2008/04/mmi-arch" 
        xmlns:xs="http://www.w3.org/2001/XMLSchema" 
        targetNamespace="http://www.w3.org/2008/04/mmi-arch" 
        attributeFormDefault="qualified" 
        elementFormDefault="qualified">

        <xs:annotation>
                <xs:documentation xml:lang="en">
                         PrepareResponse schema for MMI Life cycle events version 1.0
                </xs:documentation>
        </xs:annotation>
        <xs:include schemaLocation="mmi-datatypes.xsd"/>
        <xs:include schemaLocation="mmi-attribs.xsd"/>
        <xs:include schemaLocation="mmi-elements.xsd"/>
        
        <xs:element name="prepareResponse">
                <xs:complexType>
                        <xs:sequence>
                                <xs:element name="data" minOccurs="0" type="mmi:anyComplexType"/>
                                <xs:element ref="mmi:statusInfo" minOccurs="0"/>
                        </xs:sequence>
                        <xs:attributeGroup ref="mmi:group.allResponseEvents.attrib"/>
                        <xs:attributeGroup ref="mmi:target.attrib"/>

                </xs:complexType>
        </xs:element>
        

</xs:schema>

B.9 StartRequest.xsd

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:dcont="http://www.w3.org/2008/04/dcont" xmlns:mmi="http://www.w3.org/2008/04/mmi-arch" xmlns:xs="http://www.w3.org/2001/XMLSchema" 
                                targetNamespace="http://www.w3.org/2008/04/mmi-arch" attributeFormDefault="qualified" elementFormDefault="qualified">

<xs:schema xmlns:mmi="http://www.w3.org/2008/04/mmi-arch" 
        xmlns:xs="http://www.w3.org/2001/XMLSchema" 
        targetNamespace="http://www.w3.org/2008/04/mmi-arch" 
        attributeFormDefault="qualified" 
        elementFormDefault="qualified">

        <xs:annotation>
                <xs:documentation xml:lang="en">
                         StartRequest schema for MMI Life cycle events version 1.0. 
                         The Runtime Framework sends the event StartRequest to invoke a Modality Component 
                         (to start loading a new GUI resource or to start the ASR or TTS). The Modality Component 
                         must return a StartResponse event in response. If the Runtime Framework has sent a previous
                         PrepareRequest event, it may leave the contentURL and content fields empty, and the Modality
                         Component will use the values from the PrepareRequest event. If the Runtime Framework includes 
                         new values for these fields, the values in the StartRequest event override those in the 
                         PrepareRequest event.
                </xs:documentation>
        </xs:annotation>
        <xs:include schemaLocation="mmi-datatypes.xsd"/>
        <xs:include schemaLocation="mmi-attribs.xsd"/>
        
        <xs:element name="startRequest">
                <xs:complexType>
                        <xs:choice>
                                <xs:sequence>
                                        <xs:element name="contentURL" type="mmi:contentURLType"/>
                                </xs:sequence>
                                <xs:sequence>
                                        <xs:element name="content" type="mmi:anyComplexType"/>
                                        <!-- only vxml permitted ?? -->
                                </xs:sequence>
                                <!-- data really needed ?? -->
                                <xs:sequence>
                                        <xs:element name="data" type="mmi:anyComplexType"/>
                                </xs:sequence>
                        </xs:choice>
                        <xs:attributeGroup ref="mmi:group.allEvents.attrib"/>
                        <xs:attributeGroup ref="mmi:target.attrib"/>

                </xs:complexType>
        </xs:element>
</xs:schema>

B.10 StartResponse.xsd

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:dcont="http://www.w3.org/2008/04/dcont" xmlns:mmi="http://www.w3.org/2008/04/mmi-arch" xmlns:xs="http://www.w3.org/2001/XMLSchema" 
                                targetNamespace="http://www.w3.org/2008/04/mmi-arch" attributeFormDefault="qualified" elementFormDefault="qualified">

<xs:schema xmlns:mmi="http://www.w3.org/2008/04/mmi-arch" 
        xmlns:xs="http://www.w3.org/2001/XMLSchema" 
        targetNamespace="http://www.w3.org/2008/04/mmi-arch" 
        attributeFormDefault="qualified" 
        elementFormDefault="qualified">

        <xs:annotation>
                <xs:documentation xml:lang="en">
                         StartResponse schema for MMI Life cycle events version 1.0
                </xs:documentation>
        </xs:annotation>
        <xs:include schemaLocation="mmi-datatypes.xsd"/>
        <xs:include schemaLocation="mmi-attribs.xsd"/>
        <xs:include schemaLocation="mmi-elements.xsd"/>
        
        <xs:element name="startResponse">
                <xs:complexType>
                        <xs:sequence>
                                <xs:element name="data" minOccurs="0" type="mmi:anyComplexType"/>
                                <xs:element ref="mmi:statusInfo" minOccurs="0"/>
                        </xs:sequence>
                        <xs:attributeGroup ref="mmi:group.allResponseEvents.attrib"/>
                        <xs:attributeGroup ref="mmi:target.attrib"/>

                </xs:complexType>
        </xs:element>
</xs:schema>

B.11 DoneNotification.xsd

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:dcont="http://www.w3.org/2008/04/dcont" xmlns:mmi="http://www.w3.org/2008/04/mmi-arch" xmlns:xs="http://www.w3.org/2001/XMLSchema" 
                                targetNamespace="http://www.w3.org/2008/04/mmi-arch" attributeFormDefault="qualified" elementFormDefault="qualified">

<xs:schema xmlns:mmi="http://www.w3.org/2008/04/mmi-arch" 
        xmlns:xs="http://www.w3.org/2001/XMLSchema" 
        targetNamespace="http://www.w3.org/2008/04/mmi-arch" 
        attributeFormDefault="qualified" 
        elementFormDefault="qualified">

        <xs:annotation>
                <xs:documentation xml:lang="en">
                         DoneNotification schema for MMI Life cycle events version 1.0. 
                         The DoneNotification event is intended to be used by the Modality Component to indicate that
                         it has reached the end of its processing. For the VUI-MC it can be used to return the ASR
                         recognition result (or the status info: noinput/nomatch) and TTS/Player done notification. 
                </xs:documentation>
        </xs:annotation>
        <xs:include schemaLocation="mmi-datatypes.xsd"/>
        <xs:include schemaLocation="mmi-attribs.xsd"/>
        <xs:include schemaLocation="mmi-elements.xsd"/>
        
        <xs:element name="doneNotification">
                <xs:complexType>
                        <xs:sequence>
                                <xs:element name="data" type="mmi:anyComplexType"/>
                                <xs:element ref="mmi:statusInfo" minOccurs="0"/>
                        </xs:sequence>
                        <xs:attributeGroup ref="mmi:group.allResponseEvents.attrib"/>
                        <xs:attributeGroup ref="mmi:target.attrib"/>

                </xs:complexType>
        </xs:element>
        

</xs:schema>

B.12 CancelRequest.xsd

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:dcont="http://www.w3.org/2008/04/dcont" xmlns:mmi="http://www.w3.org/2008/04/mmi-arch" xmlns:xs="http://www.w3.org/2001/XMLSchema" 
                                targetNamespace="http://www.w3.org/2008/04/mmi-arch" attributeFormDefault="qualified" elementFormDefault="qualified">

<xs:schema xmlns:mmi="http://www.w3.org/2008/04/mmi-arch" 
        xmlns:xs="http://www.w3.org/2001/XMLSchema" 
        targetNamespace="http://www.w3.org/2008/04/mmi-arch" 
        attributeFormDefault="qualified" 
        elementFormDefault="qualified">

        <xs:annotation>
                <xs:documentation xml:lang="en">
                         CancelRequest schema for MMI Life cycle events version 1.0. 
                         The CancelRequest event is sent by the Runtime Framework to stop processing in the Modality 
                         Component (e.g. to cancel ASR or TTS/Playing). The Modality Component must return with a 
                         CancelResponse message. 
                </xs:documentation>
        </xs:annotation>
        <xs:include schemaLocation="mmi-datatypes.xsd"/>
        <xs:include schemaLocation="mmi-attribs.xsd"/>
        
        <xs:element name="cancelRequest">
                <xs:complexType>
                        <xs:attributeGroup ref="mmi:group.allEvents.attrib"/>
                        <xs:attributeGroup ref="mmi:target.attrib"/>

                        <xs:attributeGroup ref="mmi:immediate.attrib"/>
                        <!-- no elements -->
                </xs:complexType>
        </xs:element>
</xs:schema>

B.13 CancelResponse.xsd

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:dcont="http://www.w3.org/2008/04/dcont" xmlns:mmi="http://www.w3.org/2008/04/mmi-arch" xmlns:xs="http://www.w3.org/2001/XMLSchema" 
                                targetNamespace="http://www.w3.org/2008/04/mmi-arch" attributeFormDefault="qualified" elementFormDefault="qualified">

<xs:schema xmlns:mmi="http://www.w3.org/2008/04/mmi-arch" 
        xmlns:xs="http://www.w3.org/2001/XMLSchema" 
        targetNamespace="http://www.w3.org/2008/04/mmi-arch" 
        attributeFormDefault="qualified" 
        elementFormDefault="qualified">

        <xs:annotation>
                <xs:documentation xml:lang="en">
                         CancelResponse schema for MMI Life cycle events version 1.0. 
                         The CancelRequest event is sent by the Runtime Framework to stop processing in the Modality 
                         Component (e.g. to cancel ASR or TTS/Playing). The Modality Component must return with a 
                         CancelResponse message. 
                </xs:documentation>
        </xs:annotation>
        <xs:include schemaLocation="mmi-datatypes.xsd"/>
        <xs:include schemaLocation="mmi-attribs.xsd"/>
        <xs:include schemaLocation="mmi-elements.xsd"/>
        
        <xs:element name="cancelResponse">
                <xs:complexType>
                        <xs:sequence>
                                <xs:element ref="mmi:statusInfo" minOccurs="0"/>
                        </xs:sequence>
                        <xs:attributeGroup ref="mmi:group.allResponseEvents.attrib"/>
                        <xs:attributeGroup ref="mmi:target.attrib"/>

                </xs:complexType>
        </xs:element>
</xs:schema>

B.14 PauseRequest.xsd

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:dcont="http://www.w3.org/2008/04/dcont" xmlns:mmi="http://www.w3.org/2008/04/mmi-arch" xmlns:xs="http://www.w3.org/2001/XMLSchema" 
                                targetNamespace="http://www.w3.org/2008/04/mmi-arch" attributeFormDefault="qualified" elementFormDefault="qualified">

<xs:schema xmlns:mmi="http://www.w3.org/2008/04/mmi-arch" 
        xmlns:xs="http://www.w3.org/2001/XMLSchema" 
        targetNamespace="http://www.w3.org/2008/04/mmi-arch" 
        attributeFormDefault="qualified" 
        elementFormDefault="qualified">

        <xs:annotation>
                <xs:documentation xml:lang="en">
                         PauseRequest schema for MMI Life cycle events version 1.0. 
                         The PauseRequest event is sent by the Runtime Framework to pause processing of a Modality 
                         Component (e.g. to cancel ASR or TTS/Playing). The Modality Component must return with a 
                         PauseResponse message. 
                </xs:documentation>
        </xs:annotation>
        <xs:include schemaLocation="mmi-datatypes.xsd"/>
        <xs:include schemaLocation="mmi-attribs.xsd"/>
        
        <xs:element name="pauseRequest">
                <xs:complexType>
                        <xs:attributeGroup ref="mmi:group.allEvents.attrib"/>
                        <xs:attributeGroup ref="mmi:target.attrib"/>

                        <xs:attributeGroup ref="mmi:immediate.attrib"/>
                        <!-- no elements -->
                </xs:complexType>
        </xs:element>
</xs:schema>

B.15 PauseResponse.xsd

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:dcont="http://www.w3.org/2008/04/dcont" xmlns:mmi="http://www.w3.org/2008/04/mmi-arch" xmlns:xs="http://www.w3.org/2001/XMLSchema"
                                targetNamespace="http://www.w3.org/2008/04/mmi-arch" attributeFormDefault="qualified" elementFormDefault="qualified">

<xs:schema xmlns:mmi="http://www.w3.org/2008/04/mmi-arch" 
        xmlns:xs="http://www.w3.org/2001/XMLSchema"
        targetNamespace="http://www.w3.org/2008/04/mmi-arch" 
        attributeFormDefault="qualified" 
        elementFormDefault="qualified">

        <xs:annotation>
                <xs:documentation xml:lang="en">
                         PauseResponse schema for MMI Life cycle events version 1.0. 
                         The PauseRequest event is sent by the Runtime Framework to pause the processing of
                         the Modality Component (e.g. to cancel ASR or TTS/Playing). The Modality Component 
                         must return with a PauseResponse message. 
                </xs:documentation>
        </xs:annotation>
        <xs:include schemaLocation="mmi-datatypes.xsd"/>
        <xs:include schemaLocation="mmi-attribs.xsd"/>
        <xs:include schemaLocation="mmi-elements.xsd"/>
        
        <xs:element name="pauseResponse">
                <xs:complexType>
                        <xs:sequence>
                                <xs:element ref="mmi:statusInfo" minOccurs="0"/>
                        </xs:sequence>
                        <xs:attributeGroup ref="mmi:group.allResponseEvents.attrib"/>
                        <xs:attributeGroup ref="mmi:target.attrib"/>

                </xs:complexType>
        </xs:element>
</xs:schema>

B.16 ResumeRequest.xsd

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:dcont="http://www.w3.org/2008/04/dcont" xmlns:mmi="http://www.w3.org/2008/04/mmi-arch" xmlns:xs="http://www.w3.org/2001/XMLSchema" 
                                targetNamespace="http://www.w3.org/2008/04/mmi-arch" attributeFormDefault="qualified" elementFormDefault="qualified">

<xs:schema xmlns:mmi="http://www.w3.org/2008/04/mmi-arch" 
        xmlns:xs="http://www.w3.org/2001/XMLSchema" 
        targetNamespace="http://www.w3.org/2008/04/mmi-arch" 
        attributeFormDefault="qualified" 
        elementFormDefault="qualified">

        <xs:annotation>
                <xs:documentation xml:lang="en">
                         ResumeRequest schema for MMI Life cycle events version 1.0. 
                         The ResumeRequest event is sent by the Runtime Framework to resume a previously suspended 
                         processing task of a Modality Component. The Modality Component must return with a 
                         ResumeResponse message. 
                </xs:documentation>
        </xs:annotation>
        <xs:include schemaLocation="mmi-datatypes.xsd"/>
        <xs:include schemaLocation="mmi-attribs.xsd"/>
        
        <xs:element name="resumeRequest">
                <xs:complexType>
                        <xs:attributeGroup ref="mmi:group.allEvents.attrib"/>
                        <xs:attributeGroup ref="mmi:target.attrib"/>

                        <xs:attributeGroup ref="mmi:immediate.attrib"/>
                        <!-- no elements -->
                </xs:complexType>
        </xs:element>
</xs:schema>

B.17 ResumeResponse.xsd

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:dcont="http://www.w3.org/2008/04/dcont" xmlns:mmi="http://www.w3.org/2008/04/mmi-arch" xmlns:xs="http://www.w3.org/2001/XMLSchema" 
                                targetNamespace="http://www.w3.org/2008/04/mmi-arch" attributeFormDefault="qualified" elementFormDefault="qualified">

<xs:schema xmlns:mmi="http://www.w3.org/2008/04/mmi-arch" 
        xmlns:xs="http://www.w3.org/2001/XMLSchema" 
        targetNamespace="http://www.w3.org/2008/04/mmi-arch" 
        attributeFormDefault="qualified" 
        elementFormDefault="qualified">

        <xs:annotation>
                <xs:documentation xml:lang="en">
                         ResumeRequest schema for MMI Life cycle events version 1.0. 
                         The ResumeRequest event is sent by the Runtime Framework to resume a previously suspended 
                         processing task of a Modality Component. The Modality Component must return with a 
                         ResumeResponse message. 
                </xs:documentation>
        </xs:annotation>
        <xs:include schemaLocation="mmi-datatypes.xsd"/>
        <xs:include schemaLocation="mmi-attribs.xsd"/>
        <xs:include schemaLocation="mmi-elements.xsd"/>
        
        <xs:element name="resumeResponse">
                <xs:complexType>
                        <xs:sequence>
                                <xs:element ref="mmi:statusInfo" minOccurs="0"/>
                        </xs:sequence>
                        <xs:attributeGroup ref="mmi:group.allResponseEvents.attrib"/>
                        <xs:attributeGroup ref="mmi:target.attrib"/>

                </xs:complexType>
        </xs:element>
</xs:schema>

B.18 ExtensionNotification.xsd

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:dcont="http://www.w3.org/2008/04/dcont" xmlns:mmi="http://www.w3.org/2008/04/mmi-arch" xmlns:xs="http://www.w3.org/2001/XMLSchema" 
                                targetNamespace="http://www.w3.org/2008/04/mmi-arch" attributeFormDefault="qualified" elementFormDefault="qualified">

<xs:schema xmlns:mmi="http://www.w3.org/2008/04/mmi-arch" 
        xmlns:xs="http://www.w3.org/2001/XMLSchema" 
        targetNamespace="http://www.w3.org/2008/04/mmi-arch" 
        attributeFormDefault="qualified" 
        elementFormDefault="qualified">

        <xs:annotation>
                <xs:documentation xml:lang="en">
                         ExtensionNotification schema for MMI Life cycle events version 1.0. 
                         The extensionNotification event may be generated by either the Runtime Framework or the 
                         Modality Component and is used to communicate (presumably changed) data values to the 
                         other component. E.g. the VUI-MC has signaled a recognition result for any field displayed 
                         on the GUI, the event will be used by the Runtime Framework to send a command to the 
                         GUI-MC to update the GUI with the recognized value. 
                </xs:documentation>
        </xs:annotation>
        <xs:include schemaLocation="mmi-datatypes.xsd"/>
        <xs:include schemaLocation="mmi-attribs.xsd"/>
        
        <xs:element name="extensionNotification">
                <xs:complexType>
                        <xs:sequence>
                                <xs:element name="data" type="mmi:anyComplexType"/>
                        </xs:sequence>
                        <xs:attributeGroup ref="mmi:group.allEvents.attrib"/>
                        <xs:attributeGroup ref="mmi:target.attrib"/>

                        <xs:attributeGroup ref="mmi:extension.name.attrib"/>
                </xs:complexType>
        </xs:element>
        

</xs:schema>

B.19 ClearContextRequest.xsd

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:mmi="http://www.w3.org/2008/04/mmi-arch" xmlns:xs="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.w3.org/2008/04/mmi-arch" 
                                attributeFormDefault="qualified" elementFormDefault="qualified">

<xs:schema xmlns:mmi="http://www.w3.org/2008/04/mmi-arch" 
        xmlns:xs="http://www.w3.org/2001/XMLSchema" 
        targetNamespace="http://www.w3.org/2008/04/mmi-arch" 
        attributeFormDefault="qualified" 
        elementFormDefault="qualified">

        <xs:annotation>
                <xs:documentation xml:lang="en">
                         ClearContextRequest schema for MMI Life cycle events version 1.0
                </xs:documentation>
        </xs:annotation>
        <xs:include schemaLocation="mmi-datatypes.xsd"/>
        <xs:include schemaLocation="mmi-attribs.xsd"/>
        <xs:include schemaLocation="mmi-elements.xsd"/>
        <xs:element name="clearContextRequest">
                <xs:complexType>
                        <xs:sequence>
                                <xs:element ref="mmi:media" minOccurs="0" maxOccurs="unbounded"/>
                        </xs:sequence>
                        <xs:attributeGroup ref="mmi:group.allEvents.attrib"/>
                        <xs:attributeGroup ref="mmi:target.attrib"/>

                </xs:complexType>
        </xs:element>
</xs:schema>

B.20 ClearContextResponse.xsd

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:dcont="http://www.w3.org/2008/04/dcont" xmlns:mmi="http://www.w3.org/2008/04/mmi-arch" xmlns:xs="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.w3.org/2008/04/mmi-arch" 
                                attributeFormDefault="qualified" elementFormDefault="qualified">

<xs:schema xmlns:mmi="http://www.w3.org/2008/04/mmi-arch" 
        xmlns:xs="http://www.w3.org/2001/XMLSchema" 
        targetNamespace="http://www.w3.org/2008/04/mmi-arch" 
        attributeFormDefault="qualified" 
        elementFormDefault="qualified">

        <xs:annotation>
                <xs:documentation xml:lang="en">
                         ClearContextResponse schema for MMI Life cycle events version 1.0
                </xs:documentation>
        </xs:annotation>
        <xs:include schemaLocation="mmi-datatypes.xsd"/>
        <xs:include schemaLocation="mmi-attribs.xsd"/>
        <xs:include schemaLocation="mmi-elements.xsd"/>
        
        <xs:element name="clearContextResponse">
                <xs:complexType>
                        <xs:sequence>
                                <xs:element ref="mmi:media" minOccurs="0" maxOccurs="unbounded"/>
                                <xs:element ref="mmi:statusInfo" minOccurs="0"/>
                        </xs:sequence>
                        <xs:attributeGroup ref="mmi:group.allResponseEvents.attrib"/>
                        <xs:attributeGroup ref="mmi:target.attrib"/>

                </xs:complexType>
        </xs:element>
</xs:schema>

B.21 StatusRequest.xsd

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:mmi="http://www.w3.org/2008/04/mmi-arch" xmlns:xs="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.w3.org/2008/04/mmi-arch" 
                                attributeFormDefault="qualified" elementFormDefault="qualified">

<xs:schema xmlns:mmi="http://www.w3.org/2008/04/mmi-arch" 
        xmlns:xs="http://www.w3.org/2001/XMLSchema" 
        targetNamespace="http://www.w3.org/2008/04/mmi-arch" 
        attributeFormDefault="qualified" 
        elementFormDefault="qualified">

        <xs:annotation>
                <xs:documentation xml:lang="en">
                         ClearContextRequest schema for MMI Life cycle events version 1.0

                         StatusRequest schema for MMI Life cycle events version 1.0. 
                         The StatusRequest message and the corresponding StatusResponse are intended to provide keep-alive 
                         functionality, informing the Runtime Framework about the presence of the various modality components. 
                         Note that both messages are not tied to any context and may thus be sent independent of any user 
                         interaction.

                </xs:documentation>
        </xs:annotation>
        <xs:include schemaLocation="mmi-datatypes.xsd"/>
        <xs:include schemaLocation="mmi-attribs.xsd"/>
        <xs:include schemaLocation="mmi-elements.xsd"/>
        <xs:element name="clearContextRequest">

        
        <xs:element name="statusRequest">

                <xs:complexType>
                        <xs:sequence>
                                <xs:element ref="mmi:media" minOccurs="0" maxOccurs="unbounded"/>
                        </xs:sequence>
                        <xs:attributeGroup ref="mmi:group.allEvents.attrib"/>

                        <xs:attributeGroup ref="mmi:context.optional.attrib"/>
                        <xs:attributeGroup ref="mmi:source.attrib"/>

                        <xs:attributeGroup ref="mmi:target.attrib"/>
                        <xs:attributeGroup ref="mmi:requestID.attrib"/>
                        <xs:attributeGroup ref="mmi:requestAutomaticUpdate.attrib"/>
                        <!-- no elements -->

                </xs:complexType>
        </xs:element>
</xs:schema>

B.22 StatusResponse.xsd

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:dcont="http://www.w3.org/2008/04/dcont" xmlns:mmi="http://www.w3.org/2008/04/mmi-arch" xmlns:xs="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.w3.org/2008/04/mmi-arch" 
                                attributeFormDefault="qualified" elementFormDefault="qualified">

<xs:schema xmlns:mmi="http://www.w3.org/2008/04/mmi-arch" 
        xmlns:xs="http://www.w3.org/2001/XMLSchema" 
        targetNamespace="http://www.w3.org/2008/04/mmi-arch" 
        attributeFormDefault="qualified" 
        elementFormDefault="qualified">

        <xs:annotation>
                <xs:documentation xml:lang="en">
                         ClearContextResponse schema for MMI Life cycle events version 1.0

                         StatusRequest schema for MMI Life cycle events version 1.0. 
                         The StatusRequest message and the corresponding StatusResponse are intended to provide keep-alive 
                         functionality, informing the Runtime Framework about the presence of the various modality components. 
                         Note that both messages are not tied to any context and may thus be sent independent of any user 
                         interaction.

                </xs:documentation>
        </xs:annotation>
        <xs:include schemaLocation="mmi-datatypes.xsd"/>
        <xs:include schemaLocation="mmi-attribs.xsd"/>
        <xs:include schemaLocation="mmi-elements.xsd"/>

        
        <xs:element name="clearContextResponse">

        <xs:element name="statusRequest">

                <xs:complexType>
                        <xs:sequence>
                                <xs:element ref="mmi:media" minOccurs="0" maxOccurs="unbounded"/>
                                <xs:element ref="mmi:statusInfo" minOccurs="0"/>
                        </xs:sequence>
                        <xs:attributeGroup ref="mmi:group.allResponseEvents.attrib"/>

                        <xs:attributeGroup ref="mmi:context.optional.attrib"/>
                        <xs:attributeGroup ref="mmi:source.attrib"/>

                        <xs:attributeGroup ref="mmi:target.attrib"/>
                        <xs:attributeGroup ref="mmi:requestID.attrib"/>
                        <xs:attributeGroup ref="mmi:requestAutomaticUpdate.attrib"/>
                        <!-- no elements -->

                </xs:complexType>
        </xs:element>
</xs:schema>

C Ladder Diagrams

C.1 Creating a Session

The following ladder diagram shows a possible message sequence upon a session creation. We assume that the Runtime Framework and a an Interaction Manager session is already up and running. The user starts a multimodal session for example by starting a web browser and fetching a given URL.

The initial document contains scripts which providing the modality component functionality (e.g. understanding XML formatted life cycle life-cycle events) and message transport capabilities (e.g. AJAX, but depends on the exact system implementation).

After loading the initial documents (and scripts) the modality component implementation issues a mmi:newContextRequest message to the Runtime Framework. IM. The Runtime Framework IM may load a corresponding markup document, if necessary (could be SCXML), necessary, and initializes and starts the Interaction Manager. a new session.

In this sceneario scenario the Interaction Manager manager logic issues a number of mmi:startRequest messages to the various modality components. One message is sent to the graphical modality component (GUI) to instruct it to load a HTML document. Another message is sent to a voice modality component (VUI) to play a welcome message.

The voice modality component has (in this example) to create a VoiceXML session. As VoiceXML 2.1 does not provide an external event interface a CCXML session will be used for external asynchronous communication. Therefore the voice modality component uses the session creation interface of CCXML 1.0 to create a session and start a corresponding script. This script will then make a call to a phone at the user device (which could be a regular phone or a SIP soft phone on the user's device). This scenario illustrates the use of a SIP phone, which may reside on the users mobile handset.

After successful setup of a CCXML session and the voice connection the voice modality component instructs the CCXML browser to start a VoiceXML dialog and passing it a corresponding VoiceXML script. The VoiceXML interpreter will execute the script and play out the welcome message. After the execution of the VoiceXML script has finished, the voice modality component notifies the Interaction Manager using the mmi:done event.

session creation ladder

C.2 Processing User Input

The next diagram gives a example for the possible message flow while processing of user input. In the given scenario the user wants to enter information using the voice modality component. To start the voice input the user has to use the "push-to-talk" button. The "push-to-talk" button (which might be a hardware button or a soft button on the screen) generates a corresponding event when pushed. This event is issues as a mmi:extension event towards the Interaction Manager. The Interaction Manager logic sends a mmi:startRequest to the voice modality component. This mmi:startRequest message contains a URL which points to a corresponding VoiceXML script. The voice modality component again starts a VoiceXML interpreter using the given URL. The VoiceXML interpreter loads the document and executes it. Now the system is ready for the user input. To notify the user about the availability of the voice input functionality the Interaction Manager might send an event to the GUI upon receiving the mmi:startResponse event (which indicates that the voice modality component has started to execute the document). But note that this is not shown in the picture.

The VoiceXML interpreter captures the users voice input and uses a speech recognition engine to recognize the utterance. The speech recognition result will be represented as an EMMA document and sent to the interaction manager using the mmi:done message. The Interaction Manager logic sends a mmi:extension message to the GUI modality component to instruct it to display the recognition result.

session creation ladder

C.3 Ending a Session

In the following sceneario scenario a modality component instance will be destroyed as a reaction to a user input, e.g. because the user selected to change to the GUI only mode. In this case a mmi:clearContextRequest will be issued to the voice modality component. The voice modality component wrapper will then destroy the CCXML (and VoiceXML) session.

The application logic (i.e. the IM) may also decide to indicate the removed voice functionality and disable an icon on the screen which indicates the availability of the voice modality.

session creation ladder

D Localization and Customization

The MMI architecture specification describes a set of lifecycle events which define the basic interface between the interaction management and the modality components. The startRequest lifecycle event defines the "content" and "contentURL" elements which may contain markup code (or references to markup code). The markup has to be executed by the modality component. Using the "content" or "contentURL" attributes introduces a dependency of the lifecycle event to a specific modality component implementation. In other words, the interaction manager has to issue different startRequests, depending on which markup a GUI modality component may be able to process.

But multimodal applications may want to support different modality component implementations, such as HTML or Flash, for the same application. In this case the interaction manager should be independent of the modality component implementation and hence not generate a markup specific lifecycle event (e.g. containing a link to HTML or even HTML content), but a further abstracted description of the command.

Furthermore, localization needs to be taken into account. If the interaction manager sends markup code to the modality component (or references to it), this markup code should not contain any dependencies to the user's language. Instead the interaction manager needs to send the locale information to the modality component and let it select the appropriate strings.

Here is an example to show, how these two issues could be addressed within the lifecycle events. This example uses a generic data structure to carry the locale information (within the xml:lang attribute) and the data to be visualized at a GUI.


<mmi:mmi xmlns:mmi="http://www.w3.org/TR/mmi-arch" xmlns:xml="http://www.w3.org/XML/1998/namespace" version="1.0">
    <mmi:startRequest mmi:requestID="1.237204761416E12" mmi:context="IM_dcc3c320-9e88-44fe-b91d-02bd02fba1e3" mmi:target="GUI">
        <mmi:contentURL>login</mmi:contentURL>
        <mmi:data>
            <gui resourceid="login" xml:lang="de-DE">
                <data id="back" enabled="false"/>
                <data id="next" enabled="false"/>
            </gui>
        </mmi:data>
    </mmi:startRequest>
</mmi:mmi>

This startRequest carries a generic <gui> structure as its payload which contains a "resourceid" and the xml:lang information. The "resourceid" has to be interpreted by the modality component (either to load an HTML document or a corresponding dialog, e.g. if it is a flash app), whereas "xml:lang" is used by the modality component to select the appropriate string tables.

The content of the <gui> structure is an application specific (but generic) description of data to be used by the modality component. This could contain a description of the status of GUI elements (such as "enabled" or "disabled") or a list of items to be displayed. The following example shows a startRequest to display a list of music songs. The list of songs will be loaded from a backend system and are dynamic. The representation of the song list is agnostic to the modality component implementation. It is the responsibility of the modality component to interpret the structure and to display its content appropriately.


<mmi:mmi xmlns:mmi="http://www.w3.org/TR/mmi-arch" xmlns:xml="http://www.w3.org/XML/1998/namespace" version="1.0">
    <mmi:startRequest mmi:requestID="1.23720967758E12" mmi:context="IM_dcc3c320-9e88-44fe-b91d-02bd02fba1e3" mmi:target="GUI">
        <mmi:contentURL>songSelection</mmi:contentURL>
        <mmi:data>
            <gui resourceid="songSelection" xml:lang="de-DE">
                <data id="back" enabled="true"/>
                <data id="next" enabled="false"/>
                <data id="titleList" selected="" enabled="true">
                    <items>
                        <item id="10">
                            <arg name="artist"><![CDATA[One artist]]>
                            </arg>
                            <arg name="title"><![CDATA[This is the title]]>
                            </arg>
                            <arg name="displayName"><![CDATA[Title]]>
                            </arg>
                            <arg name="price"><![CDATA[0.90]]>
                            </arg>
                        </item>
                        <item id="11">
                            <arg name="artist"><![CDATA[Another artist]]>
                            </arg>
                            <arg name="title"><![CDATA[Yet another title]]>
                            </arg>
                            <arg name="displayName"><![CDATA[2nd title]]>
                            </arg>
                            <arg name="price"><![CDATA[0.90]]>
                            </arg>
                        </item>
                    </items>
                </data>                            
            </gui>
        </mmi:data>
    </mmi:startRequest>
</mmi:mmi>

E HTTP transport of MMI lifecycle events

The "Multimodal Architecture and Interfaces" specification supports deployments in a variety of topologies, either distributed or co-located. In case of a distributed deployment, a protocol for the lifecycle event transport needs to be defined. HTTP is the major protocol of the web. HTTP is widely adopted, it is supported by many programming languages and especially used by web browsers. Technologies like AJAX provide asynchronous transmission of messages for web browsers and allow to build modality components on top of it in distributed environments. This chapter describes how the HTTP protocol should be used for MMI lifecycle event transport in distributed deployments. Modality components and the Interaction Manager need an HTTP processor to send and receive MMI lifecycle events. The following picture illustrates a possible modularization of the Runtime Framework, the Interaction Manager and the Modality Components. It shows internal lifecycle event interfaces (which abstract from the transport layer) and the HTTP processors. The HTTP processors are responsible for assembling and disassembling of HTTP requests, which carry MMI lifecycle event representations as payloads.

The following chapters describe, how the HTTP protocol should be used to transport MMI lifecycle events.

HTTP defines the concept of client and server [RFC2616] . One possible deployment of the multimodal architecture is shown in following figure:

In this deployment scenario the Interaction Manager acts as an HTTP server, whereas modality components are HTTP clients, sending HTTP requests to the Interaction Manager. But other configurations are possible.

E.1 Lifecycle event transport from modality components to Interaction Manager

The multimodal architecture specification requires an asynchronous bi-directional event transmission. To achieve this (in the given scenario, where modality components are HTTP clients and the Interaction Manager acts as an HTTP server) separate (parallel) HTTP requests (refered to as send and receive channels in the picture) are used to send and receive lifecycle events.

Modality components use HTTP/POST requests to send MMI lifecycle events to the IM. The request contains the following URL request parameters:

  • context (or token )
  • source

The lifecycle event itself is contained in the body of the HTTP/POST request. The Content-Type header field of the HTTP/POST request has to be set according to the lifecycle event format, e.g. “text/xml”.

The URL request parameters context and source are equivalent to the respective MMI lifecycle event attributes. The context MUST be used whenever available. The context is only unknown to the modality component during startup of a multimodal session, as the context will be returned from the Interaction Manager to the Modality component with the newContextResponse lifecycle event. Hence, when sending a newContextRequest , the context is unknown. Therefore a token is used to associate the newContextRequest and newContextResponse messages.

The token is a unique id (preexisting knowledge, e.g. generated by the modality component during registration) to identify the channel between a modality component and the Interaction Manager.

Once the context is exchanged, the context MUST be used with subsequent requests and the token MUST NOT be used anymore.

The response (to a HTTP/POST request, which carries a lifecycle event from a Modality Component to to the Interaction Manager) MUST NOT contain any content and the HTTP response code MUST be “204 No Content”.

The HTTP processor of the Interaction Manager is expected to handle POST requests (which contain lifecycle events sent from the modality component to the Interaction Manager) as following:

  • use the context (or token ) parameter to identify the corresponding interaction manager session
  • read lifecycle event from request body
  • forward MMI event to corresponding Interaction Manager session
  • return "204 No Content" HTTP status code in case of success or 4XX/5XX codes in case of failure (see error handling section below)

E.2 Lifecycle event transport from IM to modality components (HTTP clients only)

Modality components, which are not HTTP servers (such as modality components build on top of web browsers) are not able to receive HTTP requests. Thus, to receive MMI events from the Interaction Manager, such modality components need to poll for events. The modality component has to send an HTTP/GET request to the Interaction Manager to request for the next MMI event. For network performance optimization the HTTP processor of the Interaction Manager may block the HTTP request for a certain time to avoid delay and network traffic (long living HTTP request). The modality component may control the maximum delay using the optional parameter timeout (in milliseconds). The request contains the following URL request parameters:
  • context (or token )
  • source
  • timeout (optional)

See discussion of the parameter context in the previous chapter. The parameter source describes the source of the request, i.e. the modality components id. The parameter timeout is optional and describes the maximum delay in milliseconds. Only positive integer values are allowed for the parameter timeout . The request with timeout set to “0” returns immediately. The Interaction Manager may limit the timeout to a (platform specific) maximum value. In case of absence of the parameter timeout the Interaction Manager uses a platform specific default.

The HTTP response body contains the lifecycle event as a string. The HTTP response header MUST contain the Content-Type header field, which describes the format of the lifecycle event string (e.g. “text/xml”).

The HTTP processor of the Interaction Manager is expected to handle HTTP/GET requests (which are used by the Modality Component to receive lifecycle events) as following:

  • use context (or token ) parameter to identify the corresponding Interaction Manager session
  • use source parameter to identify modality component id
  • check for corresponding events (i.e. are there events to send from Interaction Manager to this particular Modality Component). This step might be blocking for a certain time (according to timeout parameter) to optimize network performance.
  • generate HTTP response containing lifecycle event string (and set Content-Type header field appropriately). Use "200 OK" HTTP status code in case an event is contained in the response, “204 No Content” in case of timeout or 4XX/5XX codes in case of failure (see error handling section below)

The following figure shows a sequence of HTTP requests:

If the IM receives a HTTP/GET request containing an invalid token or context , it MUST return a 409 (Conflict) response code.

E.3 Lifecycle event transport from Interaction Manager to modality components (HTTP servers)

For modality components, which are HTTP servers themselves, the Interaction Manager needs to send a lifecycle event through an HTTP/POST request. The request contains the following parameters:

  • context
  • target
See discussion of parameters in previous chapters. Again, the parameter target is equivalent to the corresponding MMI lifecycle event attribute and describes the receiver of the event. Hence, the receiver of the HTTP request uses this parameter to indentify the corresponding modality component.

E.4 Error handling

Various MMI lifecycle events (especially response events) contain Status and StatusInfo fields. These fields should be used for error indication whenever possible. However, a failure during delivery of a lifecycle event needs to be indicated using HTTP response codes.

The HTTP processor of the Interaction Manager has to use HTTP response codes to indicate success or errors during request handling. In case of a successful processing of a request (successful in terms of transport, i.e. an event has been successfully delivered) a 2XX status code (e.g. "204 No Content") has to be returned. Transport related errors, which lead to failure in delivery of a lifecycle event, are indicated using 4XX or 5XX response codes. 4XX error codes referring to "client errors" (wrong parameters etc.) whereas 5XX error codes indicating server errors (see also HTTP response codes in [RFC2616] ).

The treatment of transport errors is up to the implementation, but the implementation should make errors visible to author code (e.g. raise event within Interaction Manager when a lifecycle event has not been successfully delivered to a Modality Component).

F Glossary

E Use Case Discussion This section presents a detailed example of how an implementation of this architecture. For the sake of concreteness, it specifies a number of details that are not included in this document. It is based on the MMI use case document [MMIUse] , specifically the second use case, which presents a multimodal in-car application for giving driving directions. Three languages are involved in the design view: The Controller/Interaction Manager markup language. We will not specify this language but will assume that it is capable of representing a reasonably powerful state machine. The graphical language. We will assume that this is HTML. The voice language . We will assume that this VoiceXML. For concreteness, we will use VoiceXML 2.0 [VXML] , but will also note differences in behavior that might occur with a future version of VoiceXML The remainder of the discussion involves the run-time view. The numbered items are taken from the "User Action/External Input" field of the event table. The appended comments are based on the working group's discussion of the use case. User Presses Button on wheel to start application. Comment: The Runtime Framework submits to a pre-configured URL and receives a session cookie in return.  This cookie will be included in all subsequent submissions. Now the Runtime Framework loads the DCCI framework, retrieves the default user and device profile and submits them to a (different) URl to get the Controller Document. UAPROF can be used for standard device characteristics (screen size, etc.), but it is not extensible and does not cover user preferences. The DCCI group is working on a profile definition that provides an extensible set of attributes and can be used here. Once the initial profile submission is made, only updates get sent in subsequent submissions. Once the Runtime Framework loads the Controller, it notes that it references both VoiceXML and HTML documents. Therefore it makes sure that the corresponding Modality Components are loaded, and then sends Prepare for each Component. These events contain the Context ID and the Component-specific markup (VoiceXML or HTML). If the markup was included in the root document, it is delivered in-line in the event. However, if the main document referenced the Component-specific markup via URL, only the URL is passed in the event. Once the Modality Components receive the Prepare event, they parse their markup, initialize their resources (ASR, TTS, etc.) and return PrepareResponse events. The IM responds with Start events and the application is ready to interact with the user. The user interacts in an authentication dialog. Comment: The Runtime Framework sends the Start command to the VoiceXML Modality component, which executes a Form asking the user to identify himself. In VoiceXML 3.0, the Form might make use of speaker verification as well as speech recognition. Any database access or other back-end interaction is handled inside the Form. In VoiceXML 2.0, the recognition results (which include the user's identity) will be returned to the IM by the <exit> tag along with a namelist. This would mean that the specific logical Modality Component instance had exited, so that any further voice interactions would have to be handled by a separate logical Modality Component corresponding to a separate Presentation Document. In VoiceXML 3.0, however, it would be possible for the Modality Component instance to send a recognition result event to the IM without exiting. It would then be sitting there, waiting for the IM to send it another event to trigger further processing. Thus in VoiceXML 3.0, all the voice interactions in the application could be handled by a single Markup Component (section of VoiceXML markup) and a single logical Modality Component. Recognition can be done locally, remotely (on the server) or distributed between the device and the server. By default, the location of event handling is determined by the markup. If there is a local handler for an event specified in the document, the event is handled locally. If not, the event is forwarded to the server. Thus if the markup specifies a speech-started event handler, that event will be consumed locally. Otherwise it will be forwarded to the server. However, remote ASR requires more than simply forwarding the speech-started event to the server because the audio channel must be established. This level of configuration is handled by the device profile, but can be overridden by the markup. Note that the remote server might contain a full VoiceXML interpreter as well as ASR capabilities. In that case, the relevant markup would be sent to the server along with the audio. The protocol used to control the remote recognizer and ship it audio is not part of the MMI specification (but may well be MRCP.) Open Issue: The previous paragraph about local vs remote event handling is retained from an earlier draft. Since the Modality Component is a black box to the Runtime Framework, the local vs remote distinction should be internal to it. Therefore the event handlers would have to be specified in the VoiceXML markup. But no such possibility exists in VoiceXML 2.0. One option would be to make the local vs remote distinction vendor-specific, so that each Modality Component provider would decide whether to support remote operations and, if so, how to configure them. Alternatively, we could define the DCCI properties for remote recognition, but make it optional that vendors support them. In either case, it would be up to the VoiceXML Modality Component communicate with the remote server, etc. Newer languages, such as VoiceXML 3.0 could be designed to allow explicit markup control of local vs remote operations. Note that in the most complex case, there could be multiple simultaneous recognitions, some of which were local and some remote. This level of control is most easily achieved via markup, by attaching properties to individual grammars. DCCI properties are more suitable for setting global defaults. When the IM receives the recognition result event, it parses it and retrieves the user's preferences from the DCCI component, which it then dispatches to the Modality Components, which adjust their displays, output, default grammars, etc. accordingly. In VoiceXML 2.0, each of the multiple voice Modality Components will receive the corresponding event. Initial GPS input. Comment: DCCI configuration determines how often GPS update events are raised. On the first event, the IM sends the HTML Modality Component an command to display the initial map. On subsequent events, a handler in the IM markup determines if the automobile's location has changed enough to require an update of the map display. Depending on device characteristics, the update may require redrawing the whole map or just part of it. This particular step in the use case shows the usefulness of the Interaction Manager. One can imagine an architecture lacking an IM in which the Modality Components communicate with each other directly. In this case, all Modality Components would have to handle the location update events separately. This would mean considerable duplication of markup and calculation. Consider in particular the case of a VoiceXML 2.0 Form which is supposed to warn the driver when he went off course. If there is an IM, this Form will simply contain the off-course dialog and will be triggered by an appropriate event from the IM. In the absence of the IM, however, the Form will have to be invoked on each location update event. The Form itself will have to calculate whether the user is off-course, exiting without saying anything if he is not. In parallel, the HTML Modality Component will be performing a similar calculation to determine whether to update its display. The overall application is simpler and more modular if the location calculation and other application logic is placed in the IM, which will then invoke the individual Modality Components only when it is time to interact with the user. Note on the GPS. We assume that the GPS raises four types of events: On-Course Updates, Off-Course Alerts, Loss-of-Signal Alerts, and Recovery of Signal Notifications. The Off-Course Alert is covered below. The Loss-of-Signal Alert is important since the system must know if its position and course information is reliable. At the very least, we would assume that the graphical display would be modified when the signal was lost. An audio earcon would also be appropriate. Similarly, the Recovery of Signal Notification would cause a change in the display and possibly a audio notification. This event would also contain an indication of the number of satellites detected, since this determines the accuracy of the signal: three satellites are necessary to provide x and y coordinate, while a fourth satellite allows the determination of height as well. Finally, note that the GPS can assume that the car's location does not change while the engine is off. Thus when it starts up it will assume that it is at its last recorded location. This should make the initialization process quicker. User selects option to change volume of on-board display using touch display. Comment: HTML Modality Component raises an event, which the IM catches. Depending on the IM language, it may be able to call the DCCI interface directly (e.g. as executable content in SCXML). If it cannot, the IM would generates an event to modify the relevant DCCI property and the Runtime Framework (Adapter) would be responsible for converting it into the appropriate function call, which has the effect of resetting the output volume. User presses button on steering wheel (to start recognition) Comment: The interesting question here is whether the button-push event is visible at the application level. One possibility is that the button-push simply turns on the mike and is thus invisible to the application. In that case, the voice modality component must already be listening for input with no prespeech timeout set. On the other hand, if there is an explicit button-push event, the IM could catch it and then invoke the speech component, which would not need to have been active in the interim. The explicit event would also allow for an update of the graphical display. User says destination address. (May improve recognition accuracy by sending grammar constraints to server based on a local dialog with the user instead of allowing any address from the start) Comment: Assuming V3 and explicit markup control of recognition, the device would first perform first local recognition, then send the audio off for remote recognition if the confidence was not high enough. The local grammar would consist of 'favorites' or places that the driver was considered likely to visit. The remote grammar would be significantly larger, possibly including the whole continent. When the IM is satisfied with the confidence levels, it ships the n-best list off to a remote server, which adds graphical information for at least the first choice. The server may also need to modify the n-best list, since items that are linguistically unambiguous may turn out to be ambiguous in the database (e.g., "Starbucks"). Now the IM instructs the HTML component to display the hypothesized destination (first item on n-best list) on the screen and instructs the speech component to start a confirmation dialog. Note that the submission to the remote server should be similar to the <data> tag in VoiceXML 2.1 in that it does not require a document transition. (That is, the remote server should not have to generate a new IM document/state machine just to add graphical information to the n-best list.) User confirms destination. Comment: Local recognition of grammar built from n-best list. The original use case states that the device sends the destination information to the server, but that may not be necessary since the device already has a map of the hypothesized destination. However, if the confirmation dialog resulted in the user choosing a different destination (i.e., not the first item on the n-best list), it might be necessary to fetch graphical/map information for the selected destination. In any case, all this processing is under markup control. GPS Input at regular intervals. Comment: On-Course Updates. Event handler in the IM decides if location has changed enough to require update of graphical display. GPS Input at regular intervals (indicating driver is off course) Comment: This is probably an asynchronous Off-Course Alert, rather than a synchronous update. In either case, the GPS determines that the driver is off course and raises a corresponding event which is caught by the IM. Its event handler updates the display and plays a prompt warning the user. Note that both these updates are asynchronous. In particular, the warning prompt may need to pre-empt other audio (for example, the system might be reading the user's email back to him.) N/A Comment: The IM sends a route request to server, requesting it to recalculate the route based on the new (unexpected) location. This is also part of the event handler for the off-course event. There might also be a speech interaction here, asking the user if he has changed his destination. Alert received on device based on traffic conditions Comment: This is another asynchronous event, just like the off-course event. It will result in asynchronous graphical and verbal notifications to the user, possibly pre-empting other interactions.; The difference between this event and the off-course event is that this one is generated by the remote server. To receive it, the IM must have registered for it (and possibly other event types) when the driver chose his destination. Note that the registration is specific to the given destination since the driver does not want to receive updates about routes he is not planning to take. User requests recalculation of route based on current traffic conditions Comment: Here the recognition can probably be done locally, then the recalculation of the route is done by the server, which then sends updated route and graphical information is sent to the device. GPS Input at regular intervals Comment: On-Course updates as discussed above. User presses button on steering wheel Comment: Recognition started. Whether this is local or remote recognition is determined by markup and/or DCCI defaults established at the start of application. The use case does not specify whether all recognition requires a button push. One option would be to require the button push only when the driver is initiating the interaction. This would simplify the application in that it would not have to be listening constantly to background noise or side chatter just in case the driver issued a command. In cases where the system had prompted the driver for input, the button push would not be necessary. Alternatively, a special hot-word could take the place of the button push. All of these options are compatible with the architecture described in this document. User requests new destination by destination type while still depressing button on steering wheel (may improve recognition accuracy by sending grammar constraints to server based on a local dialog with the us Comment: Local and remote recognition as before, with IM sending n-best list to server, which adds graphical information for at least the first choice. User confirms destination via a multiple interaction dialog to determine exact destination Comment: Local disambiguation dialog, as above. At the end, user is asked if this is a new destination. User indicates that this is a stop on the way to original destination Comment: Device sends request to server, which provides updated route and display info. The IM must keep track of the original destination so that it can request a new route to it after the driver reaches his intermediate destination. GPS Input at regular intervals Comment: As above. F G Best Practices for Creating a MMI Modality Component

F.1

G.1 Simple modality components

Modality components can be classified into either of three categories: simple, complex or nested.

A simple modality component presents information to a user or captures information from a user as directed by an interaction manager. A simple modality component is atomic in that it can not be portioned into two or ore simple modality components that send events among themselves. A simple modality component is like a black box in that the interaction manager can not directly access any function inside of the black box other than by using life cycle life-cycle events.

A simple modality component might contain functionality to present one of the following types of information to the user or user agent. For example:

  • TTS—generates synthetic speech from a text string
  • Audio replay—replays an audio file to a user
  • GUI presentation—presents HTML on a display device.
  • Ink replay—replays one or more ink strokes
  • Video replay—replays one or more video clips

A simple modality component might contain functionality to capture one of the following types of information from the user or user agent as directed by a complex modality or interaction manager:

  • Audio capture—records user utterances
  • ASR—captures text from the user by using a grammar to convert spoken voice into text
  • DTMF—captures integers from a user by using a grammar a user capture digits represented by the sounds created by touch tone keypad on a phone
  • Ink capture—capture one or more ink strokes
  • Ink recognition—captures one or more ink strokes and interprets them as text by using a grammar.
  • Speaker verification—determines if a user is who the user claims to be by comparing spoken voice characteristics with the voice characteristics known to be associated with the user
  • Speaker identification—determines who a speak is by comparing spoken voice characteristics with a set of preexisting voice characterists characteristics of several individuals.
  • Face verification—determines if a user is who the user claims to be by comparing face patterns with the face patterns known to be associated with the user
  • Face identification—determines who a speak is by comparing face pattern characteristics with a set of preexisting face patterns of several individuals
  • GPS—captures the current GPS location of a device.
  • Keyboard or mouse—captures information entered by the user using a keyboard or mouse.
Two simple modality components

Figure 1: Two simple modality components

Figure 1 illustrates two simple modality components—ASR modality for capturing input from the user and TTS for presenting output to the user. Note that all information exchanged between the two modality components must be sent as life cycle life-cycle events to the interaction manager which forwards them to the other modality component.

F.2

G.2 Complex modality components

A complex modality component may contain functionality of two or more simple modality components, for example:

  • GUI—presents information to the user, and captures keystrokes and mouse movements
  • VXML—presents a VoiceXML dialog to the user that both present speech to the user and captures the user's speech
  • GUI/VUI—enables user to both speak and listen, and read and type.
A basic modality component with two functions

Figure 2: A basic modality component with two functions

Figure 2 illustrates a complex modality component containing two functions, ASR and TTS. The ASR and TTS functions within the complex modality component may communicate directly with each other, in addition to sending and receiving life cycle life-cycle events with the interaction manager

F.3

G.3 Nested modality components

A nested modality component is a set of modality components and a script (possibly written in SCXML) that manages them. The script communicates with the child modality components using life cycle events. The script communicates with the interaction manager using only life cycle life-cycle events. The children modality components may not communicate directly with each other.

A nested modality component with two child modality components, ASR and TTS

Figure 3: A nested modality component with two child modality components, ASR and TTS.

Figure 3 illustrates a nested modality component with two child modality components, ASR and TTS.

In effect, the script within a nested modality component can be thought of as an interaction manager that manages the child modality components. In effect, a nested modality component is a nested interaction manager. This is the so-called "Russian Doll" model of nested interaction managers.

F.4

G.4 Modality component rules

The following rules guarantee that modalities are portable from interaction manager to interaction manager.

G.4.1 Rule 1: Each modality component must implement all of the MMI life cycle life-cycle events.

The MMI life cycle life-cycle events are the mechanism through which a modality component communicates with the interaction manager. The MC author must define how the modality component will respond to each life-cycle event. A modality component must respond to every life cycle life-cycle event it receives from the interaction manager in the cases where a response is required, as defined by the MMI Architecture. For example, if a modality component presents a static display, it must respond to a <pause> event with a <pauseResponse> event even if the static display modality component does nothing else in response to the <pause> event.

For each life cycle life-cycle event, define the parameters and syntax of the "data" element of the corresponding the life cycle life-cycle event that will be used in performing that function. For example, the <startRequest> event for a speech recognition modality component might include parameters like timeout, confidence threshold, max n-best, and grammar.

G.4.2 Rule 2: Identify other functions of the modality component that are relevant to the interaction manager.

Define an <extensionNotification> event to communicate these functions to and from the interaction manager manager.

G.4.3 Rule 3: If the component uses media, specify the media format. For example, audio formats for speech recognition, or InkML for handwriting recognition.

G.4.4 Rule 4: Specify protocols for use between the component and the device IM (e.g., SIP or HTTP).

G.4.5 Rule 5: Specify supported human languages , languages, e.g., English, German, Chinese and locale, if relevant.

G.4.6 Rule 6: Specify supporting languages required by the component, if any.

For example:

  • SSML for a speech synthesis simple modality component
  • SRGS and SISR for a speech recognition simple modality component
  • VoiceXML 2.1, SSML, SRGS, and SISR for a speech complex modality component

G.4.7 Rule 7: Modality components sending data to the interaction manager must use the EMMA formatwhere format where appropriate.

If a modality component captures or generates information, then it should format the information using the EMMA format and use an extension event to send that information to the interaction manager.

G.4.8 Rule 8: Specify error codes and their meanings to be returned to the IM.

The MC developer must specify all error codes that are specific to the component. If the MC is based on another technology, the developer can provide a reference to that technology specification. For instance, if the MC is based on VoiceXML, a reference to the VoiceXML spec for VoiceXML errors can be included instead of listing each VoiceXML error.

Errors such as XML errors and MMI protocol errors must be handled in accordance with the rules laid out in the MMI architecture. These errors do not need to be documented.

F.5

G.5 Modality component Guidelines

The following guidelines should be helpful for modality authors to make modalities portable from interaction manager to interaction manager.

G.5.1 Guideline1: Consider constructing a complex modality component with multiple functions if one function handles the errors generated by another function.

For example, if the ASR fails to recognize a user's utterance, a prompt may be presented to the user asking the user to try again by the TTS function. As another example, if the ASR fails to recognize a user's utterance, a GUI function might display the n-best list on a screen so the user can select the desired word. Efficiency concerns may indicate that two modality components be combined into a single complex modality component.

G.5.2 Guideline2: Consider constructing a complex modality component with multiple functions rather than several simple modality components if the functions need to be synchronized.

For example, a TTS function must be synchronized with a visual talking head so that the lip movements are synchronized with the words. As another example, a TTS functions presents information about the each graphical item that the user places "in focus." Again, efficiency concerns may indicate that the TTS and talking head be two modality components be combined into a single complex modality component.

G.5.3 Guideline3: Consider constructing a nested modality component with multiple child modality components if the children modality components are frequently used together but do not handle ther errors generated by the other children compoents components and the children components do not need to be extensively synchronized.

Writing an application using a nested modality component may be easier than writing the same application using multiple modality components if the nested modality component hides much of the complexity of managing the children modality components.

F.6

G.6 Example simple modality: Face Identification

F.6.1

G.6.1 Functions of a Possible Face Identification Component

Consider a theoretical face identification modality component that takes an image or images of a face and returns the set of possible matches and the confidence of the face identification software in each match. An API to that modality component would include events for starting the component, providing data, and for receiving results back from the component.

This particular example includes the information needed to run this component in the "startRequest" and "doneNotification" events; that is, in this example no "extensionNotification" events are used, although extensionNotification events could be part of another modality component's API. This example assumes that an image has already been acquired from some source; however, another possibility would be to also include image acquisition in the operation of the component.

Depending on the capabilities of the modality component, other possible information that might be included would be the algorithm to be used or the image format to expect. We emphasize that this is just an example to indicate the kinds of information that might be used by a multimodal application that includes face recognition. The actual interface used in real applications should be defined by experts in the field.

The use case is a face identification component that identifies one of a set of employees on the basis of face images.

The MMI Runtime Framework could use the following events to communicate with such a component.

Table 1: Component behavior of Face Identification with respect to modality component rules.
Rule Component Information
Rule 1: Each modality component must implement all of the MMI life cycle events events. See Table 2 for the details of the implementation of the life cycle life-cycle events.
Rule 2: Identify other functions of the modality component that are relevant to the interaction manager. All the functions of the component are covered in the life cycle life-cycle events, no other functions are needed.
Rule 3: If the component uses media, specify the media format. The component uses the jpeg format for images to be identified and for its image database.
Rule 4: Specify protocols supported by the component for transmitting media (e.g. SIP). The component uses HTTP for transmitting media.
Rule 5: Specify supported human languages languages. This component does not support any human languages.
Rule 6: Specify supporting languages required by the component component. This component does not require any markup languages.
Rule 7: Modality components sending data to the interaction manager must use the EMMA format. This component uses EMMA.

Table 2: Component behavior of face identification for each life cycle life-cycle event. "(Standard)" means that the component does not do anything over and above the actions specified by the MMI Architecture.
Life Cycle Event Component Implementation
newContextRequest (Standard) The component requests a new context from the IM.
newContextResponse (Standard) The component starts a new context and assigns the new context id to it.
prepareRequest The component prepares resources to be used in identification, specifically, the image database.
prepareResponse (Standard) If the database of known users is not found, the error message "known users not found" is returned in the <statusInfo> element.
startRequest The component starts processing if possible, using a specified image, image database, threshold, and limit on the size of nbest results to be returned.
startResponse (Standard) If the database of known users is not found, the error message "known users not found" is returned in the <statusInfo> element.
doneNotification Identification results in EMMA format are reported in the "data" field.The mode is "photograph", the medium is "visual", the function is "identification", and verbal is "false".
cancelRequest This component stops processing when it receives a "cancelRequest". It always performs a hard stop whether or not the IM requests a hard stop.
cancelResponse (Standard)
pauseRequest This component cannot pause.
pauseResponse <statusInfo> field is "cannot pause".
resumeRequest This component cannot resume.
resumeResponse <statusInfo> field is "cannot resume".
extensionNotification This component does not use "extensionNotification". It ignores any "extensionNotification" events that are sent to it by the IM.
clearContextRequest (Standard)
clearContextResponse (Standard)
statusRequest (Standard)
statusResponse The component returns a standard life cycle response. The "automaticUpdate" attribute is "false", because this component does not supply automatic updates.
F.6.2.

Note: "(Standard)" means that the component does not do anything over and above the actions specified by the MMI Architecture.

G.6.2 Event Syntax

F.6.2.1
G.6.2.1 Examples of events for starting the component

To start the component, a startRequest event from the RTF IM to the face identification component is sent, asking it to start an identification. It assumes that images found at a certain URI are to be identified by comparing them against a known set of employees found at another URI. The confidence threshold of the component is set to .5 and the RTF IM requests a maximum of five possible matches.

<mmi xmlns="http://www.w3.org/2008/04/mmi-arch" version="1.0">
  <mmi:startRequest source="uri:RTFURI" context="URI-1" requestID="request-1">
    <mmi:data>
      <face-identification-parameters threshold=".5" unknown="someURI" known="uri:employees" max-nbest="5"/>
    </mmi:data>
  </mmi:startRequest>
</mmi:mmi>  

</mmi:mmi>

As part of support for the life cycle life-cycle events, a modality component is required to respond to a startRequest event with a startResponse event. Here's an example of a startResponse from the face identification component to the RTF IM informing the RTF IM that the face identification component has successfully started.

<mmi xmlns="http://www.w3.org/2008/04/mmi-arch" version="1.0">
  <mmi:startResponse source="uri:faceURI" context="URI-1" requestID="request-1" status="success"/>
 </mmi:mmi> 

 </mmi:mmi>

Here's an example of a startResponse event from the face identification component to the RTF IM in the case of failure, with an example failure message. In this case the failure message indicates that the known images cannot be found. 

<mmi xmlns="http://www.w3.org/2008/04/mmi-arch" version="1.0">
  <mmi:startResponse source="uri:faceURI" context="URI-1" requestID="request-1" status="failure">
    <mmi:statusInfo>
      known users not found
    </mmi:statusInfo>
  </mmi:startResponse>
</mmi:mmi> 

</mmi:mmi>

F.6.2.2
G.6.2.2 Example output event

Here's an example of an output event, sent from the face identification component to the RTF, IM, using EMMA to represent the identification results. Two results with different confidences are returned.

<mmi xmlns="http://www.w3.org/2008/04/mmi-arch" version="1.0">
  <mmi:doneNotification source="uri:faceURI" context="URI-1" status="success" requestID="request-1">
    <mmi:data>
      <emma:emma version="1.0">
        <emma:one-of emma:medium="visual" emma:verbal="false" emma:mode="photograph" emma:function="identification">
          <emma:interpretation id="int1" emma:confidence=".75">
            <person>12345</person>
            <name>Mary Smith</name>
          </emma:interpretation>
          <emma:interpretation id="int2" emma:confidence=".6">
            <person>67890</person>
            <name>Jim Jones</name>
          </emma:interpretation>
        </emma:one-of>
      </emma:emma>
    </mmi:data>
  </mmi:doneNotification>
</mmi:mmi>

This is an example of EMMA output in the case where the face image doesn't match any of the employees.

<mmi xmlns="http://www.w3.org/2008/04/mmi-arch" version="1.0">
  <mmi:doneNotification source="uri:faceURI" context="URI-1" status="success" requestID="request-1" >
    <mmi:data>
      <emma:emma version="1.0">
        <emma:interpretation id="int1" emma:confidence="0.0" uninterpreted="true" emma:medium="visual" emma:mode="photograph" emma:function="identification"/>

        <emma:interpretation id="int1" emma:confidence="0.0"
         uninterpreted="true" emma:medium="visual" emma:mode="photograph"
         emma:function="identification"/>

      </emma:emma>
    </mmi:data>
  </mmi:doneNotification>
</mmi:mmi> 

G.7 Example simple modality: Form-filling using Handwriting Recognition

G Acknowledgements G.7.1 Functions of a Possible Handwriting Recognition Component

Consider an ink recognition modality component for Handwriting Recognition (HWR) that takes digital ink written using an electronic pen or stylus, performs recognition and returns the recognized text. An API to such a modality component would include events for initializing the component, requesting for recognition by providing digital ink data, and for receiving recognized text result (possibly an n-best list) back from the component as shown in the below figure.

Example of Japanese handwriting recognition

Figure 4: Example of Japanese handwriting recognition

This example assumes that handwriting ink is captured, represented in W3C InkML format and sent to the IM requesting for recognition to text. The editor wishes following sequences of events explain the ink recognition request.

  1. The IM requests the ink recognition modality by sending the "prepareRequest" event along with the parameters for configuring the HWR system.
  2. Ink recognition modality responds with the "prepareResponse" event with the status of the configuration of the HWR system.
  3. IM sends the "startRequest" event to thank Jim Larson (Intervoice) the ink recognition modality where the event’s data field contains the InkML data to be recognized.
  4. Once the recognition is completed, the ink recognition modality notifies the results to the IM using the “doneNotification” event along with the recognition choices (N-best list).

The use case is a form-filling application which accepts handwriting input provided by the user on the form fields. The inputs are recognized and displayed back as text in the corresponding fields. An ink capture modality may be used to capture the ink and send it to IM for recognition. The communication between the ink capture modality and the IM is not covered here for the sake of brevity. The below section explains the details of the communication between the MMI Runtime Framework (RTF) of the IM and the ink recognition modality.

Table 3: Component behavior of Ink modality with respect to modality component rules.
Rule Component Information
Rule 1: Each modality component must implement all of the MMI life cycle events. See Table 4 for the details of the implementation of the life-cycle events.
Rule 2: Identify other functions of the modality component that are relevant to the interaction manager. All the functions of the component are covered in the life-cycle events, no other functions are needed.
Rule 3: If the component uses media, specify the media format. The component uses W3C InkML format to represent handwriting data (digital ink).
Rule 4: Specify protocols supported by the component for transmitting media (e.g. SIP). The component uses HTTP for transmitting media. Other standard protocols such as TCP may also be explored.
Rule 5: Specify supported human languages. Virtually any human language script can be supported based on the HWR component capability.
Rule 6: Specify supporting languages required by the component. W3C InkML for representing the handwriting data.
Rule 7: Modality components sending data to the interaction manager must use the EMMA format. This component uses EMMA.

Table 4: Component behavior of handwriting recognition for each life-cycle event.
Life Cycle Event Component Implementation
newContextRequest (Standard) The component requests a new context from the Voice Browser Working Group IM.
newContextResponse (Standard) The component starts a new context and assigns the new context id to it.
prepareRequest The component prepares resources to be used in recognition. Based on the ‘script’ parameter, it first selects an appropriate recognizer. It also configures the recognizer with other parameters such as recognition confidence threshold, limit on the size of n-best results to be returned etc., when available.
prepareResponse (Standard) If the recognizer failed to find a matching recognizer for his contributions the request language script, a relevant error message is returned in the <statusInfo> element.
startRequest The component performs recognition of the handwriting input.
startResponse (Standard)The status of recognition as "success" or "failure" is returned in the <statusInfo> element.
doneNotification Identification results in EMMA format are reported in the "data" field. The mode is "ink", the medium is "tactile", the function is "transcription", and verbal is "true".
cancelRequest This component stops processing when it receives a "cancelRequest". It always performs a hard stop irrespective of the IM request.
cancelResponse (Standard)
pauseRequest This component cannot pause.
pauseResponse <statusInfo> field is "cannot pause".
resumeRequest This component cannot resume.
resumeResponse <statusInfo> field is "cannot resume".
extensionNotification This component does not use "extensionNotification". It ignores any "extensionNotification" events that are sent to it by the writing IM.
clearContextRequest (Standard)
clearContextResponse (Standard)
statusRequest (Standard)
statusResponse The component returns a standard life cycle response. The "automaticUpdate" attribute is "false", because this component does not supply automatic updates.

Note: "(Standard)" means that the component does not do anything over and above the actions specified by the MMI Architecture.

G.7.2 Event Syntax

G.7.2.1 Examples of events for preparing the document, especially component

IM send a prepareRequest event to the ink recognition component. The ink recognition component selects an appropriate recognizer that matches the given language script, in this example it is set to "English_Lowercase". The "RecoGrammar.xml" grammar file contains constraints that aid the recognizer. The confidence threshold of the component is set to .7 and the IM requests a maximum of five possible matches. Based on Appendix F . the capability of the recognizer, other possible parameters such as a ‘user profile’ that contains user-specific information can be provided.


<mmi:mmi version="1.0" xmlns:mmi="http://www.w3.org/2008/04/mmi-arch">
  <mmi:prepareRequest source="uri:RTFURI" context="URI-1" requestID="request-1">
    <mmi:data>
      <ink-recognition-parameters grammar="RecoGrammar.xml" threshold=".7" script="English_Lowercase" max-nbest="5"/>
    </mmi:data>
  </mmi:prepareRequest>
</mmi:mmi>

As part of support for the life cycle events, a modality component is required to respond to a prepareRequest event with a prepareResponse event. Here's an example of a prepareResponse from the ink recognition component to the IM informing the IM that the ink recognition component has successfully initialized.


<mmi:mmi xmlns:mmi="http://www.w3.org/2008/04/mmi-arch" version="1.0">
  <mmi:prepareResponse source="uri:inkRecognizerURI" context="URI-1" requestID="request-1" status="success"/>
</mmi:mmi>

Here's an example of a prepareResponse event from the ink recognition component to the IM in the case of failure, with an example failure message. In this case the failure message indicates that the language script is not supported.


<mmi:mmi xmlns:mmi="http://www.w3.org/2008/04/mmi-arch" version="1.0">
  <mmi: prepareResponse source="uri:inkRecognizerURI" context="URI-1" requestID="request-1" status="failure">
    <mmi:statusInfo>
      Given language script not supported
    </mmi:statusInfo>
  </mmi: prepareResponse>
</mmi:mmi>
G.7.2.2 Examples of events for starting the component

To start the component and recognize the handwriting data, a startRequest event from the IM to the ink recognition component is sent. The data field of the event contains InkML representation of the ink data.

Along with the ink, additional information such as the reference co-ordinate system and capture device’s resolution may also be provided in the InkML data. The below example shows that the ink strokes have X and Y channels and the ink has been captured at a resolution of 1000 DPI. The example ink data contains strokes of the Japanese character "手" (te) which means "hand".


<mmi:mmi xmlns:mmi="http://www.w3.org/2008/04/mmi-arch" version="1.0">
  <mmi:startRequest source="uri:inkRecognizerURI" context="URI-1" requestID="request-1">
    <mmi:data>
      <ink:ink xmlns:ink="http://www.w3.org/2003/InkML">
        <ink:definitions>
          <ink:context id="device1Context">
            <ink:traceFormat id=”strokeFormat>
              <ink:channel name="X" type="decimal">
                <ink:channelProperty name="resolution" value="1000" units="1/in"/>
              </ink:channel>
              <ink:channel name="Y" type="decimal">
                <ink:channelProperty name="resolution" value="1000" units="1/in"/>
              </ink:channel>
            </ink:traceFormat>
          </ink:context>
        </ink:definitions>
       <ink:traceGroup contextRef="#device1Context">
        <ink:trace>
          106 81, 105 82, 104 84, 103 85, 101 88, 100 90, 99 91, 97 97,
          89 105, 88 107, 87 109, 86 110, 84 111, 84 112, 82 113, 78 117,
          74 121, 72 122, 70 123, 68 125, 67 125, 66 126, 65 126, 63 127,
          57 129, 53 133, 47 135, 46 136, 45 136, 44 137, 43 137, 43 137
        </ink:trace>
        <ink:trace>
          28 165, 29 165, 31 165, 33 165, 35 164, 37 164, 38 164, 40 163,
          42 163, 45 163, 49 162, 51 162, 53 162, 56 162, 58 162, 64 160,
          69 160, 71 159, 74 159, 76 159, 78 159, 86 157, 91 157, 95 157,
          96 157, 99 157, 101 157, 103 157, 109 155, 111 155, 114 155,
          116 155, 119 155, 121 154, 124 154, 126 154, 127 154, 129 154,
          131 154, 134 153, 135 153, 136 153, 137 153, 138 153, 139 153,
          140 153, 141 153, 142 153, 143 153, 144 153, 145 153, 145 153  
        </ink:trace>
        <ink:trace>
          10 218, 12 218, 14 218, 20 216, 25 216, 28 216, 31 216, 34 216,
          37 216, 45 216, 53 216, 58 215, 60 215, 63 215, 68 215, 72 215,
          74 215, 77 215, 85 212, 88 212, 94 210, 100 208, 105 208, 107 208,
          109 208, 110 208, 111 207, 114 207, 115 207, 119 207, 121 207,
          123 207, 124 207, 128 206, 130 205, 131 205, 134 205, 136 205,
          137 205, 138 205, 139 204, 140 204, 141 204, 142 204, 143 204,
          144 204, 145 204, 146 204, 147 204, 148 204, 149 204, 150 204,
          151 203, 152 203, 153 203, 154 203, 155 203, 156 203, 158 203,
          159 202, 160 202, 161 202, 162 202, 163 202, 164 202, 165 202,
          166 202, 167 202, 168 202, 169 202, 170 202, 171 202, 172 202,
          173 202, 173 201, 173 201
        </ink:trace>
        <ink:trace>
          78 128, 78 127, 79 127, 79 128, 80 129, 80 130, 81 132, 82 133,
          82 134, 83 135, 84 137, 85 139, 86 141, 87 142, 88 144, 89 146,
          94 152, 95 153, 96 155, 98 160, 99 162, 100 165, 101 167, 101 169,
          102 173, 102 176, 102 181, 102 183, 102 185, 102 186, 104 192,
          104 195, 104 197, 104 199, 104 201, 104 203, 104 205, 104 206,
          104 207, 104 208, 104 209, 104 210, 104 211, 104 213, 104 214,
          104 215, 104 216, 104 217, 104 218, 104 220, 103 222, 102 223,
          102 224, 102 223, 102 224, 103 225, 103 228, 103 229, 103 230,
          103 231, 103 232, 103 233, 103 236, 103 239, 103 242, 103 243,
          103 247, 103 248, 102 249, 102 250, 102 251, 101 251, 100 253,
          99 255, 99 256, 98 257, 97 258, 97 259, 96 260, 96 261, 95 262,
          95 263, 94 264, 94 265, 93 266, 93 267, 92 268, 91 269, 91 270,
          90 271, 90 272, 89 273, 89 274, 88 275, 88 276, 87 276, 87 277,
          86 277, 86 278, 85 279, 85 280, 84 281, 83 282, 82 284, 82 285,
          81 285, 80 286, 79 287, 78 288, 77 288, 77 289, 76 290, 75 290,
          75 291, 74 291, 74 290, 74 289, 74 288, 74 287, 73 287, 73 286,
          73 285, 72 284, 72 281, 71 280, 70 279, 70 278, 69 277, 68 276,
          67 275, 65 274, 62 272, 60 271, 59 271, 58 270, 57 270, 56 269,
          55 268, 54 268, 53 267, 52 267, 51 267, 49 267, 48 267, 48 266,
          48 266  
        </ink:trace>
       </ink:traceGroup>
      </mmi:data>
     </mmi:startRequest>
</mmi:mmi>

As part of support for the life cycle events, a modality component is required to respond to a startRequest event with a startResponse event. Here's an example of a startResponse from the ink recognition component to the IM informing the IM that the ink recognition component has successfully started.


<mmi:mmi xmlns:mmi="http://www.w3.org/2008/04/mmi-arch" version="1.0">
  <mmi:startResponse source="uri:inkRecognizerURI" context="URI-1" requestID="request-1" status="success"/>
</mmi:mmi>

Here's an example of a startResponse event from the ink recognition component to the IM in the case of failure, with an example failure message. In this case the failure message indicates that the recognition failed due to invalid data format of the handwriting data.


<mmi:mmi xmlns:mmi="http://www.w3.org/2008/04/mmi-arch" version="1.0">
  <mmi:startResponse source="uri:inkRecognizerURI" context="URI-1" requestID="request-1" status="failure">
    <mmi:statusInfo>
      Invalid data format
    </mmi:statusInfo>
  </mmi:startResponse>
</mmi:mmi>
G.7.2.3 Example output event

Here's an example of an output event, sent from the ink recognition component to the IM, using EMMA to represent the identification results. Two results with different confidences are returned.


<mmi:mmi xmlns:mmi="http://www.w3.org/2008/04/mmi-arch" version="1.0">
  <mmi:doneNotification source="uri:inkRecognizerURI" context="URI-1" status="success" requestID="request-1">
    <mmi:data>
      <emma:emma version="1.0" xmlns:emma="http://www.w3.org/2003/04/emma">
        <emma:one-of emma:medium="tactile" emma:verbal="true"
         emma:mode="ink" emma:function="transcription">
          <emma:interpretation id="int1" emma:confidence=".8">
            <text> 手 </text>
          </emma:interpretation>
          <emma:interpretation id="int2" emma:confidence=".7">
           <text> 于 </text>
          </emma:interpretation>
        </emma:one-of>
      </emma:emma>
    </mmi:data>
  </mmi:doneNotification>
</mmi:mmi>

This is an example of EMMA output in the case where the recognizer is unable to find a suitable match to the input handwriting. The EMMA output contains an empty interpretation result.


<mmi:mmi xmlns:mmi="http://www.w3.org/2008/04/mmi-arch" version="1.0">
  <mmi:doneNotification source="uri:inkRecognizerURI" context="URI-1" status="success" requestID="request-1" >
    <mmi:data>
      <emma:emma version="1.0" xmlns:emma="http://www.w3.org/2003/04/emma">
        <emma:interpretation id="int1" emma:confidence="0.0"
         emma:medium="tactile" emma:verbal="true" emma:mode="ink"
         emma:function="transcription" emma:uninterpreted="true"/>
      </emma:emma>
    </mmi:data>
  </mmi:doneNotification>
</mmi:mmi>

I H References

CDF
Compound Document by Reference Framework 1.0. Timur Mehrvarz, et al. editors. World Wide Web Consortium, 2006
CCXML
"Voice Browser Call Control: CCXML Version 1.0" , R.J. Auburn, editor, World Wide Web Consortium, 2005.
DCCI "Delivery Context Interfaces (DCCI) Accessing Static and Dynamic Properties" , Keith Waters, Rafah Hosn, Dave Raggett, Sailesh Sathish, and Matt Womer, editors. World Wide Web Consortium, 2004. EMMA
"Extensible multimodal Annotation markup language (EMMA)" , , Michael Johnson et al. editors. EMMA is an XML format for annotating application specific interpretations of user input with information such as confidence scores, time stamps, input modality and alternative recognition hypotheses, World Wide Web Consortium, 2005.
Galaxy
"Galaxy Communicator" Galaxy Communicator is an open source hub and spoke architecture for constructing dialogue systems that was developed with funding from Defense Advanced Research Projects Agency (DARPA) of the United States Government.
MMIF
"W3C Multimodal Interaction Framework" , James A. Larson, T.V. Raman and Dave Raggett, editors, World Wide Web Consortium, 2003.
MMIUse
"W3C Multimodal Interaction Use Cases" , , Emily Candell and Dave Raggett, editors, World Wide Web Consortium, 2002.
RFC2616
"Hypertext Transfer Protocol -- HTTP/1.1" , R. Fielding et al. editors. IETF, 1999.
SCXML
"State Chart XML (SCXML): State Machine Notation for Control Abstraction" , Jim Barnett et al. editors. World Wide Web Consortium, 2006.
SMIL
"Synchronized Multimedia Integration Language (SMIL 2.1)" , Dick Bulterman et al. editors. World Wide Web Consortium, 2005.
SVG
"Scalable Vector Graphics (SVG) 1.1 Specification" , Jon Ferraiolo et al. editors. World Wide Web Consortium, 2003.
VoiceXML
"Voice Extensible Markup Language (VoiceXML) Version 2.0" , Scott McGlashan et al. editors. World Wide Web Consortium, 2004.
XHTML
"XHTML 1.0 The Extensible HyperText Markup Language (Second Edition)" , Steven Pemberton et al. editors. World Wide Web Consortium, 2004.
XMLSig
"XML-Signature Syntax and Processing" Eastlake et al., editors. World Wide Web Consortium, 2001.