Multimodal Interaction Working Group Charter

The mission of the Multimodal Interaction Working Group, part of the Multimodal Interaction Activity, is to develop open standards that enable the following vision:

Summary Table

End date 31 January 2009
Confidentiality Proceedings are Member-only, but the group sends regular summaries of ongoing work to the public mailing list.
Initial Chairs Deborah Dahl
Initial Team Contacts
(FTE %: 40)
Kazuyuki Ashimura
Usual Meeting Schedule Teleconferences: Weekly
Face-to-face: 3-4 per year


The primary goal of this Charter is to develop W3C Recommendations that enable multimodal interaction with mobile phones and other devices with limited resources. For rapid adoption on a global scale, it should be possible to add simple multimodal capabilities to existing markup languages in a way that is backwards compatible with widely deployed devices, and which builds upon widespread familarity with existing Web technologies. The standards should be scalable to enable richer capabilities for subsequent generations of multimodal devices.

Users will be able to provide input via speech, handwriting, or keystrokes, with output presented via displays, pre-recorded and synthetic speech, audio, and tactile mechanisms such as mobile phone vibrators and Braille strips. Application developers will be able to provide an effective user interface for whichever modes the user selects. To encourage rapid adoption, the same content can be designed for use on both old and new devices. People with new phones will get to experience the multimodal capabilities, while users with old phones will get to use the keypad and/or stylus in the same way as now.

The specifications developed by the Multimodal Interaction Working Group under this charter must fall within the scope defined in section 3, and should be implementable on a royalty-free basis, see section 9.

Target Audience of the Multimodal Interaction Working Group should include a range of organizations in different industry sectors like:

Multimodal applications are of particular interest for mobile devices. Speech offers a welcome means to interact with smaller devices, allowing one-handed and hands-free operation. Users benefit from being able to choose which modalities they find convenient in any situation. The Working Group should be of interest to companies developing smart phones and personal digital assistants or who are interested in providing tools and technology to support the delivery of multimodal services to such devices.
Automotive Telematics
With the emergence of dashboard integrated high resolution color displays for navigation, communication and entertainment services, W3C's work on open standards for multimodal interaction should be of interest to companies working on developing the next generation of in-car systems.
Multimodal interfaces in the office
Multimodal has benefits for desktops, wall mounted interactive displays, multi-function copiers, and other office equipment, offering a richer user experience and the chance to use speech and pens as alternatives to the mouse and keyboard. W3C's standardization work in this area should be of interest to companies developing client software and application authoring technologies, and who wish to ensure that the resulting standards live up to their needs.
Multimodal interfaces in the home
In addition to desktop access to the Web, multimodal interfaces are expected to add value to remote control of home entertainment systems, as well as finding a role for other systems around the home. Companies involved in developing embedded systems and consumer electronics should be interested in W3C's work on multimodal interaction.

The Working Group's initial focus was on use cases and requirements. This led to the publication of the W3C Multimodal Interaction Framework, and in turn to work on extensible multi-modal annotations (EMMA), and InkML, an XML language for ink traces. The Working Group has also worked on integration of composite multimodal input; dynamic adaptation to the user, device and environmental conditions; modality component interfaces; and a study of current approaches to interaction management. Natural Language Semantics Markup Language for the Speech Interface Framework is obsolete and has been replaced by work on EMMA.

The Working Group is now being re-chartered for a further two years.


All work items carried out under this Charter must fall within the scope defined by this section.

The aim is to develop W3C Recommendations that will enable the widespread deployment of multimodal applications on mobile phones and other devices, and to complete existing work on EMMA and InkML.

Multimodal Architecture and Interfaces

Future Web applications will allow developers to define applications in terms of markup of their own choosing, with the means to define the corresponding runtime behavior in terms of scriptable objects and shadow markup, such as SVG for visualization.

To assist with realizing this goal, the Multimodal Interaction Working Group is tasked with providing a loosely coupled architecture for multimodal user interfaces, which allows for co-resident and distributed implementations, and focuses on the role of markup and scripting, and the use of well defined interfaces between its constituents. The framework is motivated by several basic design goals:

  • Encapsulation. The architecture should make no assumptions about the internal implementation of components, which will be treated as black boxes.
  • Distribution. The architecture should support both distributed and co-hosted implementations.
  • Extensibility. The architecture should facilitate the integration of new modality components. For example, given an existing implementation with voice and graphics components, it should be possible to add a new component (for example, a biomentric security component) without modifying the existing components.
  • Recursiveness. The architecture should allow for nesting, so that an instance of the framework consisting of several components can be packaged up to appear as a single component to a higher-level instance of the architecture.
  • Modularity. The architecture should provide for the separation of data, control, and presentation.

Where practical this should leverage existing W3C work. We expect to reach Recommendation status by the end of the charter.

Multimodal Authoring

This task (1) validates that the Multimodal Architecture and Interfaces and associated development languages can be used to create multimodal user interfaces, and (2) suggests standard approaches, techniques and principles for developing multimodal user interfaces on mobile phones in particular as well as other devices. The approach must allow simple multimodal capabilities to be added to existing markup languages in a way that is backwards compatible with widely deployed mobile devices, and which builds upon widespread familiarity with existing Web technologies. The standards should be scalable to enable richer capabilities on subsequent generations of multimodal devices. This involves the separation of the user interface from the application, to enable different user interfaces according to the user's preferences and the capabilities available to the devices. Work is expected on:

  • collaboration with other W3C Working Groups on enabling application developers to provide user interface skins for whichever modes of interaction are selected.
  • support for effective user interfaces for various modes of interaction, in terms of contextual prompts, constrained text input, and declarative event handlers, taking account of uncertainties in user input.
  • the re-use of existing markup languages for prompts and constraints on user input.
  • the use of scripts to enable the customization of the user interface based upon previous user input.

This is not expected a recommendation track document but a Working Group note. We may fold this into an informative appendix of the Multimodal Architecture and Interfaces specification. The first draft is expected in June 2007 and the second draft is expected in December 2007.

The working group will investigate and recommend how various W3C languages can be extended for use in a multimodal environment using the multimodal life cycle events. We may prepare W3C notes on how the following languages to participate in multimodal specifications by incorporating the life cycle events from the multimodal architecture: XHTML, VoiceXML, MathML, SMIL, SVG, InkML and other languages that can be used in a multimodal environment. The working group is also interested in investigating how CSS and Delivery Context Interfaces (DCI) can be used to support Multimodal Interaction applications, and if appropriate, may write a W3C Note.

Extensible Multi-Modal Annotations (EMMA)

EMMA is being developed as a data exchange format for the interface between input processors and interaction management systems. It will define the means for recognizers to annotate application specific data with information such as confidence scores, time stamps, input mode (e.g. key strokes, speech or pen), alternative recognition hypotheses, and partial recognition results etc. EMMA is a target data format for the semantic interpretation specification being developed in the W3C Voice Browser Activity, and which describes annotations to speech grammars for extracting application specific data as a result of speech recognition. EMMA supercedes earlier work on the natural language semantics markup language in the Voice Browser Activity.

InkML - an XML language for ink traces

InkML provides a range of features to support real-time ink streaming, multi-party interactions and annotated ink archival. Applications may make use of as much or as little information as required, from minimalist applications using only simple traces to more complex problems, such as signature verification or calligraphic animation, requiring full dynamic information. As a platform-neutral format for digital ink, InkML can support collaborative or distributed applications in heterogeneous environments, such as courier signature verification to distance education. The specification is the product of four years of work by a cross-sector working group with input from Apple, Corel, HP, IBM and Motorola, as well as invited experts from academia and other sources. InkML provides a range of features to support real-time ink streaming,

Maintenance work

Following the publication of EMMA and InkML for Recommendations, the Working Group will be maintaining the specifications; that is, responding to questions and requests on the public mailing list and issuing errata as needed. The Working Group will also consider publishing additional versions of the specification, depending on such factors as feedback from the user community and any requirements generated for EMMA and InkML by the Multimodal Architecture and Interfaces work and the Multimodal Authoring work.

Success Criteria


The following documents are expected to become W3C Recommendations:

The following document is a note, which may be folded into an informative appendix of the Multimodal Architecture and Interfaces specification:

The following notes may be revised depending upon the interest of the working group members:


This Working Group is chartered to last until 31 January 2009. The first face to face meeting after re-chartering will be held in May or June 2007.

Here is a list of milestones identified at the time of re-chartering. Others may be added later at the discretion of the Working Group. The dates are for guidance only and subject to change.

Note: The group will document significant changes from this initial schedule on the group home page.
Specification FPWD LC CR PR Rec
Multimodal Architecture and Interfaces Completed December 2007 June 2008 December 2008 February 2009
EMMA Completed Completed January 2007 August 2007 September 2007
InkML Completed Completed September 2007 March 2008 April 2008


W3C-related activities

These are W3C activities that may be asked to review documents produced by the Multimodal Interaction Working Group, or which may be involved in closer collaboration as appropriate to achieving the goals of the Charter.

External groups

This is an indication of external groups with complementary goals to the Multimodal Interaction activity. W3C has formal liaison agreements with some of them, e.g. OMA and VoiceXML Forum.


To be successful, the Multimodal Interaction Working Group is expected to have 10 or more active participants for its duration. Effective participation to Multimodal Interaction Working Group is expected to consume one work day per week for each participant; two days per week for editors. The Multimodal Interaction Working Group will allocate also the necessary resources for building Test Suites for each specification.

In order to make rapid progress, the Multimodal Interaction Working Group consists of several subgroups, each working on a separate document. The Multimodal Interaction Working Group members may participate in one or more subgroups.

Participants are reminded of the Good Standing requirements of the W3C Process.

To become a participant of the Working Group, a representative of a W3C Member organization must be nominated by their Advisory Committee Representative as described in the W3C Process. The associated IPR disclosure must further satisfy the requirements specified in the W3C Patent Policy (5 February 2004 Version).

Experts from appropriate communities may also be invited to join the working group, following the provisions for this in the W3C Process.

Working Group participants are not obligated to participate in every work item, however the Working Group as a whole is responsible for reviewing and accepting all work items.

Face to face meetings will be arranged 3 to 4 times a year. The Chair will make Working Group meeting dates and locations available to the group in a timely manner according to the W3C Process. The Chair is also responsible for providing publicly accessible summaries of Working Group face to face meetings, which will be announced on www-multimodal@w3.org.


This group primarily conducts its work on the Member-only mailing list w3c-mmi-wg@w3.org (archive). Certain topics need coordination with external groups. The Chair and the Working Group can agree to discuss these topics on a public mailing list. The archived mailing list www-multimodal@w3.org is used for public discussion of W3C proposals for Multimodal Interaction Working Group and for public feedback on the group's deliverables.

Information about the group (deliverables, participants, face-to-face meetings, teleconferences, etc.) is available from the Multimodal Interaction Working Group home page.

All proceedings of the Working Group (mail archives, telecon minutes, face-to-face minutes) will be available to W3C Members. Summaries of face-to-face meetings will be sent to the public list.

Decision Policy

As explained in the Process Document (section 3.3), this group will seek to make decisions when there is consensus. When the Chair puts a question and observes dissent, after due consideration of different opinions, the Chair should record a decision (possibly after a formal vote) and any objections, and move on.

This charter is written in accordance with Section 3.4, Votes of the W3C Process Document and includes no voting procedures beyond what the Process Document requires.

Patent Policy

This Working Group operates under the W3C Patent Policy (5 February 2004 Version). To promote the widest adoption of Web standards, W3C seeks to issue Recommendations that can be implemented, according to this policy, on a Royalty-Free basis.

For more information about disclosure obligations for this group, please see the W3C Patent Policy Implementation.

About this Charter

This is a draft charter for review by the W3C Membership.

This charter for the Multimodal Interaction Working Group has been created according to section 6.2 of the Process Document. In the event of a conflict between this document or the provisions of any charter and the W3C Process, the W3C Process shall take precedence.

Please also see the previous charter for this group.

Note: This charter was modified on 26 November 2007 to included the informative note in section 4.1 referring readers to the home page of the group for updated milestone information.

Deborah Dahl, Chair, Multimodal Interaction Working Group
Kazuyuki Ashimura, Multimodal Interaction Activity Lead
Dave Raggett, Multimodal Interaction Working Group staff contact

$Date: 2007/11/26 23:18:30 $