Workshop on W3C's Multimodal Architecture and
Interfaces — Summary
On 16-17 November the W3C Multimodal Interaction Working Group held a
Workshop on W3C's Multimodal Architecture and Interfaces in Fujisawa,
Japan, hosted by W3C/Keio.
The minutes of the workshop are available on the W3C Web server:
http://www.w3.org/2007/08/mmi-arch/minutes.html
There were 25 attendees from
following organizations:
- ACCESS
- Conversational Technologies
- Deutsche Telekom Laboratories
- T-Systems
- IBM
- INRIA
- KDDI R&D Laboratories
- Kyoto Institute of Technology
- Hewlett-Packard Labs India
- Intervoice Inc.
- Microsoft Windows Division
- Openstream Inc
- Opera Software
- Toyohashi University of Technology
- Polytechnic University
- University of Tampere
- W3C
The motivation of the W3C Multimodal Interaction Working Group for
holding the MMI Architecture includes:
- There is great need for multimodal input/output modes these days
especially for hand-held portable devices with small displays and
small or nonexistent keypads.
- Accessibility to the Web must be extended so that users are allowed
to dynamically select the most appropriate modes of interaction
based on their needs depending on:
- their condition
- their environment
- their modality preferences
- A general and flexible framework should be provided to guarantee
application authors interoperability among modality-specific
components from different vendors - for example, speech recognition
from vendor A and handwriting recognition from vendor B.
This workshop was narrowly focused on identifying and prioritizing
requirements for extensions and additions to the MMI Architecture to
better support speech, GUI, Ink and other Modality Components. Topics
discussed during the Workshop included:
- Multimodal application authoring (Modality Component specific
grammar, standard authoring approaches, synchronizing multiple
modalities, error handling, etc.)
- Architecture design (latency of communication, integration with Web
browsers, fusion/fission of data, device capability, higher level
control language, etc.)
- User experience (accessibility, user information, application
context, multiple users, etc.)
- Topics that need further clarification (role of Interaction Manager,
application specific management, direct communication between
modality components, etc.)
We have generated a list of issues and requirements about the current
MMI Architecture through the workshop:
http://www.w3.org/2007/08/mmi-arch/topics.html
The major "takeaways" are:
- Multimodal applications use various modalities including GUI,
speech, handwriting, etc. Some of the Modality Components,
e.g. kinesthetic sensor input on mobile devices, etc., need modality
specific grammars for converting user input to concrete events.
- Considering that all data must be communicated between the
Interaction Manager and the Modality Components, latency of
communication may be problematic for real-time applications.
- Integration of multimodal applications with ordinary Web browsers is
a key question. Starting with integrating them as a plug-in
application might be a quick option.
- The capabilities of each handset and the user's preferences should
be available to the Interaction Manager to allow the application to
be adapted to accommodate both the handset and the user.
- The W3C Multimodal Interaction Group will review those new topics and
use the list as a guideline for future enhancements to the MMI
Architecture.
The Call for Participation,
the Logistics,
the Presentation Guideline
and
the Agenda
are also available on the W3C Web server.
Deborah Dahl
and
Kazuyuki Ashimura,
Workshop Co-chairs
$Id: agenda.html,v 1.11 2007/01/09 14:06:52 ashimura Exp
$