Next steps for W3C work on Multimodal Standards

What is W3C doing with regards to standards for multimodal user interfaces to the Web? This page sets out what has already been done and what W3C plans to do in the future.

What are Multimodal user interfaces?

Traditional Web browsers present a visual rendering of Web pages written in HTML, and allow you to interact through the keyboard and a pointing device such as a mouse, roller ball, touch pad or stylus. Voice user interfaces, by contrast, present information using a combination of synthetic speech and pre-recorded audio, and allow you to interact via spoken commands or phrases. You may also be able to use touch tone (DTMF) keypads.

Multimodal user interfaces support multiple modes of interaction:

Electronic ink is the term for information that describes the motion of a stylus in terms of position, velocity and pressure. It can be used for handwriting and gesture recognition.

Here are just a few ideas for ways to exploit multimodal user interfaces:

What has been done already?

The W3C Voice Browser working group published a set of requirements for multimodal interaction in July 2000. The working group also invited participants to demonstrate proof of concept examples of multimodal applications. A number of such demonstrations were shown at the working group's face to face meeting held in Paris in May 2000.

To get a feeling for future work, the W3C together with the WAP Forum held a joint workshop on the Multimodal Web in Hong Kong on 5-6 September 2000. This workshop addressed the convergence of W3C and WAP standards, and the emerging importance of speech recognition and synthesis for the Mobile Web. The workshop's recommendations encouraged W3C to set up a multimodal working group to develop standards for multimodal user interfaces for the Web.

Why isn't multimodal being addressed in the Voice Browser working group

Although the Voice Browser working group developed requirements for multimodal interaction, the pressure of work on spoken dialogs and related specifications has made it impractical to devote time to further work on multimodal standards. As a result, W3C now expects to create a new multimodal working group later this year.

Next steps

To ensure that the new multimodal work group can act swiftly to fulfil commercial requirements, W3C member organizations are invited to submit detailed proposals to W3C for the markup language and synchronization protocols needed to support multimodal interaction. Submissions should consider the following points:

W3C Members are encouraged to collaborate on proposals, as this will make it easier to ascertain broad industry support. In late Summer 2001, a charter for a Multimodal working group will be drawn up based upon the proposals that get the broadest industry backing.

Some ideas that have been suggested include:

Information on how to make a submission to W3C can be found here.


Dave Raggett <dsr@w3.org>, Voice Browser Activity lead, February 2001