Discovery for Multimodal Interaction with Multi-Device Systems

Multi-device systems include a variety of devices, where each device provides services that make the best use of its unique capabilities. A smartphone is more convenient than a large monitor for a touch user interface, but a large monitor works better for displaying information to a group of people. Why not take advantage of what each does best, and control the large display through the smartphone?

But an application that uses multiple devices needs a way to find them. Capabilities that are local to a device are easy to find – a developer can just tell the application to start using the microphone when it needs to capture speech input. Permanent multi-device configurations, like a television and its remote controls, can also be configured at development time. But what about ad hoc multi-device systems where the relationship between devices and their services is dynamic?

The Multimodal Interaction Working Group has just published a note that starts to address this issue (Registration & Discovery of Multimodal Modality Components ) by putting together some use cases and requirements from the point of view of multimodal interaction.

For example —

1. Smart Cars: Car functions could be managed by other devices, allowing the user to control and personalize the car’s various adjustments — seat, navigation, radio, or temperature settings — by using their mobile device, storing their preferences in the cloud or on the device.

2. Intelligent Conference Rooms: Conference rooms are full of sophisticated technology — projectors, audio systems, interactive whiteboards and so on. We’ve all experienced too many frustrating interruptions when a new presenter has to attach their laptop to the projector, or a remote participant tries to use the video system. A standard way for participants’ devices to discover and use of the conference room’s services could avoid a lot of frustration.

3. Health Notifiers: There are many kinds of medical sensors, and new ones are constantly showing up. Every sensor has its own user interface, which may be very complex. In many cases users have to undergo special training to operate the device and interpret its displays, which raises the possibility of medical errors caused by user mistakes. However, a multimodal interface on a mobile device could provide standardized options to control sensor operations by voice or gesture (“start reading blood pressure now”). Sophisticated applications could also synthesize information from multiple sensors (for example, blood pressure and heart rate) and transmit a notification to the doctor in case of an unusual combination of readings.

4. Smart Home: Home appliances, entertainment equipment, thermostats, and other intelligent home devices can be controlled and monitored through a user’s own device, either at home or remotely. Adult children could monitor the systems at an elderly parent’s home, for example, to make sure that the temperature is not getting too high or too low.

5. Public Spaces: Users would be able to access public information such as a map of a shopping mall, information about an historical site, or information about the exhibits in a museum, through their own mobile devices. Public multi-device/multi-user applications could also be used just for fun, for example, for musical interaction or for controlling a lighting display.

We’re just starting this work so comments and suggestions for other use cases are very welcome. Please send comments to the multimodal interaction mailing list.