MMI/Use Cases

Template for Use Cases

This is a generic template for proposing use cases.

Submitter(s): company (name), ...

Tracker Issue ID: ISSUE-NN

Category: How would you categorize this issue?

gap in existing specifications (=> IG to draft a proposal for an existing WG), or how does this work in existing architecture?
what implications does this UC have for discovery and registration?
require new specification/WG (=> IG to draft a proposal for W3C Director)
can be addressed as part of a guidelines document to be produced by the IG (=> The proponent should draft some text for the document)

Objects

Fuctions of Objects
Object	Identify	Track	Accountability
Museum Exhibit	Museum Exhibit ID (exhibit content)	@	@
Visitor	@	@	@
Employee	@	@	@

Actions

change the content depending on the visitors
visitor->museum exhibit
- Functions
  - Identify: visitors (children/matured, accessibility, languages, etc.)
  - Track: visitor near-by (location; patterns of behavior; situation of the visitor, e.g., whether in a dangerous area or not)
  - Accountability: counting visitors

Description:

high level description/overview of the goals of the use case
Schematic Illustration (devices involved, work flows, etc) (Optional)
Implementation Examples (e.g. JS code snippets) (Optional)

Motivation:

Explanation of benefit to ecosystem
Why were you not able to use only existing standards to accomplish this?

Dependencies:

other use cases, proposals or other ongoing standardization activities which this use case is dependent on or related to

What needs to be standardized:

What are the new requirements, i.e. functions that don't exist, generated by this use case?

Comments:

any relevant comment that is good to track related to this use case

Use Cases

UC-1: Synchronizing video stream and HMI (e.g., remote surgery)

Submitter(s): W3C (Kaz Ashimura)

Reviewer(s): Masahiro, Kosuke, Shinya

Tracker Issue ID: ISSUE-NN

Category: 1. gap with existing specification

Description:

The ability to synchronize (1) a video stream using holography and (2) advanced HMI like robot arm for remote surgery.

That would requires precise synchronization for realtime interaction. Also it would be nicer to have 3-d display and HMI to control the image from any angles using intuitive operation regardless of the user's knowledge on the system.

Typical example is the smart HMI from a famous movie of "Minority Report".
It is expected to handle various devices provided by multiple vendors.

Motivation:

This is being requested by the TV industry.
It may be possible to achieve this with existing web standards but this should be investigated.

Dependencies:

There are other activities going on in the W3C related to this use case. For example, Multi-Device Community Group, Web and TV Interest Group, Second Screen Working Group and possibly the Web of Things IG as well. It may be the case that these other efforts will be able to support the required functionality, but if there needs to be any MMI work, we should coordinate with the other groups. Also there is a proposal that there could be an HTML5 new features community group, and we should coordinate with that.

Gaps

What can't be done with the existing mechanism?
We need to find a 3-D gesture representation. If so, we can use that internally in EMMA application semantics. This needs to be investigated.
we could investigate the Media Fragments API for annotating video

What needs to be standardized:

New features? APIs? data model? language?

Comments:

Anything you want to add

UC-2: Interaction with Robots

Submitter(s): Kaz & Debbie

Reviewer(s): Masahiro, Kosuke, Shinya

Tracker Issue ID: @@@

Category: @@@

Description:

controlling robots (including both industrial robots and personal robots) using multimodal interface like voice and gesture
A typical example is interaction of multiple pet robots, e.g., Pepper, AIBO, Nao (French), Robi (Japanese) and human(s). Interoperable robot interaction?

may use BehaviorML or VRML for the MC
Behavior Tree could be used for IM
possible requirements identified during TPAC breakout session:
- offline operation
- scheduled operation
- realtime operation
- authorization
- discovery and vocabularies
- human to machine
- machine to machine
- privacy
- virtual robot/agent (software) to robot
- asynchronous operation
- messaging standards
- graceful degradation
- should be possible to control other devices through a robot instead of through smart phoneor remote

Motivation:

Interactive and adaptive control is useful in the case of accidents or errors
Elder people will increase in the near future and people would need help (or would enjoy interaction) with pet robots.

Dependencies:

List of possible related standardization activities...

Gaps

What can't be done with the existing mechanism?

The current MMI Life Cycle events don't have the concept of scheduling or events scheduled to occur in the future. The simplest approach is to use the ExtensionNotification event, but that's not very standardized. Other ideas would be to use SMIL (but that's mostly for media), MIDI, SCXML, use the EMMA output functionality, or introduce a new attribute into the Life Cycle events. In addition to starting at a certain time, scheduling also introduces the idea of recurring events, for example, every day at 5:00.

Options

Data field of ExtensionNotification: This could be used, but the idea of scheduling is very generic and should be standardized.
SMIL: SMIL is used for presenting synchronized media, it doesn't seem appropriate for scheduling Life Cycle events, which may not be user-facing.
EMMA: The reasons against using the EMMA output functionality are similar to the arguments against using SMIL. EMMA output is designed for output that will be consumed by a user.
MIDI: MIDI is also designed for output to a user.
SCXML: the "delay" and "delayexpr" attributes of "<send>" may be suitable for scheduling.
Add a new common field or fields to the Life Cycle events.
MC's could also have an internal scheduling capability, independent of the IM
EMMA 2.0 output could also be responsible for scheduling, using "emma:planned-output", perhaps using SMIL, should look at starting relative to other events
could also use EMMA 2.0 for scheduling interactions that don't directly affect UI (scheduling DVR, for example)

The choice between 5 and 6 depends at least in part on whether the IM or the MC is responsible for scheduled action. If the IM is responsible for scheduling, SCXML "delay" would be more appropriate, because it would keep track of the schedule and send the Life Cycle events at the correct times. On the other hand, if the MC is responsible, it can be sent, for example, a StartRequest with a "delay", and the MC will be responsible for starting at the correct time. It may be that both are needed. If we want to maintain the independence of the MMI Architecture from SCXML, we can just say the IM (however it's implemented) is responsible for scheduling.

What needs to be standardized:

New features? APIs? data model? language?

Comments:

Anything you want to add

UC-3: Wearable devices

Submitter(s): Kaz

Reviewer(s): Shinya, Yasuhiro

Tracker Issue ID: @@@

Category: @@@

Description:

Abstract:
- controlling wearable devices (like sphygmomanometer or game devices) using multimodal interface like voice and gesture
  - may get connected with home network when the user comes home
  - may interact with head mount display like Google Glass
  - doctors might want to use
- controlling devices, e.g., music instruments, by eye tracking or brain wave
- e.g.) Eye Play the Piano

Problem:
- There are various wearable devices, but currently the usage of wearable devices depends on identification and personalization using vendor specific account information, e.g., Google account or Apple ID. It also depends on smartphone's connectivity outside home.

Possible technology areas to attach:
- There is a question on how to identify and manage the capability of each wearable device and what kind interaction could be made between users and devices.
- MMI could be used to identify and manage devices and interactions.
- For that, what are the key requirements to solve the issues:
  - make devices on different platforms interact with each other?
  - a smarter mechanism to identify multiple devices and let them interact with each other without each developer's specifying device capability explicitly

Related slides (Japanese): http://www.slideshare.net/mawarimichi/ss-37559346

Motivation:

@@@

Dependencies:

List of possible related standardization activities...

Gaps

What can't be done with the existing mechanism?

What needs to be standardized:

New features? APIs? data model? language?

Comments:

Anything you want to add

UC-4: MMI output devices: Controlling 3D Printers using a remote HMI

Submitter(s): Kaz

Reviewer(s): @@@

Tracker Issue ID: @@@

Category: @@@

Description:

interaction with a 3D printer or vending machine using multimodal interface like voice and gesture
controlling a 3D printer using multiple modalities including gesture and speech.
may be applied to automatic cooker
this is an output device for MMI

Motivation:

@@@

Dependencies:

List of possible related standardization activities...

Gaps

What can't be done with the existing mechanism?

What needs to be standardized:

New features? APIs? data model? language?

Comments:

Anything you want to add

UC-5: MMI input devices: Video Cameras work with HMI and sensors

Submitter(s): Kaz

Reviewer(s): @@@

Tracker Issue ID: @@@

Category: @@@

Description:

interaction with a video camera, e.g., back view monitor for cars, using multimodal interface like voice and gesture
a video camera outside the entrance detects intruders and let us via an HMI
a video camera could be installed on a drone and controlled using gesture, speech, etc.

Motivation:

@@@

Dependencies:

List of possible related standardization activities...

Gaps

What can't be done with the existing mechanism?

What needs to be standardized:

New features? APIs? data model? language?

Comments:

Anything you want to add

UC-6: Multimodal Guide Device at Museums

Submitter(s): Kaz

Reviewer(s): Shinya, Masahiro, Kosuke

Tracker Issue ID: @@@

Category: @@@

Description:

not only audio but also any kind of modalities can be used, e.g., input from the user using gesture, speech, shaking the device
interaction between the user and the device is the key
location and orientation are identified, and appropriate service is provided

Motivation:

@@@

Dependencies:

List of possible related standardization activities...

Gaps

What can't be done with the existing mechanism?

What needs to be standardized:

New features? APIs? data model? language?

Comments:

This UC could be applied to other places/scenarios than museums.

UC-7: Multimodal Appliances

Submitter(s): Kaz (possible support from Dirk)

Reviewer(s): Shinya, Masahiro, Kosuke, Ryuichi

Tracker Issue ID: @@@

Category: @@@

Description:

control home appliances like rice cooker using multimodal interface like speech and gesture
user, location and orientation are identified, and appropriate service is provided
e.g., digital TV with speech interface
HMI which helps people use appliances at home easily, e.g., assisting people's sight by enlarging the image, assisting their hearing by clear audio and haptic interface
wearable devices and AR-ready display are nice to be included. maybe a smart car also could be a target.
see also Dirk's paper at the SCXML worksohp
we should consider what is the (significant) merit to use multimodal interface for multiple-device integration
- for example, we could integrae multiple devices and multiple Web services so that we can make an order to buy a pizza
- this is related to the UC-2, interaction with robots. robots could be used as smart/friendly/easy-to-use remotes for appliances
- we can talk to a pet robot and ask him/her to turn on TV, change the temp. of air conditioner, or get a can of juice from fridge.
- it is a kind of "personal agent" which remember each person's preference

Motivation:

@@@

Dependencies:

List of possible related standardization activities...

Gaps

What can't be done with the existing mechanism?

What needs to be standardized:

New features? APIs? data model? language?

Comments:

Anything you want to add

UC-8: MIDI-based Speech Synthesizer

Submitter(s): Kaz

Reviewer(s): Masahiro, Shinya, Kosuke

Tracker Issue ID: @@@

Category: @@@

Description:

control MIDI-based voice generator with a prosody generator to make it a speech synthesizer using multimodal interface
not only speech synthesizers but also music synthesizers, music play devices, sirens at factories could be included
the following is a lit of example input modalities: gesture, movement of body parts (lips, eyes, eye blow, winking), humming, wearable suit
this use case reminds us of possible "semantic interpretation for MMI" which allows us to specify the semantics of the input
an existing example is Yamaha's Miburi system (demo video)

Motivation:

@@@

Dependencies:

List of possible related standardization activities...

Gaps

What can't be done with the existing mechanism?

What needs to be standardized:

New features? APIs? data model? language?

Comments:

Anything you want to add

UC-9: Collaboration of multiple video cameras

Submitter(s): Kaz

Reviewer(s): @@@

Tracker Issue ID: @@@

Category: @@@

Description:

radio-controlled cars and helicopters collaboratively work with each other
each of them has video camera
need to identify their position and direciton

Motivation:

@@@

Dependencies:

List of possible related standardization activities...

Gaps

What can't be done with the existing mechanism?

What needs to be standardized:

New features? APIs? data model? language?

Comments:

would be better to be merged with UC-5

UC-10: Geolocation device, e.g., GPS, as a MC for Location-based Services

Submitter(s): Kaz

Reviewer(s): Kosuke

Tracker Issue ID: @@@

Category: @@@

Description:

radio-controlled cars and helicopters collaboratively work with each other
each of them has video camera
need to identify their position and direciton

Motivation:

@@@

Dependencies:

List of possible related standardization activities...

Gaps

What can't be done with the existing mechanism?

What needs to be standardized:

New features? APIs? data model? language?

Comments:

Anything you want to add

UC-11: Smart Power Meter

Submitter(s): Kaz

Reviewer(s):

Tracker Issue ID: @@@

Category: @@@

Description:

smart power meter

Motivation:

@@@

Dependencies:

List of possible related standardization activities...

Gaps

What can't be done with the existing mechanism?

What needs to be standardized:

New features? APIs? data model? language?

Comments:

Anything you want to add

UC-12: Smart Car Platform

Submitter(s): Kaz

Reviewer(s):

Tracker Issue ID: @@@

Category: @@@

Description:

Car navigation system, HTML5-based IVI, smartphones and sensor devices within car controlled using MMI lifecycle events and (JSON version of) EMMA over WebSocket
MMI lifecycle events could be used to integrate multiple devices within a car (and outside the car)

Motivation:

@@@

Dependencies:

List of possible related standardization activities...

Gaps

What can't be done with the existing mechanism?

What needs to be standardized:

New features? APIs? data model? language?

Comments:

Anything you want to add

UC-13: MMI Automatic visual data annotator

Submitter(s): Helena

Reviewer(s):

Tracker Issue ID: @@@

Category: @@@

Description:

@@@

Motivation:

In image recognition systems, the coordination between a semantic web inference engine and the low-level attributes recognition services can be synchronized by using MMI lifecycle events.
MMI lifecycle events could be used to integrate complementary and concurrent services within a recognition process

Dependencies:

List of possible related standardization activities...

Gaps

Currently it is very complex to produce well synchronized semantic fusion for multimodal recognition in the fashion media.
The recognition process is very specialized on vision techniques and it lacks of the inference strengths provided by a high level vision coming from semantic web ontologies

What needs to be standardized:

New features? APIs? data model? language?

Comments:

Anything you want to add

UC-14: Multimodal e-Text

Submitter(s): Kaz

Reviewer(s):

Tracker Issue ID: @@@

Category: @@@

Description:

Multiple modalities to be used to access the contents for e-Learning
MMI Architecture could be used to synchronize those multiple modalities.

Motivation:

Text books for e-Learning mainly use text and graphics using, e.g., HTML, CSS, SVG and MathML.
However, from the viewpoint of accessibility and better understanding for learners it would be better to have multimodal interface, e.g., speech recognition/synthesis, handwriting/gesture recognition, to access the e-Learning contents.

Dependencies:

HTML5, CSS
SMIL, SSML, SRGS/SISR, PLS
Second Screen, Multi-device timing
Annotation

Gaps

TBD

What needs to be standardized:

New features? APIs? data model? language?

Comments:

(nothing so far)

UC-15: English standardized tests through an MMI interface. OPENPAU project

Submitter(s): Jesus Garcia-Laborda, Teresa Magal-Royo

Reviewer(s):

Tracker Issue ID:

Category: Computer Aided Learning language, Accesibility, MMI educational interfaces, Second language skill acquisition.

Description: The OPENPAU is a research project funded by the Spanish Government for 3 years and has worked on a navigation interface (keyboard, voice and touchscreen) to develop english tests using Moodle©.

1. Multiple modalities have been used to access the contents for second language acquisition e-learning.

2. The MMI Architecture has been used to synchronize multiple modalities of input navigation.

Motivation:

Specific developments to improve the accessibility of educational content for language learning environments by using MMI interfaces or via keyboard, voice and touchscreen.

Dependencies:

Gaps

1. Specific developments in Moodle© (OS) that allow access to educational content accessible via keyboard, voice and touchscreen synchronously.

2. Specific examinations or tests of skill acquisition in languages as a second language in Moodle © (OS) that allows access to educational content accessible via keyboard, voice and touchscreen synchronously.

What needs to be standardized:

6. Specific educational developments in Moodle © (OS) that allows access to content accessible via keyboard, voice and touch in synchronously.

7. Oriented user interfaces in conducting tests that allow the use of integrated MMI navigation.

Comments:

UC-16: Remote watching using video camera and MMI interfaces

Submitter(s): Kaz

Reviewer(s): Masahiro, Kosuke, Shinya

Tracker Issue ID: @@@

Category: @@@

Description:

watching factories and homes remotely using sensors and video cameras
the output could be notification on the Web browser on our mobiles or simply sirens/audio signals at the factory or the

home

should be useful to support aged people, children, etc.
can use both specific vendor-proprietary sensors and standard ones
question on how to integrate more than one factories and homes at once
- may be good to have an meta IM at the headquarters of the company which communicates with the sub IMs from all the factories of the company
this discussion makes me think about who in the home should be the IM, and how all the devices inside the home should be integrated
- maybe we could have sub IMs for each room which handles devices within the assigned rooms
- that implies each IM needs to talk with other (via the main IM of the home) to check which device is located at which room and negotiate with each other to recognize some device like tablet TV is moved from the living to the bedroom

Motivation:

Dependencies:

List of possible related standardization activities...

Gaps

What can't be done with the existing mechanism?

What needs to be standardized:

New features? APIs? data model? language?

Comments:

Anything you want to add

UC-17: User interface as a sensor

Submitter(s): Helena

Reviewer(s):

Tracker Issue ID: @@@

Category: @@@

Description:

Many smart pictures are on the web, and each one is a sensor that detects user interaction
It's necessary to detect a click on the controller, the brand name, the , the scrolling on the page and the suggestion
It's important for the link to be correct
It's important to detect and record scrolling because it means that the user is reading, so the MC will store the event and the position
There are also passive sensors. A smart picture MC can be created

Motivation:

Dependencies:

List of possible related standardization activities...

Gaps

What can't be done with the existing mechanism?

Requirements

be able to update data on the modality component
be able to update data on the IM
needs a "suspended state"
need of a virtualization process to complete the sensoring process
use of similarity data
bidirectional communication between Resources Manager and Modality Components
interface changes managed by the system (Resources Manager) and not user input (the MC)

What needs to be standardized:

New features? APIs? data model? language?

Comments:

Anything you want to add

UC-18 smart remote for appliances

Submitter(s): Kaz, Debbie, Helena

Reviewer(s):

Tracker Issue ID: @@@

Category: @@@

Description:

We could talk with a cute pet robot and we could ask him/her to control specific appliances or bring us a snack. The robot could interact with the world or with devices.
The robot could be an intermediary between the user and devices.
The robot could also be considered a kind of "user agent" that can do things for the user.

Motivation:

Dependencies:

List of possible related standardization activities...

Gaps

What can't be done with the existing mechanism?

Requirements

What needs to be standardized:

New features? APIs? data model? language?

Comments:

Anything you want to add

UC-19 Handling millions of components

Submitter(s): Kaz, Debbie, Helena

Reviewer(s): Masairo, Shinya

Tracker Issue ID: @@@

Category: @@@

Description:

everyone could have a personal user agent modality component
the IM could handle millions of devices
typical example is having a whole remote orchestra
car rental agency needs to handle many users and many cars
there could be an IM for the car rental company as well as an IM for each car
the car rental agency IM could talk with the user agent component for each user
the user agent could tell the car rental agency what the user' preferences are (this user is very tall, etc.)
a university could interact with the user agents for each student and staff
the car could interact with the parking lot manager about available spaces
lots of privacy and security issues

Motivation:

Dependencies:

List of possible related standardization activities...

Gaps

What can't be done with the existing mechanism?
Are there any problems with scaling?

Requirements

need a specific time keeper (time code generator) like the conductor of an orchestra
need more than levels of accuracy for time management depending on each use case
if there are slow MCs, need to delay other MCs to synchronize all the MCs

What needs to be standardized:

New features? APIs? data model? language?

Comments:

Anything you want to add

UC-20 Emergency evacuation information and notification system

Submitter(s): Kaz, Debbie

Reviewer(s):

Tracker Issue ID: @@@

Category: @@@

Description:

people might have different preference to get information on fire evacuation
what if a big apartment or hotel, or a small house with two floors
possibly multiple places on fire within a town

barrier-free safety

would apply to many kinds of buildings, house, apartment, office building, hotelor other public spaces

or even more general than that, you want it to be aware of where the fire is, for example and route people away from that

could be more general than fire

there are many relevant sensors from the WoT group, e.g., camera, heat, water or something for earthquakes; smoke detectors, all reporting their state to the IM which would notify the users

a friendly robot could help with this
- it could help you physically or tell you things
- could help children or other people who can't use a smartphone
- could talk with the home server and ask the server to open the door
- the robot could talk with the parents first

Motivation:

Dependencies:

List of possible related standardization activities...

Gaps

What can't be done with the existing mechanism?
Are there any problems with scaling?

Requirements

What needs to be standardized:

New features? APIs? data model? language?

Comments:

Anything you want to add

UC-21 Collaborative session by remote players

Submitter(s): Kaz, Masahiro, Shinya

Reviewer(s):

Tracker Issue ID: @@@

Category: @@@

Description:

multiple players collaboratively play a musical session
- piano player in US
- guitar player in France
- singer in Japan

a computer-based musical instrument could be included
- e.g.) Yamaha Disklavier

Motivation:

Dependencies:

List of possible related standardization activities...

Gaps

What can't be done with the existing mechanism?
Are there any problems with scaling?

Requirements

What needs to be standardized:

New features? APIs? data model? language?

Comments:

Anything you want to add