Copyright
©
2016
2017
W3C
®
(
MIT
,
ERCIM
,
Keio
,
Beihang
).
W3C
liability
,
trademark
and
document
use
rules
apply.
This
document
is
addressed
to
people
who
want
either
to
develop
Modality
Components
for
Multimodal
Applications
distributed
over
a
local
network
or
"in
the
cloud".
With
this
goal,
in
a
multimodal
system
implemented
according
to
the
Multimodal
Architecture
Specification
,
over
a
network,
to
configure
the
technical
conditions
needed
for
the
interaction,
the
system
must
discover
and
register
its
Modality
Components
in
order
to
monitor
and
preserve
the
overall
state
of
the
distributed
elements.
In
this
way,
Therefore,
Modality
Components
can
be
composed
with
automation
mechanisms
in
order
to
adapt
the
Application
to
the
state
of
the
surrounding
environment.
Beware. This specification is no longer in active maintenance and the Multimodal Interaction Working Group does not intend to maintain it further.
This
section
describes
the
status
of
this
document
at
the
time
of
its
publication.
Other
documents
may
supersede
this
document.
A
list
of
current
W3C
publications
and
the
latest
revision
of
this
technical
report
can
be
found
in
the
W3C
technical
reports
index
at
http://www.w3.org/TR/.
https://www.w3.org/TR/.
This
is
document
has
been
published
as
a
Working
Group
Note
to
reflect
the
11
April
2016
W3C
fact
that
the
Multimodal
Interaction
Working
Draft
Group
on
"Discovery
&
Registration
is
no
longer
progressing
it
along
the
W3C
Recommendation
Track.
A
record
of
discussion
relating
to
this
specification
can
be
found
in
the
Multimodal
Modality
Components:
State
Handling".
Interaction
Working
Group's
email
archive
.
The
email
list
was
www-multimodal@w3.org
.
This
draft
was
modified
to
enhance
The
changes
from
the
contrast
in
graphics
to
meet
WCAG
requirements
for
color
contrast.
Fonts
in
Graphics
were
increased
in
size,
a
longdesc
was
added
and
previous
Working
Draft
are
(1)
removal
of
"State
Handling"
from
the
support
for
keyboard
interaction.
We
modify
title
since
the
document
now
describes
not
only
state
diagram,
update
handling
but
also
annotation
vocabulary,
(2)
addition
of
the
normative
part
about
states,
edit
description
on
a
vocabulary
for
the
code
examples
annotation
of
Modality
Components
and
remove
some
informative
text.
(3)
clarifications
and
modifications
based
on
public
comments.
A
diff-marked
version
of
this
document
is
also
available
for
comparison
purposes.
This
W3C
Working
Draft
has
been
developed
by
the
The
Multimodal
Interaction
Working
Group
of
the
W3C
Multimodal
Interaction
Activity
.
This
document
was
published
by
the
Multimodal
Interaction
Working
Group
chartered
as
a
Working
Draft.
If
you
wish
to
make
comments
regarding
this
document,
please
send
them
develop
open
standards
that
enable
the
following
vision:
Publication
as
a
Working
Draft
Group
Note
does
not
imply
endorsement
by
the
W3C
Membership.
This
is
a
draft
document
and
may
be
updated,
replaced
or
obsoleted
by
other
documents
at
any
time.
It
is
inappropriate
to
cite
this
document
as
other
than
work
in
progress.
This document was produced by a group operating under the 5 February 2004 W3C Patent Policy . W3C maintains a public list of any patent disclosures made in connection with the deliverables of the group; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy .
Sections
in
this
document
that
are
not
marked
as
Normative
are
Informative.
This
document
is
governed
by
the
1
September
2015
W3C
Process
Document
.
To
the
best
of
our
knowledge,
there
is
no
standardized
way
to
build
a
web
Application
that
can
dynamically
combine,
combine
and
control
discovered
components
by
querying
a
registry
build
based
on
modality
the
multimodal
types
of
the
modalities
and
their
states.
This
document
covers
three
needs
on
Discovery
&
Registration
for
this
kind
of
web
Application
implemented
following
the
Multimodal
Architecture
Specification
.
First,
we
propose
to
define
a
new
component
responsible
for
the
management
of
the
state
of
a
Multimodal
System
,
extending
the
control
layer
already
defined
in
the
Multimodal
Architecture
Specification
.
(Table
1
col.
1).
This
component
will
be
responsible
for
handling
the
messages
exchanged
in
order
to
declare
the
presence
(or
absence)
of
the
Modality
Components
of
the
system.
system
.
Second,
this
document
presents
an
adaptive
push/pull
mechanism,
needed
to
inform
the
system
about
the
changes
in
the
state
of
the
Modality
Components
.
(Table
1
col.
2)
These
changes
are
not
necessarily
related
to
the
interaction
functional
context
itself,
but
they
can
affect
it,
for
example,
in
the
case
of
the
unavailability
of
a
given
Modality
Component
.
And
finally,
to
allow
the
advertisement
of
the
state
of
the
Modality
Components
by
using
the
adaptive
mechanism,
two
new
events
are
needed.
needed.(Table
1
col.
3)
The
semantics
of
these
new
events
is
not
directly
related
to
the
interaction
context;
context
but
it
is
related
with
the
system's
configuration;
for
this
reason
a
new
component
responsible
for
the
management
of
the
state
of
the
Multimodal
System
is
needed.
Resources Handling |
A
new
direction
|
Events for System's updates |
The state management through events and the pull mechanism must be supported by a dedicated component, responsible for the management of the state of the Modality Components in the Multimodal System. | An adaptive pull mechanism needed to inform periodically of the availability or other kind of evolution on the state of the Modality Components. | A new event and a new notification to support the pull mechanism and the advertisement, registering, search and update of Modality Component's availability. |
In the current state of the Multimodal Architecture Specification , the events that are responsible for handling the control of the user-system interaction, like Prepare or Start , must be triggered only by the Interaction Manager and sent to the Modality Components. As a result, a Modality Component can not send a StartRequest or a PrepareRequest to the Interaction Manager. In both cases the Modality Component depends on the Interaction Manager to begin the interaction cycle by raising an event, originated by an internal command or in reaction to a previous notification sent by a Modality Component (Figure 1).
A Modality Component may send a NewContextRequest to the Interaction Manager to request the creation of a new context of interaction. The interaction can be started by different Modality Components independently. Nevertheless, to start an interaction the Modality Component needs to be already part of the system a to be registered, given that a context represents a single extended interaction with one or more Modality Components.
This means that the Multimodal System has two complementary phases: the runtime phase (defined by the execution of one or multiple interaction cycles), and the system configuration phase (defined by the loading of components and their monitoring and adaptation in real-time).
The semantics of the NewContextRequest event is different and mostly oriented to the interaction phase, while the registering process is part of a previous phase, when even the presence of the user is not mandatory. This phase is designed for a system that will handle one or more interaction processes at the same time.
In addition, in the current state of the MMI Recommendation , the Interaction Manager is supposed to know ahead of time, the address and port of all the Modality Components available in the system. In consequence, the preparation of the media or the start of the interaction cycle also currently implies the setting up of a "multimodal session" that is not completely defined at the current stage of the specification .
On
the
other
hand,
the
The
Extension
and
Status
notifications
are
dedicated
to
the
exchange
of
interaction
data,
while
the
data
exchange
exchanged
in
a
discovery
process
is
mostly
previous
to
any
interaction
between
the
user
and
the
system.
For
example,
at
load-time,
when
During
the
configuration
phase
(or
reconfiguration
according
to
a
changement
of
the
overall
state),
the
system
prepares
and
registers
the
information
about
the
Modality
Components
(availability,
technical
characteristics,
cost).
In
this
way,
all
this
information
might
be
used
in
the
future,
when
the
user-system
interaction
actually
begins.
takes
place.
In other terms, the semantics of the two existing notifications differ from the features needed for discovery. The communication protocol paradigm (the flow of messages always initiated by the Interaction Manager) is not sufficient if the Recommendation is used to address use cases evolving in dynamic environments, as described in uses cases like [UC 2.1] Personal Externalized Interfaces: Smart Cars , [UC 3.1] Public Spaces: Interactive Spaces or [UC 3.2] Public Spaces:In-Office Events Assistance , MMI Use Cases , or some of the use cases described in our current charter.
In all these cases, the Modality Components enter and quit the multimodal system dynamically, and they must declare to the system, their existence, availability and capabilities in some way:
In the first case, [UC 2.1] Personal Externalized Interfaces: Smart Cars , the Modality Components provided by a smartphone must be detected by the multimodal system to relate these features to the features provided by the Modality Components in the car.
In the second case, [UC 3.1] Public Spaces: Interactive Spaces , the discovery of the Modality Components installed on the client's smartphone can affect the behavior of the multimodal application in the public space.
In the third case, [ UC 3.2 ] Public Spaces: In-Office Events Assistance , the announcement and discovery of the Modality Component capabilities in a smart conference room can allow the attendees to access to some of the multimodal services provided by the conference room, providing a fine-grained adaptation of the application features to the multimodal interaction's environment state.
For all these reasons the current document addresses the need of support for discovery and registration in very dynamical environments, like the ones described above by proposing a resources manager, a new flow of messages and two events specifically designed to carry discovery and registration data.
A Modality Component's discovery protocol needs a mechanism tracing the relevant session data to be handled on the control layer. This is the first of the responsibilities for a Resources Manager. This manager is responsible for handling the evolution of the "multimodal session" [See: Functions of Session Component in W3C Multimodal Interaction Framework ] and the modifications in any of the participants of the system that could affect its global state. This component is also aware of the system's capabilities, like the address of modalities, their availability or their processing state.
The inclusion of the Resources Manager responds to the functional requirement concerning the management of the interaction cycles locally and globally, the requirement of an appropriate real-time sensing for dynamic uses; and, partially, to the requirement concerning the support of processing of dynamic and incomplete data. [See: MMI Framework requirements ]
The
Resources
Manager
is
nested
in
the
control
layer
of
the
multimodal
system
(turquoise
in
Figure
2)
which
is
slightly
different
from
the
proposal
of
a
Session
Component
described
in
the
W3C
Multimodal
Interaction
Framework
.
In the MVC model (Figure 3), the Controller translates the user's actions into method calls on the Model. The Model broadcasts a notification to the View and to the Controller to inform that its state has changed. The View queries the Model to determine the exact change. Upon reception of the response, the View updates the display according to the information received. Thus, in the MVC pattern, the View is directly linked with its controller, but it can also query and communicate with the Model.
In
this
pattern,
the
Model
offers
a
registration
mechanism
so
that
multiple
Views
and
Controllers
can
express
their
interest
in
the
Model
through
anonymous
callbacks.
This
allows
an
easy
implementation
of
multiple
renderings
of
the
same
domain
concepts
either
on
one
local
device
or
across
multiple
distributed
devices.
The
Resources
Manager
proposed
described
in
the
current
document,
allows
the
management
of
the
states
of
the
Modality
Component
(which
represents
the
MVC
view
in
the
MMI
Architecture)
putting
this
function
in
the
control
layer
(dark
gray
in
Figure
4).
The
Resources
Manager
translates
the
user's
actions
into
method
calls
on
the
Data
Component,
as
the
MVC
pattern
proposes.
But
also,
WHile
the
INteraction
Manager
handles
the
user
interaction,
the
Ressources
Manager
will
take
care
of
the
state
of
the
system,
the
type
and
avalaibility
of
the
Modality
Components
and
the
state
of
the
multimodal
session.
The
Modality
Component's
communication
and
request
of
state
information
is
restricted
to
exchanges
with
the
Control
layer
as
the
MMI
Recommendation
defines.
The
Model
broadcasts
a
notification
to
the
Resources
Manager
(Figure
4),
and
then,
the
Resources
Manager
informs
the
Modality
Component
that
the
state
has
changed
using
a
flow
of
messages
through
an
UpdateNotification
or
a
CheckUpdateResponse.
Upon
reception
of
the
UpdateNotification
or
the
checkUpdateResponse,
the
Modality
Component
updates
the
user
interface
according
to
the
information
received.
Thus, the Resources Manager delivers information about the state and the resources of the multimodal system during and outside the interaction cycle . Some of its responsibilities can be:
The Resources Manager can also process and serialize in data structures the traces of external and internal phenomena. Depending on the complexity of the implementation, the application can store in the Data Component:
Requirements | |
Distribution |
The
|
Advertisement |
The
|
Discovery |
The
|
Registration |
The
|
Querying |
The
flow
of
queries
transit
through
the
|
According
to
the
current
MMI
life-cycle
events
protocol,
the
command
of
Modality
Components
is
initiated
by
the
Interaction
Manager,
which
means
that
if
there
is
an
HTTP
client-server
implementation,
it
must
can
be
designed
following
a
push
notification
technique.
In the communication protocol designed for the MMI life-cycle events , the direction of the message flow (mostly from the Interaction Manager to the Modality Components) is suggested by the specification through the description of the control events, even if the specific communication mechanism is not currently described in detail in the normative section and it is, for the moment, implementation dependent.
Our
proposal
aims
to
cover
a
This
document
describes
the
flow
of
messages
in
both
directions,
which
are
needed
for
the
Discovery
&
Registration
of
Modality
Components.
With
this
proposal,
the
MMI
architecture
will
respond
more
accurately
to
architectural
requirements
like
completeness,
extensibility,
integratability
and
interoperability
concerning
the
relations
allowed
between
requesters
and
providers
of
messages.
Our intention is to allow multimodal developers the use of a communication flow initiated by Modality Components arriving dynamically to the system. An extension that authorizes the Modality Component client to request or provide new data from the server; using for example, form submissions or AJAX-based technologies with the XMLHttpRequest object.
With this mechanism the change in the state of the multimodal session (i.e. the dynamic inclusion of new distributed modalities) is instigated from the Modality Component itself.
After a certain period, the Modality Component's client requests the Resources Manager (i.e. in a server), which notifies the Modality Component about changes on the user interface displayed with other distant components or in the data related to the overall state of the system, causing eventually the Modality Component‘s state to evolve, for example, by putting it on stand-by. The connection is closed after each transfer and the Modality Component is told when to open a new connection, and what data to fetch when it does so.
The inclusion of this new direction in the flow of messages is the best option for tightly coupled clients to which the Resources Manager has reliable access.
Nevertheless, adding a new direction in the message flow can raise issues related to the risk of high network traffic reducing the overall performance.
In a distributed multimodal system, Modality Components can be idle for long time if no interaction happens or the situation is not optimal to allow a specific type of interaction. Given that the data rate is very low during this period, it is not necessary to keep the client requesting all the time.
For fine-tuning the Modality Component's requests we propose a new attribute: the timeout attribute. The sleep value of this attribute can reduce the requesting time by putting the client (e.g. a Modality Component using recognition services) into a periodic sleep state. This allows handling the requesting frequency to update the state data in the Modality Component.
Requirements | |
Distribution |
Modality
Components
can
be
distributed
in
a
centralized
way,
an
hybrid
way
or
a
fully
decentralized
way
in
order
to
support
distributed
processing
[MMI-A14]
,
[MMI-A15]
and
distributed
input
/
output
synchronization
[MMI-A13]
.
Given the number of devices that could be used, a more flexible way to recognize and include the device in the multimodal system's registry requires to adding a new direction in the flow of messages to allow an announcement of modalities coming from the device every time an important change occurs. This reduces the number of permanent connections, and allows a more pertinent monitoring of the availability and changes on Modality Components at the session level [MMI-A6] . |
Advertisement |
With
a
pull
mechanism,
the
unique
identifier
of
the
Modality
Component,
its
name,
address,
port
number,
its
embedded
services,
constructor,
version
and
lifetime
can
be
announced
when
important
changes
affecting
this
information
occurs.
This
proactive
updating
of
information
facilitates
the
management
of
scalable
multimodal
systems
across
wide
ranges
of
devices,
and
supports
the
application's
adaptability
[MMI-G2]
and
the
coordination
capabilities
of
the
multimodal
session
[MMI-I8]
.
It
also
supports
the
announcement
of
evolution
in
the
user
profile
or
user
preferences
[MMI-G13]
-
[MMI-G14]
.
A new direction in the flow of messages also supports the extensibility of the system, through the active announcement of the new modalities or new devices and capabilities to be dynamically added [MMI-I12] - [MMI-O8] . This implies the management of external input events during the announcement process [MMI-A16] . |
Discovery |
This
new
direction
in
the
flow
of
messages
facilitates
the
mediated
and
passive
discovery
of
Modality
Components.
Functions
can
be
partitioned
and
distributed
across
several
servers
or
devices
that
notify
periodically
their
availability
and
general
state.
[MMI-C1]
and
[MMI-C2]
.
It also facilitates the deployments using mobile networks, preventing bandwidth limitations and delays because the embedded Modality Component itself can announce and update its current state. [MMI-R1] and [MMI-R2] . |
Registration |
Using
a
new
direction
in
the
flow
of
messages,
the
updates
to
the
register
are
triggered
by
changes
dynamically
declared
by
the
Modality
Component
itself
without
the
need
of
a
persistent
connection
to
update
data
that
is
not
very
frequently
modified.
This also helps in the registration of high level information used to specify the preconditions and effects produced by the addition of this new Modality Component to the system or its unavailability [MMI-G15] It also supports the registering of other information that does not change very often, like the semantics of some kinds of inputs, or any specification of the meaning of the embedded modalities implemented in the Modality Component to be registered [MMI-I13] |
Querying |
To
enable
information
gathering
in
a
multimodal
system,
the
simplest
strategy
is
to
have
all
Modality
Components
providing
a
continuous
stream
of
all
the
data
that
they
gather
to
the
Interaction
Manager.
However,
for
many
types
of
applications
where
only
a
small
subset
of
the
collected
information
is
likely
to
be
useful,
updated
or
pertinent,
this
simple
approach
can
become
very
inefficient.
For
this
reason,
|
With these mechanisms of communication a Modality Component can register its services for a specific period of time. This is the basis for the handling of the Modality Component's state. Every Modality Component can have a life-time, that begins at discovery and ends at a date provided at registration. If the Modality Component does not re-register the service before its lifetime expires, the Modality Component's index is purged. This depends on the parameters given by the Application logic, the distribution of the Modality Components or the context of interaction.
When the lifetime has no end, the Modality Component is part of the multimodal system indefinitely. In contrast, in more dynamic environments, a limited life-time can be associated with the Modality Component, and if it is not renewed before expiration, the Modality Component will be assumed to no longer be part of the multimodal system. Thus, by the use of this kind of registering, the multimodal system can implement a procedure to confirm its global state and update the <<inventory>> of the components that could eventually participate in the interaction cycle. Therefore, registering involves some Modality Components' timeout information, which can be always exchanged between components and, in the case of a dynamic environment, can be updated from time to time.
For
this
reason,
a
registration
renewal
mechanism
is
needed.
We
propose
define
a
renewal
mechanism
based
on
the
use
of
the
timeout
attribute
and
two
new
events:
the
checkUpdate
CheckUpdate
Event
and
the
UpdateNotification,
used
in
conjunction
with
an
automatic
process
that
ensures
periodical
requests.
On
the
other
hand,
the
The
UpdateNotification
is
proposed:
provides
a
mechanism:
We
propose
a
A
dedicated
data
structure
is
defined
for
registration:
the
timeout
attribute.
A
timeout
is
an
ordered
list
of
three
elements:
Each Modality Component can sleep for some time, and then wake up and check to see if there are changes planned on the systems side (by requesting the component responsible for the management of the system states). During sleeping, the client turns off checkUpdate requests, and sets a timer to awake itself later.
The sleep value is calculated by the Resources Manager (on the server side, for example) based on the context-awareness level of the multimodal system. It can be static and defined with a set of basic rules or more dynamic, linked to the semantic analysis of the environment.
The second element of the timeout tuple is the communication life-time. A Modality Component leaves the multimodal system when its life-time is exceeded and needs to restart its registering mechanism to obtain a new Modality Component ID and timeout pace. This supports periodic updates of the availability of the Component (e.g. authorization) or the renewal of its metadata (See Figure 6).
The third element is the communication interval, which is modulated according to the multimodal system's needs by a set of static rules or by a prediction mechanism used in the state handler Component. This element informs the Modality Component before-hand about the frequency of requests that can be allowed by the recipient component (a Resources Manager in a server, for example) in the current conditions. This value is exchanged on each request, which means that it can be changed at any moment in the multimodal session.
With
our
proposal,
the
The
communication
intervals
will
be
synchronized,
because
the
Modality
Component
knows
the
exact
publish
interval
beforehand
according
to
a
time
pattern.
In
this
way,
data
coherence
is
ensured
and
network
performance
maintained.
Since
the
Resources
Manager
has
access
to
all
the
state
data,
it
can
for
example,
use
a
prediction
algorithm
implemented
in
the
Data
Component,
to
foresee
a
time
when
the
data
is
going
to
change.
The
Resources
Manager
then
attaches
this
time
value
in
the
timeout
triplet
to
the
outgoing
data
allowing
the
data
synchronization.
Finally, if the Resources Manager prediction is wrong and still a change occurs in data, if the Resources Manager knows the address of the Modality Component it can push the change to it, using the original push technique proposed here. In this case the push command is handled as an interruption of the default pull update mechanism. In this way, the system maintains its reliability.
This section is Normative.
The
CheckUpdate
events
are
the
request/response
pair,
CheckUpdateRequest
and
CheckUpdateResponse.
CheckUpdateRequest
and
CheckUpdateResponse
are
used
to
check
to
see
if
there
are
any
changes
in
the
system.
They
share
the
Context
,
Source
,
Target
,
RequestID
and
Data
fields
with
MMI
Life
Cycle
Events.
A
CheckUpdate
event
MUST
include
Source,
Target
and
RequestID.
It
MAY
include
a
Data
field.
It
MAY
also
include
a
Context
field,
if
the
event
pertains
to
a
specific
context.
In addition, both CheckUpdate events MUST include the additional fields UpdateType, State, and Timeout. The CheckUpdateResponse MUST also include the field "AutomaticUpdate". The CheckUpdate events can be sent from either the Modality Component to the Resources Manager or from the Resources Manager to the Modality Components.
An attribute that MUST indicate the type of check to be performed. Some values can be: Handshake, Monitoring, Reporting, DataCheck, Resuming, Leaving. This values are application specific.
An
attribute
that
MUST
indicate
the
state
of
the
requesting
component
and
its
value.
Some
values
can
MUST
be:
Alive,
Loading,
Registering,
Available,
Idle,
Busy
Waiting,
Processing,
Unavailable,
Unregistered.
(See
Figure
7)
A Modality Component MUST be in Alive state when it is already started and ready to be identified and registered on the multimodal system.
A Modality Component MAY be in Loading state if it is currently loading resources that it will need to function.
A Modality Component MAY be in Registering state when it has already requested a registration id in the multimodal system through the Resources Manager.
A Modality Component MUST be in Available state when it is already registered, ready to function and not busy.
A Modality Component MAY be in Idle state when it is already registered, functioning and waiting for a user input.
A Modality Component MAY be in Busy Waiting state when it is already registered, functioning and waiting for a system's event or a system's response.
A Modality Component MUST be in Processing state when it is already registered and processing some task. The process could be any multimodal or unimodal task like transferring, searching, recognizing or any other kind of process. The processing state is related to a given multimodal session (the same Modality Component can handle multiple tasks in parallel from different users and sessions).
A Modality Component MUST be in Unregistered state if the system's rules command the unregistration and the Modality Component is no longer authorized to interact with the system (for example if it has to update its access credentials).
A Modality Component MUST be in Unavailable state when it has a failure, or when it is unregistered and it does not update its registration, or when it lacks of resources or must reload them. In short, when the Modality Component is no more able to correctly ensure its task.
The following list shows the flow between these nine states:
The component MUST pass from the ALIVE state to Loading, Registering or Available state.
The component MUST pass from the LOADING state to Registering or Available state.
The component MUST pass from the REGISTERING state only to Available state.
The component MUST pass from the AVAILABLE state to Idle, Busy Waiting or Unavailable state.
The component MUST pass from the IDLE state to Busy Waiting, Processing or Unavailable state.
The component MUST pass from the BUSY WAITING state to Processing, Idle, Unregistered or Unavailable state.
The component MUST pass from the PROCESSING state to Processing, Idle or Unavailable state.
The component MUST pass from the UNREGISTERED state only to Unavailable state.
Some examples of this flow between the states are:- Unauthorized Component
ALIVE | AVAILABLE | BUSY WAITING | UNREGISTERED | UNAVAILABLE |
First the Modality Component announces that it is ALIVE and declares its AVAILABLE state to the system. After sending this announcement, the Modality Component enters the BUSY WAITING state, waiting a response from the system. The system does not allow the Modality Component to continue joining the system (it is no more authorized to join the system), then the Component pass to an UNREGISTERED state, and becomes UNAVAILABLE. |
- Failure of a Registered Component
ALIVE | REGISTERING | AVAILABLE | IDLE | PROCESSING | UNAVAILABLE |
The Modality Component is ALIVE and announces its address and port to the Resources Manager which registers this data allowing the component to pass to the REGISTERED state. The Modality Component passes to the AVAILABLE state. When the system is ready to interact with a user, the Modality Component passes to the IDLE state, waiting for a user action. If the user interacts with the Modality Component it passes to a PROCESSING state, but then, the current process fails and the component becomes UNAVAILABLE. |
- Unavailability of a Registered Component
ALIVE | REGISTERING | AVAILABLE | IDLE | PROCESSING | BUSY WAITING | UNAVAILABLE |
The Modality Component is ALIVE and announces its address and port. It is allowed to pass to the REGISTERED state. The Modality Component passes to the AVAILABLE state. When the system is ready to interact with a user, the Modality Component passes to the IDLE state, waiting for a user action. The user interacts with the Modality Component it passes to a PROCESSING state. The process needs an exchange with another component on the system, then it waits for a response. After a certain time with no response (or after a response making impossible to continue the process), the current process fails and the component becomes UNAVAILABLE. |
- Registration of a Component needing multimodal resources
ALIVE | LOADING | REGISTERING | AVAILABLE | IDLE | PROCESSING | BUSY WAITING | PROCESSING | IDLE |
The Modality Component is ALIVE. It needs to load some resources, passing on LOADING state. Then the Modality Component announces its address, port and resources and is allowed to pass to the REGISTERED state. The Modality Component passes to the AVAILABLE state. When the system is ready to interact with a user, the Modality Component passes to the IDLE state, waiting for a user action. The user interacts with the Modality Component, it passes to a PROCESSING state. The process communicates with another component on the system, and receives a response. The process ends and then the Modality Component returns to its IDLE state to wait for another user interaction. |
A boolean-valued attribute indicating whether the state of the Modality Component will be automatically updated by UpdateNotification events or whether the Modality Component will keep sending UpdateNotification events in the future without waiting for another CheckUpdateRequest event. If the Resources Manager is temporarily unavailable the Modality Component will continue to send messages according with the last interval defined by the last timeout information received.
An element with an attribute to link with external complementary metadata and an info attribute for inline data. The metadata non-functional information will be complementary to the data, which is a functional information type.
An element used to temporize the exchanges between components. The values of this element are defined by the Resources Manager. These values can be changed by a Modality Component if the Modality Component arrives into a state that makes impossible to preserve the pace of communication (i.e. error, fail, unavailability). This element MUST include three attributes. It MUST include a sleep attribute to define the "communication sleep period", a validity attribute to represent the "communication validity period" in milliseconds and an interval attribute to express the "communication interval" in milliseconds. Example:
<mmi:Timeout sleep="1000" validity="5000" interval="500"/>
<mmi:mmi xmlns:mmi="http://www.w3.org/2008/04/mmi-arch" version="1.0"><mmi:mmi xmlns:mmi="https://www.w3.org/2008/04/mmi-arch" version="1.0"> <mmi:CheckUpdateRequest mmi:Source="URIForMC" mmi:Target="URIForRM" mmi:RequestID="request-1" mmi:State="LOADING" mmi:UpdateType="HANDSHAKE" mmi:AutomaticUpdate="true"> <mmi:metadata src="URIForMetadata" info="{medium:{acoustic}, modality:{acoustic:SPEECH}}" /> <mmi:Timeout sleep="0" validity="500" interval="500"/> </mmi:CheckUpdateRequest> </mmi:mmi>
<mmi:mmi xmlns:mmi="http://www.w3.org/2008/04/mmi-arch" version="1.0"><mmi:mmi xmlns:mmi="https://www.w3.org/2008/04/mmi-arch" version="1.0"> <mmi:CheckUpdateResponse mmi:Source="URIForRM" mmi:Target="URIForMC" mmi:RequestID="request-1" mmi:State="REGISTERED" mmi:UpdateType="HANDSHAKE" mmi:AutomaticUpdate="true" mmi:data="MCRegistrationID Data"> <mmi:Timeout sleep="1000" validity="360000" interval="500"/> </mmi:CheckUpdateResponse> </mmi:mmi>
This section is Normative.
The UpdateNotification event informs other system components (periodically or not) about the changes on the state of a Component. If automatic updates are enabled, the Component may send multiple UpdateNotification messages after a single CheckUpdateRequest message. It shares the Context , Source , Target , RequestID and Data fields with MMI Life Cycle Events. An UpdateNotification event MUST include Source, Target, and RequestID. It MAY include a Data field. It MAY also include a Context field, if the notification pertains to a specific context.
In addition, an UpdateNotification MUST include the additional fields UpdateType, State, and Timeout. The UpdateNotification event can be sent from either the Modality Component to the Resources Manager or from the Resources Manager to the Modality Components.
An attribute that MUST indicate the type of check to be performed. Some values can be: Reporting , in the case of an important change to the Modality Component that needs to be reported to the Resources Manager, like a noise situation in some audio capture, for example. An update notification can also be triggered when the Modality Component uses or produces new data: in this case the UpdateType can be DataUpdate . Finally a Modality Component can need to inform other components about some user interface changes, for example when the load of some data is finished and this affects the user interface display. In this case the UpdateType will be InterfaceUpdate
An attribute that MUST indicate the state of the requesting component and its value. These values correspond to the values supported by the CheckUpdate event: Alive, Loading, Registering, Available, Idle, Busy Waiting, Processing, Unavailable, Unregistered.
An element used to indicate the pace of the notification process when automatic updates are enabled. This element MUST include three attributes. It MUST include a sleep attribute to define the "communication sleep period", a validity attribute to represent the "communication validity period" in milliseconds and an interval attribute to express the "communication interval" in milliseconds.
<mmi:mmi xmlns:mmi="http://www.w3.org/2008/04/mmi-arch" version="1.0"><mmi:mmi xmlns:mmi="https://www.w3.org/2008/04/mmi-arch" version="1.0"> <mmi:UpdateNotification mmi:Source="URIForMC" mmi:Target="URIForRM" mmi:RequestID="request-1" mmi:UpdateType="REPORTING" mmi:State="BUSY WAITING"> <mmi:Timeout sleep="1000" validity="360000" interval="200"/> </mmi:UpdateNotification> </mmi:mmi>
Requirements | |
Distribution |
For
notification
of
failures,
progress
or
delays
in
distributed
processing
[MMI-A14]
the
UpdateNotification
ensures
periodical
requests
informing
other
components
if
any
change
occurs
in
the
Modality
Component's
state.
This
can
support,
for
example,
grammar
updates
or
image
recognition
updates
for
a
subset
of
differential
data
(the
general
recognized
image
is
the
same
but
one
little
part
of
the
image
has
changed,
i.
e.
the
face
is
the
same
but
there
is
a
smile)
On the other hand, if a Modality Component is waiting for some processing provided by other distributed component, the checkUpdate Event allows the recovery of progress information and the fine tuning of requests by changing the timeout attribute. This enhances input/output synchronization in distributed environments [MMI-A13] . |
Advertisement | The use of the timeout attribute helps in the management of the validity of the advertised data. If a Modality Component communication is out-of-date, the system can infer that the data has the risk of being inaccurate or invalid. |
Discovery | The UpdateNotification and the checkUpdate Event support mediated and passive discovery of Modality Components, by allowing servers or devices to announce their capabilities at bootstrapping and notify or check periodically availability and session state changes [MMI-C1] . |
Registration | The UpdateNotification and the checkUpdate Event tuned by a timeout mechanism for pull requests allow the dynamic registration and update of the information about the capabilities of the Modality Component [MMI-G2] or the user preferences [MMI-G13] and profile [MMI-G14] collected on the device. |
Querying | The checkUpdate Event allows the recovery of a small subset of the information provided by the interaction manager or the data component, to maintain up to date the data in the Modality Components as in the Data Component. |
This proposal is designed to support the annotation of Modality Components, to allow their discovery and registering in a multimodal system. The focus is the dynamic discovery of Modality Component as services using generic information about the underlying properties and types of processes. This information is provided by an announcement and a description (a capabilities manifest, for example) advertised in some network. In this document we will illustrate this point with an example of a multimodal greeting service in a smart environment.
The Modality Components can be described with a document that evolves on complexity depending on the application needs. This description can be limited to indications about the Input and Output interfaces or be more detailed describing functional and non-functional properties, inspired by some of the Extensible Multimodal Annotation Markup Language (EMMA) properties [W3C-EMMA 2009] like emma:function, emma:media-type, emma:medium and emma:mode.
The meaning of the terms for a controlled vocabulary in the form of a Glossary for the annotation of Modality Components, is divided in two parts (Figure 2): Subsumption terms and behavior terms.
Subsumption concerns the attributes classifiying the Modality Components. It is structured with metadata classifying the Modality Component according to its membership or association with a Multimodal class in conformance to the modes handled by the System. This first description allows discovery filtering for a precise target mode. There are four properties:
The first term, is the intentional "Name" of the service. It is used to announce the service in the network. It is a compound name in three parts and provides the semantics about the implementation of the component, its most important attribute and its more important role. For example: SVG_COUNTER_DISPLAY, HTML_VIDEO_CONTROL, JS_FACIAL_SINTHESIZER.
Based on this term, the Modality Component's capabilities can be classified from a high level perspective, for example, we can infer that the first component is part of the device class "TEXT_DISPLAYS", and the second to the class "MEDIA_CONTROLLERS". The triplet is inspired by the intentional name schema [ADJIE-1999] and show hierarchical tree relationships between general concepts (including some negative differentiating aspects). These names are intentional; they describe the intent of the Modaity Component and its implementation in the form of a tuple of attributes.
The functions are the technical entities supporting a limited number of modalities according to the semantics of the message and the capabilities of the support itself. A Modality Component acts as a complex set of functions. Each function uses one or more modalities that realizes some mode. For example, in Figure 2 the Avatar uses a 3D mesh modality through a visual mode. The functions term defines a list of functions using in the service, ordered by importance and by mode. For example, a gesture recognizer service uses the sign language function, using the single hand gesture modality that is executed in the haptic mode and is perceived in the visual mode.
Finally, the operations is the IOPE list of the Modality Component Capabilities. IOPE means Inputs, Outputs, Preconditions and Effects of a service [YU-2007] [OWL-S]
In Figure 2 the "Face Synthesizer Service" acts in some mode that is perceived by a final user through a modality that is part of some functions, i.e. a face synthesis service acts in the visual mode that is perceived through a 3D mesh modality that is part of an avatar function.
Thus, for the "Face Syntesizer" service illustrated in Figure 2 the Modality Component's description (description.js document) shows an operation description. It could be a list of other expressions but we propose the smile operation as an example:
{ "name": "VRML_FACE_SYNTHESIZER", "affiliation": "ANIMATED_3D_RENDERER", "version": "1.0", "endpoints": { "1.0" : { "description":"http://localhost:5000/vrml_face_synthesizer/1-0/description.js", "uri": "http://localhost:5000/vrml_face_synthesizer/1-0/" } }, "modalities":{ "visual":["REALTIME_SINTHESIZER"] } }, "functions":{ "visual":["VR_GRAPHICS"] }, "operations": { "smile": { "method":"POST", "endpoint":"http://localhost:5000/vrml_face_synthesizer/1-0", "documentation": "Operation to change the expression to a smiling face. ", "metadata": {"emotion":"emotionML_uri","behavior":"behaviorML_uri"}, "input": { "key": { "position": 1, "metadata": { "Content-Type":{ "cognitive":["text/plain"] } }, "documentation": "The user key to acces this API" }, "event": { "position": 0, "metadata": { "Content-Type":{ "cognitive":["ExtensionNotification","StartRequest"] } }, "documentation": "If the event type is extension, the service returns just true or fail (for a steady smile, for example). If the event type is start request (for a time-controlled smile), the service can receive the starting time and returns the acceleration info." "data": { "metadata": { "Content-Type":{ "cognitive":["data/integer","data/time"] } }, "documentation": "If the event's data is a notification, the event will include the easing integer value for the acceleration. If the event is a StartRequest the event can also include the start time in milliseconds for the smile process." } } }, "output": { "event": { "position": 0, "metadata": { "Content-Type":{ "cognitive":["StartResponse"] } }, "documentation": "The type of response event.", "data": { "metadata": { "Content-Type":{ "cognitive":["data/integer" } }, "documentation": "In the case of a startRequest, a confirmation of the starting time of the animation." } } }, "preconditions": {"documentation": "No precondition is needed other than the loading of the face visual data."}, "effects": {"documentation": "Asynchronous modality. It will not block the rest of the application rendering."} } } }
This description can be parsed before the execution of the service, in a discovery process. To call the service and to execute a smile operation, le service query with a POST method must be structures as follows:
POST /vrml_face_synthesizer/1-0 HTTP/1.1
Host: localhost:5000
Content-Type: text/xml
<?xml version="1.0"?> <smile>
<input>
<event>
<mmi xmlns="https://www.w3.org/2008/04/mmi-arch" version="1.0">
<mmi:startRequest source="IM_1" target="smile" context="c_1" requestID="r_1">
<mmi:data>
<ease value="0.5"/>
<starting_time value="300"/>
</mmi:data>
</mmi:startRequest>
</mmi:mmi>
</event>
</input> </smile>
The smile tag represents the operation that has been requested, the input tag express that this is a request, the event tag contains the MMI Lifecycle event to control the operation. There can be multiple MMI events inside the input and output elements to support concurrential or parallel commands to the interface. The MMI Lifecycle event sent to the operation provided by the Modality Component can be any of the events defined to handle inputs on the MMI specification.
The POST response of the service will be:
<?xml version="1.0"?> <smile>
<output> <event>
<mmi xmlns="https://www.w3.org/2008/04/mmi-arch" version="1.0"> <mmi:startResponse source="smile" target="IM_1" context="c_1" requestID="r_1" status="success" /> </mmi:mmi>
</event>
</output> </smile>
The possible GET request to the REST endpoint for the same service could be:
GET /vrml_face_synthesizer/1-0 HTTP/1.1
Host: localhost:5000 /IM_1/c_1/event/startRequest/r_1/smile?data[ease]=0.5&data[starting_time]=300
The possible Json response to the REST request:
{ "output": {
"event": [{
"mmi": "startResponse",
"context": "c_1", "source": "smile", "target": "IM_1",
"requestID": "r_1",
"status": "success",
"data": {}
}]
}
}
Security techniques are separated from the current communication protocol in the architecture as in this document: we assume that this is a private network. Security issues for this protocol in public networks will be addressed later.
Also, this document is focusing on the flow of messages and the building blocks needed to support this flow. The details of the communication between the Interaction Manager and the State Manager, as the interfaces between the Data Component and the State Manager will be described later.
Another open issue is the management of multiple instances of the Interaction Manager and the flow of messages between them, the Resources Manager and multiple Modality Components.
Finally, a common vocabulary for the description of the Modality Component's attributes in order to Register and Compose them, is an important subject to be treated in order to allow a better interoperability between multimodal systems. Vocabulary and Capabilities will be addressed in a subsequent document.
The
authors
wish
to
acknowledge
the
contributions
by
all
the
members
of
the
Multimodal
Interaction
Working
Group,
in
particular,
Kazuyuki
Ashimura,
the
W3C
Team
Contact
for
the
Working
Group.
Finally, the authors would also like to acknowledge the people outside of the MMI Working Group who help with the process of developing this document, specially Jean-Claude Moissinac and Isabelle Demeure.