Breakout Session Notes

Areas to advance within W3C

W3C Workshop on Web Device Independent Authoring
Wednesday 4 October 2000

Group 4 - Interaction

Notes taken by Gregory Rosmaita

Interaction -- Group Leader: Amir

questions to be addressed:
1. can we come up with templates for interactions that
span devices?  (forms, voice interaction)

what are the devices?

what are the channels?

what are the modalities?

accessibility -- is it a subset of modality?
Amir: piggy-back issue; same functionality
Roger G: overlap between interaction and
accessibility; if define modalities user can choose
between (choose to use text-only, voice-only) helping

WAP display based, so how to achieve accessibility

Amir: multi-modal devices
Roger: voice-only mode and graphic & button mode

Amir: consider modalities as concepts; not devices

what is mobile?  wireless data?  voice?

WAP text-only;

Amir: dialog classes: poor text, rich text, etc.

Ian: UA grouping: text, color, image, animation, audio,
and speech; animation a hybrid
Amir: speech can be synthesized; not audio; content
types; text volume -- can you display a large amount
of text or small amount?
Ian: now talking about content types

missing interaction models, device constraints
IJ: pre-rendering of text -- text as glyphs that are
rendered or that can be output in any modality?

color: device capability
graphics: device capability
tactile: output modality; particular rendering; speech
rendering of something; tactile a rendering of text

input types
is it device dependent?
RG: can describe at level of description at a higher
level than device type
Amir: device capabilities
content input or means of doing it?
Amir: numerical keypad a means of entering text?
DP: today, yes
RG: quality device offers
Amir: if have means of entering text and only have
numeric keyboard, have a problem
RG: high resolution versus low resolution

different fidelities; additional taxonomy -- can display
images, but how well is a subset

do text and numbers always go together, but numbers
can stand alone

selection across modalities; specialized interaction

concentrate on functionality; independent of input

RG: if mouse can make choices; onscreen keyboard
can make text; talking about coordinate
choice/selection

IJ: device input (voice, text, pointer, etc.) from input
metaphors (how activated)

Amir: what does the author need?
IJ: coordinate only one way to input; equivalent in
functionality; fact that did them through different
devices implies that there is a device layer; some
cases in which coordinates are required, some places
where authors only think there are required

continuous choice versus discrete choice
Amir: as author, want to achieve certain goals
(selection) have several means of achieving that end

RG: 2 layers: authoring intent; mapping onto device
capability

when is speech in itself useful other than for human
consumption; speech-to-text; speech-to-audio

points and strokes

single point coordinate; stroke as vector; local

IJ: abstract data types rather than formatting
Amir: whatever I am using, is under the grammar of.

capabilities that allow us to map

what about navigation elements?  where I want to go
next; is that data types or interaction

applicative state (where am I?)

describe a form independent of device; want pieces of
info related to each other to work; output modality will
be different

serial input and random input -- granularity of
validation; missing fields can be checked later;

RG: focus on capabilities -- if assuming authoring
process where intent is mapped onto facility, author
has to think of capabilities of device generating things
for; more abstract authoring model in the future; have
set of capabilities not specific to each device, but
define a capability layer that is an abstraction gives
you a good intermediate form which author can relate
to

capabilities
image rendering
speech output
display

RG: what does the user think they've got in the
device; devices have screens, so can display things --
map to some visual representation; devices that have
speakers, expect them to emit audio; devices with
keyboards, expect them to accept keyboard input

TV can't send static, need to send real-time stream;
presented continuously;

Amir: same from author's point of view; want to say
"display page" responsibility of refreshing and
rendering is someone else's problem; author may
want text,

DP: state -- you have filled out X of Y; you must fill out
Z

author needs to know capabilities, including state and
permanence

Amir: intention of author and capability of device;
need for mapping between the 2

1. local (language, currency, time, geographical
position)
2. personal preferences (text input, voice input,
3. text: richness
4. image
5. general audio (sounds)
6. speech

voice recognition -- is it a capability?

RG: this is not a trivial task to itemize and get in right
layers -- is it a useful task?

not a right answer -- not a correct list -- never going to
be right for all authors or all devices

if authors have mapping and list, can separate
content from presentation

will authors target particular channels and ignore
others

Amir: UIML does this
RG: if can come up with set that is good for authors to
author to, people developing new devices have a set
of requirements for developers to pick from -- list of
capabilities to sell to authoring community; not going
to be exhaustive list, but valuable to authors and
developers; implement for a given device (this device
is good for A, B, & C)

device independent -- can generate content that can
be displayed on any device not true, author can
choose subset

if I am author who wants apps to run on all devices
where capabilities exist, that's what I'm going to do;

need to classify devices so authors can understand
groupings

authors need to develop own classifications; standard
taxonomy useful, but shouldn't be mandatory

lack of consistency between devices of
similar/identical capabilities

RG: document types; document models; develop
something appropriate for delivery on specific device;
how does this impact W3C; implications for CC/PP
(intended to address some of these capabilities)

IJ: author wants to author; doesn't care about
capabilities, just communication

Amir: where do you want device independence

IJ: up to author, have to extend system of categories,
CC/PP offers basic profiles, but don't want to author
with particular device in mind, just want to reach
broadest audience possible

if want to develop without any knowledge of device,
can't leap between classes

Y axis "effort"; X axis, channels -- in ideal world have
same effort for all channels; cover all channels with
equal effort; developer decides where wants to invest
effort to achieve high quality for specific devices and
may not care/know about others

low quality on all or high quality on many?

provide useful taxonomy of modalities and capabilities
to help authors think about this issue

want to independently author for all mobile devices,
but don't expect to display well on other devices; do
we want independent focus on broad categories or an
LCD solution

give authors the ability to express capabilities then
map to devices

IJ: not clear when have to go that far or rely on device

as author want to author for classes

SUMMATION
1. targets for authors:
a) device capabilities -- is it the starting point?
b) need
c) extensibility
d) useful profile
e) author ignorance of devices a reality;
IJ: author needs to be able to choose depth of detail
RG: fill gap between author and device; fill in
automatically for all devices, or concentrate on class;
strong process, can do more, but getting strong
process that maps across modalities hard; are there
good descriptive levels btw author and device that
allows author to work to a certain layer and let
something map to other devices
DP: how do I check it?  this is what it will look like on
A, B, & C
Amir: or, can it be rendered and interacted with on
each modality?

2. device classes and characteristics

FINAL SUMMARIZATION
Amir: need taxonomy; ability to classify intentions
(display an image by displaying an image or through
description)

GJR: author may intend that the user see an image,
but is actually attempting to communicate a specific
idea, that can be communicated/represented in a
number of modalities

Amir: good point; look at UIML to ascertain if meets
device independence requirements, particularly layers
of abstraction; what is author's intent?  to
communicate a specific idea, and UIML gives us a
mechanism with which to do this