w3c/wbs-design
or
by mail to sysreq
.
The results of this questionnaire are available to anybody. In addition, answers are sent to the following email addresses: ashimura@w3.org,dahl@conversational-technologies.com,jim@larson-tech.com
This questionnaire was open from 2007-05-06 to 2007-06-30.
18 answers have been received.
Jump to results for question:
To what client device(s) should we target multimodal application specification languages?
Please rank your priorities. Note that 5 represents your highest priority and 1 your lowest priority.
Choice | All responders | |||||
---|---|---|---|---|---|---|
1 | 2 | 3 | 4 | 5 | 6 | |
Desktop PC's | 4 | 2 | 1 | 1 | 7 | 2 |
Laptops | 3 | 1 | 2 | 4 | 5 | 2 |
Cell phones | 1 | 3 | 12 | 1 | ||
Home appliances: environment (thermostat, lights, etc.) | 2 | 2 | 5 | 5 | 2 | 1 |
Home appliances: entertainment (TV, radio,video recorder, gaming devices, etc.) | 1 | 1 | 7 | 7 | 1 | |
Home appliances: food management(microwave, refrigerator, coffeepot, etc.) | 5 | 6 | 2 | 1 | 3 | |
Home appliances: security (outside lights, locks, etc.) | 2 | 1 | 4 | 4 | 3 | 3 |
Home appliances: communication (telephone, chat, etc.) | 1 | 5 | 2 | 8 | 1 | |
Kiosks | 3 | 2 | 1 | 5 | 2 | 4 |
Automobiles | 1 | 5 | 2 | 8 | 1 | |
Others (Please specify below) | 1 | 1 | 1 | 14 |
Averages:
Choices | All responders: |
---|---|
Value | |
Desktop PC's | 3.65 |
Laptops | 3.76 |
Cell phones | 4.76 |
Home appliances: environment (thermostat, lights, etc.) | 3.35 |
Home appliances: entertainment (TV, radio,video recorder, gaming devices, etc.) | 4.35 |
Home appliances: food management(microwave, refrigerator, coffeepot, etc.) | 3.18 |
Home appliances: security (outside lights, locks, etc.) | 3.82 |
Home appliances: communication (telephone, chat, etc.) | 4.12 |
Kiosks | 3.76 |
Automobiles | 4.18 |
Others (Please specify below) | 5.65 |
Responder | Desktop PC's | Laptops | Cell phones | Home appliances: environment (thermostat, lights, etc.) | Home appliances: entertainment (TV, radio,video recorder, gaming devices, etc.) | Home appliances: food management(microwave, refrigerator, coffeepot, etc.) | Home appliances: security (outside lights, locks, etc.) | Home appliances: communication (telephone, chat, etc.) | Kiosks | Automobiles | Others (Please specify below) | Comments |
---|---|---|---|---|---|---|---|---|---|---|---|---|
Lawrence Catchpole | 5 | 5 | 5 | 2 | 3 | 4 | 2 | 5 | 5 | 3 | 6 | |
Patrick Nepper | 5 | 5 | 5 | 1 | 2 | 1 | 1 | 3 | 4 | 3 | 6 | |
hiromi honda | ||||||||||||
Nicholas Jones | 3 | 4 | 5 | 3 | 4 | 3 | 3 | 3 | 3 | 3 | 4 | The falling cost of simple computing combined with wireless networking means that it's likely there will be a very wide range of future devices that don't fit into today's neat categories such as "cellphone" or "home entertainment device". Furthermore, I expect that in the longer term future user interfaces will become disassociated from individual devcies, with multiple devices participating in a user experience. E.g. use cases like "follow me" video calls which switch between the mobile phone and the TV as I move around the house. Therefore specifications should be as open as possible to new devices and new multi-device interaction paradigms. |
KATERINA PASTRA | 5 | 5 | 3 | 4 | 5 | 3 | 3 | 5 | 5 | 5 | 6 | |
Simon Harper | 6 | 6 | 6 | 6 | 6 | 6 | 6 | 6 | 6 | 6 | 5 | This seems to be the wrong way of looking at things to me. I think that by asking these questions you pre-judge a use case / scenario. This kind of decision - by developers - is exactly the kind of thing that has created problems in the Disability community from the outset. Is MM interaction important? I f the answer is yes when we should make sure all devices can handle this at a rating of 5. |
Jose ROUILLARD | 5 | 4 | 5 | 3 | 4 | 3 | 3 | 3 | 1 | 2 | 3 | Healthcare (patient augmented room, smart night table ...) |
Kostas Karpouzis | 5 | 1 | 4 | 1 | 5 | 1 | 4 | 5 | 4 | 4 | 6 | |
Alex Pfalzgraf | 1 | 2 | 5 | 4 | 4 | 4 | 4 | 4 | 4 | 5 | 6 | |
Norbert Reithinger | 1 | 3 | 5 | 4 | 5 | 3 | 4 | 4 | 1 | 5 | 6 | Multimodal interaction will be most beeficial outside the traditional desktop: i.e. while on the go (mobile phone, car), when controlling remote devices (home appliances). The WII success of multimodal interaction (compared to the technically superior but traditional PS3) shows that also gaming is pretty interesting for MMI. |
Jan Alexanderson | 6 | 6 | 4 | 5 | 5 | 6 | 5 | 5 | 6 | 5 | 6 | |
Quan Nguyen | 5 | 5 | 5 | 4 | 4 | 1 | 4 | 1 | 1 | 5 | 6 | |
Ali Choumane | 2 | 4 | 5 | 3 | 4 | 6 | 6 | 5 | 6 | 5 | 6 | |
Garland Phillips | 1 | 1 | 5 | 3 | 4 | 3 | 6 | 3 | 4 | 3 | 6 | |
Hirotaka Ueda | 4 | 4 | 5 | 4 | 5 | 3 | 3 | 5 | 2 | 3 | 6 | |
Massimo Romanelli | 1 | 1 | 4 | 5 | 5 | 5 | 5 | 5 | 6 | 5 | 6 | |
Daniel Sonntag | 2 | 3 | 5 | 2 | 4 | 1 | 1 | 3 | 2 | 4 | 6 | |
Yoshitaka Tokusho | 5 | 5 | 5 | 3 | 5 | 1 | 5 | 5 | 4 | 5 | 6 |
What input and output methods should be standardized?
Please rank your priorities. Note that 5 represents your highest priority and 1 your lowest priority.
Choice | All responders | |||||
---|---|---|---|---|---|---|
1 | 2 | 3 | 4 | 5 | 6 | |
Graphical display | 2 | 4 | 10 | 1 | ||
Simple DTMF phone keypad | 2 | 2 | 3 | 6 | 4 | |
Blackberry-style keypad | 1 | 6 | 4 | 1 | 1 | 4 |
Computer keyboard | 3 | 2 | 4 | 6 | 2 | |
Pointing device (mouse, joystick, touch screen, etc.) | 1 | 2 | 3 | 10 | 1 | |
Automatic Speech Recognition | 1 | 1 | 3 | 11 | 1 | |
Text to Speech Synthesis | 2 | 4 | 10 | 1 | ||
Audio capture and replay | 1 | 4 | 2 | 5 | 5 | |
still and video image capture and their interpretation (e.g. bar codes, QR codes, etc.) | 2 | 5 | 2 | 6 | 2 | |
Ink capture and handwriting recognition | 2 | 4 | 3 | 7 | 1 | |
Sensor input (GPS, temperature, wind velocity, etc.) | 2 | 3 | 2 | 5 | 3 | 2 |
Biometrics (please give examples below) | 2 | 3 | 1 | 1 | 3 | 7 |
Haptics (wii-style interface, etc.) | 2 | 1 | 1 | 10 | 3 | |
Others (please specify below) | 1 | 2 | 14 |
Averages:
Choices | All responders: |
---|---|
Value | |
Graphical display | 4.59 |
Simple DTMF phone keypad | 4.06 |
Blackberry-style keypad | 3.41 |
Computer keyboard | 3.59 |
Pointing device (mouse, joystick, touch screen, etc.) | 4.41 |
Automatic Speech Recognition | 4.59 |
Text to Speech Synthesis | 4.47 |
Audio capture and replay | 4.53 |
still and video image capture and their interpretation (e.g. bar codes, QR codes, etc.) | 4.06 |
Ink capture and handwriting recognition | 3.94 |
Sensor input (GPS, temperature, wind velocity, etc.) | 3.59 |
Biometrics (please give examples below) | 4.24 |
Haptics (wii-style interface, etc.) | 4.65 |
Others (please specify below) | 5.76 |
Responder | Graphical display | Simple DTMF phone keypad | Blackberry-style keypad | Computer keyboard | Pointing device (mouse, joystick, touch screen, etc.) | Automatic Speech Recognition | Text to Speech Synthesis | Audio capture and replay | still and video image capture and their interpretation (e.g. bar codes, QR codes, etc.) | Ink capture and handwriting recognition | Sensor input (GPS, temperature, wind velocity, etc.) | Biometrics (please give examples below) | Haptics (wii-style interface, etc.) | Others (please specify below) | Comments |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Lawrence Catchpole | 5 | 5 | 3 | 5 | 5 | 5 | 5 | 3 | 5 | 5 | 2 | 2 | 6 | 6 | |
Patrick Nepper | 5 | 5 | 2 | 3 | 3 | 5 | 5 | 5 | 2 | 4 | 4 | 6 | 5 | 6 | |
hiromi honda | |||||||||||||||
Nicholas Jones | 5 | 5 | 5 | 5 | 5 | 5 | 5 | 5 | 5 | 5 | 5 | 5 | 5 | 5 | Biometrics might include both current and new technologies such as EMG, expression recognition. Referring to my earlier comment, the user interface may (must) be able to integrate multiple clues to the user's intention. E.g. if an accelerometer on a phone suggests I'm running, a GPS shows I'm in an airport then I'm probably running to catch a plane. This implies the use of certain UI modalities, e.g. voice interactions only. The questionnaire suggests some implicit assumptions which might not be correct during the next decade. E.g. that the UI is always intentional - the user specifies exactly what he/she wants to do. Whereas UIs will likely become more deductional, assembling clues from multiple sensors. Also, referring to my earlier comment, there is no reason for a UI to be bound to a single interaction model for the duration of an interaction. |
KATERINA PASTRA | 5 | 3 | 3 | 5 | 5 | 5 | 5 | 5 | 5 | 5 | 5 | 5 | 5 | 6 | |
Simon Harper | 6 | 6 | 6 | 6 | 6 | 6 | 6 | 6 | 6 | 6 | 6 | 6 | 6 | 5 | I think there should not be a standard input and output method, but a standard method FOR input output which can be used for all of the technologies above. It may be more difficult to implement now, but it will pay off in the future. |
Jose ROUILLARD | 5 | 5 | 6 | 5 | 5 | 5 | 5 | 5 | 4 | 5 | 4 | 3 | 3 | 6 | |
Kostas Karpouzis | 4 | 5 | 1 | 5 | 5 | 5 | 4 | 3 | 3 | 1 | 1 | 1 | 2 | 6 | |
Alex Pfalzgraf | 4 | 3 | 4 | 1 | 5 | 5 | 5 | 6 | 5 | 4 | 4 | 6 | 5 | 6 | |
Norbert Reithinger | 3 | 1 | 2 | 1 | 5 | 3 | 2 | 4 | 2 | 4 | 3 | 1 | 5 | 6 | |
Jan Alexanderson | 4 | 3 | 3 | 3 | 5 | 5 | 5 | 6 | 6 | 3 | 4 | 6 | 5 | 6 | |
Quan Nguyen | 5 | 6 | 6 | 5 | 5 | 4 | 4 | 4 | 5 | 5 | 3 | 6 | 5 | 6 | |
Ali Choumane | 5 | 6 | 6 | 1 | 1 | 5 | 5 | 6 | 3 | 5 | 6 | 6 | 6 | 4 | gaze as input |
Garland Phillips | 4 | 2 | 2 | 2 | 3 | 5 | 5 | 3 | 3 | 3 | 2 | 2 | 2 | 6 | |
Hirotaka Ueda | 5 | 5 | 3 | 3 | 4 | 4 | 4 | 3 | 3 | 3 | 4 | 4 | 4 | 6 | |
Massimo Romanelli | 5 | 6 | 2 | 6 | 5 | 2 | 2 | 6 | 4 | 1 | 1 | 6 | 5 | 6 | |
Daniel Sonntag | 3 | 2 | 2 | 2 | 4 | 4 | 4 | 2 | 3 | 3 | 2 | 2 | 5 | 6 | |
Yoshitaka Tokusho | 5 | 1 | 2 | 3 | 4 | 5 | 5 | 5 | 5 | 5 | 5 | 5 | 5 | 6 |
Which combinations of the above modalities are you interested in?
Please rank your priorities. Note that 5 represents your highest priority and 1 your lowest priority.
Choice | All responders | |||||
---|---|---|---|---|---|---|
1 | 2 | 3 | 4 | 5 | 6 | |
voice dialog components (Speech recognition, Speech synthesis) | 6 | 10 | 1 | |||
Lip-reading (audio-visual speech recognition) | 3 | 3 | 6 | 1 | 2 | 2 |
GPS and voice (location-based services) | 1 | 5 | 2 | 5 | 3 | 1 |
other combinations (please specify below) | 1 | 9 | 7 |
Averages:
Choices | All responders: |
---|---|
Value | |
voice dialog components (Speech recognition, Speech synthesis) | 4.71 |
Lip-reading (audio-visual speech recognition) | 3.12 |
GPS and voice (location-based services) | 3.41 |
other combinations (please specify below) | 5.29 |
Responder | voice dialog components (Speech recognition, Speech synthesis) | Lip-reading (audio-visual speech recognition) | GPS and voice (location-based services) | other combinations (please specify below) | Comments |
---|---|---|---|---|---|
Lawrence Catchpole | 5 | 3 | 3 | 6 | |
Patrick Nepper | 5 | 3 | 4 | 5 | GUI+Voice+Haptics realized e.g. by XHTML+Voice plus NFC integration |
hiromi honda | |||||
Nicholas Jones | 5 | 2 | 5 | 5 | Combinations of any and all input technologies may be used going beyond those suggested here. E.g. combining eye tracking and voice recognition so when I say "stop" we can deduce that I'm probably instructing the device I'm looking towards. Another issue to be addressed is what happens when there are multiple devices / applications monitoring the same source of user input. E.g. watching my expression or listening to my voice. How do we disambiguate which input is routed to which system? |
KATERINA PASTRA | 4 | 3 | 4 | 6 | |
Simon Harper | 6 | 6 | 6 | 5 | Non Conventional forms such as sign language etc |
Jose ROUILLARD | 5 | 3 | 4 | 3 | Gesture + voice (Bolt like, SVG ?). |
Kostas Karpouzis | 4 | 5 | 1 | 6 | |
Alex Pfalzgraf | 5 | 2 | 5 | 5 | Grahpics + Speech (ASR/TTS) + Pointing Devices/Haptics + Sensor Input |
Norbert Reithinger | 4 | 1 | 3 | 5 | Most interesting will be haptic devices (e.g. accelerometers in smartphones) and the semantics and interpretation of the signals. Voice is mature and will eb successrul in the current boundaries unless the recognition quality gets better in real life environments. |
Jan Alexanderson | 5 | 3 | 4 | 5 | Input: speech and (hand or pen) gestures Output: speech, graphics, other sounds, vibrations etc. |
Quan Nguyen | 5 | 3 | 2 | 6 | |
Ali Choumane | 4 | 4 | 4 | 5 | (Speech recognition, handwriting recognition, Speech synthesis, Visuel) |
Garland Phillips | 5 | 1 | 2 | 6 | |
Hirotaka Ueda | 4 | 2 | 2 | 6 | |
Massimo Romanelli | 5 | 6 | 2 | 5 | The interplay of speech, gesture and haptic is a basic requirement for spontaneous and successful human interection with devices of any kind depending on scenario and content (multimedia manipulation, menus navigation). |
Daniel Sonntag | 4 | 1 | 2 | 5 | Haptics on device/touchscreen and speech |
Yoshitaka Tokusho | 5 | 5 | 5 | 6 |
Where do things run?
Please rank your priorities. Note that 5 represents your highest priority and 1 your lowest priority.
Choice | All responders | |||||
---|---|---|---|---|---|---|
1 | 2 | 3 | 4 | 5 | 6 | |
On the client | 3 | 2 | 1 | 2 | 7 | 2 |
On the server | 4 | 2 | 4 | 2 | 3 | 2 |
On the client and also on the server (e.g. a simple ASR on the client and a powerful ASR on the server so that the device can function in a limited mode when not connected) | 2 | 4 | 6 | 4 | 1 | |
Distributed between client and server (e.g. extract speech features on the client, send to the server where powerful ASR does the rest) | 2 | 2 | 4 | 6 | 3 | |
Others (please specify below) | 1 | 16 |
Averages:
Choices | All responders: |
---|---|
Value | |
On the client | 3.82 |
On the server | 3.24 |
On the client and also on the server (e.g. a simple ASR on the client and a powerful ASR on the server so that the device can function in a limited mode when not connected) | 3.76 |
Distributed between client and server (e.g. extract speech features on the client, send to the server where powerful ASR does the rest) | 4.12 |
Others (please specify below) | 5.94 |
Responder | On the client | On the server | On the client and also on the server (e.g. a simple ASR on the client and a powerful ASR on the server so that the device can function in a limited mode when not connected) | Distributed between client and server (e.g. extract speech features on the client, send to the server where powerful ASR does the rest) | Others (please specify below) | Comments |
---|---|---|---|---|---|---|
Lawrence Catchpole | 5 | 2 | 3 | 4 | 6 | |
Patrick Nepper | 5 | 1 | 1 | 1 | 6 | |
hiromi honda | ||||||
Nicholas Jones | 5 | 5 | 5 | 5 | 5 | Given the huge variation of computing capability from the simplest clients (such as a sensor node with a single pixel display) and the most complex (e.g. a PC) no assumptions can be made about UI distribution. Also, why can't we have the UI operating in a peer to peer mode distributed across multiple clients? This fits the model I mentioned earlier where many devices may co-operate in an interaction. |
KATERINA PASTRA | 1 | 5 | 5 | 5 | 6 | |
Simon Harper | 5 | 1 | 1 | 1 | 6 | |
Jose ROUILLARD | 6 | 6 | 6 | 6 | 6 | |
Kostas Karpouzis | 4 | 2 | 4 | 6 | 6 | |
Alex Pfalzgraf | 2 | 5 | 4 | 5 | 6 | |
Norbert Reithinger | 5 | 1 | 3 | 2 | 6 | As the mobile client changes frequently (c.f. the model changes for smartphones), client-side solutions must be reusable and adhere to standards not tied to a specific vendor. |
Jan Alexanderson | 6 | 6 | 4 | 4 | 6 | |
Quan Nguyen | 5 | 3 | 3 | 6 | 6 | |
Ali Choumane | 1 | 1 | 3 | 5 | 6 | |
Garland Phillips | 5 | 3 | 5 | 4 | 6 | |
Hirotaka Ueda | 3 | 3 | 4 | 4 | 6 | |
Massimo Romanelli | 2 | 4 | 4 | 5 | 6 | |
Daniel Sonntag | 1 | 3 | 4 | 2 | 6 | I think on this abstract level this is more a questions of belief rather than a more concrete consideration. |
Yoshitaka Tokusho | 4 | 4 | 5 | 5 | 6 |
What formats should be used to exchange data among processors?
Please rank your priorities. Note that 5 represents your highest priority and 1 your lowest priority.
Choice | All responders | |||||
---|---|---|---|---|---|---|
1 | 2 | 3 | 4 | 5 | 6 | |
Flat file | 4 | 3 | 3 | 1 | 2 | 4 |
ECMAScript/JavaScript | 8 | 3 | 2 | 1 | 3 | |
XML | 1 | 1 | 4 | 9 | 2 | |
EMMA | 1 | 1 | 7 | 4 | 4 | |
Others (please specify below) | 1 | 16 |
Averages:
Choices | All responders: |
---|---|
Value | |
Flat file | 3.35 |
ECMAScript/JavaScript | 2.47 |
XML | 4.59 |
EMMA | 4.53 |
Others (please specify below) | 5.94 |
Responder | Flat file | ECMAScript/JavaScript | XML | EMMA | Others (please specify below) | Comments |
---|---|---|---|---|---|---|
Lawrence Catchpole | 3 | 3 | 5 | 2 | 6 | |
Patrick Nepper | 6 | 6 | 6 | 6 | 6 | |
hiromi honda | ||||||
Nicholas Jones | 5 | 2 | 5 | 4 | 5 | This may depend on the granularity and timeliness of the information. E.g. some form of RPC might be best for real time interactions. Also it may depend on the capability of the participatint devices. E.g. sensor nodes probably don't want to parse XML. |
KATERINA PASTRA | 3 | 3 | 5 | 4 | 6 | |
Simon Harper | 5 | 1 | 5 | 6 | 6 | |
Jose ROUILLARD | 2 | 1 | 4 | 4 | 6 | |
Kostas Karpouzis | 2 | 1 | 4 | 5 | 6 | |
Alex Pfalzgraf | 1 | 1 | 5 | 5 | 6 | |
Norbert Reithinger | 1 | 1 | 2 | 5 | 6 | Prestructured XML schemata like EMMA help a lot in interoperability. The script based solutions lack clarity. |
Jan Alexanderson | 1 | 1 | 3 | 4 | 6 | |
Quan Nguyen | 1 | 6 | 5 | 4 | 6 | |
Ali Choumane | 6 | 1 | 4 | 6 | 6 | |
Garland Phillips | 6 | 6 | 6 | 6 | 6 | |
Hirotaka Ueda | 2 | 2 | 5 | 3 | 6 | |
Massimo Romanelli | 6 | 1 | 5 | 5 | 6 | |
Daniel Sonntag | 3 | 2 | 4 | 4 | 6 | |
Yoshitaka Tokusho | 4 | 4 | 5 | 4 | 6 |
What languages should be reused in multimodal applications to perform the following functions? Some possibilities are listed with each function.
Please rank your priorities. Note that 5 represents your highest priority and 1 your lowest priority.
Choice | All responders | |||||
---|---|---|---|---|---|---|
1 | 2 | 3 | 4 | 5 | 6 | |
Dialog management(SCXML, VoiceXML,CCXML, scripting, server-side programs) | 2 | 1 | 4 | 9 | 1 | |
Synchronization of modalities (SCXML, VoiceXML, CCXML, scripting, server-side programs) | 2 | 1 | 5 | 6 | 3 | |
Presentation (HTML/XHTML, SSML, SVG, SMIL, VoiceXML, SRGS, SISR, PLS) | 1 | 3 | 4 | 8 | 1 | |
Data extraction (XSLT, XPath, XLink, DOM) | 2 | 7 | 4 | 2 | 2 | |
user input representation (EMMA, InkML, ECMAScript objects, generic XML) | 1 | 7 | 7 | 2 | ||
Access to modalities (a set of standardized APIs) | 2 | 1 | 4 | 4 | 4 | 2 |
Others (please specify below) | 17 |
Averages:
Choices | All responders: |
---|---|
Value | |
Dialog management(SCXML, VoiceXML,CCXML, scripting, server-side programs) | 4.35 |
Synchronization of modalities (SCXML, VoiceXML, CCXML, scripting, server-side programs) | 4.41 |
Presentation (HTML/XHTML, SSML, SVG, SMIL, VoiceXML, SRGS, SISR, PLS) | 4.29 |
Data extraction (XSLT, XPath, XLink, DOM) | 3.71 |
user input representation (EMMA, InkML, ECMAScript objects, generic XML) | 4.59 |
Access to modalities (a set of standardized APIs) | 3.76 |
Others (please specify below) | 6.00 |
Responder | Dialog management(SCXML, VoiceXML,CCXML, scripting, server-side programs) | Synchronization of modalities (SCXML, VoiceXML, CCXML, scripting, server-side programs) | Presentation (HTML/XHTML, SSML, SVG, SMIL, VoiceXML, SRGS, SISR, PLS) | Data extraction (XSLT, XPath, XLink, DOM) | user input representation (EMMA, InkML, ECMAScript objects, generic XML) | Access to modalities (a set of standardized APIs) | Others (please specify below) | Comments |
---|---|---|---|---|---|---|---|---|
Lawrence Catchpole | 5 | 5 | 5 | 4 | 5 | 6 | 6 | |
Patrick Nepper | 5 | 5 | 5 | 3 | 4 | 1 | 6 | Please reconsider XHTML+Voice. I believe it was a very promising step forward! |
hiromi honda | ||||||||
Nicholas Jones | 5 | 4 | 5 | 3 | 5 | 5 | 6 | I haven't thought about this in detail, but we prpobably have a variety of different roles which may need different languages. E.g. orchestrating a UI involving multiple devices may need some high level framework, and talking to a GPS needs a low level API. |
KATERINA PASTRA | 5 | 5 | 5 | 3 | 4 | 5 | 6 | |
Simon Harper | 4 | 6 | 5 | 4 | 6 | 5 | 6 | |
Jose ROUILLARD | 5 | 4 | 4 | 4 | 4 | 3 | 6 | |
Kostas Karpouzis | 4 | 5 | 5 | 3 | 5 | 3 | 6 | |
Alex Pfalzgraf | 5 | 5 | 4 | 5 | 5 | 4 | 6 | |
Norbert Reithinger | 5 | 5 | 3 | 3 | 5 | 3 | 6 | |
Jan Alexanderson | 2 | 2 | 2 | 2 | 4 | 4 | 6 | Dialog management: Most of these things are too limited, e.g., VoiceXML, to be really useful for natural and intuitive applications. All these things mentioned lack a discourse memory which is vital to our applications. Synchronization is relevant but not using VoiceXML (I don't understand that btw) Data extraction is irrelevant. |
Quan Nguyen | 5 | 6 | 6 | 6 | 6 | 6 | 6 | |
Ali Choumane | 6 | 6 | 4 | 6 | 5 | 5 | 6 | |
Garland Phillips | 2 | 3 | 3 | 3 | 4 | 3 | 6 | |
Hirotaka Ueda | 4 | 4 | 5 | 3 | 4 | 4 | 6 | |
Massimo Romanelli | 5 | 4 | 3 | 5 | 5 | 1 | 6 | |
Daniel Sonntag | 3 | 2 | 5 | 2 | 3 | 2 | 6 | |
Yoshitaka Tokusho | 4 | 4 | 4 | 4 | 4 | 4 | 6 |
summary | by responder | by choice
For each of the functions above, which should be based on XML and which should be based on other approaches?
Please check the box if you think it should be XML based.
Choice | All responders |
---|---|
Results | |
Dialog management | 12 |
Presentation | 13 |
Data extraction | 9 |
User input representation | 14 |
Access to modalities | 11 |
Others (please specify below) | 2 |
Skip to view by choice.
Responder | XML based or not | Comments |
---|---|---|
Lawrence Catchpole |
|
|
Patrick Nepper |
|
|
hiromi honda | ||
Nicholas Jones |
|
XML is fine, but IMHO can't be the only format. Many devices (like cellphones or dumb sensor nodes) won't be able to deal with XML easily. |
KATERINA PASTRA |
|
|
Simon Harper |
|
|
Jose ROUILLARD |
|
|
Kostas Karpouzis |
|
|
Alex Pfalzgraf |
|
|
Norbert Reithinger |
|
|
Jan Alexanderson |
|
XML or not is irrelevant. The only thing which is good with XML is schemata. Otherwise it is as good with Lisp Lists, prolog lists or any other well definable syntactic formalism. |
Quan Nguyen |
|
|
Ali Choumane |
|
|
Garland Phillips |
|
|
Hirotaka Ueda |
|
|
Massimo Romanelli |
|
|
Daniel Sonntag |
|
|
Yoshitaka Tokusho |
|
Choice | Responders |
---|---|
Dialog management |
|
Presentation |
|
Data extraction |
|
User input representation |
|
Access to modalities |
|
Others (please specify below) |
|
What use cases are important to you?
Responder | Use cases |
---|---|
Lawrence Catchpole | |
Patrick Nepper | (Location-based) Mobile access to the web, accessibility for disabled people and the elderly. |
hiromi honda | |
Nicholas Jones | Everything? |
KATERINA PASTRA | Multimodal fusion for indexing, retrieval and summarization |
Simon Harper | Use cases are misleading IMO, because they pre-judge the user and are filtered through technologists. In real life the number if use cases are so extreme that designing with this in mind is the only real solution. |
Jose ROUILLARD | |
Kostas Karpouzis | |
Alex Pfalzgraf | Web-based services via Smartphone (esp. location-based services) Combination of home appliance control with web-based services Combination of car control with web-based services (Car2X) and sensoric input |
Norbert Reithinger | Use cases that will dominate the multimodal area will address mobile users that are not tied to a big screen or other stationary devices. |
Jan Alexanderson | |
Quan Nguyen | |
Ali Choumane | |
Garland Phillips | |
Hirotaka Ueda | Various devices around user coordinate with each other and provide various services according to the user-context. |
Massimo Romanelli | Interaction with smartphones, multimedia manipulation |
Daniel Sonntag | Different end devices run by same dialogue engine. |
Yoshitaka Tokusho | Personal Computing area is most important because we are spending most of working time in front of Personal Comupter and working on the Computer Display. |
If you already have a version of multimodal applications/solutions, how important is the ability to plug-in/reuse the modality components you have already developed?
Please rank your priorities. Note that 5 represents your highest priority and 1 your lowest priority.
Choice | All responders | |||||
---|---|---|---|---|---|---|
1 | 2 | 3 | 4 | 5 | 6 | |
ability to plug-in/reuse the modality components | 1 | 1 | 2 | 9 | 4 |
Averages:
Choices | All responders: |
---|---|
Value | |
ability to plug-in/reuse the modality components | 4.76 |
Responder | ability to plug-in/reuse the modality components | Comments |
---|---|---|
Lawrence Catchpole | 5 | |
Patrick Nepper | 5 | We have already created some prototypes using VoiceXML and XHTML+Voice, respectively. Besides some minor problems with X+V, we are satisfied with their simplicity and degree of abstraction (e.g. as opposed to EMMA). |
hiromi honda | ||
Nicholas Jones | 6 | |
KATERINA PASTRA | 5 | |
Simon Harper | 1 | |
Jose ROUILLARD | 5 | |
Kostas Karpouzis | 5 | |
Alex Pfalzgraf | 4 | |
Norbert Reithinger | 5 | I was involved in multiple multimodal project and the (partial) reusability is an absolute must. See also my remark above about model changes. |
Jan Alexanderson | 5 | |
Quan Nguyen | 5 | |
Ali Choumane | 4 | |
Garland Phillips | 6 | |
Hirotaka Ueda | 6 | |
Massimo Romanelli | 5 | |
Daniel Sonntag | 3 | |
Yoshitaka Tokusho | 6 |
Please feel free to add any comments, justifications, and additional responses that you feel are important.
Responder | General comments |
---|---|
Lawrence Catchpole | |
Patrick Nepper | Again, I would like to see a revitalization of XHTML+Voice. Both, XHTML and VoiceXML have already proven their advantages for their respective modalities. Combining both to build a multimodal specification language seems very promising to me (as far as our prototype implementations are concerned). |
hiromi honda | |
Nicholas Jones | I think maybe there needs to be an explicit statement about how UI design is expected to evolve so we can discuss the target. I suspect we have at least 3 stages: (a) simple UI (like a GUI). (2) composite UI involving lots of input / output technologies. (3) environmental UI involving lots of separate devices in an intelligent environment. |
KATERINA PASTRA | |
Simon Harper | |
Jose ROUILLARD | |
Kostas Karpouzis | |
Alex Pfalzgraf | |
Norbert Reithinger | Excuse for any typos and glitches of a non-native speaker :-) |
Jan Alexanderson | |
Quan Nguyen | |
Ali Choumane | |
Garland Phillips | |
Hirotaka Ueda | |
Massimo Romanelli | |
Daniel Sonntag | I think the form is well-structured. But more concrete scenarios should be regarded. I think different integration environments and characteristics, e.g., time, effort, and skills, result naturally in different standardisation requirements. |
Yoshitaka Tokusho |
Compact view of the results / list of email addresses of the responders
WBS home / Questionnaires / WG questionnaires / Answer this questionnaire
w3c/wbs-design
or
by mail to sysreq
.