Or a network of Web services

web services

III.2 Input Modalities

picture

Need Grammars: SRGS and InkML

SRGS

<one-of>
  <item>Michael</item>
  <item>Yuriko</item>
  <item>Mary</item>
  <item>Duke</item>
  <item><ruleref uri="#otherNames"/></item>
</one-of>

<one-of><item>1</item> <item>2</item> <item>3</item></one-of>

<one-of>
  <item weight="10">small</item>
  <item weight="2">medium</item>
  <item>large</item>
</one-of>

<one-of>
  <item weight="3.1415">pie</item>
  <item weight="1.414">root beer</item>
  <item weight=".25">cola</item>
</one-of>

Defining handwritten gestures and grammar: InkML

picture

<ink>
   <trace>
     10 0 9 14 8 28 7 42 6 56 6 70 8 84 8 98 8 112 9 126 10 140
     13 154 14 168 17 182 18 188 23 174 30 160 38 147 49 135
     58 124 72 121 77 135 80 149 82 163 84 177 87 191 93 205
   </trace>
   <trace>
     130 155 144 159 158 160 170 154 179 143 179 129 166 125
     152 128 140 136 131 149 126 163 124 177 128 190 137 200
     150 208 163 210 178 208 192 201 205 192 214 180
   </trace>
   <trace>
     227 50 226 64 225 78 227 92 228 106 228 120 229 134
     230 148 234 162 235 176 238 190 241 204
   </trace>
   <trace>
     282 45 281 59 284 73 285 87 287 101 288 115 290 129
     291 143 294 157 294 171 294 185 296 199 300 213
   </trace>
   <trace>
     366 130 359 143 354 157 349 171 352 185 359 197
     371 204 385 205 398 202 408 191 413 177 413 163
     405 150 392 143 378 141 365 150
   </trace>
</ink>

III.3. annotating and combining input

picture

EMMA: annotating input

Goals:

picture

EMMA: example

<emma:emma emma:version="1.0"
 xmlns:emma="http://www.w3.org/2003/04/emma#"> 
  <emma:one-of emma:id="r1" 
      emma:start="2003-03-26T0:00:00.15"
      emma:end="2003-03-26T0:00:00.2">
    <emma:interpretation emma:id="int1" emma:confidence="0.75" > 
      <origin>Boston</origin>
      <destination>Denver</destination>
      <date> 
         <emma:absolute-timestamp
          emma:start="2003-03-26T0:00:00.15"
          emma:end="2003-03-26T0:00:00.2"/> 
         03112003 
      </date>  
    </emma:interpretation>
    <emma:interpretation emma:id="int2" emma:confidence="0.68" >
      <origin>Austin</origin>
      <destination>Denver</destination>
      <date>03112003</date>
    </emma:interpretation>
  </emma:one-of>
</emma:emma>

III.4. Compositing

Making one emma file out of two

III.5. The Dynamic Properties Framework

image image image

The S+E specification defines an API to access system properties. E.g.

DPF: example

<html>
  <head>
    <title>GPS location example</title>
    <script type="text/javascript">
    <![CDATA[
      SystemEnvironment.location.format="zip code";
      SystemEnvironment.location.updateFrequency="20s";
    ]]>
    </script>
    <script defer="defer" type="text/javascript" 
      ev:event="se:locationUpdate">
      <![CDATA[
        var field = document.getElementById("location");
        var zipcode = SystemEnvironment.location;
        field.childNodes[0].nodeValue = zipcode;
      ]]>
    </script>
  </head>
  <body>
    <h1>Track your location as you walk</h1>
    <p>Your current zip code is: <span id="location">(please
    wait)</span></p>
  </body>
</html>

III.6. Interaction Manager

The manager...

picture

...

Interaction Manager (2)

...and shapes the interaction accordingly:

sometimes with a little help from the application author...

Part IV: writing multimodal web content

Existing web pages and applications will still work but won't provide:

So extensions will be useful.

SALT

Speech Application Language Tags

<?xml version="1.0"?>
<vxml version="2.0" xmlns="http://www.w3.org/2001/voice">
    <form>
        <field name="stock">
            <grammar src="./g_stock.grxml"/>
            <help> Please just say stock name. </help>
            Please say the stock name.
        </field>
        <field name="op">
            <grammar src="./g_op.grxml"/>
            <help> Please just say buy or sell. </help>
            Do you want to buy or sell?
        </field>
        <field name="quantity"> 
            <grammar src="./g_quant.grxml"/>
            <help> Please just say number of shares. </help>
            How many shares?
        </field>
        <field name="price">
            <grammar src="./g_price.grxml"/>
            <help> Please just say price. </help>
            What's the price?
        </field>
    </form>
</vxml>

XHTML+Voice

<?xml version="1.0"?>
<html 
xmlns="http://www.w3.org/1999/xhtml" 
xmlns:vxml="http://www.w3.org/2001/vxml"
xmlns:ev="http://www.w3.org/2001/xml-events"
xmlns:xv="http://www.voicexml.org/2002/xhtml+voice"
>
  <head>
    <title>XHTML+Voice Example</title>
    <!-- voice handler -->
    <vxml:form id="sayHello">
      <vxml:block><vxml:prompt xv:src="#hello"/>
      </vxml:block>
    </vxml:form>
  </head>
  <body>
    <h1>XHTML+Voice Example</h1>
    <p id="hello" ev:event="click" ev:handler="#sayHello">
      Hello World!
    </p>
  </body>
</html>

Focus on CSS-MMI

Extensions to CSS for multimodal interaction

Designing an application's interaction can be viewed as styling it

CSS-MMI: a simple HTML file

<html xmlns="http://www.w3.org/1999/xhtml">
  <head>
    <title>Daily Horoscope</title>
  </head>
  <body>
    <form action="http://example.com/horoscope">
    Your star sign?
       <input id="sign" type="text" name="sign" />
    </form>
  </body>
</html>

CSS-MMI: example stylesheet

#sign:focus {
     prompt: "What is your star sign?";
     grammar: Aries | Taurus | Gemini | Cancer;
     reprompt: 1.5s;
  }

or per-modality:

@media speech {

     prompt: "Do you confirm?";
     grammar: yes | yeah {yes} | sure {yes} | no | nah {no}

}

Conclusion

picture

Join The Future!

picture

6