15:48:12 RRSAgent has joined #htmlspeech 15:48:12 logging to http://www.w3.org/2011/09/29-htmlspeech-irc 15:48:19 Zakim has joined #htmlspeech 15:48:31 trackbot, start telcon 15:48:55 zakim, this will be htmlspeech 15:48:55 ok, burn; I see INC_(HTMLSPEECH)11:30AM scheduled to start 18 minutes ago 15:52:01 Meeting: HTML Speech Incubator Group Teleconference 15:52:12 Date: 29 September 2011 15:55:17 INC_(HTMLSPEECH)11:30AM has now started 15:55:24 +Dan_Burnett 15:57:48 +??P25 15:57:49 mbodell has joined #htmlspeech 15:57:49 zakim, I am Dan_Burnett 15:57:49 ok, burn, I now associate you with Dan_Burnett 15:58:25 zakim, ??P25 is Bjorn_Bringert 15:58:25 +Bjorn_Bringert; got it 15:58:49 +??P30 15:58:53 +Michael_Bodell 15:59:01 Zakim, ??P30 is Olli_Pettay 15:59:02 +Olli_Pettay; got it 15:59:17 Zakim, nick smaug is Olli_Pettay 15:59:17 ok, smaug, I now associate you with Olli_Pettay 16:00:02 +Milan_Young 16:00:03 Milan has joined #HtmlSpeech 16:00:31 +Debbie_Dahl 16:00:36 ddahl has joined #htmlspeech 16:00:46 +Dan_Druta 16:01:13 DanD has joined #htmlspeech 16:01:38 ehlen has joined #htmlspeech 16:03:08 -Bjorn_Bringert 16:03:10 +Patrick_Ehlen 16:03:12 zakim, nick DanD is Dan_Druta 16:03:12 ok, burn, I now associate DanD with Dan_Druta 16:03:43 Scribe: Dan_Druta 16:03:52 ScribeNick: DanD 16:03:56 Agenda: http://lists.w3.org/Archives/Public/public-xg-htmlspeech/2011Sep/0042.html 16:04:20 Chair: Michael_Bodell 16:04:25 + +44.794.417.aaaa 16:04:50 Charles has joined #htmlspeech 16:05:59 +Charles_Hemphill 16:06:26 zakim, aaaa is Bjorn_Bringert 16:06:26 +Bjorn_Bringert; got it 16:07:04 topic: Web API 16:07:24 Discussing SpeechInputResults 16:07:30 Bjorn's mail: http://lists.w3.org/Archives/Public/public-xg-htmlspeech/2011Aug/0033.html 16:07:40 Satish's proposal: http://lists.w3.org/Archives/Public/public-xg-htmlspeech/2011Sep/0034.html 16:07:41 mbodell: speech results discussion 16:07:49 My mail: http://lists.w3.org/Archives/Public/public-xg-htmlspeech/2011Sep/0043.html 16:09:01 bringert: I did not understand what the semantics were 16:10:01 mbodell: Three interfaces like in Bijorn proposal 16:10:44 mbodell: You get a speech result and inside you get an array of results 16:11:22 bringert: so the results are a history 16:11:30 +[Microsoft] 16:11:40 burn:So the result number will not decrease 16:11:48 Robert has joined #htmlspeech 16:12:01 mbodell: I can decrease if you get corrections 16:12:37 bringert: what is the benefit to have everything? 16:12:53 burn: It's not the history. It's the combined one 16:13:16 mbodell: in a simple case you get a concatenation 16:13:49 bringert: How do you know if this preliminary? 16:14:02 mbodell: we can add a boolean 16:14:33 Charles: Don't you want if it's final 16:15:59 burn: when you get a Boolean that is marked as final it should mean that it's final for the end user 16:16:39 Milan: I can see results coming finalized but the recognizer to shuffle them around 16:17:05 bringert: It's up to recognizer 16:17:20 Milan: let's have an example that shows that chunking 16:17:52 bringert: Why can't we have a preliminary that replaces the previous one until it hits final 16:18:14 Milan: this is more powerful and gives more flexibility 16:18:40 Milan: I thought the proposal was simplier as was discussed in the F2F 16:19:05 s/burn: when you get a Boolean that is marked as final it should mean that it's final for the end user/burn: when you get a Boolean in the event that is marked as final it means that particular indexed value in the results array is final/ 16:19:27 trackbot has joined #htmlspeech 16:19:49 zakim, [Microsoft] is Robert_Brown 16:19:49 +Robert_Brown; got it 16:20:01 Milan: Why not the recognizer send another hypotesis? 16:21:05 s/send another hypotesis/send another complete hypothesis, replacing all preceding results/ 16:21:22 Milan: The combined result would be more efficient (less headers) 16:21:55 mbodell: Better if you send an array 16:22:41 Milan: If you are to dictate an email, are we expecting to have lots of indexes? 16:22:47 s/mbodell: Better if you send an array/mbodell: result chunk size is not necessarily the same as finalizing chunk size/ 16:23:10 Milan: we have an unrealistic example here 16:23:47 Bringert: Michael can you explain a bit better the chunking? 16:25:21 bringert: From a single result you get a single piece of semantics 16:26:11 mbodell: What can't you do with the array? 16:26:45 mbodell: in a non continuous world it's not a problem 16:27:07 bringert: how do you see this from the UI point of view? 16:27:19 glen has joined #htmlspeech 16:27:35 mbodell: you concatenate them each time 16:27:46 +Glen_Shires 16:28:01 mbodell: The intent was to have the exact sequence 16:29:10 mbodell: if you have 3 results and one gets modified you get them 16:30:57 mbodell: you get a normal result anyway 16:31:26 ddahl: how does it work with nbest? 16:32:42 mbodell: I can see an API where you have results and one is wrong and gets replaces 16:33:12 mbodell: I agree it solves the simple use cases and it is more complicated for others 16:33:54 bringert: How do I know when to to interpret the actions? We should get a final for everything 16:34:32 Charles: the UI wants to show just finals. You want finals to come at a reasonable pace 16:34:48 mbodell: there's a tradeoff 16:34:54 s/Charles: the UI/Charles: maybe the UI/ 16:36:04 Milan: I know it's not going to change but maybe there's an external input that might change it 16:37:01 Milan: if the user changes one word or another it might trigger more changes and might send and updated array 16:37:35 burn: the reason I like final is that I can archive them 16:38:12 glen: If you have a command based on what the user said it might not be undoable 16:38:54 glen: there's the use case where the user doesn't care about preliminary. They just need the final 16:39:16 mbodell: this is more for online improvements 16:39:49 glen: the rerecognize should solve our problem 16:41:04 glen: if you have 8 hour dictation and you have a correction in the first sentences, are you going to send the whole 8 hour? 16:41:26 mbodell: we should put a limit. We can solve this in the protocol 16:42:04 Milan: All I'm asking is a definition of final 16:42:31 bringert: we should define final as something that never changes 16:43:22 Milan: we add a proprietary API it would be a spec violation 16:43:58 mbodell: I'm not sold into final is final but I'm OK 16:44:15 Milan: what would be the language in the spec then? 16:44:50 bringert: Final will be final and in the future we would add a correction event 16:45:29 bringert: we would add another call back 16:46:29 mbodell: it is possible we can represent the result array as read only array in the result event 16:46:38 evt.results would become evt.target.results 16:47:46 bringert: I'm fine with the way is proposed right now 16:48:05 bringert: would this be only in continuous? 16:49:16 mbodell: in one shot the index would not be larger than one 16:50:30 +Michael_Johnston 16:50:49 MJ has joined #htmlspeech 16:51:54 bringert:Maybe we should sent different events for continuous and for one shot 16:52:27 bringert: for one shot works and have to have a boolean 16:53:45 bringert: this is more the state of the request 16:54:24 bringert: the nice thing about having the in the request is that you don't have to look in the array 16:54:45 s/array/event/ 16:55:34 bringert: do we have any outstanding issues? 16:56:10 glen: what's outstanding is that the reco object would look like. Working on the proposal 16:57:00 topic: TTS 16:57:10 TTS element section of proposal: http://lists.w3.org/Archives/Public/public-xg-htmlspeech/2011Sep/att-0008/speechwepapi.html#tts-section 16:58:42 TTS JS API (not really filled in): http://lists.w3.org/Archives/Public/public-xg-htmlspeech/2011Sep/att-0008/speechwepapi.html#speechoutputrequest-section 16:58:54 bringert: this is basically extending the HTML5 media element 17:00:26 mbodell: the media element is missing the mark 17:01:05 bringert: in the proposal we submitted I added another attribute "last mark" 17:01:25 Bjorn's TTS proposal: http://lists.w3.org/Archives/Public/public-xg-htmlspeech/2011Feb/att-0022/htmltts-draft.html 17:02:17 mbodell: the time mark event is something new? 17:02:39 bringert: the last mark is new 17:03:13 bringert: you seek by time. Should last mark update then? 17:04:01 mbodell: in voice XML there's something similar 17:04:31 bringert: you will only update when you get a last mark 17:06:04 bringert: if you want clock time you can store the marks yourself 17:06:12 -Glen_Shires 17:06:58 bringert: What if I want to send text? I guess I can use a data uri 17:07:33 bringert: in my proposal I had a value element 17:08:24 bringert: I would propose to keep the value element in the spec 17:08:53 mbodell: the issue is that media requires a source 17:09:22 bringert: it's a media element we extend 17:11:52 bringert: we have the use case where we need to send the text 17:12:20 bringert: I'm fine with the way it is right now 17:12:36 mbodell: If we can avoid it would be great 17:13:04 bringert: we should loop in the HTML5 WG for advise 17:14:08 bringert: there more language that has to be added 17:14:09 add "Implementations should support at least UTF-8 encoded text/plain and application/ssml+xml. " and othe likewise text 17:15:00 mbodell: do we need a SpeechInput Object? 17:16:33 burn: SSML 1 or SSML 1.1? 17:16:54 bringert: 1.1 is an extension, right? 17:17:26 bringert: We should state that implementation should support 1 and 1.1 17:17:52 burn: 1.1 gives more flexibility and has more inteligence 17:19:25 burn: when is a good time to start the sync discussion between API and protocol? 17:19:34 mbodell: next week 17:19:53 Robert: if people send questions before hand would be better 17:21:33 -Robert_Brown 17:21:34 -Bjorn_Bringert 17:21:35 -Olli_Pettay 17:21:37 -Milan_Young 17:21:43 -Dan_Druta 17:21:45 -Michael_Bodell 17:21:45 -Patrick_Ehlen 17:21:46 -Debbie_Dahl 17:21:48 ddahl has left #htmlspeech 17:21:49 -Dan_Burnett 17:21:58 -Charles_Hemphill 17:22:28 zakim, who's here? 17:22:28 On the phone I see Michael_Johnston 17:22:29 On IRC I see glen, trackbot, Robert, Milan, mbodell, Zakim, RRSAgent, burn, smaug 17:23:18 zakim, bye 17:23:18 leaving. As of this point the attendees were Dan_Burnett, Bjorn_Bringert, Michael_Bodell, Olli_Pettay, Milan_Young, Debbie_Dahl, Dan_Druta, Patrick_Ehlen, +44.794.417.aaaa, 17:23:18 Zakim has left #htmlspeech 17:23:22 ... Charles_Hemphill, Robert_Brown, Glen_Shires, Michael_Johnston 17:23:45 rrsagent, make log public 17:23:49 rrsagent, draft minutes 17:23:49 I have made the request to generate http://www.w3.org/2011/09/29-htmlspeech-minutes.html burn 17:24:10 s/, +44.794.417.aaaa// 17:24:21 rrsagent, draft minutes 17:24:21 I have made the request to generate http://www.w3.org/2011/09/29-htmlspeech-minutes.html burn 17:26:02 s/Patrick_Ehlen,/Patrick_Ehlen, Charles_Hemphill, Robert_Brown, Glen_Shires, Michael_Johnston/ 17:27:57 s/mbodell: do we need a SpeechInput Object?/mbodell: do we need a SpeechOutput object?/ 17:28:03 rrsagent, draft minutes 17:28:03 I have made the request to generate http://www.w3.org/2011/09/29-htmlspeech-minutes.html burn 18:27:38 smaug has joined #htmlspeech