15:48:12 RRSAgent has joined #htmlspeech
15:48:12 logging to http://www.w3.org/2011/09/29-htmlspeech-irc
15:48:19 Zakim has joined #htmlspeech
15:48:31 trackbot, start telcon
15:48:55 zakim, this will be htmlspeech
15:48:55 ok, burn; I see INC_(HTMLSPEECH)11:30AM scheduled to start 18 minutes ago
15:52:01 Meeting: HTML Speech Incubator Group Teleconference
15:52:12 Date: 29 September 2011
15:55:17 INC_(HTMLSPEECH)11:30AM has now started
15:55:24 +Dan_Burnett
15:57:48 +??P25
15:57:49 mbodell has joined #htmlspeech
15:57:49 zakim, I am Dan_Burnett
15:57:49 ok, burn, I now associate you with Dan_Burnett
15:58:25 zakim, ??P25 is Bjorn_Bringert
15:58:25 +Bjorn_Bringert; got it
15:58:49 +??P30
15:58:53 +Michael_Bodell
15:59:01 Zakim, ??P30 is Olli_Pettay
15:59:02 +Olli_Pettay; got it
15:59:17 Zakim, nick smaug is Olli_Pettay
15:59:17 ok, smaug, I now associate you with Olli_Pettay
16:00:02 +Milan_Young
16:00:03 Milan has joined #HtmlSpeech
16:00:31 +Debbie_Dahl
16:00:36 ddahl has joined #htmlspeech
16:00:46 +Dan_Druta
16:01:13 DanD has joined #htmlspeech
16:01:38 ehlen has joined #htmlspeech
16:03:08 -Bjorn_Bringert
16:03:10 +Patrick_Ehlen
16:03:12 zakim, nick DanD is Dan_Druta
16:03:12 ok, burn, I now associate DanD with Dan_Druta
16:03:43 Scribe: Dan_Druta
16:03:52 ScribeNick: DanD
16:03:56 Agenda: http://lists.w3.org/Archives/Public/public-xg-htmlspeech/2011Sep/0042.html
16:04:20 Chair: Michael_Bodell
16:04:25 + +44.794.417.aaaa
16:04:50 Charles has joined #htmlspeech
16:05:59 +Charles_Hemphill
16:06:26 zakim, aaaa is Bjorn_Bringert
16:06:26 +Bjorn_Bringert; got it
16:07:04 topic: Web API
16:07:24 Discussing SpeechInputResults
16:07:30 Bjorn's mail: http://lists.w3.org/Archives/Public/public-xg-htmlspeech/2011Aug/0033.html
16:07:40 Satish's proposal: http://lists.w3.org/Archives/Public/public-xg-htmlspeech/2011Sep/0034.html
16:07:41 mbodell: speech results discussion
16:07:49 My mail: http://lists.w3.org/Archives/Public/public-xg-htmlspeech/2011Sep/0043.html
16:09:01 bringert: I did not understand what the semantics were
16:10:01 mbodell: Three interfaces like in Bijorn proposal
16:10:44 mbodell: You get a speech result and inside you get an array of results
16:11:22 bringert: so the results are a history
16:11:30 +[Microsoft]
16:11:40 burn:So the result number will not decrease
16:11:48 Robert has joined #htmlspeech
16:12:01 mbodell: I can decrease if you get corrections
16:12:37 bringert: what is the benefit to have everything?
16:12:53 burn: It's not the history. It's the combined one
16:13:16 mbodell: in a simple case you get a concatenation
16:13:49 bringert: How do you know if this preliminary?
16:14:02 mbodell: we can add a boolean
16:14:33 Charles: Don't you want if it's final
16:15:59 burn: when you get a Boolean that is marked as final it should mean that it's final for the end user
16:16:39 Milan: I can see results coming finalized but the recognizer to shuffle them around
16:17:05 bringert: It's up to recognizer
16:17:20 Milan: let's have an example that shows that chunking
16:17:52 bringert: Why can't we have a preliminary that replaces the previous one until it hits final
16:18:14 Milan: this is more powerful and gives more flexibility
16:18:40 Milan: I thought the proposal was simplier as was discussed in the F2F
16:19:05 s/burn: when you get a Boolean that is marked as final it should mean that it's final for the end user/burn: when you get a Boolean in the event that is marked as final it means that particular indexed value in the results array is final/
16:19:27 trackbot has joined #htmlspeech
16:19:49 zakim, [Microsoft] is Robert_Brown
16:19:49 +Robert_Brown; got it
16:20:01 Milan: Why not the recognizer send another hypotesis?
16:21:05 s/send another hypotesis/send another complete hypothesis, replacing all preceding results/
16:21:22 Milan: The combined result would be more efficient (less headers)
16:21:55 mbodell: Better if you send an array
16:22:41 Milan: If you are to dictate an email, are we expecting to have lots of indexes?
16:22:47 s/mbodell: Better if you send an array/mbodell: result chunk size is not necessarily the same as finalizing chunk size/
16:23:10 Milan: we have an unrealistic example here
16:23:47 Bringert: Michael can you explain a bit better the chunking?
16:25:21 bringert: From a single result you get a single piece of semantics
16:26:11 mbodell: What can't you do with the array?
16:26:45 mbodell: in a non continuous world it's not a problem
16:27:07 bringert: how do you see this from the UI point of view?
16:27:19 glen has joined #htmlspeech
16:27:35 mbodell: you concatenate them each time
16:27:46 +Glen_Shires
16:28:01 mbodell: The intent was to have the exact sequence
16:29:10 mbodell: if you have 3 results and one gets modified you get them
16:30:57 mbodell: you get a normal result anyway
16:31:26 ddahl: how does it work with nbest?
16:32:42 mbodell: I can see an API where you have results and one is wrong and gets replaces
16:33:12 mbodell: I agree it solves the simple use cases and it is more complicated for others
16:33:54 bringert: How do I know when to to interpret the actions? We should get a final for everything
16:34:32 Charles: the UI wants to show just finals. You want finals to come at a reasonable pace
16:34:48 mbodell: there's a tradeoff
16:34:54 s/Charles: the UI/Charles: maybe the UI/
16:36:04 Milan: I know it's not going to change but maybe there's an external input that might change it
16:37:01 Milan: if the user changes one word or another it might trigger more changes and might send and updated array
16:37:35 burn: the reason I like final is that I can archive them
16:38:12 glen: If you have a command based on what the user said it might not be undoable
16:38:54 glen: there's the use case where the user doesn't care about preliminary. They just need the final
16:39:16 mbodell: this is more for online improvements
16:39:49 glen: the rerecognize should solve our problem
16:41:04 glen: if you have 8 hour dictation and you have a correction in the first sentences, are you going to send the whole 8 hour?
16:41:26 mbodell: we should put a limit. We can solve this in the protocol
16:42:04 Milan: All I'm asking is a definition of final
16:42:31 bringert: we should define final as something that never changes
16:43:22 Milan: we add a proprietary API it would be a spec violation
16:43:58 mbodell: I'm not sold into final is final but I'm OK
16:44:15 Milan: what would be the language in the spec then?
16:44:50 bringert: Final will be final and in the future we would add a correction event
16:45:29 bringert: we would add another call back
16:46:29 mbodell: it is possible we can represent the result array as read only array in the result event
16:46:38 evt.results would become evt.target.results
16:47:46 bringert: I'm fine with the way is proposed right now
16:48:05 bringert: would this be only in continuous?
16:49:16 mbodell: in one shot the index would not be larger than one
16:50:30 +Michael_Johnston
16:50:49 MJ has joined #htmlspeech
16:51:54 bringert:Maybe we should sent different events for continuous and for one shot
16:52:27 bringert: for one shot works and have to have a boolean
16:53:45 bringert: this is more the state of the request
16:54:24 bringert: the nice thing about having the in the request is that you don't have to look in the array
16:54:45 s/array/event/
16:55:34 bringert: do we have any outstanding issues?
16:56:10 glen: what's outstanding is that the reco object would look like. Working on the proposal
16:57:00 topic: TTS
16:57:10 TTS element section of proposal: http://lists.w3.org/Archives/Public/public-xg-htmlspeech/2011Sep/att-0008/speechwepapi.html#tts-section
16:58:42 TTS JS API (not really filled in): http://lists.w3.org/Archives/Public/public-xg-htmlspeech/2011Sep/att-0008/speechwepapi.html#speechoutputrequest-section
16:58:54 bringert: this is basically extending the HTML5 media element
17:00:26 mbodell: the media element is missing the mark
17:01:05 bringert: in the proposal we submitted I added another attribute "last mark"
17:01:25 Bjorn's TTS proposal: http://lists.w3.org/Archives/Public/public-xg-htmlspeech/2011Feb/att-0022/htmltts-draft.html
17:02:17 mbodell: the time mark event is something new?
17:02:39 bringert: the last mark is new
17:03:13 bringert: you seek by time. Should last mark update then?
17:04:01 mbodell: in voice XML there's something similar
17:04:31 bringert: you will only update when you get a last mark
17:06:04 bringert: if you want clock time you can store the marks yourself
17:06:12 -Glen_Shires
17:06:58 bringert: What if I want to send text? I guess I can use a data uri
17:07:33 bringert: in my proposal I had a value element
17:08:24 bringert: I would propose to keep the value element in the spec
17:08:53 mbodell: the issue is that media requires a source
17:09:22 bringert: it's a media element we extend
17:11:52 bringert: we have the use case where we need to send the text
17:12:20 bringert: I'm fine with the way it is right now
17:12:36 mbodell: If we can avoid it would be great
17:13:04 bringert: we should loop in the HTML5 WG for advise
17:14:08 bringert: there more language that has to be added
17:14:09 add "Implementations should support at least UTF-8 encoded text/plain and application/ssml+xml. " and othe likewise text
17:15:00 mbodell: do we need a SpeechInput Object?
17:16:33 burn: SSML 1 or SSML 1.1?
17:16:54 bringert: 1.1 is an extension, right?
17:17:26 bringert: We should state that implementation should support 1 and 1.1
17:17:52 burn: 1.1 gives more flexibility and has more inteligence
17:19:25 burn: when is a good time to start the sync discussion between API and protocol?
17:19:34 mbodell: next week
17:19:53 Robert: if people send questions before hand would be better
17:21:33 -Robert_Brown
17:21:34 -Bjorn_Bringert
17:21:35 -Olli_Pettay
17:21:37 -Milan_Young
17:21:43 -Dan_Druta
17:21:45 -Michael_Bodell
17:21:45 -Patrick_Ehlen
17:21:46 -Debbie_Dahl
17:21:48 ddahl has left #htmlspeech
17:21:49 -Dan_Burnett
17:21:58 -Charles_Hemphill
17:22:28 zakim, who's here?
17:22:28 On the phone I see Michael_Johnston
17:22:29 On IRC I see glen, trackbot, Robert, Milan, mbodell, Zakim, RRSAgent, burn, smaug
17:23:18 zakim, bye
17:23:18 leaving. As of this point the attendees were Dan_Burnett, Bjorn_Bringert, Michael_Bodell, Olli_Pettay, Milan_Young, Debbie_Dahl, Dan_Druta, Patrick_Ehlen, +44.794.417.aaaa,
17:23:18 Zakim has left #htmlspeech
17:23:22 ... Charles_Hemphill, Robert_Brown, Glen_Shires, Michael_Johnston
17:23:45 rrsagent, make log public
17:23:49 rrsagent, draft minutes
17:23:49 I have made the request to generate http://www.w3.org/2011/09/29-htmlspeech-minutes.html burn
17:24:10 s/, +44.794.417.aaaa//
17:24:21 rrsagent, draft minutes
17:24:21 I have made the request to generate http://www.w3.org/2011/09/29-htmlspeech-minutes.html burn
17:26:02 s/Patrick_Ehlen,/Patrick_Ehlen, Charles_Hemphill, Robert_Brown, Glen_Shires, Michael_Johnston/
17:27:57 s/mbodell: do we need a SpeechInput Object?/mbodell: do we need a SpeechOutput object?/
17:28:03 rrsagent, draft minutes
17:28:03 I have made the request to generate http://www.w3.org/2011/09/29-htmlspeech-minutes.html burn
18:27:38 smaug has joined #htmlspeech