See also: IRC log
<burn> trackbot, start telcon
<burn> Date: 29 September 2011
<burn> Scribe: Dan_Druta
<burn> ScribeNick: DanD
<mbodell> Discussing SpeechInputResults
<mbodell> Bjorn's mail: http://lists.w3.org/Archives/Public/public-xg-htmlspeech/2011Aug/0033.html
<mbodell> Satish's proposal: http://lists.w3.org/Archives/Public/public-xg-htmlspeech/2011Sep/0034.html
mbodell: speech results discussion
<mbodell> My mail: http://lists.w3.org/Archives/Public/public-xg-htmlspeech/2011Sep/0043.html
bringert: I did not understand what the semantics were
mbodell: Three interfaces like in
Bijorn proposal
... You get a speech result and inside you get an array of
results
bringert: so the results are a history
burn: So the result number will not decrease
mbodell: I can decrease if you get corrections
bringert: what is the benefit to have everything?
burn: It's not the history. It's the combined one
mbodell: in a simple case you get a concatenation
bringert: How do you know if this preliminary?
mbodell: we can add a boolean
Charles: Don't you want if it's final
burn: when you get a Boolean in the event that is marked as final it means that particular indexed value in the results array is final
Milan: I can see results coming finalized but the recognizer to shuffle them around
bringert: It's up to recognizer
Milan: let's have an example that shows that chunking
bringert: Why can't we have a preliminary that replaces the previous one until it hits final
Milan: this is more powerful and
gives more flexibility
... I thought the proposal was simplier as was discussed in the
F2F
... Why not the recognizer send another complete hypothesis,
replacing all preceding results?
... The combined result would be more efficient (less
headers)
mbodell: result chunk size is not necessarily the same as finalizing chunk size
Milan: If you are to dictate an
email, are we expecting to have lots of indexes?
... we have an unrealistic example here
Bringert: Michael can you explain
a bit better the chunking?
... From a single result you get a single piece of
semantics
mbodell: What can't you do with
the array?
... in a non continuous world it's not a problem
bringert: how do you see this from the UI point of view?
mbodell: you concatenate them
each time
... The intent was to have the exact sequence
... if you have 3 results and one gets modified you get
them
... you get a normal result anyway
ddahl: how does it work with nbest?
mbodell: I can see an API where
you have results and one is wrong and gets replaces
... I agree it solves the simple use cases and it is more
complicated for others
bringert: How do I know when to to interpret the actions? We should get a final for everything
Charles: the UI wants to show just finals. You want finals to come at a reasonable pace
mbodell: there's a tradeoff
<burn> s/Charles: the UI/Charles: maybe the UI/
Milan: I know it's not going to
change but maybe there's an external input that might change
it
... if the user changes one word or another it might trigger
more changes and might send and updated array
burn: the reason I like final is that I can archive them
glen: If you have a command based
on what the user said it might not be undoable
... there's the use case where the user doesn't care about
preliminary. They just need the final
mbodell: this is more for online improvements
glen: the rerecognize should
solve our problem
... if you have 8 hour dictation and you have a correction in
the first sentences, are you going to send the whole 8
hour?
mbodell: we should put a limit. We can solve this in the protocol
Milan: All I'm asking is a definition of final
bringert: we should define final as something that never changes
Milan: we add a proprietary API it would be a spec violation
mbodell: I'm not sold into final is final but I'm OK
Milan: what would be the language in the spec then?
bringert: Final will be final and
in the future we would add a correction event
... we would add another call back
mbodell: it is possible we can represent the result array as read only array in the result event
<smaug> evt.results would become evt.target.results
bringert: I'm fine with the way
is proposed right now
... would this be only in continuous?
mbodell: in one shot the index would not be larger than one
bringert: Maybe we should sent
different events for continuous and for one shot
... for one shot works and have to have a boolean
... this is more the state of the request
... the nice thing about having the in the request is that you
don't have to look in the event
... do we have any outstanding issues?
glen: what's outstanding is that the reco object would look like. Working on the proposal
<mbodell> TTS element section of proposal: http://lists.w3.org/Archives/Public/public-xg-htmlspeech/2011Sep/att-0008/speechwepapi.html#tts-section
<mbodell> TTS JS API (not really filled in): http://lists.w3.org/Archives/Public/public-xg-htmlspeech/2011Sep/att-0008/speechwepapi.html#speechoutputrequest-section
bringert: this is basically extending the HTML5 media element
mbodell: the media element is missing the mark
bringert: in the proposal we submitted I added another attribute "last mark"
<bringert> Bjorn's TTS proposal: http://lists.w3.org/Archives/Public/public-xg-htmlspeech/2011Feb/att-0022/htmltts-draft.html
mbodell: the time mark event is something new?
bringert: the last mark is
new
... you seek by time. Should last mark update then?
mbodell: in voice XML there's something similar
bringert: you will only update
when you get a last mark
... if you want clock time you can store the marks
yourself
... What if I want to send text? I guess I can use a data
uri
... in my proposal I had a value element
... I would propose to keep the value element in the spec
mbodell: the issue is that media requires a source
bringert: it's a media element we
extend
... we have the use case where we need to send the text
... I'm fine with the way it is right now
mbodell: If we can avoid it would be great
bringert: we should loop in the
HTML5 WG for advise
... there more language that has to be added
<mbodell> add "Implementations should support at least UTF-8 encoded text/plain and application/ssml+xml. " and othe likewise text
mbodell: do we need a SpeechOutput object?
burn: SSML 1 or SSML 1.1?
bringert: 1.1 is an extension,
right?
... We should state that implementation should support 1 and
1.1
burn: 1.1 gives more flexibility
and has more inteligence
... when is a good time to start the sync discussion between
API and protocol?
mbodell: next week
Robert: if people send questions before hand would be better
This is scribe.perl Revision: 1.136 of Date: 2011/05/12 12:01:43 Check for newer version at http://dev.w3.org/cvsweb/~checkout~/2002/scribe/ Guessing input format: RRSAgent_Text_Format (score 1.00) Succeeded: s/burn: when you get a Boolean that is marked as final it should mean that it's final for the end user/burn: when you get a Boolean in the event that is marked as final it means that particular indexed value in the results array is final/ Succeeded: s/send another hypotesis/send another complete hypothesis, replacing all preceding results/ Succeeded: s/mbodell: Better if you send an array/mbodell: result chunk size is not necessarily the same as finalizing chunk size/ FAILED: s/Charles: the UI/Charles: maybe the UI/ Succeeded: s/array/event/ Succeeded: s/, +44.794.417.aaaa// Succeeded: s/Patrick_Ehlen,/Patrick_Ehlen, Charles_Hemphill, Robert_Brown, Glen_Shires, Michael_Johnston/ Succeeded: s/mbodell: do we need a SpeechInput Object?/mbodell: do we need a SpeechOutput object?/ Found Scribe: Dan_Druta Found ScribeNick: DanD Default Present: Dan_Burnett, Bjorn_Bringert, Michael_Bodell, Olli_Pettay, Milan_Young, Debbie_Dahl, Dan_Druta, Patrick_Ehlen, Charles_Hemphill, Robert_Brown, Glen_Shires, Michael_Johnston Present: Dan_Burnett Bjorn_Bringert Michael_Bodell Olli_Pettay Milan_Young Debbie_Dahl Dan_Druta Patrick_Ehlen Charles_Hemphill Robert_Brown Glen_Shires Michael_Johnston Agenda: http://lists.w3.org/Archives/Public/public-xg-htmlspeech/2011Sep/0042.html Found Date: 29 Sep 2011 Guessing minutes URL: http://www.w3.org/2011/09/29-htmlspeech-minutes.html People with action items:[End of scribe.perl diagnostic output]