15:51:23 RRSAgent has joined #htmlspeech
15:51:23 logging to http://www.w3.org/2011/04/28-htmlspeech-irc
15:51:32 Zakim has joined #htmlspeech
15:51:39 trackbot, start telcon
15:51:41 RRSAgent, make logs public
15:51:43 Zakim, this will be
15:51:43 I don't understand 'this will be', trackbot
15:51:44 Meeting: HTML Speech Incubator Group Teleconference
15:51:44 Date: 28 April 2011
15:51:45 zakim, this will be htmlspeech
15:51:45 ok, burn; I see INC_(HTMLSPEECH)12:00PM scheduled to start in 9 minutes
15:51:58 INC_(HTMLSPEECH)12:00PM has now started
15:52:00 +Michael_Bodell
15:52:01 Chair: Dan Burnett
15:52:15 Agenda: http://lists.w3.org/Archives/Public/public-xg-htmlspeech/2011Apr/0059.html
15:53:19 zakim, code?
15:53:19 the conference code is 48657 (tel:+1.617.761.6200 tel:+33.4.26.46.79.03 tel:+44.203.318.0479), burn
15:53:35 +[Voxeo]
15:53:49 zakim, [Voxeo] is Dan_Burnett
15:53:49 +Dan_Burnett; got it
15:53:55 zakim, I am Dan_Burnett
15:53:55 ok, burn, I now associate you with Dan_Burnett
15:57:26 +??P31
15:57:26 -??P31
15:57:26 +??P31
15:57:46 Zakim, ??P31 is Olli_Pettay
15:57:46 +Olli_Pettay; got it
15:57:51 +[Microsoft]
15:58:05 zakim, [Microsoft] is Robert_Brown
15:58:05 +Robert_Brown; got it
15:58:15 Zakim, nick smaug_ is Olli_Pettay
15:58:15 ok, smaug_, I now associate you with Olli_Pettay
15:58:29 Robert has joined #htmlspeech
15:58:36 zakim, nick Robert is Robert_Brown
15:58:36 ok, burn, I now associate Robert with Robert_Brown
15:59:41 +Charles_Hemphill
15:59:41 ddahl has joined #htmlspeech
15:59:45 +Milan_Young
15:59:57 + +1.760.705.aaaa - is perhaps AZ
16:00:22 +Debbie_Dahl
16:00:29 zakim, aaaa is Bjorn_Bringert
16:00:29 sorry, burn, I do not recognize a party named 'aaaa'
16:00:40 Milan has joined #htmlspeech
16:00:40 zakim, nick ddahl is Debbie_Dahl
16:00:40 ok, burn, I now associate ddahl with Debbie_Dahl
16:00:54 Scribe: Robert Brown
16:01:01 ScribeNick: Robert
16:01:13 Charles_Hemphill has joined #htmlspeech
16:03:20 Agenda: http://lists.w3.org/Archives/Public/public-xg-htmlspeech/2011Apr/0059.html
16:03:46 topic: F2F logistics
16:04:02 Bjorn: nothing new logistically
16:04:12 Burn: will send revised schedule
16:04:25 topic: updated report draft
16:04:58 Raj has joined #htmlspeech
16:05:04 final report draft: http://www.w3.org/2005/Incubator/htmlspeech/live/NOTE-htmlspeech-20110426.html
16:05:40 burn: no new comments
16:05:49 zakim, +1.760.705.aaaa is Bjorn_Bringert
16:05:54 topic: new design decisions
16:06:17 Zakim has joined #htmlspeech
16:06:25 zakim, +1.760.705.aaaa is Bjorn_Bringert
16:06:25 sorry, bringert, I do not recognize a party named '+1.760.705.aaaa'
16:06:52 zakim, who's on the phone?
16:06:53 sorry, burn, I don't know what conference this is
16:06:54 On IRC I see Raj, Charles_Hemphill, Milan, ddahl, Robert, RRSAgent, burn, bringert, smaug_, trackbot
16:07:40 zakim, this is htmlspeech
16:07:40 ok, burn; that matches INC_(HTMLSPEECH)12:00PM
16:07:59 zakim, who's on the phone?
16:07:59 On the phone I see Michae, Dan_Burnett, Olli_Pettay, Robert_Brown, Charles_Hemphill, Milan_Young, AZ, Debbie_Dahl, +1.818.237.aaaa
16:08:13 zakim, AZ is now Bjorn_Bringert
16:08:13 I don't understand 'AZ is now Bjorn_Bringert', burn
16:08:59 zakim, AZ is Bjorn_Bringert
16:08:59 +Bjorn_Bringert; got it
16:09:27 +Michael_Johnston
16:09:42 Bjorn: previously only looked at intersection of proposals, is there anything that's in two proposals but not the third. e.g. continuous recognition
16:10:09 +[IPcaller]
16:10:52 zakim, [IPCaller] is Raj_Tumuluri
16:10:52 +Raj_Tumuluri; got it
16:10:55 Milan: any requirement that we support this?
16:11:01 Regrets: Dan Druta
16:11:16 zakim, aaaa is Patrick_Ehlen
16:11:16 +Patrick_Ehlen; got it
16:11:37 zakim, who's on the phone?
16:11:37 On the phone I see Michae, Dan_Burnett, Olli_Pettay, Robert_Brown, Charles_Hemphill, Milan_Young, Bjorn_Bringert, Debbie_Dahl, Patrick_Ehlen, Michael_Johnston, Raj_Tumuluri
16:11:51 zakim, Michae is Michael_Bodell
16:11:51 +Michael_Bodell; got it
16:12:28 Michael_ has joined #htmlspeech
16:12:35 burn: will add continuous recognition to the list of topics to discuss
16:13:33 zakim, who's noisy?
16:13:33 I am sorry, burn; I don't have the necessary resources to track talkers right now
16:13:40 Bjorn: only removed it from Google proposal because difficult to do , and may want to do it in a later version
16:14:51 Michael: recapped two scenarios stated by Bjorn: 1) continuous speech; 2) open mic
16:15:56 Bjorn: proposed that we all agree this is a requirement
16:16:32 Milan: we were vague about what the interim events requirement meant, whether it included results
16:17:18 burn: satish is trying to join, but zakim says the conference code isn't valid
16:17:31 Burn: [after discussion] proposes Michael adds this as a new requirement (or requirements) to the report
16:17:32 zakim, code?
16:17:32 the conference code is 48657 (tel:+1.617.761.6200 tel:+33.4.26.46.79.03 tel:+44.203.318.0479), burn
16:17:56 Michael: sure, but will also check to see whether we just need to clarify an existing requirement
16:18:17 satish has joined #htmlspeech
16:18:41 Bjorn: this is also a design topic
16:20:01 ehlen has joined #htmlspeech
16:20:12 burn: will do
16:20:48 Bjorn: Robert is there anything else in the Microsoft proposal that should be considered as a design decision?
16:21:00 Robert: nothing apparent, will review again in coming week
16:21:22 Bjorn: should we start work on a joint proposal then?
16:22:03 Burn: proposes that we now go to the list of issues to discuss and discuss them
16:22:50 Bjorn: more items for discussion from Microsoft proposal
16:22:55 zakim, bringert is Bjorn_Bringert
16:22:55 sorry, burn, I do not recognize a party named 'bringert'
16:23:08 zakim, nick bringert is Bjorn_Bringert
16:23:08 ok, burn, I now associate bringert with Bjorn_Bringert
16:23:21 zakim, nick burn is Daniel_Burnett
16:23:21 sorry, burn, I do not see a party named 'Daniel_Burnett'
16:23:33 zakim, nick Charles_Hemphill is Charles_Hemphill
16:23:33 ok, burn, I now associate Charles_Hemphill with Charles_Hemphill
16:23:41 ... MS proposal supports multiple grammars, but Google & Mozilla only supports one
16:23:47 zakim, nick ddahl is Debbie_Dahl
16:23:47 ok, burn, I now associate ddahl with Debbie_Dahl
16:24:04 zakim, nick ehlen is Patrick_Ehlen
16:24:04 ok, burn, I now associate ehlen with Patrick_Ehlen
16:24:05 Olli: Mozilla proposal allows multiple parallel recognitions, each with its own grammar
16:24:32 zakim, nick Charles_Hemphill is Charles_Hemphill
16:24:32 ok, burn, I now associate Charles_Hemphill with Charles_Hemphill
16:25:07 MichaelJohnston: can't reference an SLM from SRGS, so multiple grammars are required
16:25:27 Bjorn: proposes topic: Should we support multiple simultaneous grammars?
16:25:32 zakim, nick Milan is Milan_Young
16:25:32 ok, burn, I now associate Milan with Milan_Young
16:25:40 zakim, nick Raj is Raj_Tumuluri
16:25:40 ok, burn, I now associate Raj with Raj_Tumuluri
16:25:47 ... proposes topic: which timeout parameters should we have?
16:25:48 zakim, nick Robert is Robert_Brown
16:25:49 ok, burn, I now associate Robert with Robert_Brown
16:26:05 zakim, nick smaug_ is Olli_Pettay
16:26:07 ok, burn, I now associate smaug_ with Olli_Pettay
16:26:10 yeah, Mozilla proposal should have some timouts
16:26:13 timeouts
16:26:49 Bjorn: emulating speech input is a requirement, but it's only present in the Microsoft proposal
16:27:48 Michael: proposes topic: some way for the application to provide feedback information to the recognizer
16:29:07 Bjorn: does anybody disagree that this is a requirement we agree on?
16:29:43 Burn: proposes requirement: "it must be possible for the application author to provide feedback on the recognition result"
16:30:27 Debbie: need to discuss the result format
16:31:42 Michael: seems like general agreement on EMMA, with notion of other formats available
16:32:07 Olli: EMMA as a DOM document? Or as a JSON object?
16:32:30 MichaelJohnston: multimodal working group has been discussing JSON representations of EMMA
16:32:53 ... there are some issues, such as losing element/attribute distinction
16:33:24 ... straight translation to JSON is a little ugly
16:33:56 Michael: existing proposals include simple representations as alternatives to EMMA
16:34:48 MichaelJohnston: For more nuanced things, let's not reinvent solutions to the problems EMMA already solves
16:35:31 Milan: would rather not have EMMA mean XML, since that implies the app needs a parser
16:35:58 Debbie: sounds like we agree on EMMA, but need to discuss how its represented, simplified formats, etc
16:37:09 Milan: a good idea to agree that an EMMA result available through a DOM object is a baseline agreement
16:37:52 Bjorn: it's okay to provide the EMMA DOM, but we should also have the simple access mechanism that all three proposals have
16:38:08 Burn: would rather have XML or JSON, but not the DOM
16:38:22 Michael: if you have XML, you can feed it into the DOM
16:39:30 Burn: it's a minor objection, if everybody else agrees on the DOM, I'm okay with that
16:39:40 Bjorn: maybe just provide both
16:40:40 MichaelJohnston: EMMA will also help with more sophisticated multimodal apps, for example using ink. The DOM will be more convenient to work with.
16:41:17 Burn: proposed agreement: "both DOM and XML text representations of EMMA must be provided"
16:41:25 ... haven't necessarily agreed that that is all
16:43:17 Bjorn: we already appear to agree, based on proposals: "recognition results must also be available in the javascript objects where the result is a list of recognition result items containing utterance, confidence and interpretation."
16:43:54 Michael: may need to be tweaked to accommodate continuous recognition
16:44:40 Burn: add "at least" to Bjorn's proposed requirement
16:45:24 Burn: added a statement "note that this will need to be adjusted based on any decision regarding support for continuous recognition"
16:45:59 Milan: would like to add a discussion topic around generic parameters to the recognition engine
16:47:19 Burn: related to existing topic on the list, but will add
16:47:46 Milan: also need to agree on standard parameters, such as speed-vs-accuracy
16:48:07 Burn: will generalize the timeouts discussion to include other parameters
16:49:18 MichaelJohnston: which parameters should be expressed in the javascript API, and what can go in the URI? What sorts of conflicts could occur?
16:49:55 Bjorn: URI parameters are engine specific
16:50:44 MichaelJohnston: for example, if we agreed that the way standard parameters are communicated is via the URI, they could come from the URI, or from the Javascript
16:51:35 Michael: need to discuss the API/protocol to the speech engine, and how standard parameters are conveyed
16:52:23 Bjorn: we need to discuss the protocol, it's not in the list
16:53:11 Burn: will add it to the list
16:54:11 Milan: are the grammars referred to by HTTP URI?
16:54:56 Burn: existing requirement says "uri" which was intended to represent URLs and URNs
16:55:16 Milan: would like to mandate that HTTP was for sure supported. there are lots of others that may work.
16:56:26 Robert: should we have a standard set of built-in grammars/topics?
16:56:49 Bjorn: in the Google proposal we had "builtin:" URIs
16:58:03 Burn: "a standard set of common tasks/grammars should be supported. details TBD"
16:58:24 Burn: need a discussion topic about what these are
16:59:11 Robert: what about inline grammars?
16:59:36 Bjorn: data URIs would work for that, and perhaps we should agree about that
16:59:48 Charles: would like to see inline grammars remain on the table
17:00:36 Burn: will add a discussion about inline grammars
17:01:39 Burn: we all agree on the functionality that inline grammars would give
17:02:02 MichaelJohnston: one target user is "mom & pop developers" who would provide simple grammars
17:03:00 Burn: discussion topic: "what is the mechanism for authors to directly include grammars within their HTML document? Is this inline XML, data URI or something else?"
17:03:07 -Patrick_Ehlen
17:04:01 Robert: use case: given that HTML5 supports local storage, the data from which a grammar is constructed may only be located on the local device
17:04:45 Bjorn: proposes that we mandate data URIs, just for consistency with the rest of HTML
17:05:02 Burn: no objections, so will record as an agreement
17:05:50 Michael: need to discuss the ability to do re-recognition
17:06:00 Burn: related to the topic of recognition from a file
17:06:13 Bjorn: both are fine discussion topics
17:08:08 Burn: [discussion about whether there's anything to discuss around endpointing], already implied in existing discussion topic
17:08:40 Bjorn: context block?
17:09:24 Burn: discussion topic: "do we need a recognition context block capability?" and if we end up deciding yes, we'll discuss the mechanism
17:09:40 Milan: how do we specify a default recognizer?
17:09:47 Bjorn: don't specify it at all
17:10:25 ... since it's the default
17:11:13 Michael: need some canonical string to specify user agent default, so we could switch back to it (could be empty string)
17:12:08 ... Whereas how we specify a local one may be similar to the way to specify the remote engine
17:12:39 Bjorn: for local engines do we need to specify the engine or the criteria?
17:12:49 Burn: SSML does it this way
17:13:09 Bjorn: is there a use case for specifying criteria?
17:14:01 Burn: in Tropo API, language specification can specify a specific engine
17:16:03 Burn: this is a scoping issue. e.g. in SSML a voice is used in the scope of the enclosing element
17:16:30 ... in HTML could say that the scope is the input field, or the entire form
17:17:04 Bjorn: in all the proposals, scoping is to a javascript object
17:18:22 Bjorn: are there any other criteria for local recognizers than speed-vs-accuracy?
17:19:22 Charles: different microphones will have different profiles
17:20:01 Raj: how do we discover characteristics of installed engines
17:20:54 Michael: selection = discovery?
17:21:08 Burn: in SSML, some people wanted discovery
17:21:24 Bjorn: use cases?
17:21:47 Michael: selection of existing acoustic and language models
17:22:29 Robert: there's a blurry line between what a recognizer is, and what a parameter is
17:23:26 Michael: topic: "how to specify default recognition"
17:23:38 Michael: topic: "how to specify local recognizers"
17:24:01 Michael: topic: "do we need to specify engines by capability?"
17:25:26 Raj: or "how do we specify the parameters to the local recognizer?"
17:26:08 Burn: want to back up to "what is a recognizer, and what parameters does it need?"
17:26:56 ... call something a recognizer, and call other things related to that a recognizer
17:27:43 Bjorn: the API probably doesn't need to specify a recognizer. speech and parameters go somewhere and results come back
17:29:35 Burn: what is the boundary between selecting a recognizer and selecting the parameters of a recognizer
17:30:04 Milan: we need to discuss audio streaming
17:30:22 Burn: topic: "do we support audio streaming and how?"
17:30:30 Milan: Let's discuss audio streaming
17:30:52 -Bjorn_Bringert
17:30:53 -Olli_Pettay
17:30:53 -Debbie_Dahl
17:30:54 -Milan_Young
17:31:02 -Michael_Bodell
17:31:03 -Raj_Tumuluri
17:31:03 -Michael_Johnston
17:31:08 -Charles_Hemphill
17:31:13 zakim, who's on the phone?
17:31:13 On the phone I see Dan_Burnett, Robert_Brown
17:31:14 -Robert_Brown
17:31:21 -Dan_Burnett
17:31:22 INC_(HTMLSPEECH)12:00PM has ended
17:31:24 Attendees were Dan_Burnett, Olli_Pettay, Robert_Brown, Charles_Hemphill, Milan_Young, Debbie_Dahl, +1.818.237.aaaa, Bjorn_Bringert, Michael_Johnston, Raj_Tumuluri, Patrick_Ehlen,
17:31:26 ... Michael_Bodell
17:31:34 rrsagent, draft minutes
17:31:34 I have made the request to generate http://www.w3.org/2011/04/28-htmlspeech-minutes.html burn
17:31:39 rrsagent, make minutes public
17:31:39 I'm logging. I don't understand 'make minutes public', burn. Try /msg RRSAgent help
17:31:48 rrsagent, make log public
18:45:26 ddahl has left #htmlspeech
19:34:55 Zakim has left #htmlspeech