15:51:23 RRSAgent has joined #htmlspeech 15:51:23 logging to http://www.w3.org/2011/04/28-htmlspeech-irc 15:51:32 Zakim has joined #htmlspeech 15:51:39 trackbot, start telcon 15:51:41 RRSAgent, make logs public 15:51:43 Zakim, this will be 15:51:43 I don't understand 'this will be', trackbot 15:51:44 Meeting: HTML Speech Incubator Group Teleconference 15:51:44 Date: 28 April 2011 15:51:45 zakim, this will be htmlspeech 15:51:45 ok, burn; I see INC_(HTMLSPEECH)12:00PM scheduled to start in 9 minutes 15:51:58 INC_(HTMLSPEECH)12:00PM has now started 15:52:00 +Michael_Bodell 15:52:01 Chair: Dan Burnett 15:52:15 Agenda: http://lists.w3.org/Archives/Public/public-xg-htmlspeech/2011Apr/0059.html 15:53:19 zakim, code? 15:53:19 the conference code is 48657 (tel:+1.617.761.6200 tel:+33.4.26.46.79.03 tel:+44.203.318.0479), burn 15:53:35 +[Voxeo] 15:53:49 zakim, [Voxeo] is Dan_Burnett 15:53:49 +Dan_Burnett; got it 15:53:55 zakim, I am Dan_Burnett 15:53:55 ok, burn, I now associate you with Dan_Burnett 15:57:26 +??P31 15:57:26 -??P31 15:57:26 +??P31 15:57:46 Zakim, ??P31 is Olli_Pettay 15:57:46 +Olli_Pettay; got it 15:57:51 +[Microsoft] 15:58:05 zakim, [Microsoft] is Robert_Brown 15:58:05 +Robert_Brown; got it 15:58:15 Zakim, nick smaug_ is Olli_Pettay 15:58:15 ok, smaug_, I now associate you with Olli_Pettay 15:58:29 Robert has joined #htmlspeech 15:58:36 zakim, nick Robert is Robert_Brown 15:58:36 ok, burn, I now associate Robert with Robert_Brown 15:59:41 +Charles_Hemphill 15:59:41 ddahl has joined #htmlspeech 15:59:45 +Milan_Young 15:59:57 + +1.760.705.aaaa - is perhaps AZ 16:00:22 +Debbie_Dahl 16:00:29 zakim, aaaa is Bjorn_Bringert 16:00:29 sorry, burn, I do not recognize a party named 'aaaa' 16:00:40 Milan has joined #htmlspeech 16:00:40 zakim, nick ddahl is Debbie_Dahl 16:00:40 ok, burn, I now associate ddahl with Debbie_Dahl 16:00:54 Scribe: Robert Brown 16:01:01 ScribeNick: Robert 16:01:13 Charles_Hemphill has joined #htmlspeech 16:03:20 Agenda: http://lists.w3.org/Archives/Public/public-xg-htmlspeech/2011Apr/0059.html 16:03:46 topic: F2F logistics 16:04:02 Bjorn: nothing new logistically 16:04:12 Burn: will send revised schedule 16:04:25 topic: updated report draft 16:04:58 Raj has joined #htmlspeech 16:05:04 final report draft: http://www.w3.org/2005/Incubator/htmlspeech/live/NOTE-htmlspeech-20110426.html 16:05:40 burn: no new comments 16:05:49 zakim, +1.760.705.aaaa is Bjorn_Bringert 16:05:54 topic: new design decisions 16:06:17 Zakim has joined #htmlspeech 16:06:25 zakim, +1.760.705.aaaa is Bjorn_Bringert 16:06:25 sorry, bringert, I do not recognize a party named '+1.760.705.aaaa' 16:06:52 zakim, who's on the phone? 16:06:53 sorry, burn, I don't know what conference this is 16:06:54 On IRC I see Raj, Charles_Hemphill, Milan, ddahl, Robert, RRSAgent, burn, bringert, smaug_, trackbot 16:07:40 zakim, this is htmlspeech 16:07:40 ok, burn; that matches INC_(HTMLSPEECH)12:00PM 16:07:59 zakim, who's on the phone? 16:07:59 On the phone I see Michae, Dan_Burnett, Olli_Pettay, Robert_Brown, Charles_Hemphill, Milan_Young, AZ, Debbie_Dahl, +1.818.237.aaaa 16:08:13 zakim, AZ is now Bjorn_Bringert 16:08:13 I don't understand 'AZ is now Bjorn_Bringert', burn 16:08:59 zakim, AZ is Bjorn_Bringert 16:08:59 +Bjorn_Bringert; got it 16:09:27 +Michael_Johnston 16:09:42 Bjorn: previously only looked at intersection of proposals, is there anything that's in two proposals but not the third. e.g. continuous recognition 16:10:09 +[IPcaller] 16:10:52 zakim, [IPCaller] is Raj_Tumuluri 16:10:52 +Raj_Tumuluri; got it 16:10:55 Milan: any requirement that we support this? 16:11:01 Regrets: Dan Druta 16:11:16 zakim, aaaa is Patrick_Ehlen 16:11:16 +Patrick_Ehlen; got it 16:11:37 zakim, who's on the phone? 16:11:37 On the phone I see Michae, Dan_Burnett, Olli_Pettay, Robert_Brown, Charles_Hemphill, Milan_Young, Bjorn_Bringert, Debbie_Dahl, Patrick_Ehlen, Michael_Johnston, Raj_Tumuluri 16:11:51 zakim, Michae is Michael_Bodell 16:11:51 +Michael_Bodell; got it 16:12:28 Michael_ has joined #htmlspeech 16:12:35 burn: will add continuous recognition to the list of topics to discuss 16:13:33 zakim, who's noisy? 16:13:33 I am sorry, burn; I don't have the necessary resources to track talkers right now 16:13:40 Bjorn: only removed it from Google proposal because difficult to do , and may want to do it in a later version 16:14:51 Michael: recapped two scenarios stated by Bjorn: 1) continuous speech; 2) open mic 16:15:56 Bjorn: proposed that we all agree this is a requirement 16:16:32 Milan: we were vague about what the interim events requirement meant, whether it included results 16:17:18 burn: satish is trying to join, but zakim says the conference code isn't valid 16:17:31 Burn: [after discussion] proposes Michael adds this as a new requirement (or requirements) to the report 16:17:32 zakim, code? 16:17:32 the conference code is 48657 (tel:+1.617.761.6200 tel:+33.4.26.46.79.03 tel:+44.203.318.0479), burn 16:17:56 Michael: sure, but will also check to see whether we just need to clarify an existing requirement 16:18:17 satish has joined #htmlspeech 16:18:41 Bjorn: this is also a design topic 16:20:01 ehlen has joined #htmlspeech 16:20:12 burn: will do 16:20:48 Bjorn: Robert is there anything else in the Microsoft proposal that should be considered as a design decision? 16:21:00 Robert: nothing apparent, will review again in coming week 16:21:22 Bjorn: should we start work on a joint proposal then? 16:22:03 Burn: proposes that we now go to the list of issues to discuss and discuss them 16:22:50 Bjorn: more items for discussion from Microsoft proposal 16:22:55 zakim, bringert is Bjorn_Bringert 16:22:55 sorry, burn, I do not recognize a party named 'bringert' 16:23:08 zakim, nick bringert is Bjorn_Bringert 16:23:08 ok, burn, I now associate bringert with Bjorn_Bringert 16:23:21 zakim, nick burn is Daniel_Burnett 16:23:21 sorry, burn, I do not see a party named 'Daniel_Burnett' 16:23:33 zakim, nick Charles_Hemphill is Charles_Hemphill 16:23:33 ok, burn, I now associate Charles_Hemphill with Charles_Hemphill 16:23:41 ... MS proposal supports multiple grammars, but Google & Mozilla only supports one 16:23:47 zakim, nick ddahl is Debbie_Dahl 16:23:47 ok, burn, I now associate ddahl with Debbie_Dahl 16:24:04 zakim, nick ehlen is Patrick_Ehlen 16:24:04 ok, burn, I now associate ehlen with Patrick_Ehlen 16:24:05 Olli: Mozilla proposal allows multiple parallel recognitions, each with its own grammar 16:24:32 zakim, nick Charles_Hemphill is Charles_Hemphill 16:24:32 ok, burn, I now associate Charles_Hemphill with Charles_Hemphill 16:25:07 MichaelJohnston: can't reference an SLM from SRGS, so multiple grammars are required 16:25:27 Bjorn: proposes topic: Should we support multiple simultaneous grammars? 16:25:32 zakim, nick Milan is Milan_Young 16:25:32 ok, burn, I now associate Milan with Milan_Young 16:25:40 zakim, nick Raj is Raj_Tumuluri 16:25:40 ok, burn, I now associate Raj with Raj_Tumuluri 16:25:47 ... proposes topic: which timeout parameters should we have? 16:25:48 zakim, nick Robert is Robert_Brown 16:25:49 ok, burn, I now associate Robert with Robert_Brown 16:26:05 zakim, nick smaug_ is Olli_Pettay 16:26:07 ok, burn, I now associate smaug_ with Olli_Pettay 16:26:10 yeah, Mozilla proposal should have some timouts 16:26:13 timeouts 16:26:49 Bjorn: emulating speech input is a requirement, but it's only present in the Microsoft proposal 16:27:48 Michael: proposes topic: some way for the application to provide feedback information to the recognizer 16:29:07 Bjorn: does anybody disagree that this is a requirement we agree on? 16:29:43 Burn: proposes requirement: "it must be possible for the application author to provide feedback on the recognition result" 16:30:27 Debbie: need to discuss the result format 16:31:42 Michael: seems like general agreement on EMMA, with notion of other formats available 16:32:07 Olli: EMMA as a DOM document? Or as a JSON object? 16:32:30 MichaelJohnston: multimodal working group has been discussing JSON representations of EMMA 16:32:53 ... there are some issues, such as losing element/attribute distinction 16:33:24 ... straight translation to JSON is a little ugly 16:33:56 Michael: existing proposals include simple representations as alternatives to EMMA 16:34:48 MichaelJohnston: For more nuanced things, let's not reinvent solutions to the problems EMMA already solves 16:35:31 Milan: would rather not have EMMA mean XML, since that implies the app needs a parser 16:35:58 Debbie: sounds like we agree on EMMA, but need to discuss how its represented, simplified formats, etc 16:37:09 Milan: a good idea to agree that an EMMA result available through a DOM object is a baseline agreement 16:37:52 Bjorn: it's okay to provide the EMMA DOM, but we should also have the simple access mechanism that all three proposals have 16:38:08 Burn: would rather have XML or JSON, but not the DOM 16:38:22 Michael: if you have XML, you can feed it into the DOM 16:39:30 Burn: it's a minor objection, if everybody else agrees on the DOM, I'm okay with that 16:39:40 Bjorn: maybe just provide both 16:40:40 MichaelJohnston: EMMA will also help with more sophisticated multimodal apps, for example using ink. The DOM will be more convenient to work with. 16:41:17 Burn: proposed agreement: "both DOM and XML text representations of EMMA must be provided" 16:41:25 ... haven't necessarily agreed that that is all 16:43:17 Bjorn: we already appear to agree, based on proposals: "recognition results must also be available in the javascript objects where the result is a list of recognition result items containing utterance, confidence and interpretation." 16:43:54 Michael: may need to be tweaked to accommodate continuous recognition 16:44:40 Burn: add "at least" to Bjorn's proposed requirement 16:45:24 Burn: added a statement "note that this will need to be adjusted based on any decision regarding support for continuous recognition" 16:45:59 Milan: would like to add a discussion topic around generic parameters to the recognition engine 16:47:19 Burn: related to existing topic on the list, but will add 16:47:46 Milan: also need to agree on standard parameters, such as speed-vs-accuracy 16:48:07 Burn: will generalize the timeouts discussion to include other parameters 16:49:18 MichaelJohnston: which parameters should be expressed in the javascript API, and what can go in the URI? What sorts of conflicts could occur? 16:49:55 Bjorn: URI parameters are engine specific 16:50:44 MichaelJohnston: for example, if we agreed that the way standard parameters are communicated is via the URI, they could come from the URI, or from the Javascript 16:51:35 Michael: need to discuss the API/protocol to the speech engine, and how standard parameters are conveyed 16:52:23 Bjorn: we need to discuss the protocol, it's not in the list 16:53:11 Burn: will add it to the list 16:54:11 Milan: are the grammars referred to by HTTP URI? 16:54:56 Burn: existing requirement says "uri" which was intended to represent URLs and URNs 16:55:16 Milan: would like to mandate that HTTP was for sure supported. there are lots of others that may work. 16:56:26 Robert: should we have a standard set of built-in grammars/topics? 16:56:49 Bjorn: in the Google proposal we had "builtin:" URIs 16:58:03 Burn: "a standard set of common tasks/grammars should be supported. details TBD" 16:58:24 Burn: need a discussion topic about what these are 16:59:11 Robert: what about inline grammars? 16:59:36 Bjorn: data URIs would work for that, and perhaps we should agree about that 16:59:48 Charles: would like to see inline grammars remain on the table 17:00:36 Burn: will add a discussion about inline grammars 17:01:39 Burn: we all agree on the functionality that inline grammars would give 17:02:02 MichaelJohnston: one target user is "mom & pop developers" who would provide simple grammars 17:03:00 Burn: discussion topic: "what is the mechanism for authors to directly include grammars within their HTML document? Is this inline XML, data URI or something else?" 17:03:07 -Patrick_Ehlen 17:04:01 Robert: use case: given that HTML5 supports local storage, the data from which a grammar is constructed may only be located on the local device 17:04:45 Bjorn: proposes that we mandate data URIs, just for consistency with the rest of HTML 17:05:02 Burn: no objections, so will record as an agreement 17:05:50 Michael: need to discuss the ability to do re-recognition 17:06:00 Burn: related to the topic of recognition from a file 17:06:13 Bjorn: both are fine discussion topics 17:08:08 Burn: [discussion about whether there's anything to discuss around endpointing], already implied in existing discussion topic 17:08:40 Bjorn: context block? 17:09:24 Burn: discussion topic: "do we need a recognition context block capability?" and if we end up deciding yes, we'll discuss the mechanism 17:09:40 Milan: how do we specify a default recognizer? 17:09:47 Bjorn: don't specify it at all 17:10:25 ... since it's the default 17:11:13 Michael: need some canonical string to specify user agent default, so we could switch back to it (could be empty string) 17:12:08 ... Whereas how we specify a local one may be similar to the way to specify the remote engine 17:12:39 Bjorn: for local engines do we need to specify the engine or the criteria? 17:12:49 Burn: SSML does it this way 17:13:09 Bjorn: is there a use case for specifying criteria? 17:14:01 Burn: in Tropo API, language specification can specify a specific engine 17:16:03 Burn: this is a scoping issue. e.g. in SSML a voice is used in the scope of the enclosing element 17:16:30 ... in HTML could say that the scope is the input field, or the entire form 17:17:04 Bjorn: in all the proposals, scoping is to a javascript object 17:18:22 Bjorn: are there any other criteria for local recognizers than speed-vs-accuracy? 17:19:22 Charles: different microphones will have different profiles 17:20:01 Raj: how do we discover characteristics of installed engines 17:20:54 Michael: selection = discovery? 17:21:08 Burn: in SSML, some people wanted discovery 17:21:24 Bjorn: use cases? 17:21:47 Michael: selection of existing acoustic and language models 17:22:29 Robert: there's a blurry line between what a recognizer is, and what a parameter is 17:23:26 Michael: topic: "how to specify default recognition" 17:23:38 Michael: topic: "how to specify local recognizers" 17:24:01 Michael: topic: "do we need to specify engines by capability?" 17:25:26 Raj: or "how do we specify the parameters to the local recognizer?" 17:26:08 Burn: want to back up to "what is a recognizer, and what parameters does it need?" 17:26:56 ... call something a recognizer, and call other things related to that a recognizer 17:27:43 Bjorn: the API probably doesn't need to specify a recognizer. speech and parameters go somewhere and results come back 17:29:35 Burn: what is the boundary between selecting a recognizer and selecting the parameters of a recognizer 17:30:04 Milan: we need to discuss audio streaming 17:30:22 Burn: topic: "do we support audio streaming and how?" 17:30:30 Milan: Let's discuss audio streaming 17:30:52 -Bjorn_Bringert 17:30:53 -Olli_Pettay 17:30:53 -Debbie_Dahl 17:30:54 -Milan_Young 17:31:02 -Michael_Bodell 17:31:03 -Raj_Tumuluri 17:31:03 -Michael_Johnston 17:31:08 -Charles_Hemphill 17:31:13 zakim, who's on the phone? 17:31:13 On the phone I see Dan_Burnett, Robert_Brown 17:31:14 -Robert_Brown 17:31:21 -Dan_Burnett 17:31:22 INC_(HTMLSPEECH)12:00PM has ended 17:31:24 Attendees were Dan_Burnett, Olli_Pettay, Robert_Brown, Charles_Hemphill, Milan_Young, Debbie_Dahl, +1.818.237.aaaa, Bjorn_Bringert, Michael_Johnston, Raj_Tumuluri, Patrick_Ehlen, 17:31:26 ... Michael_Bodell 17:31:34 rrsagent, draft minutes 17:31:34 I have made the request to generate http://www.w3.org/2011/04/28-htmlspeech-minutes.html burn 17:31:39 rrsagent, make minutes public 17:31:39 I'm logging. I don't understand 'make minutes public', burn. Try /msg RRSAgent help 17:31:48 rrsagent, make log public 18:45:26 ddahl has left #htmlspeech 19:34:55 Zakim has left #htmlspeech