IRC log of htmlspeech on 2011-04-28
Timestamps are in UTC.
- 15:51:23 [RRSAgent]
- RRSAgent has joined #htmlspeech
- 15:51:23 [RRSAgent]
- logging to http://www.w3.org/2011/04/28-htmlspeech-irc
- 15:51:32 [Zakim]
- Zakim has joined #htmlspeech
- 15:51:39 [burn]
- trackbot, start telcon
- 15:51:41 [trackbot]
- RRSAgent, make logs public
- 15:51:43 [trackbot]
- Zakim, this will be
- 15:51:43 [Zakim]
- I don't understand 'this will be', trackbot
- 15:51:44 [trackbot]
- Meeting: HTML Speech Incubator Group Teleconference
- 15:51:44 [trackbot]
- Date: 28 April 2011
- 15:51:45 [burn]
- zakim, this will be htmlspeech
- 15:51:45 [Zakim]
- ok, burn; I see INC_(HTMLSPEECH)12:00PM scheduled to start in 9 minutes
- 15:51:58 [Zakim]
- INC_(HTMLSPEECH)12:00PM has now started
- 15:52:00 [Zakim]
- +Michael_Bodell
- 15:52:01 [burn]
- Chair: Dan Burnett
- 15:52:15 [burn]
- Agenda: http://lists.w3.org/Archives/Public/public-xg-htmlspeech/2011Apr/0059.html
- 15:53:19 [burn]
- zakim, code?
- 15:53:19 [Zakim]
- the conference code is 48657 (tel:+1.617.761.6200 tel:+33.4.26.46.79.03 tel:+44.203.318.0479), burn
- 15:53:35 [Zakim]
- +[Voxeo]
- 15:53:49 [burn]
- zakim, [Voxeo] is Dan_Burnett
- 15:53:49 [Zakim]
- +Dan_Burnett; got it
- 15:53:55 [burn]
- zakim, I am Dan_Burnett
- 15:53:55 [Zakim]
- ok, burn, I now associate you with Dan_Burnett
- 15:57:26 [Zakim]
- +??P31
- 15:57:26 [Zakim]
- -??P31
- 15:57:26 [Zakim]
- +??P31
- 15:57:46 [smaug_]
- Zakim, ??P31 is Olli_Pettay
- 15:57:46 [Zakim]
- +Olli_Pettay; got it
- 15:57:51 [Zakim]
- +[Microsoft]
- 15:58:05 [burn]
- zakim, [Microsoft] is Robert_Brown
- 15:58:05 [Zakim]
- +Robert_Brown; got it
- 15:58:15 [smaug_]
- Zakim, nick smaug_ is Olli_Pettay
- 15:58:15 [Zakim]
- ok, smaug_, I now associate you with Olli_Pettay
- 15:58:29 [Robert]
- Robert has joined #htmlspeech
- 15:58:36 [burn]
- zakim, nick Robert is Robert_Brown
- 15:58:36 [Zakim]
- ok, burn, I now associate Robert with Robert_Brown
- 15:59:41 [Zakim]
- +Charles_Hemphill
- 15:59:41 [ddahl]
- ddahl has joined #htmlspeech
- 15:59:45 [Zakim]
- +Milan_Young
- 15:59:57 [Zakim]
- + +1.760.705.aaaa - is perhaps AZ
- 16:00:22 [Zakim]
- +Debbie_Dahl
- 16:00:29 [burn]
- zakim, aaaa is Bjorn_Bringert
- 16:00:29 [Zakim]
- sorry, burn, I do not recognize a party named 'aaaa'
- 16:00:40 [Milan]
- Milan has joined #htmlspeech
- 16:00:40 [burn]
- zakim, nick ddahl is Debbie_Dahl
- 16:00:40 [Zakim]
- ok, burn, I now associate ddahl with Debbie_Dahl
- 16:00:54 [burn]
- Scribe: Robert Brown
- 16:01:01 [burn]
- ScribeNick: Robert
- 16:01:13 [Charles_Hemphill]
- Charles_Hemphill has joined #htmlspeech
- 16:03:20 [burn]
- Agenda: http://lists.w3.org/Archives/Public/public-xg-htmlspeech/2011Apr/0059.html
- 16:03:46 [Robert]
- topic: F2F logistics
- 16:04:02 [Robert]
- Bjorn: nothing new logistically
- 16:04:12 [Robert]
- Burn: will send revised schedule
- 16:04:25 [Robert]
- topic: updated report draft
- 16:04:58 [Raj]
- Raj has joined #htmlspeech
- 16:05:04 [burn]
- final report draft: http://www.w3.org/2005/Incubator/htmlspeech/live/NOTE-htmlspeech-20110426.html
- 16:05:40 [Robert]
- burn: no new comments
- 16:05:49 [bringert]
- zakim, +1.760.705.aaaa is Bjorn_Bringert
- 16:05:54 [Robert]
- topic: new design decisions
- 16:06:17 [Zakim]
- Zakim has joined #htmlspeech
- 16:06:25 [bringert]
- zakim, +1.760.705.aaaa is Bjorn_Bringert
- 16:06:25 [Zakim]
- sorry, bringert, I do not recognize a party named '+1.760.705.aaaa'
- 16:06:52 [burn]
- zakim, who's on the phone?
- 16:06:53 [Zakim]
- sorry, burn, I don't know what conference this is
- 16:06:54 [Zakim]
- On IRC I see Raj, Charles_Hemphill, Milan, ddahl, Robert, RRSAgent, burn, bringert, smaug_, trackbot
- 16:07:40 [burn]
- zakim, this is htmlspeech
- 16:07:40 [Zakim]
- ok, burn; that matches INC_(HTMLSPEECH)12:00PM
- 16:07:59 [burn]
- zakim, who's on the phone?
- 16:07:59 [Zakim]
- On the phone I see Michae, Dan_Burnett, Olli_Pettay, Robert_Brown, Charles_Hemphill, Milan_Young, AZ, Debbie_Dahl, +1.818.237.aaaa
- 16:08:13 [burn]
- zakim, AZ is now Bjorn_Bringert
- 16:08:13 [Zakim]
- I don't understand 'AZ is now Bjorn_Bringert', burn
- 16:08:59 [burn]
- zakim, AZ is Bjorn_Bringert
- 16:08:59 [Zakim]
- +Bjorn_Bringert; got it
- 16:09:27 [Zakim]
- +Michael_Johnston
- 16:09:42 [Robert]
- Bjorn: previously only looked at intersection of proposals, is there anything that's in two proposals but not the third. e.g. continuous recognition
- 16:10:09 [Zakim]
- +[IPcaller]
- 16:10:52 [burn]
- zakim, [IPCaller] is Raj_Tumuluri
- 16:10:52 [Zakim]
- +Raj_Tumuluri; got it
- 16:10:55 [Robert]
- Milan: any requirement that we support this?
- 16:11:01 [burn]
- Regrets: Dan Druta
- 16:11:16 [burn]
- zakim, aaaa is Patrick_Ehlen
- 16:11:16 [Zakim]
- +Patrick_Ehlen; got it
- 16:11:37 [burn]
- zakim, who's on the phone?
- 16:11:37 [Zakim]
- On the phone I see Michae, Dan_Burnett, Olli_Pettay, Robert_Brown, Charles_Hemphill, Milan_Young, Bjorn_Bringert, Debbie_Dahl, Patrick_Ehlen, Michael_Johnston, Raj_Tumuluri
- 16:11:51 [burn]
- zakim, Michae is Michael_Bodell
- 16:11:51 [Zakim]
- +Michael_Bodell; got it
- 16:12:28 [Michael_]
- Michael_ has joined #htmlspeech
- 16:12:35 [Robert]
- burn: will add continuous recognition to the list of topics to discuss
- 16:13:33 [burn]
- zakim, who's noisy?
- 16:13:33 [Zakim]
- I am sorry, burn; I don't have the necessary resources to track talkers right now
- 16:13:40 [Robert]
- Bjorn: only removed it from Google proposal because difficult to do , and may want to do it in a later version
- 16:14:51 [Robert]
- Michael: recapped two scenarios stated by Bjorn: 1) continuous speech; 2) open mic
- 16:15:56 [Robert]
- Bjorn: proposed that we all agree this is a requirement
- 16:16:32 [Robert]
- Milan: we were vague about what the interim events requirement meant, whether it included results
- 16:17:18 [bringert]
- burn: satish is trying to join, but zakim says the conference code isn't valid
- 16:17:31 [Robert]
- Burn: [after discussion] proposes Michael adds this as a new requirement (or requirements) to the report
- 16:17:32 [burn]
- zakim, code?
- 16:17:32 [Zakim]
- the conference code is 48657 (tel:+1.617.761.6200 tel:+33.4.26.46.79.03 tel:+44.203.318.0479), burn
- 16:17:56 [Robert]
- Michael: sure, but will also check to see whether we just need to clarify an existing requirement
- 16:18:17 [satish]
- satish has joined #htmlspeech
- 16:18:41 [Robert]
- Bjorn: this is also a design topic
- 16:20:01 [ehlen]
- ehlen has joined #htmlspeech
- 16:20:12 [satish]
- burn: will do
- 16:20:48 [Robert]
- Bjorn: Robert is there anything else in the Microsoft proposal that should be considered as a design decision?
- 16:21:00 [Robert]
- Robert: nothing apparent, will review again in coming week
- 16:21:22 [Robert]
- Bjorn: should we start work on a joint proposal then?
- 16:22:03 [Robert]
- Burn: proposes that we now go to the list of issues to discuss and discuss them
- 16:22:50 [Robert]
- Bjorn: more items for discussion from Microsoft proposal
- 16:22:55 [burn]
- zakim, bringert is Bjorn_Bringert
- 16:22:55 [Zakim]
- sorry, burn, I do not recognize a party named 'bringert'
- 16:23:08 [burn]
- zakim, nick bringert is Bjorn_Bringert
- 16:23:08 [Zakim]
- ok, burn, I now associate bringert with Bjorn_Bringert
- 16:23:21 [burn]
- zakim, nick burn is Daniel_Burnett
- 16:23:21 [Zakim]
- sorry, burn, I do not see a party named 'Daniel_Burnett'
- 16:23:33 [burn]
- zakim, nick Charles_Hemphill is Charles_Hemphill
- 16:23:33 [Zakim]
- ok, burn, I now associate Charles_Hemphill with Charles_Hemphill
- 16:23:41 [Robert]
- ... MS proposal supports multiple grammars, but Google & Mozilla only supports one
- 16:23:47 [burn]
- zakim, nick ddahl is Debbie_Dahl
- 16:23:47 [Zakim]
- ok, burn, I now associate ddahl with Debbie_Dahl
- 16:24:04 [burn]
- zakim, nick ehlen is Patrick_Ehlen
- 16:24:04 [Zakim]
- ok, burn, I now associate ehlen with Patrick_Ehlen
- 16:24:05 [Robert]
- Olli: Mozilla proposal allows multiple parallel recognitions, each with its own grammar
- 16:24:32 [burn]
- zakim, nick Charles_Hemphill is Charles_Hemphill
- 16:24:32 [Zakim]
- ok, burn, I now associate Charles_Hemphill with Charles_Hemphill
- 16:25:07 [Robert]
- MichaelJohnston: can't reference an SLM from SRGS, so multiple grammars are required
- 16:25:27 [Robert]
- Bjorn: proposes topic: Should we support multiple simultaneous grammars?
- 16:25:32 [burn]
- zakim, nick Milan is Milan_Young
- 16:25:32 [Zakim]
- ok, burn, I now associate Milan with Milan_Young
- 16:25:40 [burn]
- zakim, nick Raj is Raj_Tumuluri
- 16:25:40 [Zakim]
- ok, burn, I now associate Raj with Raj_Tumuluri
- 16:25:47 [Robert]
- ... proposes topic: which timeout parameters should we have?
- 16:25:48 [burn]
- zakim, nick Robert is Robert_Brown
- 16:25:49 [Zakim]
- ok, burn, I now associate Robert with Robert_Brown
- 16:26:05 [burn]
- zakim, nick smaug_ is Olli_Pettay
- 16:26:07 [Zakim]
- ok, burn, I now associate smaug_ with Olli_Pettay
- 16:26:10 [smaug_]
- yeah, Mozilla proposal should have some timouts
- 16:26:13 [smaug_]
- timeouts
- 16:26:49 [Robert]
- Bjorn: emulating speech input is a requirement, but it's only present in the Microsoft proposal
- 16:27:48 [Robert]
- Michael: proposes topic: some way for the application to provide feedback information to the recognizer
- 16:29:07 [Robert]
- Bjorn: does anybody disagree that this is a requirement we agree on?
- 16:29:43 [Robert]
- Burn: proposes requirement: "it must be possible for the application author to provide feedback on the recognition result"
- 16:30:27 [Robert]
- Debbie: need to discuss the result format
- 16:31:42 [Robert]
- Michael: seems like general agreement on EMMA, with notion of other formats available
- 16:32:07 [Robert]
- Olli: EMMA as a DOM document? Or as a JSON object?
- 16:32:30 [Robert]
- MichaelJohnston: multimodal working group has been discussing JSON representations of EMMA
- 16:32:53 [Robert]
- ... there are some issues, such as losing element/attribute distinction
- 16:33:24 [Robert]
- ... straight translation to JSON is a little ugly
- 16:33:56 [Robert]
- Michael: existing proposals include simple representations as alternatives to EMMA
- 16:34:48 [Robert]
- MichaelJohnston: For more nuanced things, let's not reinvent solutions to the problems EMMA already solves
- 16:35:31 [Robert]
- Milan: would rather not have EMMA mean XML, since that implies the app needs a parser
- 16:35:58 [Robert]
- Debbie: sounds like we agree on EMMA, but need to discuss how its represented, simplified formats, etc
- 16:37:09 [Robert]
- Milan: a good idea to agree that an EMMA result available through a DOM object is a baseline agreement
- 16:37:52 [Robert]
- Bjorn: it's okay to provide the EMMA DOM, but we should also have the simple access mechanism that all three proposals have
- 16:38:08 [Robert]
- Burn: would rather have XML or JSON, but not the DOM
- 16:38:22 [Robert]
- Michael: if you have XML, you can feed it into the DOM
- 16:39:30 [Robert]
- Burn: it's a minor objection, if everybody else agrees on the DOM, I'm okay with that
- 16:39:40 [Robert]
- Bjorn: maybe just provide both
- 16:40:40 [Robert]
- MichaelJohnston: EMMA will also help with more sophisticated multimodal apps, for example using ink. The DOM will be more convenient to work with.
- 16:41:17 [Robert]
- Burn: proposed agreement: "both DOM and XML text representations of EMMA must be provided"
- 16:41:25 [Robert]
- ... haven't necessarily agreed that that is all
- 16:43:17 [Robert]
- Bjorn: we already appear to agree, based on proposals: "recognition results must also be available in the javascript objects where the result is a list of recognition result items containing utterance, confidence and interpretation."
- 16:43:54 [Robert]
- Michael: may need to be tweaked to accommodate continuous recognition
- 16:44:40 [Robert]
- Burn: add "at least" to Bjorn's proposed requirement
- 16:45:24 [Robert]
- Burn: added a statement "note that this will need to be adjusted based on any decision regarding support for continuous recognition"
- 16:45:59 [Robert]
- Milan: would like to add a discussion topic around generic parameters to the recognition engine
- 16:47:19 [Robert]
- Burn: related to existing topic on the list, but will add
- 16:47:46 [Robert]
- Milan: also need to agree on standard parameters, such as speed-vs-accuracy
- 16:48:07 [Robert]
- Burn: will generalize the timeouts discussion to include other parameters
- 16:49:18 [Robert]
- MichaelJohnston: which parameters should be expressed in the javascript API, and what can go in the URI? What sorts of conflicts could occur?
- 16:49:55 [Robert]
- Bjorn: URI parameters are engine specific
- 16:50:44 [Robert]
- MichaelJohnston: for example, if we agreed that the way standard parameters are communicated is via the URI, they could come from the URI, or from the Javascript
- 16:51:35 [Robert]
- Michael: need to discuss the API/protocol to the speech engine, and how standard parameters are conveyed
- 16:52:23 [Robert]
- Bjorn: we need to discuss the protocol, it's not in the list
- 16:53:11 [Robert]
- Burn: will add it to the list
- 16:54:11 [Robert]
- Milan: are the grammars referred to by HTTP URI?
- 16:54:56 [Robert]
- Burn: existing requirement says "uri" which was intended to represent URLs and URNs
- 16:55:16 [Robert]
- Milan: would like to mandate that HTTP was for sure supported. there are lots of others that may work.
- 16:56:26 [Robert]
- Robert: should we have a standard set of built-in grammars/topics?
- 16:56:49 [Robert]
- Bjorn: in the Google proposal we had "builtin:" URIs
- 16:58:03 [Robert]
- Burn: "a standard set of common tasks/grammars should be supported. details TBD"
- 16:58:24 [Robert]
- Burn: need a discussion topic about what these are
- 16:59:11 [Robert]
- Robert: what about inline grammars?
- 16:59:36 [Robert]
- Bjorn: data URIs would work for that, and perhaps we should agree about that
- 16:59:48 [Robert]
- Charles: would like to see inline grammars remain on the table
- 17:00:36 [Robert]
- Burn: will add a discussion about inline grammars
- 17:01:39 [Robert]
- Burn: we all agree on the functionality that inline grammars would give
- 17:02:02 [Robert]
- MichaelJohnston: one target user is "mom & pop developers" who would provide simple grammars
- 17:03:00 [Robert]
- Burn: discussion topic: "what is the mechanism for authors to directly include grammars within their HTML document? Is this inline XML, data URI or something else?"
- 17:03:07 [Zakim]
- -Patrick_Ehlen
- 17:04:01 [Robert]
- Robert: use case: given that HTML5 supports local storage, the data from which a grammar is constructed may only be located on the local device
- 17:04:45 [Robert]
- Bjorn: proposes that we mandate data URIs, just for consistency with the rest of HTML
- 17:05:02 [Robert]
- Burn: no objections, so will record as an agreement
- 17:05:50 [Robert]
- Michael: need to discuss the ability to do re-recognition
- 17:06:00 [Robert]
- Burn: related to the topic of recognition from a file
- 17:06:13 [Robert]
- Bjorn: both are fine discussion topics
- 17:08:08 [Robert]
- Burn: [discussion about whether there's anything to discuss around endpointing], already implied in existing discussion topic
- 17:08:40 [Robert]
- Bjorn: context block?
- 17:09:24 [Robert]
- Burn: discussion topic: "do we need a recognition context block capability?" and if we end up deciding yes, we'll discuss the mechanism
- 17:09:40 [Robert]
- Milan: how do we specify a default recognizer?
- 17:09:47 [Robert]
- Bjorn: don't specify it at all
- 17:10:25 [Robert]
- ... since it's the default
- 17:11:13 [Robert]
- Michael: need some canonical string to specify user agent default, so we could switch back to it (could be empty string)
- 17:12:08 [Robert]
- ... Whereas how we specify a local one may be similar to the way to specify the remote engine
- 17:12:39 [Robert]
- Bjorn: for local engines do we need to specify the engine or the criteria?
- 17:12:49 [Robert]
- Burn: SSML does it this way
- 17:13:09 [Robert]
- Bjorn: is there a use case for specifying criteria?
- 17:14:01 [Robert]
- Burn: in Tropo API, language specification can specify a specific engine
- 17:16:03 [Robert]
- Burn: this is a scoping issue. e.g. in SSML a voice is used in the scope of the enclosing element
- 17:16:30 [Robert]
- ... in HTML could say that the scope is the input field, or the entire form
- 17:17:04 [Robert]
- Bjorn: in all the proposals, scoping is to a javascript object
- 17:18:22 [Robert]
- Bjorn: are there any other criteria for local recognizers than speed-vs-accuracy?
- 17:19:22 [Robert]
- Charles: different microphones will have different profiles
- 17:20:01 [Robert]
- Raj: how do we discover characteristics of installed engines
- 17:20:54 [Robert]
- Michael: selection = discovery?
- 17:21:08 [Robert]
- Burn: in SSML, some people wanted discovery
- 17:21:24 [Robert]
- Bjorn: use cases?
- 17:21:47 [Robert]
- Michael: selection of existing acoustic and language models
- 17:22:29 [Robert]
- Robert: there's a blurry line between what a recognizer is, and what a parameter is
- 17:23:26 [Robert]
- Michael: topic: "how to specify default recognition"
- 17:23:38 [Robert]
- Michael: topic: "how to specify local recognizers"
- 17:24:01 [Robert]
- Michael: topic: "do we need to specify engines by capability?"
- 17:25:26 [Robert]
- Raj: or "how do we specify the parameters to the local recognizer?"
- 17:26:08 [Robert]
- Burn: want to back up to "what is a recognizer, and what parameters does it need?"
- 17:26:56 [Robert]
- ... call something a recognizer, and call other things related to that a recognizer
- 17:27:43 [Robert]
- Bjorn: the API probably doesn't need to specify a recognizer. speech and parameters go somewhere and results come back
- 17:29:35 [Robert]
- Burn: what is the boundary between selecting a recognizer and selecting the parameters of a recognizer
- 17:30:04 [Robert]
- Milan: we need to discuss audio streaming
- 17:30:22 [Robert]
- Burn: topic: "do we support audio streaming and how?"
- 17:30:30 [Milan]
- Milan: Let's discuss audio streaming
- 17:30:52 [Zakim]
- -Bjorn_Bringert
- 17:30:53 [Zakim]
- -Olli_Pettay
- 17:30:53 [Zakim]
- -Debbie_Dahl
- 17:30:54 [Zakim]
- -Milan_Young
- 17:31:02 [Zakim]
- -Michael_Bodell
- 17:31:03 [Zakim]
- -Raj_Tumuluri
- 17:31:03 [Zakim]
- -Michael_Johnston
- 17:31:08 [Zakim]
- -Charles_Hemphill
- 17:31:13 [burn]
- zakim, who's on the phone?
- 17:31:13 [Zakim]
- On the phone I see Dan_Burnett, Robert_Brown
- 17:31:14 [Zakim]
- -Robert_Brown
- 17:31:21 [Zakim]
- -Dan_Burnett
- 17:31:22 [Zakim]
- INC_(HTMLSPEECH)12:00PM has ended
- 17:31:24 [Zakim]
- Attendees were Dan_Burnett, Olli_Pettay, Robert_Brown, Charles_Hemphill, Milan_Young, Debbie_Dahl, +1.818.237.aaaa, Bjorn_Bringert, Michael_Johnston, Raj_Tumuluri, Patrick_Ehlen,
- 17:31:26 [Zakim]
- ... Michael_Bodell
- 17:31:34 [burn]
- rrsagent, draft minutes
- 17:31:34 [RRSAgent]
- I have made the request to generate http://www.w3.org/2011/04/28-htmlspeech-minutes.html burn
- 17:31:39 [burn]
- rrsagent, make minutes public
- 17:31:39 [RRSAgent]
- I'm logging. I don't understand 'make minutes public', burn. Try /msg RRSAgent help
- 17:31:48 [burn]
- rrsagent, make log public
- 18:45:26 [ddahl]
- ddahl has left #htmlspeech
- 19:34:55 [Zakim]
- Zakim has left #htmlspeech