IRC log of htmlspeech on 2010-11-11

Timestamps are in UTC.

16:31:50 [RRSAgent]
RRSAgent has joined #htmlspeech
16:31:50 [RRSAgent]
logging to
16:36:36 [burn]
zakim, nick burn is DanBurnett
16:36:36 [Zakim]
sorry, burn, I do not see a party named 'DanBurnett'
16:40:11 [burn]
rrsagent, bye
16:41:28 [burn]
rrsagent, draft minutes
16:41:28 [RRSAgent]
I have made the request to generate burn
16:41:31 [burn]
rrsagent, bye
16:57:03 [marc]
marc has joined #htmlspeech
16:57:08 [Zakim]
INC_(HTMLSPEECH)12:00PM has now started
16:57:15 [Zakim]
16:57:46 [burn]
zakim, code?
16:57:46 [Zakim]
the conference code is 48657 (tel:+1.617.761.6200 tel:+ tel:+44.203.318.0479), burn
16:58:32 [ddahl]
ddahl has joined #htmlspeech
16:58:32 [Zakim]
16:58:37 [Zakim]
16:59:14 [Zakim]
16:59:29 [ddahl]
trackbot start telcon
16:59:54 [burn]
zakim, mute P7
16:59:54 [Zakim]
sorry, burn, I do not know which phone connection belongs to P7
17:00:02 [mbodell]
mbodell has joined #htmlspeech
17:00:04 [ddahl]
trackbot, start telcon
17:00:04 [burn]
zakim, mute ??P7
17:00:04 [Zakim]
??P7 should now be muted
17:00:06 [Zakim]
17:00:07 [trackbot]
RRSAgent, make logs public
17:00:09 [trackbot]
Zakim, this will be
17:00:09 [Zakim]
I don't understand 'this will be', trackbot
17:00:10 [trackbot]
Meeting: HTML Speech Incubator Group Teleconference
17:00:10 [trackbot]
Date: 11 November 2010
17:00:11 [marc]
17:00:16 [burn]
zakim, unmute ??P7
17:00:16 [Zakim]
??P7 should no longer be muted
17:00:22 [smaug_]
Zakim, ??P14 is Olli_Pettay
17:00:22 [Zakim]
+Olli_Pettay; got it
17:00:32 [smaug_]
Zakim, smaug is Olli_Pettay
17:00:32 [Zakim]
sorry, smaug_, I do not recognize a party named 'smaug'
17:00:35 [Zakim]
+ +39.335.766.aaaa
17:00:38 [smaug_]
Zakim, smaug_ is Olli_Pettay
17:00:38 [Zakim]
sorry, smaug_, I do not recognize a party named 'smaug_'
17:00:39 [burn]
zakim, ??P7 is Marc_Schroeder
17:00:40 [Zakim]
+Marc_Schroeder; got it
17:00:43 [ddahl]
chair: Dan_Burnett
17:00:55 [burn]
zakim, ??P2 is Dan_Burnett
17:00:55 [Zakim]
+Dan_Burnett; got it
17:00:56 [Zakim]
17:01:10 [burn]
zakim, nick burn is Dan_Burnett
17:01:10 [Zakim]
ok, burn, I now associate you with Dan_Burnett
17:01:18 [smaug_]
17:01:29 [smaug_]
Zakim, nick smaug_ is Olli_Pettay
17:01:29 [Zakim]
ok, smaug_, I now associate you with Olli_Pettay
17:01:40 [marc]
zakim, nick marc is Marc_Schroeder
17:01:40 [Zakim]
ok, marc, I now associate you with Marc_Schroeder
17:01:45 [burn]
zakim, who's on the phone?
17:01:45 [Zakim]
On the phone I see [Microsoft], Dan_Burnett, Marc_Schroeder, Debbie_Dahl, Olli_Pettay, +39.335.766.aaaa, Michael_Bodell
17:01:51 [ddahl]
zakim, nick ddahl is Debbie_Dahl
17:01:51 [Zakim]
ok, ddahl, I now associate you with Debbie_Dahl
17:01:55 [burn]
zakim, aaaa is Paolo_Baggia
17:01:55 [Zakim]
+Paolo_Baggia; got it
17:02:00 [Zakim]
17:02:05 [Zakim]
+ +1.760.705.aabb
17:02:35 [burn]
zakim, aabb is Bjorn_Bringert
17:02:37 [Zakim]
+Bjorn_Bringert; got it
17:03:06 [Milan]
Milan has joined #htmlspeech
17:03:16 [burn]
zakim, who's on the phone?
17:03:16 [Zakim]
On the phone I see [Microsoft], Dan_Burnett, Marc_Schroeder, Debbie_Dahl, Olli_Pettay, Paolo_Baggia, Michael_Bodell, Milan_Young, Bjorn_Bringert
17:03:26 [ddahl]
rrsagent, start logging
17:03:26 [RRSAgent]
I'm logging. I don't understand 'start logging', ddahl. Try /msg RRSAgent help
17:04:01 [Zakim]
+ +1.425.391.aacc
17:04:38 [burn]
zakim, [Microsoft] is temporarily Robert_Brown
17:04:39 [Zakim]
+Robert_Brown; got it
17:04:59 [burn]
zakim, aacc is Dan_Druta
17:04:59 [Zakim]
+Dan_Druta; got it
17:05:11 [burn]
zakim, who's here?
17:05:20 [Zakim]
On the phone I see Robert_Brown, Dan_Burnett, Marc_Schroeder, Debbie_Dahl, Olli_Pettay, Paolo_Baggia, Michael_Bodell, Milan_Young, Bjorn_Bringert, Dan_Druta
17:05:23 [Zakim]
On IRC I see Milan, mbodell, ddahl, marc, RRSAgent, Zakim, burn, smaug_, trackbot
17:05:47 [burn]
zakim, nick Milan is Milan_Young
17:05:47 [Zakim]
ok, burn, I now associate Milan with Milan_Young
17:05:59 [burn]
zakim, nick ddahl is Debbie_Dahl
17:05:59 [Zakim]
ok, burn, I now associate ddahl with Debbie_Dahl
17:06:13 [burn]
zakim, nick mbodell is Michael_Bodell
17:06:13 [Zakim]
ok, burn, I now associate mbodell with Michael_Bodell
17:06:56 [ddahl]
topic: minute takers and expectations
17:07:30 [ddahl]
scribe: ddahl
17:08:03 [ddahl]
dan: important to start and end on time
17:08:49 [ddahl]
... in the future we will track when people take minutes, also will prefer to to select people who join late and haven't taken minutes
17:09:16 [bringert]
bringert has joined #htmlspeech
17:09:20 [ddahl]
...will start asking newer people in the coming weeks
17:09:52 [ddahl]
...suggestion that we check on f2f minutes and requirements draft
17:10:25 [ddahl]
...send minor corrections to minutes by email, or bring up major corrections now.
17:10:55 [ddahl] requirements draft sent out based on f2f discussion by Michael Bodell
17:11:48 [smaug_]
17:11:49 [ddahl]
michael: started new section and struck out requirements that have been deleted. the new draft just covers f2f
17:12:08 [ddahl]
dan: concerns about requirements?
17:12:12 [burn]
agenda is at
17:12:33 [burn]
zakim, who is noisy?
17:12:44 [Zakim]
burn, listening for 10 seconds I heard sound from the following: Marc_Schroeder (85%), Paolo_Baggia (14%)
17:13:02 [ddahl]
marc: should this requirements document be at a stable uri?
17:13:20 [ddahl]
michael: would like to do this, but still need CVS access
17:13:32 [ddahl]
dan: we're going through this process as quickly as we can
17:13:54 [ddahl]
...this is supposed to be more of a live draft
17:14:01 [ddahl]
...comments on requirements?
17:14:25 [ddahl]
paolo: would like to know more about section 4, is that the place where the new changes will be placed?
17:14:30 [ddahl]
michael: yes
17:14:58 [ddahl] we go through this, all of section 3 will be moved, but section 4 will replace it
17:15:16 [ddahl]
...confusing to mix old and new requirements
17:15:40 [ddahl]
dan: paolo, were you asking if there was a way to link old and new requirements?
17:16:00 [ddahl]
paolo: yes, and in the old one there was explanation
17:16:30 [ddahl]
michael: was mainly trying to capture what we agreed on rather than include a lot of text
17:16:42 [ddahl] ones link back to old ones
17:17:13 [marc]
zakim, who is noisy?
17:17:24 [Zakim]
marc, listening for 10 seconds I could not identify any sounds
17:17:40 [ddahl]
dan: this approach reduces confusion about which are the new requirements, when we finish the old requirements I will ask if there's anything that didn't get captured, then it will be safe to remove old requirements.
17:17:50 [ddahl]
topic: requirements
17:18:07 [ddahl]
dan: start with R21
17:18:55 [ddahl]
michael: status is that there was mostly agreement that it was out of scope, was not sure what the scenario was that Eric Johansson had in mind.
17:19:53 [ddahl]
dan: recommend drawing a line through this requirement and letting Eric raise concerns
17:20:11 [burn]
zakim, nick bringert is Bjorn_Bringert
17:20:11 [Zakim]
ok, burn, I now associate bringert with Bjorn_Bringert
17:20:22 [ddahl]
bjorn: i think Eric agreed that it was out of scope
17:20:30 [ddahl]
dan: he may want to continue discussion
17:20:38 [ddahl]
dan: R6
17:21:08 [ddahl]
michael: we already covered this in discussion of R27
17:21:17 [burn]
for the minutes, also note that we explicitly had consensus on the call to remove R21
17:21:46 [ddahl]
dan: does anyone object to removing R6?
17:21:50 [ddahl]
(no objections)
17:21:59 [ddahl]
topic: R17
17:22:29 [ddahl]
michael: little discussion, but some bleeded over from R18
17:22:50 [ddahl]
...should have requirement that API for recognition should not introduce unneeded latency
17:23:15 [ddahl]
s/dan: R6/topic:R6
17:23:34 [ddahl]
dan: any more comments on R17?
17:23:55 [bringert]
my connection dropped, dialing again
17:23:56 [ddahl]
...any objections to Michael's wording to replace R17
17:24:25 [ddahl]
michael: two requirements, one that Bjorn proposed and one that Michael proposed
17:25:13 [ddahl]
...first one is that applications can start processing captured audio right away and second says that applications can introduce unneeded latency
17:25:37 [mbodell]
s/can introduce/should not/
17:25:50 [ddahl]
dan: any objections to replacing original R17 with these two requirements?
17:25:59 [ddahl]
(no objections)
17:26:01 [smaug_]
"Implementations should be allowed to start processing captured audio before the capture completes." and "The API to do recognition should not introduce unneeded latency."
17:26:13 [ddahl]
topic: R18
17:26:14 [bringert]
(w3c doesn't answer the phone), no objection from me
17:26:37 [burn]
bjorn, just try again. zakim does this sometimes
17:27:23 [ddahl]
dan: we acknowledge no objections to R17 (including from bjorn)
17:28:00 [ddahl]
...we will do what we can on R18, will get started while waiting for bjorn
17:28:29 [Zakim]
+ +44.122.546.aadd
17:28:31 [ddahl]
dan: any objections?
17:29:08 [burn]
zakim, aadd is Bjorn_Bringert
17:29:08 [Zakim]
+Bjorn_Bringert; got it
17:29:54 [mbodell]
Implementations should be allowed to start playing back synthesized speech before the complete results of the speech synthesis request are available.
17:30:25 [ddahl]
dan: any objections to adding this requirement?
17:30:47 [ddahl]
michael: other proposals that may have come out of this discussion.
17:31:21 [ddahl]
... other issues about what user agent must allow, codecs, not sure how to tackle
17:31:38 [ddahl]
dan: there are others where we're closed to consensus, so let's start with those
17:32:00 [ddahl]
...some of the ones proposed by Milan were addressing other wording on email list
17:32:20 [ddahl]
milan: first two, about writing and reading to the audio buffer is handled
17:32:42 [ddahl]
bjorn: we might have consensus on the requirement about unneeded latency
17:32:49 [ddahl]
...for TTS
17:33:15 [ddahl]
marc: we agree with this
17:33:49 [ddahl]
michael: that takes care of the first 3, and the others aren't needed
17:34:10 [ddahl]
...then we had user agents must not filter results from the application
17:34:41 [burn]
no, not "the others aren't needed" -- instead, the next 3 are not needed
17:35:02 [burn]
... through filtering results
17:35:17 [ddahl]
bjorn: the next two are that the ua must not interfere with result or timing
17:35:42 [burn]
oops, i meant through passing parameters
17:35:58 [burn]
yes, we are now discussing results and timing
17:36:17 [ddahl]
bjorn: you might have some results that are suspicious
17:36:27 [ddahl]
..that the ua might want to filter
17:36:39 [ddahl]
marc: what speaks in favor of these requirements?
17:37:23 [ddahl]
milan: concerned that ua think that it knows best about the speech interaction and deleting something that was part of the protocol between the speech resource and the web application
17:38:14 [ddahl]
michael: api needs to be extensible so that additional functionality is supported?
17:39:09 [ddahl]
milan: actually wants to make sure that if an EMMA result is sent to the web app the ua must not change it because it doesn't know what it is? also wants events fired by the speech resource to make it into the user space.
17:39:22 [ddahl]
bjorn: what kind of event are you talking about?
17:39:35 [burn]
actually, that was olli
17:39:42 [burn]
(not bjorn)
17:39:59 [mbodell]
Yeah, I think ealier too about supspicious results was also olli
17:40:14 [ddahl]
milan: if we have a "start of speech" event, as long as our API is flexible enough we can add new things
17:40:31 [ddahl]
s/bjorn: what/olli: what
17:41:12 [ddahl]
s/bjorn: you/olli: you
17:42:13 [ddahl]
milan: API should be flexible so that new events don't break apps
17:42:47 [ddahl]
bjorn: how about if we say that it should be possible to add new information to speech recognition results
17:43:16 [ddahl]
...then for events we would have to require that speech server specific events would be able to be returned to the web app
17:44:01 [ddahl]
milan: web applications need to be able to continue to run even if they aren't expecting them
17:44:15 [ddahl]
bjorn: can't expect that app will never crash
17:44:34 [ddahl] speech server should be able to return implementation-specific events
17:44:55 [ddahl]
...two new requirements
17:45:32 [bringert]
new requirement 1: speech recognition implementations should be allowed to add implementation specific information to speech recognition results
17:46:29 [bringert]
new requirement 2: speech recognition implementations should be allowed to fire implementation specific events
17:47:41 [ddahl]
milan: previously we had discussed that there was a particular ordering of events, if we just have events being fired in between those events, we might destabilize applications
17:48:24 [ddahl]
bjorn: the apps that don't care about events won't register for them. and if we say "a before b" that doesn't mean that there's anything in between.
17:49:11 [ddahl]
milan: if speech server doesn't generate "start of speech" does the ua insert it?
17:49:21 [ddahl]
bjorn: we haven't defined events
17:49:25 [bringert]
s/that doesn't mean that there's anything/that doesn't mean that there's not anything/
17:49:32 [Zakim]
17:49:41 [ddahl]
michael: it could be the ua or the speech resource that generates some of these events
17:50:18 [ddahl]
dan: any objections to adding these two requirements as stated on IRC
17:50:44 [ddahl]
milan: are we going to allow TTS to fire events? these just talk about recognition.
17:50:53 [Zakim]
17:50:56 [ddahl]
marc: it would make sense to have that?
17:51:33 [ddahl]
michael: rather than changing R2 to speech resources would it make sense to have new requirement for TTS events?
17:51:49 [ddahl]
bjorn: other use cases for TTS events other than "mark"?
17:52:11 [ddahl]
marc: yes, "mark" is too coarse-grained for lip synchronization
17:52:26 [mbodell]
new requirement 3: speech synthesis implementations should be allowed to fire implementation specific events
17:52:55 [ddahl]
dan: sounds like we have agreement on these three
17:53:07 [ddahl]
michael: back to R18
17:53:30 [ddahl]
dan: should be split into two, first is exactly as worded
17:54:21 [ddahl]
michael: i.e. requires support for remote services, e.g. by HTTP 1.1
17:54:45 [ddahl]
dan: yes, second is that speech services and ua may negotiate on other protocols
17:55:13 [ddahl]
dan: first one is mandatory support for HTTP 1.1
17:55:30 [ddahl]
michael: does HTTP 1.1 include https?
17:55:59 [ddahl]
marc: this seems like a low-level technical requirement, should not discuss now
17:56:30 [ddahl]
dan: two issues -- we use a generic term "communication" but we might need to distinguish control and media
17:56:59 [ddahl] IETF there is a lot of discussion for using HTTP for streaming, when should you use that or not
17:57:27 [ddahl]
...we might not want to define these protocols, just pick from what's available, but may not want to pick now
17:57:47 [ddahl]
marc: would like to settle on some existing protocol, but now is too early
17:58:04 [ddahl]
michael: likes rewording with "such as"
17:58:30 [ddahl]
olli: likes "lowest common denominator"
17:58:45 [smaug_]
that wasn't me
17:58:57 [mbodell]
that was robert
17:59:13 [ddahl]
s/olli: likes/robert: likes
18:01:00 [ddahl]
marc: requirement should be: the communication between the us and the speech server must allow for some lowest common denominator like HTTP 1.1
18:01:07 [ddahl]
18:01:43 [ddahl]
dan: we want to require a lowest common denominator protocol
18:02:14 [ddahl]
bjorn: we want to avoid mismatch between browser and speech server communication
18:02:18 [burn]
i said "require a mandatory-to-support communication protocol, such protocol TBD"
18:03:45 [ddahl]
michael: the communication between the us and the speech server must require a mandatory-to-support lowest common denominator such as HTTP 1.1, TBD
18:03:51 [ddahl]
18:05:07 [ddahl]
dan: the reason for this is that four months from now it might be construed as requiring HTTP 1.1
18:05:21 [ddahl]
michael: do we have agreement on this?
18:06:01 [ddahl]
...ok what about the second sentence?
18:07:02 [ddahl]
dan: we could write a requirement but may not end up considering it as something that we need to do now
18:07:20 [ddahl]
...we don't want to prevent negotiation in the future
18:07:56 [ddahl]
robert: negotiation sounds like a runtime handshake but what we really want is the freedom to use something else if something better shows up
18:08:55 [ddahl] telephony negotiation is important, but in web apps you just ask for what you want and if it doesn't work, try something else
18:09:05 [ddahl]
dan: this concept is important to capture
18:09:13 [ddahl]
michael: requirement text?
18:09:23 [burn]
what i had proposed initially was: "UAs and speech services may negotiate on use of other protocols for communication."
18:10:34 [ddahl]
... another wording: "UAs and speech services may agree to use alternate protocols for communication."
18:10:41 [ddahl]
dan: agreement?
18:10:49 [ddahl]
(no objections)
18:10:58 [mbodell]
agree with another wording
18:12:18 [ddahl]
marc: another one for R18, should we have one for implementation data for TTS, a mirror image to the new one we added for ASR
18:12:31 [ddahl]
robert: we added that at the f2f
18:12:40 [ddahl]
michael: two more
18:12:59 [ddahl]
...require ua's to expose an API for local speech services
18:13:13 [ddahl]
bjorn: i don't see this as necessary
18:13:20 [ddahl]
michael: i agree
18:13:35 [ddahl]
dan: do you think that whatever API we define should be sufficient?
18:14:05 [ddahl]
bjorn: the ua is free to talk to any speech service whenever it wants
18:14:14 [burn]
my question is whether even in the local case that the single api we are defining should be required
18:14:26 [burn]
... and that UAs can optimize the behavior
18:14:46 [burn]
or are you saying that the UA can do anything it wants any way it wants when the resource is local?
18:17:17 [ddahl]
bjorn: as long as the specification is specified, if the web app requests a local speech service but it's not available, what happens
18:17:17 [Zakim]
18:18:14 [ddahl]
milan: we have a stereotype that a local speech service is embedded, but it could be a plug in. so if the app requests a local service, that should be honored.
18:18:20 [ddahl]
bjorn: what if it's not there?
18:18:52 [ddahl]
milan: if it's a plugin you might ask the user if they want to install a plugin
18:19:34 [ddahl]
bjorn: to make that work we would have to specify a plugin language
18:20:07 [ddahl]
milan: agree, but if a local service is requested it should be used
18:20:38 [ddahl]
michael: isn't this an example of the web app requesting an alternate speech service?
18:20:53 [ddahl]
milan: don't expect to require downloading of code
18:21:06 [RobertBrown]
RobertBrown has joined #htmlspeech
18:21:13 [ddahl]
bjorn: how could this work without downloading code?
18:21:29 [ddahl]
milan: we don't need to specify, just say that there has to be some mechanism
18:22:15 [ddahl]
bjorn: what if we say that web app could point to a local service, but we shouldn't say that the ua has to try to install a local service
18:22:54 [ddahl]
??: what if some vendor only makes plugins for, i.e. IE
18:23:06 [smaug_]
18:23:06 [bringert]
that was olli
18:23:41 [ddahl]
michael: we should have a requirement that speech services can specify a local service
18:23:56 [mbodell]
new req: Speech services that can be specified by web apps must include local speech services.
18:24:59 [ddahl]
dan: if we agree on this requirement, if someone wants to propose additional requirements we can discuss
18:26:16 [ddahl]
michael: if the ua refuses resource it must inform web app (we already have this one)
18:26:31 [burn]
so we have agreed to add this new requirement
18:26:38 [ddahl]
...don't distinguish between network and local
18:26:56 [ddahl]
milan: what does "default speech resource"?
18:27:25 [burn]
we are now discussing the related FPR9 and 10 to understand whether we need to add anything else
18:27:32 [ddahl]
michael: in the future the author should be able to just ask for speech service without specifying one
18:28:09 [ddahl]
bjorn: a browser must provide speech service, it doesn't have to build its own
18:28:24 [smaug_]
I may not agree with the requirement
18:28:33 [ddahl]
dan: it sounds like we do have agreement with this requirement
18:28:35 [smaug_]
I need to think about it a bit
18:29:09 [Milan]
Who is smaug?
18:29:13 [ddahl]
michael: can we remove R18 and discuss codecs later
18:29:14 [smaug_]
smaug is Olli
18:29:26 [bringert_]
bringert_ has joined #htmlspeech
18:29:44 [ddahl]
dan: we could keep R18 just for that last point
18:30:12 [ddahl]
dan: milan's concerns about R18 haven't been addressed
18:30:27 [Zakim]
18:30:29 [Zakim]
18:30:31 [Zakim]
18:30:32 [Zakim]
18:30:33 [Zakim]
18:30:43 [ddahl] call next week
18:31:15 [ddahl]
rrsagent, who was present?
18:31:15 [RRSAgent]
I'm logging. Sorry, nothing found for 'who was present'
18:31:21 [Zakim]
18:31:28 [ddahl]
rrsagent, present?
18:31:28 [RRSAgent]
I'm logging. Sorry, nothing found for 'present'
18:31:38 [ddahl]
18:32:04 [ddahl]
zakim, list participants?
18:32:04 [Zakim]
I don't understand your question, ddahl.
18:32:15 [ddahl]
zakim, list participants
18:32:15 [Zakim]
As of this point the attendees have been Debbie_Dahl, Olli_Pettay, +39.335.766.aaaa, Marc_Schroeder, Dan_Burnett, Michael_Bodell, Paolo_Baggia, Milan_Young, +1.760.705.aabb,
18:32:18 [Zakim]
... Bjorn_Bringert, +1.425.391.aacc, Robert_Brown, Dan_Druta, +44.122.546.aadd
18:32:45 [ddahl]
rrsagent, make logs public
18:32:52 [ddahl]
rrsagent, format minutes
18:32:52 [RRSAgent]
I have made the request to generate ddahl
18:38:44 [bringert]
bringert has joined #htmlspeech
18:43:27 [burn]
18:45:35 [mbodell]
18:49:20 [Zakim]
18:49:22 [Zakim]
18:49:25 [Zakim]
18:49:29 [ddahl]
ddahl has left #htmlspeech
18:54:26 [Zakim]
disconnecting the lone participant, Bjorn_Bringert, in INC_(HTMLSPEECH)12:00PM
18:54:29 [Zakim]
INC_(HTMLSPEECH)12:00PM has ended
18:54:34 [Zakim]
Attendees were Debbie_Dahl, Olli_Pettay, +39.335.766.aaaa, Marc_Schroeder, Dan_Burnett, Michael_Bodell, Paolo_Baggia, Milan_Young, +1.760.705.aabb, Bjorn_Bringert, +1.425.391.aacc,
18:54:37 [Zakim]
... Robert_Brown, Dan_Druta, +44.122.546.aadd
20:39:57 [Zakim]
Zakim has left #htmlspeech