16:40:32 RRSAgent has joined #htmlspeech 16:40:32 logging to http://www.w3.org/2010/12/16-htmlspeech-irc 16:40:49 Zakim has joined #htmlspeech 16:43:58 zakim, this will be htmlspeech 16:43:58 ok, burn; I see INC_(HTMLSPEECH)12:00PM scheduled to start in 17 minutes 16:44:06 zakim, code? 16:44:06 the conference code is 48657 (tel:+1.617.761.6200 tel:+33.4.26.46.79.03 tel:+44.203.318.0479), burn 16:54:28 INC_(HTMLSPEECH)12:00PM has now started 16:54:35 +Michael_Bodell 16:54:52 mbodell_ has joined #htmlspeech 16:55:13 +??P1 16:55:41 Zakim, ??P1 is Olli_Pettay 16:55:41 +Olli_Pettay; got it 16:55:59 bringert has joined #htmlspeech 16:56:03 Zakim, nick smaug is Olli_Pettay 16:56:03 sorry, smaug_, I do not see 'smaug' on this channel 16:56:08 Zakim, nick smaug_ is Olli_Pettay 16:56:08 ok, smaug_, I now associate you with Olli_Pettay 16:56:45 marc has joined #htmlspeech 16:57:01 +Milan_Young 16:57:23 + +44.122.546.aaaa 16:57:35 Milan has joined #htmlspeech 16:57:48 Zakim, +44.122.546.aaaa is Bjorn_Bringert 16:57:48 +Bjorn_Bringert; got it 16:57:58 Zakim, I am Bjorn_Bringert 16:57:58 ok, bringert, I now associate you with Bjorn_Bringert 16:58:07 burn has joined #htmlspeech 16:58:25 zakim, code 16:58:25 I don't understand 'code', burn 16:58:28 zakim, code? 16:58:28 the conference code is 48657 (tel:+1.617.761.6200 tel:+33.4.26.46.79.03 tel:+44.203.318.0479), burn 16:58:45 +[IPcaller] 16:58:46 +Dan_Burnett 16:58:57 trackbot, start telcon 16:58:58 +[Microsoft] 16:58:59 RRSAgent, make logs public 16:59:00 zakim, I am IPCaller 16:59:00 ok, marc, I now associate you with [IPcaller] 16:59:01 Zakim, this will be 16:59:02 Meeting: HTML Speech Incubator Group Teleconference 16:59:02 Date: 16 December 2010 16:59:02 I don't understand 'this will be', trackbot 16:59:06 Chair: Dan Burnett 16:59:22 Agenda: http://lists.w3.org/Archives/Public/public-xg-htmlspeech/2010Dec/0144.html 16:59:24 ddahl has joined #htmlspeech 16:59:33 zakim, i am Dan_Burnett 16:59:33 ok, burn, I now associate you with Dan_Burnett 16:59:56 +Debbie_Dahl 17:00:19 zakim, who is on the phone? 17:00:19 On the phone I see Michael_Bodell, Olli_Pettay, Milan_Young, Bjorn_Bringert, [IPcaller], Dan_Burnett, [Microsoft], Debbie_Dahl 17:00:23 Robert has joined #htmlspeech 17:00:34 zakim, I am IPcaller 17:00:34 ok, marc, I now associate you with [IPcaller] 17:00:47 zakim, [Microsoft] is Robert_Brown 17:00:47 +Robert_Brown; got it 17:01:06 zakim, [IPcaller] is Marc_Schroeder 17:01:06 +Marc_Schroeder; got it 17:03:44 Scribe: Robert_Brown 17:03:54 ScribeNick: Robert 17:06:52 topic: last week's minutes 17:07:10 Dan: (no comments) last week's minutes approved 17:07:43 topic: comments on the newest version of the requirements draft 17:07:50 Dan: no comments 17:08:17 topic: require encryption http://lists.w3.org/Archives/Public/public-xg-htmlspeech/2010Dec/0099.html 17:09:12 michael: not much mail on this, Bjorn agreed in mail, no other mail comments. seems reasonable 17:09:22 proposed req: Web application must be able to encrypt communications to remote speech service 17:09:26 Dan: asked for objections, no objections voiced 17:09:51 topic: require best practices http://lists.w3.org/Archives/Public/public-xg-htmlspeech/2010Dec/0107.html 17:10:42 Milan: not sure we're aligned on the emphasis behind this requirement. maybe should put it on hold. some people are prioritising schedule ahead of features. 17:10:57 ...: put it on hold and see how the other issues we discuss this week play out 17:11:17 s/...:/... / 17:11:49 zakim, nick mbodell_ is Michael_Bodell 17:11:49 ok, burn, I now associate mbodell_ with Michael_Bodell 17:12:09 Bjorn: has anybody had experience where this sort of requirement is needed? it seems redundant 17:12:10 -Bjorn_Bringert 17:12:26 I got disconnected 17:12:53 +Bjorn_Bringert 17:12:54 zakim, nick Milan is Milan_Young 17:12:55 ok, burn, I now associate Milan with Milan_Young 17:13:07 zakim, nick bringert is Bjorn_Bringert 17:13:07 ok, burn, I now associate bringert with Bjorn_Bringert 17:13:08 Dan: sometimes to prevent avoiding certain architectures 17:13:44 zakim, nick ddahl is Debbie_Dahl 17:13:44 ok, burn, I now associate ddahl with Debbie_Dahl 17:13:46 Milan: intended to avoid the sessions/sockets issue. but lets get on dissing the other topics and get back to this one 17:14:10 topic: require support for text interpretation http://lists.w3.org/Archives/Public/public-xg-htmlspeech/2010Dec/0122.html 17:14:40 Bjorn: i wouldn't consider it high priority, but okay keeping it for now 17:14:50 Dan: this is certainly in scope 17:15:26 Bjorn: it's already possible and doesn't need a new requirement. just use an xmlhttp request. 17:15:38 Dan: there may be some benefit to having a unified approach 17:16:00 Bjorn: agreed there's a benefit but not high priority 17:16:29 Dan: looks like we have consensus on keeping it 17:16:34 proposed req: Web applications must be able to request NL interpretation based only on text input (no audio sent). 17:16:59 topic: re-recognition http://lists.w3.org/Archives/Public/public-xg-htmlspeech/2010Dec/0133.html 17:17:22 Michael: a fair bit of discussion in mail, but it seems people are okay keeping this 17:18:07 Bjorn: okay to have as a requirement, lower priority, if I was making the proposal I wouldn't add it because of the added complexity 17:18:26 proposed req: Web applications must be able to request recognition based on previously sent audio. 17:18:43 Michael: no objections? [resounding silence...] 17:19:09 dan: consensus 17:19:22 topic: concept of session http://lists.w3.org/Archives/Public/public-xg-htmlspeech/2010Dec/0130.html 17:19:41 Michael: discussion on whether we need it and whether cookies support it? 17:19:53 Milan: not thrilled, but okay to call this one good enough 17:20:07 ... cookie gets 90% of use cases 17:20:37 Bjorn: do you want to add a requirement like existing mechanisms should be used to manage sessions or something like that 17:20:46 Milan: how about the way it's worded now? 17:21:00 Bjorn: text in original email is okay with me 17:21:06 burn has changed the topic to: #htmlspeech agenda: http://lists.w3.org/Archives/Public/public-xg-htmlspeech/2010Dec/0144.html (burn) 17:21:24 Olli: okay with me too 17:21:50 Robert nervous about defintition of word session 17:21:57 robert: wants to confirm meaning of "session". different from what we do in web apps? 17:22:40 robert: is there any use case? 17:22:56 bjorn: yes. could consider a speech API that does not pass on cookies that are set 17:23:15 milan: e.g. a native agent proposal. user agent would be required to tack on cookies 17:23:32 robert: can live with this. details will become apparent with the proposals 17:23:56 bjorn: IETF specs use the notion of "stateful session" when discussing cookies 17:24:01 proposed req: Web application and speech services must have a means of binding session information to communications. 17:24:09 michael: sounds like we have consensus 17:24:59 topic: modify FPR30 to remove "UA" http://lists.w3.org/Archives/Public/public-xg-htmlspeech/2010Dec/0111.html 17:25:47 Bjorn: okay with Milan's restatement in mail 17:25:59 Michael: concerned that this breaks our privacy requirements 17:26:15 Milan: but that's broken (paraphrase) 17:26:38 Michael: if I'm the only one who's nerveous I'm okay taking Milan's text 17:27:15 Bjorn: if those mechanisms don't satisfy privacy requirements, we can look at improving them. 17:28:28 Marc: is it part of our specification to make a position on who does it? 17:28:56 Bjorn: xmlhttp talks about web app but implies UA requirements 17:29:34 Michael: objections? 17:29:58 Dan: nerveous but won't object. in prioritisation we may need to be more precise 17:30:01 proposed change: fpr30 becomes Web applications must be allowed at least one form of communication with a particular speech service that is supported in all UAs. 17:30:18 my question was about confirming that at this stage we are not taking any decision how the communication between the web app and the speech service is realised, whether the UA plays a standardised role or not. 17:30:19 Dan: agreed, move on 17:30:30 confirmed that this decision is *not* taken at this stage. 17:30:41 the new requirement is better because it makes this less explicit. 17:30:53 topic: cancelling requests. http://lists.w3.org/Archives/Public/public-xg-htmlspeech/2010Dec/0134.html 17:31:21 Bjorn: besides efficiency, are there any reasons to add the requirement? 17:31:52 Michael: existing requirements relate to this (barge-in) 17:32:23 Milan: it's efficiency. but if you were going to do real barge-in in most of your transactions, it would be an issue 17:32:51 Bjorn: if the client wants to stop sending audio, it can send a marker saying it's done 17:32:59 Milan: that's what I'm asking for 17:33:19 Bjorn: sender cancelling is easy with HTTP. receiver cancelling is difficult 17:33:44 Milan: how would end of speech be indicated 17:34:02 Bjorn: some sort of end-of-audio packet, which handles the sender cancelling 17:35:03 ... why do we need this? 17:35:25 Milan: the user agent may not be able to detect when done 17:35:37 Bjorn: would server or client do that? 17:35:45 Milan: the client 17:36:27 Anthapu has joined #htmlspeech 17:36:27 Bjorn: should split into two discussions: 1 client aborting recognition (fine and required and trivial); 2 client aborting synthesis 17:37:03 ... implied by FPR17 17:37:14 Michael: that says the user can abort it 17:37:42 Bjorn: need a separate requirement that web application should be able to cancel audio capture 17:37:54 + +1.732.507.aabb 17:38:07 Marc: we used the term "abort" intentionally, with privacy concerns in mind 17:38:24 Bjorn: duplicate FPR17, replacing user with web app 17:38:29 proposed new req: While capture is happening, there must be an obvious way for the web application to abort the capture and recognition process. 17:38:50 s/obvious // 17:39:01 s/an way/a way/ 17:39:08 Bjorn: fine with what Michael typed 17:39:35 ... [no other objections] lets move on to synthesis 17:40:13 ... client wants to abort playing of long synthesized speech. if there's no way for the client to signal the server, the only option is to tear down the connection 17:40:28 ... this may have latency implications to establish a new connection 17:41:27 Milan: there's a lot of work that goes into establishing a TCP socket. Email triage is a good example. App reads a few sentences of a message then the user interrupts 17:42:10 ... it would be awkward if the mail app just read the first sentence 17:42:46 Bjorn: or the app could read a sentence at a time until it decides to move to the next message 17:43:24 Milan: not asking for interruption (existing requirement), but to cancel it all the way to the server 17:43:45 Bjorn: reluctant to add a requirement of going all the way to the server 17:44:14 Bjorn: propose "web application must be able to abort TTS output" 17:44:30 Milan: but Bjorn has already to do this for reco, why not TTS? 17:45:07 Bjorn: reco is required, and the sender aborts by sending up a token. this is different, because the receiver is aborting 17:45:49 Milan: but with reco, the server is sending back ack's while the client is speaking, so there is a bi-directional mechanism 17:46:07 Bjorn: are you saying a bidirectional communication is already required? 17:46:26 Milan: we have the requirement that speech has begun and streaming 17:46:33 Bjorn: speech detection is done on the client 17:46:50 Milan: nerveous about detection in the client 17:47:17 ... FPR21 apps should be notified when capture starts 17:47:52 ... until we have reco, we can't say that speech has begun, and we can't do hotword from the client 17:48:01 zakim, who's noisy? 17:48:12 burn, listening for 10 seconds I could not identify any sounds 17:48:14 Bjorn: notify -that- speech has begun, not -when- it has begun 17:48:31 -Marc_Schroeder 17:48:35 Yep 17:49:01 +??P3 17:49:19 zakim, ??P3 is Marc_Schroeder 17:49:19 +Marc_Schroeder; got it 17:49:26 Milan: this is part of the problem of not having detailed descriptions on this. I brought this up back in the F2F meeting, but didn't catch the nuance of the word "that" 17:49:32 zakim, nick marc is Marc_Schroeder 17:49:32 ok, burn, I now associate marc with Marc_Schroeder 17:50:07 Bjorn: no assumption that detection runs on the client, but also no exclusion of this 17:50:20 Milan: but if it runs on the server, then you need bi-direction communication 17:50:35 ... and if so, it doesn't seem to be a stretch to say we need this for synthesis 17:51:02 Bjorn: i agree with the analysis, but probably wouldn't propose an API for this 17:51:24 Michael: we shoudl agree on whether or not it's a requirement, then prioritise in the next stage 17:51:43 proposed req: Web application must be able to programatically abort tts output. 17:52:22 Bjorn: can we agree that it's a requirement for the web app to abort TTS, without any specific requirement on how thsi affects the server 17:52:35 Milan: sounds fine 17:52:44 Michael: (silence) sounds like we have consensus 17:53:23 Bjorn: so the other requirement is that when the client aborts TTS, it should not need to tear down the connection 17:53:57 Marc: is this about functionality or efficiency? if it's about efficiency, the discussion should occur later, when we discuss implementation 17:54:12 Milan: but it's so fundamental it would be crippling not to have this 17:54:30 Bjorn: how about "aborting TTS should be efficient"? 17:54:34 Milan: okay 17:54:48 proposed req: Aborting the synthesis should be efficient. 17:55:02 Michael: sounds like we have consensus 17:55:19 Bjorn: "TTS output" rather than "synthesis" 17:56:21 ... one is the effect on the user experience, the other is the effect on efficiency 17:56:26 s/the synthesis/the TTS output/ 17:57:16 topic: discussion about API, device tag, etc http://lists.w3.org/Archives/Public/public-xg-htmlspeech/2010Dec/0142.html 17:57:33 Michael: is there a set of requirements out of that discussion? 17:57:53 Bjorn: no it's a proposal 17:58:17 Milan: it shows a lot of promise and if we started early we could get done sooner 17:58:31 Bjorn: there's some serious politics going on there 17:59:30 Michael: WHATWG doesn't really represent all browser manufacturers 18:00:32 Milan: could the audio working group handle this? 18:00:44 Michael: they're more about mixing and analysis, rather than capture 18:01:47 ... IE wouldn't tackle this area until it's under some w3c group 18:02:11 Milan: it would be in our group's interest to get some sort of audio capture API into HTML 18:02:39 oops, that should have been Bjorn 18:03:19 Michael: UI is geared around web cam capture 18:03:42 Milan: people have been working on audio capture since 2005, and we only started this year 18:03:49 Michael: but the use cases are different 18:04:10 Bjorn: is there an audio chat scenario? 18:05:41 Bjorn: could we specify an API required for speech without it being general purpose? 18:05:54 Michael: we should propose what we need and explain why we need it 18:06:53 Bjorn: if we don't have a general API for app-specified network recognition, we can still have reco with the default recognizer 18:07:45 Olli: would it be easiest to co-author it with the whatwg and then propose that the HTML wg pick it up 18:07:53 Bjorn: that's my preference 18:08:37 Marc: if the browser captured audio according to ther requirements for speech recognition, then we wouldn't need any specific device API 18:09:11 Michael: an alternative is to finish discussing requirements, then look at proposals, for which there may be a spectrum of approaches 18:09:27 Bjorn: there's no reason to exclude a particular approach at this point 18:09:53 Milan: concerned that device API has a promise and if we don't work together it won't happen 18:12:26 Marc: we're expected to look at the pros and cons of various options and maybe make a decision, or if not, at least recommend options 18:13:54 Dan: people can propose more requirements later on, but we should move on to prioritization 18:15:22 ... begin prioritization in January, but between now and then, review the requirements and talk about those you don't feel are clear enough for you to prioritize 18:18:32 Michael: please send description text where you think it's missing 18:21:26 Milan: would prefer that the chairs propose a description and participants riff on that 18:23:27 Dan: prioritization is a function that will naturally work out issues at the next level of detail 18:24:09 ... So the first thing people should do is review the requirements, and if you can't prioritize, start a conversation 18:24:53 Michael: I will send out another update soon, and you'll have a couple of week to review as Dan suggests 18:25:08 Milan: it'll be chaos. 50 requirements. 6 groups here 18:26:39 Dan: if this turns out to not work, we'll change strategies 18:27:05 ... but I think we'll probably have a very small number of threads 18:28:04 ... Plan to have calls at the same timeslot in January, in case we need them 18:29:06 Marc: Michael, could you restructure the list of requirements by topic? 18:29:35 Michael: will move section 3 to an appendix, and can potentially reorder section 4. I'll make an attempt 18:30:08 - +1.732.507.aabb 18:30:09 ... I'll see what factors out 18:31:04 Great work everybody! 18:32:27 -Marc_Schroeder 18:32:28 -Olli_Pettay 18:32:29 -Milan_Young 18:32:30 -Bjorn_Bringert 18:32:30 -Debbie_Dahl 18:32:32 -Michael_Bodell 18:32:33 -Dan_Burnett 18:32:39 -Robert_Brown 18:32:40 INC_(HTMLSPEECH)12:00PM has ended 18:32:42 Attendees were Michael_Bodell, Olli_Pettay, Milan_Young, Bjorn_Bringert, Dan_Burnett, Debbie_Dahl, Robert_Brown, Marc_Schroeder, +1.732.507.aabb 18:32:49 marc has left #htmlspeech 18:33:15 zakim, bye 18:33:15 Zakim has left #htmlspeech 18:33:21 rrsagent, make log public 18:33:29 rrsagent, draft minutes 18:33:29 I have made the request to generate http://www.w3.org/2010/12/16-htmlspeech-minutes.html burn 18:33:37 ddahl has left #htmlspeech 18:46:29 smaug_ has joined #htmlspeech 18:57:27 smaug_ has joined #htmlspeech