14:57:56 RRSAgent has joined #hcg 14:57:56 logging to http://www.w3.org/2011/01/14-hcg-irc 14:57:58 RRSAgent, make logs member 14:57:58 Zakim has joined #hcg 14:58:00 Zakim, this will be HTML_CG 14:58:00 ok, trackbot, I see HTML_CG()10:00AM already started 14:58:01 Meeting: Hypertext Coordination Group Teleconference 14:58:01 Date: 14 January 2011 14:58:08 darobin has joined #hcg 14:58:13 rrsagent, make logs public 14:58:44 +TomB 14:59:01 Zakim, code? 14:59:01 the conference code is 4824 (tel:+1.617.761.6200 tel:+33.4.26.46.79.03 tel:+44.203.318.0479), glazou 14:59:05 Chair: Chris 14:59:34 Regrets: Cameron McCormack, Erik Dahlström, Art Barstow, Charles McCathieNevile, Frederick Hirsch, Lofton Henderson 14:59:40 janina has joined #hcg 14:59:57 zakim, call janina 14:59:57 ok, janina; the call is being made 14:59:58 +Janina 15:00:13 +ChrisL 15:00:21 zakim, who is here? 15:00:21 On the phone I see Michael_Bodell, TomB, Janina, ChrisL 15:00:22 On IRC I see janina, darobin, Zakim, RRSAgent, mbodell, ChrisL, Steven, kaz, ArtB, glazou, plinss, shepazu, ed, trackbot 15:00:25 "this passcode is not valid" 15:00:28 +Art_Barstow 15:00:46 +glazou 15:00:49 ah finally 15:00:52 glazou, i used 4824 and it worked 15:00:54 zakim, dial steven-617 15:00:54 ok, Steven; the call is being made 15:00:55 +Steven 15:01:07 +Bert 15:01:20 Bert has joined #hcg 15:01:26 ddahl has joined #hcg 15:01:37 -glazou 15:01:48 +??P10 15:02:03 zakim, ??P10 is Kaz 15:02:03 +Kaz; got it 15:02:20 ddahl1 has joined #hcg 15:02:50 +Debbie_Dahl 15:03:06 +glazou 15:03:13 ah that was the french bridge 15:03:18 the US one is in better shape 15:03:32 zakim, who is here? 15:03:32 On the phone I see Michael_Bodell, TomB, Janina, ChrisL, Art_Barstow, Steven, Bert, Kaz, Debbie_Dahl, glazou 15:03:33 On IRC I see ddahl1, ddahl, Bert, janina, darobin, Zakim, RRSAgent, mbodell, ChrisL, Steven, kaz, ArtB, glazou, plinss, shepazu, ed, trackbot 15:03:40 +??P16 15:03:51 Zakim, ??P16 is me 15:03:51 +darobin; got it 15:05:12 good start for the new year darobin 15:05:56 zakim,pick a victim 15:05:56 Not knowing who is chairing or who scribed recently, I propose Debbie_Dahl 15:06:24 zakim, who is here? 15:06:24 On the phone I see Michael_Bodell, TomB, Janina, ChrisL, Art_Barstow, Steven, Bert, Kaz, Debbie_Dahl, glazou, darobin 15:06:27 On IRC I see ddahl1, ddahl, Bert, janina, darobin, Zakim, RRSAgent, mbodell, ChrisL, Steven, kaz, ArtB, glazou, plinss, shepazu, ed, trackbot 15:06:27 scribe: ddahl 15:06:41 chair: ChrisL 15:07:16 topic: Audio on the Web 15:07:35 http://lists.w3.org/Archives/Member/w3c-html-cg/2011JanMar/0005.html 15:08:25 michaelB: in MMI, VoiceBrowser, and HTML SpeechXG. wants to capture audio in a way that user can interact with it. some proposals have capture and then upload, but that doesn't satisfy our use case. 15:08:39 chris: other requirements for speech? 15:08:58 +Shepazu 15:09:16 -Bert 15:09:20 michaelB: yes, endpointing, echo cancellation, playback for speech synthesis, and tying playback to barge-in 15:09:22 Bert: the french bridge... 15:09:51 the French bridge is hosed, call in on the US one if you can 15:10:07 +Bert 15:10:18 http://lists.w3.org/Archives/Member/w3c-html-cg/2011JanMar/0003.html 15:10:27 janina: our requirements came from making HTML 5 video and audio accessible 15:11:40 ...video description uses secondary audio channel, used in broadcasting, different on the web, also looking for a way to play two binary resources, not necessarily the same length 15:12:10 ...another one is the need to control volume and panning separately, or direct them to a secondary audio device 15:13:13 chrisL: if something's being broadcast to a group, then different people might have different needs 15:13:20 doug: how would that work? 15:13:55 ...are you accessing different devices, how would you discover different devices? 15:14:41 janina: I don't know if the browser knows, but the OS knows. I know it's discoverable on Linux, mapping the OS resources to the browser 15:15:09 chrisL: a kind of labeling so that different things go to different devices 15:15:38 ...also synchronization of multiple audio streams, SMIL does this, but not HTML5 audio 15:16:21 janina: HTML5 seems to assume that files have the same timespan, but that might not be true for video description or for different languages 15:16:38 chrisL: especially problematic for longer files 15:17:03 janina: SMIL seems to work well, is used in Daisy Consortium 15:17:51 ...we could take as much of SMIL for the use cases we need and leave the rest behind 15:18:10 chrisL: similar to what we did with SVG 15:20:27 doug: Audio XG -- audio api is an api for reading and writing to the live audio stream, one implementation that will be in Firefox, we just give access to the raw bits, a more sophisticated implementation in WebKit, also has a higher-level ability to manipulate audio in the browser 15:21:09 ...we will make a WG, have been mostly talking about WebKit approach, should things be done in the browser or with script libraries 15:21:21 chrisL: is script fast enough? 15:22:11 s/enough/enough for helper methods/ 15:22:14 doug: I don't know, would be better to use helper methods in mobile devices because of processing constraints 15:25:15 -TomB 15:25:28 http://lists.w3.org/Archives/Member/w3c-html-cg/2011JanMar/0004.html 15:25:51 Present: Michael_Bodell, Dan_Burnett, Janina, ChrisL, Art_Barstow, Steven, Kaz, Debbie_Dahl, glazou, darobin, Shepazu, Bert 15:26:29 ddahl: Our primary use case involving audio is input and output of speech, mainly for interaction 15:26:51 ... but also recording, like fro voicemail. so need to capture speech and to stream it 15:26:53 RRSAgent, make minutes 15:26:53 I have made the request to generate http://www.w3.org/2011/01/14-hcg-minutes.html ArtB 15:27:01 ... not just batch capture 15:27:19 i/ddahl: Our/scribenick: ChrisL/ 15:27:41 ... support arbitrary processing - speech 15:27:41 recognition, speech understanding, speech-to-speech translation, emotion 15:27:41 detection, speaker verification, language/gender/age identification, medical 15:27:41 diagnosis 15:27:59 ddahl: will not support arbitrary translation 15:28:11 ... need to contreol format and sampling rate 15:28:34 ... capture speech on mobile or desktop or over telephone (last is a VB requirement) 15:28:52 ddahl: Able to combine semantics of speech with other inputs, like circling an area 15:28:52 and saying "Italian restaurants near here" 15:29:09 ddahl: control volume of output, pause and resume 15:29:23 ... local or distributed cloud-based processing 15:29:45 ... audio file output, tts, positioningof inputsand outputs 15:30:12 ddahl: multiple microphones? like a big meeting room and record the whole meeting 15:30:21 ChrisL: multichannel or mixing? 15:30:25 ddahl: both 15:31:04 ddahl: no use cases around capturing non-speech audio, for mm, but importtant for others 15:31:37 ChrisL: ability to determine if an audio input is speech or non-speech 15:33:06 q+ to mention logistics of microphone access spec 15:33:09 scribenick: ddahl 15:33:15 michaelB: also have concerns around security and privacy 15:33:36 i/michaelB:/scribenick: ddahl1/ 15:33:38 ...a microphone is like a keyboard, what are user expectations and behavior 15:33:49 ...need to mix with functional requirements 15:34:23 chrisL: you can imagine some way of notifying the user that speech is being recorded. 15:34:49 janina: in the news today was a story about spyware on smartphones 15:35:10 michael: also need to be able to notify user in non-visual environments 15:35:35 doug: maybe a vibratory signal could signal when microphone is on 15:35:57 ...nothing about privacy in the charter, but the spec will mention privacy 15:37:26 ...charter basically has microphone access. lots of discussion about access to microphone. DAP WG is chartered to do it but hasn't done it. Audio WG will work on it if necessary 15:37:41 DAP is doing something about this 15:37:46 RTC will help as well 15:37:52 more than happy to work with Audio 15:37:55 chrisL: comments from robin on DAP? 15:38:06 and in fact we've done it 15:38:11 just not at the level required yet 15:38:22 but certainly can push further 15:39:10 very basic access: http://dev.w3.org/2009/dap/camera/ 15:39:19 more advanced: http://dev.w3.org/2009/dap/camera/Overview-API.html 15:39:31 and we want to do more advanced still, but will need some security model for it 15:39:50 RTC == real time Web 15:40:03 it's not called camera, URIs are opaque dammit :) 15:40:12 chrisL: "camera" spec sounds like it should be visual 15:40:14 http://www.w3.org/TR/media-capture-api/ 15:40:26 http://www.w3.org/TR/html-media-capture/ 15:40:35 (same links, for people who read URIs) 15:40:41 that is correct 15:40:44 michaelB: doesn't cover the streaming case for audio 15:40:52 we're working on that, but it's harder security wise 15:41:16 we're also synching with HTML WG 15:41:42 chrisL: separate specs, capture vs. streaming? 15:41:46 yes, they build atop one another whenever possible 15:42:30 michaelB: maybe could be separate, but could be the same spec. working on proposals in HTML-SpeechXG, reviewing proposals 15:42:41 feeeeeeeeeeeedback 15:42:46 we wantsssss feeeeeeeeeeeeeeeeeeeedback 15:43:06 I may not be able to speak today, but I can read :) 15:43:15 chrisL: HTML-speech XG should send email to DAP 15:43:16 DAP: public-device-apis@w3.org 15:43:35 Web Audio API from Chris Rogers: http://chromium.googlecode.com/svn/trunk/samples/audio/specification/specification.html 15:43:39 -Steven 15:43:46 -Janina 15:43:47 -darobin 15:43:47 -Kaz 15:43:49 -Michael_Bodell 15:43:49 -glazou 15:43:51 -Debbie_Dahl 15:43:51 -ChrisL 15:43:52 -Art_Barstow 15:43:58 rrsagent, format minutes 15:43:58 I have made the request to generate http://www.w3.org/2011/01/14-hcg-minutes.html ddahl1 15:43:59 -Shepazu 15:44:02 -Bert 15:44:03 HTML_CG()10:00AM has ended 15:44:04 Attendees were Michael_Bodell, TomB, Janina, ChrisL, Art_Barstow, glazou, Steven, Bert, Kaz, Debbie_Dahl, darobin, Shepazu 15:45:50 rrsagent, make logs member 15:46:41 RRSAgent, make logs world 15:46:43 rrsagent, make logs public 15:47:04 glazou has left #hcg 15:50:44 RRSAgent, stop