W3C

- DRAFT -

Audio Working Group Teleconference

27 Oct 2014

Agenda

See also: IRC log

Attendees

Present
Regrets
Chair
joe, mdjp
Scribe
olivier, olivier, philcohen, hongchan

Contents


<trackbot> Date: 27 October 2014

<olivier> Meeting: Audio Working Group face-to-face meeting

<olivier> Scribe: olivier

mdjp: thought it might be worth starting with a brief intro on where we are and where we are going in W3C process

2014 W3C Process -> http://www.w3.org/2014/Process-20140801/

mdjp: next step for us is to get to Candidate Recommendation
... that means freeing v1 scope, resolve issues and complete edition of WD
... explains next step after that - Proposed Recommendation, and how to get there

v1 Feature issues

Github repositiry of v1 tagged issues -> https://github.com/WebAudio/web-audio-api/issues?q=is%3Aopen+is%3Aissue+label%3A%22V1+%28TPAC+2014%29%22

<mdjp> https://docs.google.com/spreadsheets/d/1lBnjJI7_-wVznwuvwoaylu69-S2pelsTUGqu9zl2y-M/edit?usp=sharing

[Jerry Smith joins, round of intro]

joe: want to talk about criteria for v1/v2
... we should be stern with ourselves about what to change/keep at this point
... large number of things we can put off, will make us feel bad but otherwise we would not get out the door

ChrisLilley: note we can also have a category of things we *think* are v1 but we're not sure

mdjp: looking at table at https://docs.google.com/spreadsheets/d/1lBnjJI7_-wVznwuvwoaylu69-S2pelsTUGqu9zl2y-M/edit?usp=sharing

joe: start with 113 - audioWorkers

<BillHofmann> Good morning.

113 -> https://github.com/WebAudio/web-audio-api/issues/113

cwilso: 113 is largely under control, the biggest issue at this point is in issue 1 (inputs and outputs in Audio Workers)
... my goal was to be able to reimplement everything in audioworkers except the inputs and outputs
... splitter and merger are a pain in that regard - they have a non-predefined number of i/o, and dynamically change number of channels

Issue 1 -> https://github.com/WebAudio/web-audio-api/issues/1

cwilso: does everyone understand inputs/outputs and channels in the spec?

joe: suggests change [missed]

cwilso: problem is that output channels can be dynamically changed

joe: if you leave out the dynamic bit, what would audioprocess look like?

cwilso: you'd need to define the number of channels somewhere

joe: like in the constructor

[Philippe Cohen from Audyx joins]

ChrisL: why are dynamic channels a problem?

cwilso: that happens in the audioworker
... and the only time you can do anything is when onaudioprocess event fires
... every connection has the opportunity to cause this upmix/downmix
... if we didn't care about dynamic channels, we're done, because we let you define channels

joe: what id you could pre-specify number of inputs and channels

cwilso: problem is that channels is per input
... so we'd end with an array of array of float32 buffers

joe: propose we pass an array per input, and have arrays of arrays of buffers, organised by channel

cwilso: harder for outputs
... for inputs, easy because you are handed an array of arrays

joe: how do native nodes deal with this?

cwilso: internally it only cares about the output to the destination
... question is 1- do we want to represent multiple connections in addition to multiple channels, and 2- do we want dynamic channels?
... probably both yes, but needs to be figured out
... so we can replicate splitter and merger node behaviour

joe: worth aiming for a fully scoped audioworker that does all that
... would argue for most complete approach

cwilso: assuming no need to have dynamic change to number of inputs
... all predefined

RESOLUTION: agreement on need for support for multiple connection at instantiation, and multiple number of channels after instantiation

<cwilso> "This is a magical node." - Joe, referring to PannerNode.

Next issue - 372 - rework pannerNode https://github.com/WebAudio/web-audio-api/issues/372

joe: suggest deprecating it

cwilso: was an attempt to bind together different scenarios - both control of mixer and 3D game with spatialization
... issue 368 is about default currently HRTF https://github.com/WebAudio/web-audio-api/issues/368
... some of it (doppler) was made when there were only buffersourcenodes
... completely broken with things like a live microphone
... also - none of the parameters are audioparams, they're floats

shepazu: is there some reason it was done this way?

padenot: looking at openAL - games was a big use case at that point

cwilso: given the above I agree to tear this node apart
... we need a panner and a spatialization node, separately
... plus rip dopplr completely, can be replicated with a delaynode

BillHofmann: I hear a proposal for a stereo panner
... and a proposal to deprecate the pannernode

cwilso: agree on need for stereo panner
... spatialization still has value, especially as it has been implemented already
... would want to re-specify, to have xyz as audioparams

BillHofmann: question about whether it should be the last one

shepazu: could be a best practice

cwilso: advise authors to do so - not a requirement

joe: hear consensus on a new stereo panner
... and a spatialization feature for v2
... suggest stripping equalpower from the node
... so hear consensus to replace pannernode with a new stereo panner node + spatialization node, with audio parameters, remove doppler

olivier: not clear whether group wants the spatialization to be in v1 or v2?

matt: unsure from our perspective. We have convolvernode

joe: can the new spatialization be pushed to v2? Expecially since we currently have convolvernode
... would suggest deprecating for v1, fix it in v2

cwilso: unless we fix it no point in deprecating in v1 - games would be using it

ChrisL: agree and think new cleaned up should be in v1, possibly marked as risk
... it sets a clear direction

RESOLUTION: clearly deprecate current pannernode. Add spatialization without doppler. Add a stereo panner node.

<shepazu> http://www.w3.org/TR/mediacapture-depth/

shepazu: depth track - do we need to take it into consideration?

joe: it could, in the future

Next - Github 359 - Architectural layering: output stream access? - https://github.com/WebAudio/web-audio-api/issues/359

cwilso: question is how do you get access to input and output
... translates to streams

<cwilso> https://streams.spec.whatwg.org/

cwilso: questioning the need for streams API

Streams API -> https://streams.spec.whatwg.org/

cwilso: is is really designed as a sink
... different model to how web audio works (polling for data)
... does not really solve the balance between low latency and avoiding glitching

joe: how does this relate to getUserMedia

harald: discussion in group about output devices and how to connect a mediastream to a specific output device

joe: anything actionable for v1?

cwilso: agree it's not mature enough yet. still agree you need some access to audio

joe: agree it us fundamental, but acknowledge it will be a multi-group thing

cwilso: architectural layering seems more important than arbitrary date
... significant broken bit

TENTATIVE: Github 359 is fundamental but we may not resolve it for v1

next is Github 358 - Inter-app audio - https://github.com/WebAudio/web-audio-api/issues/358

mdjp: tricky issue - on the one hand this sounds like a case of "plugins are bad"
... but there is demand from the industry

cwilso: 2 separate but related issues
... on the one hand massive popular, adopted plugin system(s) (VST, etc)
... massive investment
... this is how most inter-application audio is done
... people have a lot of these around, and not being able to use effects people own is a problem
... separately, there's a question of whether we allow people to replicate what they do

ChrisL: are you talking about sandboxing?

joe: feels like a web-wide issue

cwilso: audio is a very low latency, high bandwidth connection - makes it different from other kinds of applications

joe: does audioworker dispense with it by allowing loading scripts from other domains?

olivier: what's the use case for this to be in v1?

cwilso: relates very closely to a number of our key use cases

ChrisL: one key difference with other parts of the web is user acceptance of the model

joe: want to be careful jumping this divide. Because this relates to our use cases does not necessary obliges a v1 release to cover those present day platforms
... would be a good thing, but may not be something we MUST do
... it will be controversial if we pull plugins in and make it a first class citizen

cwilso: are you talking about plugging into VSTs and rack effects, or the general question of inter-app plugins

joe: audioworker could be the solution to the generic question of javascript "plugins"

cwilso: don't know whether it actually works for it

BillHofmann: question of whether plugins will be native or web-based
... might be worth renaming to not create allergic reaction to "plugins"

joe: my belief is that there could be a standard built upon audioworker

<BillHofmann> (note speaking as a matter of personal opinion, not as Dolby)

mdjp: seems to be consensus that plugin arch is important, question is whether v1 or v2

(only cwilso raises hand for preference to v1)

cwilso: we are at a point where we are looking back at coverage of our use cases - we might want to revisit the use cases

olivier: suggest splitting the two issues - multi-app behaviour and VSTs etc

mdjp: suggest 3 steps
... 1 review use cases and implications of doing this or not
... 2 if we do it, how

cwilso: will have to have an answer ready when we stamp something as v1
... as to how this will be done

RESOLUTION: split GH 358 into two, start thinking about our answer

[discussion about v1, living standard etc]

shepazu: note that v1 is where patents lie
... having a live editor's draft is a good idea

Next - Github 351 - Describe how Nodes created from different AudioContexts interact (or don't) - https://github.com/WebAudio/web-audio-api/issues/351

cwilso: suggest that nodes should not interact with other contexts
... but buffers should work across contexts
... remember - a context works at a specific sample rate
... any node with a notion of a sample buffer would break if you pass it across contexts
... whereas audiobuffers have sample rate built in
... one of the two ways of getting a buffer (decodeaudiodata) would be harder
... but we have a separate issue for that
... proposal is to say "if you try connecting nodes across context, throw an exception; audiobuffers on the other hand can"

olivier: is there any case against this?

cwilso: would lose the ability to reuse a graph
... without recreating it

RESOLUTION: agree to cwilso's proposal - "if you try connecting nodes across context, throw an exception; audiobuffers on the other hand can"

Next - Github 12 - Need a way to determine AudioContext time of currently audible signal - https://github.com/WebAudio/web-audio-api/issues/12

joe: this comes from the fact that on some devices there was built in latency at OS level, and it was impossible to discover it
... no way of asking the platform "is there any latency we should know about?"
... problem with scheduling is that there is a time skew that is not discoverable

ChrisL: why is it not discoverable
... schedule something and measure?

cwilso: some of it is not measurable/not reported, e.g. bluetooth headset

olivier: there is a suggestion from srikumar here - https://github.com/WebAudio/web-audio-api/issues/12#issuecomment-52006756

joe: similar to my suggestion

(group looks at proposal - some doubts about point 3)

joe: suggestion of new attribute of audiocontext describing the UA's best guess of the signal heard by the listener right now
... in audiocontext time
... it's a time in the past, different from currenttime which is the time of the next processing

<cwilso> additionally, I think we should expose the best guess at the time the currentTime block will play in performance.now time.

olivier: will need to be more precise to be testable

RESOLUTION: add two things - new AudioContext attribute exposing UA's best guess at real context time being heard now on output device (this will normally be a bit behind currentTime, and not quantized). Also new attribute expressing DOM timestamp corresponding to currentTime. - see https://github.com/WebAudio/web-audio-api/issues/12#issuecomment-60651781

Next - Github 78 - HTMLMediaElement synchronisation - https://github.com/WebAudio/web-audio-api/issues/78

cwilso: suggest to close "not our problem"

(consensus to close, crafting close message)

joe: essentially duplicate of 257?

RESOLUTION: Close Github 78

Next - Github 91 - WaveTable normalization - https://github.com/WebAudio/web-audio-api/issues/91

ChrisL: is this the highest or the sum that is normalised to 1?

joe: sum

[discussion about normalising and band-limitation]

cwilso: need to look at actual normalization algorithm
... suggested resolution - need a parameter to turn off normalization; also - need better explanation of periodicwave

RESOLUTION: add additional optional parameter to createPeriodicWave() to enable/disable normalization; better describe real and imag; document the exact normalization function - https://github.com/WebAudio/web-audio-api/issues/91#issuecomment-60655020

[break for lunch]

Next - Need to provide hints to increase buffering for power consumption reasons - GH 348 - https://github.com/WebAudio/web-audio-api/issues/348

padenot: in some use cases better to not run at the lowest possible latency
... multiple proposals in the thread; one is to tell the contextwhat the prefered buffer size is
... another is to use channel API
... and Jer has a strawman where he kind of uses the channel API
... my position would be not to explicitely pick a buffer size
... the UA has typically a better idea of appropriate buffer size

cwilso: major concern was to conflate with other behaviour such as pausing for a phone call
... (stop the context altogether)
... agree that the requested number is not necessarily what you would get

padenot: basically we need low latency / save battery

cwilso: and balance the two
... and turn that dial

olivier: use case for typical developer? I see how that's useful for implementer...

padenot: example of a audio player with visualiser - no need for low latency there

joe: if this were to be implemented as a "the UA may..." would that be acceptable?

cwilso: right thing to do

padenot: agree

cwilso: not sure it should be at constructor level

padenot: tricky on some platforms if you want to make it glitch-less
... worried about putting too many parameters on the constructor

RESOLUTION: do something similar to Jer's proposal at https://github.com/WebAudio/web-audio-api/issues/348#issuecomment-53757682 - but as a property bag options object passed to the constructor

cwilso: expose it, make it readonly and decide later whether we make it dynamic
... expose the effects, not the whole property bag

Next - two related issues

joe: two issues - Github 264 and 132

Use AudioMediaStreamTrack as source rather than ill-defined first track of MediaStream -> https://github.com/WebAudio/web-audio-api/issues/264

Access to individual AudioTracks -> https://github.com/WebAudio/web-audio-api/issues/132

joe: looking at 132 first
... would it make change to change the API to be track-based for both
... also this comment from Chris suggesting one output per track for MediaElementSourceNode https://github.com/WebAudio/web-audio-api/issues/132#issuecomment-51366048

padenot: tracks are not ordered

joe: tracks have a kind, id and label

cwilso: yes they're identifiable
... the problem is "we say first" and they have an unordered label list...

joe: we could rename and require an id

cwilso: take the track instead

joe: agree

cwilso: suggest we keep the same name and factory method, and add another that take a track
... which would be unambiguous
... seems to be agreement on adding the interface that takes a track
... question is what we do with the old one
... and essentially deciding "what is the first track"
... first id in alphabetical order?

ChrisL: justification for keeping the "first track" interface?

cwilso: works today without people having to go read another spec and pick a track - especially since most of the time there is only one track
... we could clearly explain not to use the old system if there may be more than one track

ChrisL: what if we rickrolled them if they don't specify the track?

Harald: throw an exception if trying to use this method when there is more than one track

joe: at least it looks like mediaElement and MediaStream are congruent in their treatment of tracks
... (as far as we are concerned)

RESOLUTION: keep the same node, keep same factory method but add a second signature that take a track - also define first

cwilso: "it doesn't need to make sense, it just needs to be definitive"

Next - (ChannelLayouts): Channel Layouts are not sufficiently defined - Github 109 - https://github.com/WebAudio/web-audio-api/issues/109

BillHofmann: the reason this is relevant is for downmixing
... any more we want to cover for v1?

mdjp: do we have a limit?

cwilso: 32

olivier: mention a spec (AES?) attempting to name/describe channel layouts

BillHofmann: proposal to defer to v2

cwilso: propose to remove the statement about "other layouts" from the spec

RESOLUTION: stick to currently supported layouts, rewrite the statement about other layouts to clarify expectations that we are not planning any more layouts for v1

Next - Lack of support for continuous playback of javascript synthesized consecutive audio buffers causes audio artifacts. - GH265 - https://github.com/WebAudio/web-audio-api/issues/265

joe: agree that audioworker is the current solution to this problem

group discussing https://github.com/WebAudio/web-audio-api/issues/300 (Configurable sample rate for AudioContext)

RESOLUTION: close wontfix
... bump up priority of GH300

Next - Unclear behavior of sources scheduled at fractional sample frames - GH332 - https://github.com/WebAudio/web-audio-api/issues/332

joe: propose we not do it, and specify that we are not doing it

RESOLUTION: edit spec to stipulate that all sources are always scheduled to occur on rounding sample frame

Next - OfflineAudioContext onProgress - GH302 - https://github.com/WebAudio/web-audio-api/issues/302

joe: seems like a showstopper to me if you need to create tens of thousands of notes
... onprogress would allow JIT instantiation

cwilso: if you want to do sync graph manipulation best thing may be to pause, modify, then resume
... you do have to schedule it
... you could schedule a pause every n seconds, and use the statechange callback

joe: single interval would be fine
... given use cases I have seen

RESOLUTION: introduce way to tell offlineaudiocontext to pause automatically at some predetermined interval

Next - Musical pitch of an AudioBufferSourceNode cannot be modulated - GH333 - https://github.com/WebAudio/web-audio-api/issues/333

joe: not sure this is v1 level
... would be nice to have detune for audiobuffersourcenode
... not great at the moment
... but suggest we defer this

mdjp: fair use case

cwilso: it does feel like something we forgot, not particularly hard

RESOLUTION: Add detune AudioParam in cents, analogous to Oscillator, at a-rate

Next - Map AudioContext times to DOM timestamps - GH340 - https://github.com/WebAudio/web-audio-api/issues/340

Resolution: see GH12 for description of new AudioContext time attribute

Next - Configurable sample rate for AudioContext - GH300 - https://github.com/WebAudio/web-audio-api/issues/300

resolution: the new options object argument to realtime AudioContext will now accept an optional sample rate

cwilso: may want to round it

[break]

<scribe> ScribeNick: philcohen

issues prioritisation

Chris: bring the a-rate and k-rate issue

<olivier> https://github.com/WebAudio/web-audio-api/issues/55

RESOLUTION: k-rate and not a-rate per Chris proposition

Chris bring issue https://github.com/WebAudio/web-audio-api/issues/337: DecodeAudioData

Chris: Use case is Audio API needs a decoder of its own since it cannot use Media API for this purpose that has its own design goals

Bill: Additional related issues 371 337, 30 & 7

Chris: https://github.com/WebAudio/web-audio-api/issues/30 we should do

Joe: accepted 30 for escalation

Chris: issue https://github.com/WebAudio/web-audio-api/issues/359

Joe: we will discuss this tomorrow at 12 with Harrald from the device task force
... De-zippering https://github.com/WebAudio/web-audio-api/issues/76

Chris: we have it built in today

Paul: defined in the specs as part of the Gain node

Olivier: thought we have already a resolution

Chris: made a resolution back in January

Joe: So what do we miss today?

Chris: not convinced we can define the use case where it will be applied

Joe: Does not want the API to De-zipper itself and let the developers being responsible for that

RESOLUTION: We changed the January decision and the issue is open

Joe: proposing De-zippering is OFF

ChrisL: supporting it

RESOLUTION: Cancel automatic De-zippering, developers will use the API when needed

Olivier: should add informative material to inform developers

Joe: OK

Chris: issue https://github.com/WebAudio/web-audio-api/issues/6

AudioNode.disconnect() needs to be able to disconnect only one connection

Chris: Connect allow selective connection but disconnect is not selective and destroy all output

Joe: it's bad

RESOLUTION: Just do it!

https://github.com/WebAudio/web-audio-api/issues/367 Connecting AudioParam of one AudioNode to another Node's AudioParam

Joe: concern we are inventing a new way to connect that will create a heavy load on implementors

Chris: Use case: create a source with a DC offset

RESOLUTION: not in V1

https://github.com/WebAudio/web-audio-api/issues/39 MediaRecorder node

Chris: should not be done, since we have already have MediaRecorder via MediaStream

Paul: What about offline AudioContext?

Chris: great V2 feature

RESOLUTION: Deferring to V2

https://github.com/WebAudio/web-audio-api/issues/13 - A NoiseGate/Expander node would be a good addition to the API

Chris: Pretty common use case
... Doable in AudioWorker

Paul: Dynamic compressor can be used for that?

Chris: prefers to make it a separated node

Matt: suggesting a decision Additional node in V1, name to be finalized Expander, Dynamic Compressor
... testing requires work

Chris: anyhow we have a lot to do in testing

RESOLUTION: Approved to include this in V1

Related: DynamicsCompressor node should enable sidechain compression https://github.com/WebAudio/web-audio-api/issues/246

Chris: Joe position is to not include this in V1 and detailed how to achieve that

Paul: Connecting two inputs can make it

Chris: Two input connections (signal + control)

RESOLUTION: no new node: DynamicCompressor can have optional 2nd input for Control signal, for V1.

editorial issues ready for review

<olivier> https://github.com/WebAudio/web-audio-api/issues?q=is%3Aopen+is%3Aissue+label%3A%22Ready+for+Review%22

<hongchan> cwilso: moving onto issues on AnalyserNode

<hongchan> https://github.com/WebAudio/web-audio-api/issues/330

<hongchan> …: issue 1 - processing block size

<hongchan> …: issue 2 - smoothing

<hongchan> https://github.com/WebAudio/web-audio-api/issues/377

<hongchan> …: issue 3 - analyser FFT size

<hongchan> https://github.com/WebAudio/web-audio-api/issues/375

<hongchan> …: needed from visualization and robust pitch detection, we might want to crank it up to 8k.

<hongchan> mdjp: it is necessary to layout some explanation about the trade-off on FFT.

<hongchan> olivier: if commercial audio software support 32k, web audio api should do it too.

<hongchan> cwilso: consensus is 32k.

<hongchan> cwilso: the minimum size of frame for FFT should be 128.

<hongchan> RESOLUTION: Specify which 32 samples to use. Last 32 has been identified.

<hongchan> cwilso: issues smoothing performed on method call

<hongchan> https://github.com/WebAudio/web-audio-api/issues/377

<hongchan> cwilso: ray suggested a problem caused by non-consecutive smoothing executions.

<hongchan> TPAC RESOLUTION: Clarify analysis frame only on getFrequencyData

<hongchan> https://github.com/WebAudio/web-audio-api/issues/308

<hongchan> Issues on shared methods in offlineAudioContext

<hongchan> cwilso: polling data from Media API faster than real-time is not possible

<hongchan> TPAC Resolution: Adopt ROC's original suggestion of making both the offline and realtime AudioContexts inherit from an abstract base class that doesn't contain the methods in question.

<hongchan> Note: Ask Cameron McCormack to pronounce on best way to describe in WebIDL.

<hongchan> https://github.com/WebAudio/web-audio-api/issues/268

<hongchan> Not relevant any more. Closing.

<hongchan> Noise Reduction should be a float.

<hongchan> https://github.com/WebAudio/web-audio-api/issues/243

<hongchan> Closing issue 243.

<hongchan> Moving onto Issue 128 - https://github.com/WebAudio/web-audio-api/issues/128

<hongchan> using .value setter is sort of a training wheel, so it shouldn't be used for serious parameter control.

<hongchan> mdjp: No behavioral changes on API. Editorial changes.

<hongchan> Moving onto Issue 73 - https://github.com/WebAudio/web-audio-api/issues/73

<hongchan> cwilso: introspective nodes should not be introduced

<hongchan> mdjp: closing.

<hongchan> Issue 317 - https://github.com/WebAudio/web-audio-api/issues/317

<hongchan> Moving onto issue 317 - https://github.com/WebAudio/web-audio-api/issues/317

<olivier> ScribeNick: hongchan

<olivier> Scribe: olivier, philcohen, hongchan

<olivier> Meeting: Audio WG f2f meeting - TPAC 2014

<olivier> Scribenick: hongchan

Summary of Action Items

[End of minutes]

Minutes formatted by David Booth's scribe.perl version 1.138 (CVS log)
$Date: 2014-10-28 00:31:23 $

Scribe.perl diagnostic output

[Delete this section before finalizing the minutes.]
This is scribe.perl Revision: 1.138  of Date: 2013-04-25 13:59:11  
Check for newer version at http://dev.w3.org/cvsweb/~checkout~/2002/scribe/

Guessing input format: RRSAgent_Text_Format (score 1.00)

Succeeded: s/Audix/Audyx/
Succeeded: s/pulgins/plugins/
Succeeded: s/withotu/without/
Succeeded: s/.../MediaRecorder via MediaStream/
Succeeded: s/Decision:/RESOLUTION:/g
Succeeded: s/Decision:/RESOLUTION:/g
Found Scribe: olivier
Inferring ScribeNick: olivier
Found ScribeNick: philcohen
Found ScribeNick: hongchan
Found Scribe: olivier, philcohen, hongchan
Found ScribeNick: hongchan
WARNING: No scribe lines found matching ScribeNick pattern: <hongchan> ...
Scribes: olivier, olivier, philcohen, hongchan
ScribeNicks: olivier, philcohen, hongchan

WARNING: No "Present: ... " found!
Possibly Present: Bill BillHofmann Chris ChrisL ChrisLilley Cyril Harald Note Olivier Paul Related Scribenick Shige Shige_ Shiger TENTATIVE cwilso hongchan hongchan1 https jdsmith joe kawai kawai_ matt mdjp padenot philcohen plh rtoyg rtoyg_m shepazu timeless trackbot
You can indicate people for the Present list like this:
        <dbooth> Present: dbooth jonathan mary
        <dbooth> Present+ amy

Agenda: https://www.w3.org/2011/audio/wiki/F2F_Oct_2014
Found Date: 27 Oct 2014
Guessing minutes URL: http://www.w3.org/2014/10/27-audio-minutes.html
People with action items: 

[End of scribe.perl diagnostic output]