W3C

- DRAFT -

W3C Audio WG f2f
26 Mar 2013

See also: IRC log

Attendees

Present
Regrets
Chair
SV_MEETING_CHAIR
Scribe
gmandyam, shepazu, chrislowis

Contents


<gmandyam> scribenick: gmandyam

Olivier: first agenda item is the question of the feature set for the WebAudio API
... there have been a number of discussions on the complexity/extent of WebAudio feature set
... discussion has ranged from how we could simplify API to freezing current set of features
... refer to sec. 1.3 (API Overview) of the current WebAudio spec for current features

crogers: The current spec is stable for a Version 1, but there have been suggestions for a limited feature set for a Version 1. The current spec is semi-complete. We can add more features, but this should be done after the Mozilla implementation is further along.

ehsan: The current feature set covers too much, and from Mozilla it would make sense to have a smaller feature set. We cannot ship a feature unless it is fully implemented (even though it can be included in nightly builds).

<scribe> scribeNick: gmandyam

<shepazu> scribeNick: shepazu

Olivier: first agenda item is the question of the feature set for the WebAudio API
... there have been a number of discussions on the complexity/extent of WebAudio feature set
... discussion has ranged from how we could simplify API to freezing current set of features
... refer to sec. 1.3 (API Overview) of the current WebAudio spec for current features

crogers: The current spec is stable for a Version 1, but there have been suggestions for a limited feature set for a Version 1. The current spec is semi-complete. We can add more features, but this should be done after the Mozilla implementation is further along.

ehsan: The current feature set covers too much, and from Mozilla it would make sense to have a smaller feature set. We cannot ship a feature unless it is fully implemented (even though it can be included in nightly builds).

<scribe> scribeNick: gmandyam

joe: from an app developer perspective, there are not a lot of features that can be put off to V2..

olivier: we did put a use cases and requirements doc, but it has not helped us in prioritizing features. There are no features currently that are not useful.

cwilso: I went through this process before in the context of discussion with MS about WebAudio. Pulling out features changes the scope of use cases that we can meet. e.g. we could pull out WaveTable, but it affects synthesizer capability. There are a couple of additional features that should be added to the spec (e.g. an expander node) that can go into V2.

oliver: we should set up a post-V1 roadmap.

cwilso: It would be useful to understand the scope of implementation, too - i.e. how big is the cost of implementation?

<shepazu> (for reference: http://www.w3.org/TR/webaudio-usecases/ )

oliver: Ehsan, in addition - please distinguish between the node processing model, and the complexity in implementing individual nodes.

crogers: Convolver node, spatial panning node, dynamic compressor nodes are the hardest nodes to implement. The rest of the nodes are simpler. Would recommend to Mozilla to prioritize node development based on complexity of implementation. Developers are using most if not all of the features in Chrome currenlty, so we can't easily remove nodes. But we can designate nodes as V1.

olivier: We should arrive at a decision on criticial feature for V1 today.

ehsan: it is not easy for an app to query for supported features in WebAudio, thus making it difficult to cut down features in an implementation. This will be a problem if multiple versions of WebAudio are deployed. Maybe an extensions model like WebGL could be the way to go.
... our survey of game devleopers has yielded a priority list of processing nodes. This is driving our development. The current scope of WebAudio prevents us shipping near-term. Focusing on a subset of features and use cases will allow us to ship sooner.
... there is not a hard cost in implementing specific nodes. The complexity is based on implementing the entire spec, and not having access to the relevant Webkit code is a hindrance.
... the actual implementation time is about a few hours per node.

shepazu: we need a means of runtime detection of feature support (agree with ehsan that this is a shortcoming of the current spec).
... W3C requires two browser implementations of V1 of any given spec before work can begin on a V2.
... Developers shouldn't read W3C specs - they should go to other articles (e.g. HTML5Rocks).

crogers: Developer documentation has been very difficult for WebAudio. Developers have had to rely on the specs.

cwilso: we haven't had enough developer uptake to get more developer documentation from non-W3C sources.

shepazu: we should consider developer documentation + specs as a whole ecosystem. Let us separate what developers can use from the spec work - developers could rely on documentation for guidance on actual browser support for features.

ehsan: the difficulty is matching nodes we are developing with the behavior of the same nodes in Webkit. The spec does not provide sufficient details to allow us to implement nodes that would conform to the same level as Webkit.

oliver: ehsan, do you have a rough idea of what you consider a good feature subset?

<ehsan_> https://etherpad.mozilla.org/webaudio

ehsan: Will provide link to a feature priority list.
... this is from our survey of game developers.
... three groupings of features: V1 must haves, not sures, and post V1. The not sure features are features that game developers will use if available but are not critical.
... The post-V1 features (AnalyserNode, ChannelSplitter and ChannelMerger nodes) are of no interest to game developers.

joe: from a music developers POV, I cam up with a similar list to Mozilla's re: V1/V2 split. OfflineAudioContext would be the only difference.
... OscillatorNode as an example - would use if it is there but I can still come up with a suitable music app without it.

ehsan: We may deprioritize OfflineAudioContext because of difficulties in implementing in Gecko architecture.

crogers: There are uses for developers (e.g. upload to server), and it is critical for our test cases. We compare the node processing versus identical JS processing using OfflineAudioContext to verify nodes.
... ScriptProcessor node has a latency, and a testbed could be modified to account for this latency but it adds difficulty to automated testing.

olivier: OfflineAudioContext will be revisited during the testing discussion.
... (to all group participants) please examine Mozilla's list and come up with your own version of a V1 list. Decide if we can take Mozilla's list as a discussion starting point.

crogers: Channel Splitter/Merger node is not so popular, but AnalyserNode is popular for visualizers.
... Developers will wonder why AnalyserNode is not available in Firefox.

olivier: We should focus on what developers need, as opposed to what has been used in some demo's.

ehsan: the WebAudio demos and test cases are all using Webkit prefixed API and will not work in Firefox anyways.

crogers: You can do some of the things in AnalyserNode with ScriptProcessorNode, so it makes sense for a potential post-V1 feature.
... ConvolverNode is the most difficult node to implement. The spec language is sufficient.

ehsan: We have an implementation of PannerNode in Gecko, but no DelayNode. I don't want to focus on implementation difficulty, but on critical use cases.

crogers: We have the ability to track in Chrome the node usage stats. We could track this for a couple of months in aid of prioritizing features.

ehsan: Tracking node stats with existing web pages is misleading since many of the pages are demo pages and are meant to test as many features as possible.

chrislowis: it would be better to collect node stats based on domain (ergo type of applications)

ehsan: that is a hard problem to solve (tracking node usage by app type)

crogers: it might be possible to track domains along with node usage, but there are limits as to what we can expose.

cwilso: I have approached WebAudio development from the perspective of a developer who is not an audio expert. For instance, I couldn't convert a mono signal to stereo signal without the channel splitter node.
... Can group use cases in following app types: gaming, advanced gaming, synthesis, input, audio prociessing, media players, digital audio workstation. Cutting features will remove whole classes of apps.

olivier: I don't hear massive disagreement with the Mozilla list (though I ack. cwilso's concerns on ChannelSplitterNode).
... Even with the Mozilla split there are only a few features that would be removed from V1.

chrislowis: Maybe it is better to do the "painful" cut of whole classes of apps to define V1 rather than specific nodes without app context.

<chrislowis> CSS 1.0 core features: http://www.w3.org/TR/REC-CSS1/#terminology

cwilso: We went through this with the CSS spec. Is there a goal of having some set of devices that does not do all of these features?

<chrislowis> scribeNick: chrislowis

cwilso: do we think we'll never want a class devices that can, for example, use the ConvolverNode?
... I'm concious that we need to make this easier for other browser implementers.

shepazu: if we do decide on a split we should make sure we pass it by other browser implementors.

olivier: I saw the opinion of Mozilla change the moment the implementation work started?

ehsan_: the size of the spec is a problem when implementing everything in scratch.

<wistoch> definitely we want convolverNode in mobile devices.

ehsan_: we were able to piggy-back on our work on mediastreams which made certain things easier. We shouldn't assume that that is the case for everyone.

shepazu: have you been looking at the webkit source code?

ehsan_: yes, I've looked at it a lot to understand things. But I haven't borrowed any of the source code yet.
... but we should fix the spec so that noone needs to look at webkit source code.
... I've been trying to catch all of the pain points and raise bugs and so on.
... on the mailing list and in bugzilla and with Chris R. The real problem is that sometimes the amount of issues is too great, so sometimes I make a decision but wouldn't want to just make a change to the spec.
... the overhead of contributing to the spec is a little too high, so I haven't contributed as much as I would like.

shepazu: when we talk about "versions" in the spec, we're often talking about "levels" so that new versions don't change old ones.

cwilso: noone would claim now to implement CSS 1.0, they would implement CSS 2, for example.
... whereas what we have isn't a simple set of concentric "rings" of features, it's a complex venn diagram.

joe: originally I thought this discussion was about implementation effort, but it sounds like it's more about the difficulty in accurately specifying the features.

olivier: it's worth noting that we seem to have discovered that it's difficult to to split apart the (etherpad doc) set of features.

ehsan_: with my gecko hat on, if I had a spec I could use to implement things in a very straightforward way, we'd have already shipped web audio. It's fair to assume that there will be other browser vendors who will have similar issues. Also it's a large spec to approach and wrap your head around to begin with.
... my personal pet peeve is the ScriptProcessorNode, and I'd be happy to not implement it - but I've been convinced otherwise!
... the other issue I have is with the use of floating point numbers, as we try not to assume that a given device will have a fpu. But I don't believe that is a battle worth fighting for at this point in the spec.

<ehsan_> chrislowis: clarification on ScriptProcessorNode, my objections have been on audio processing in javascript on the main thread, but I have been convinced that web developers want to do that since they can't access everything in a worker yet

Web Audio API walk through and live edit

<joe> 4.3 AudioSourceNode can be removed; over-prescriptive of implementation

<joe> (This is a record of spec review decisions from the F2F 3/26/2013)

<joe> 1. Introduction: link "use cases" to the Use Cases document

<joe> 1.1 Features need updating to reflect current contents of spec

<joe> 1.3 Oscillator, WaveShaper missing; others may be also

<joe> (1.3 = API Overview)

<joe> 2. Conformance: need to note use of MUST that is "RFC-legal" as opposed to common English usage

<joe> 3. Terminology and Algorithms: remove this section and treat algorithm exactness on a case-by-case basis elsewhere in specific sections

<joe> 4. The Audio API:

<joe> WebIDL should be construed as normative

<joe> 4.1 AudioContext: remove audio constructor example line of code

<joe> Update Bug 20698 to differentiate the latency-discovery issue (already filed) from follow-on questions of audio clock drift and granularity which may not affect user experience to same degree

<joe> AudioContext.createBuffer (synchronous) will be deprecated in favor of decodeAudioBuffer (asynchronous)

<joe> decodeAudioBuffer to take an optional 4th argument that disables automatic sample rate conversion

<joe> decodeAudioData "Audio file data can be in any of the formats supported by the audio element" => "...can be accepted in formats containing only audio data (w/o video"

<joe> (to avoid dealing overhead of dealing with video containers that have an audio track)

<joe> Implementations are to omit functions from DOM bindings that are not implemented (e.g. createXXXNode where XXX isn't supported)

<joe> Ehsan will specify exception types to be returned by AudioContext methods where these weren't already given in spec

<joe> The sections about AudioContext and AudioNode lifetime will be construed as informative

<joe> AudioDestinationNode does not always talk to audio hardware (e.g. in offline case); fix wording that refers to this

<joe> AudioContext.createBuffer(): upper limit of 96k has been raised

<joe> AudioContext.createBuffer(): clarification: only the version of the method taking an AudioBuffer is to be deprecated

<joe> 2nd paragraph of decodeAudioData ("is preferred...") to be removed

<joe> General: Need to refer to XHR specification in this specification

<joe> Need to create an issue focusing on the question of modifying the ArrayBuffer passed to decodeAudioData

<olivier> http://www.w3.org/2011/audio/wiki/F2F_Mar_2013#Agenda

<joe> http://www.w3.org/2011/audio/wiki/F2F_Mar_2013

<olivier> http://www.w3.org/2011/audio/wiki/F2F_Mar_2013#Agenda

<mdjp> OfflineAudioContext should be event target - define event type to pass rendered audio

<mdjp> OffLineAudioContext - detail needs adding to the spec around startrendering(), how do multiple offline/onilne contexts interact clarification required

<mdjp> Shared audio buffers between contexts - use case - large sample libraries should be sharable.

<mdjp> clarification offlineAudioContext renders as quickly as possible (not real time)

<mdjp> chris to complete offLineAudioContext spec

<mdjp> proposal - recorderNode (real time). Already possible with scriptProcessorNode - but dedicated node would be more conveinient.

<mdjp> 4.2. The AudioNode Interface - text is out of date for Fan-In "in order to handle this fan-in……"

<mdjp> 4.2. The AudioNode Interface - set aside discussion on where block size limits should be defined in the spec and whether or not the 128 sample value is appropriate

<mdjp> 4.2.1 - audioNode attribute delete - AudioSourceNode

<mdjp> The connect to AudioParam method - add detail of connecting audio node to non audio node - why would you do it (LFO example) could be added to graph routing introduction

<mdjp> Define behaviour when disconnect called on audio node to audio param

<mdjp> disconnect method - audioNode - add more information on dis/connect scheduling and behaviour. Lay out steps that audio engine needs to perform when dis/connect is called. ehsan to assist

<mdjp> channel count missing in IDL for audioNode

<mdjp> Move information on multi channel (9) to audio node definition. Collate information on handling channel allocations into one place

<mdjp> compromise - duplicate partial interface in audio node defintion and link to channel handling information

<mdjp> maxChannels - review 32 channel limitation on scriptProcessor, buffer and destination node

<mdjp> set aside discussion on delayNode - how to deal with change in number of input while live - how to allocate/deallocate buffers and maintain state

<mdjp> this may affect other nodes (biquad)

<mdjp> spec channel count for each node. Split out examples that represent defaults and put them in the node defintion.

<mdjp> AudioParam - min/maxValue do not need to be exposed as attributes.

<mdjp> AudioParam - "intrinsic value" unclear. Move current text (4.5.1) higherin the spec and reword.

<mdjp> AudioParam - remove computedValue attribute.

<mdjp> Clarify "dezippering" for AudioParam (note this is already mentioned in gainNode)

<mdjp> AudioParam - (future) document inital time constant and algorythm for dezippering, allow it to be disabled. (explains how we make it "sound good")

<mdjp> Resolution - ^^ this has been recommended and we may do this in the future.

<mdjp> note - AudioParam methods (4.5.2) not reviewed at this time.

<mdjp> 4.5 - add explaination of a/k rate to cross reference in node definition.

<mdjp> consideration allow users to choose between k/a rate but provide defaults

<mdjp> raise bug ticket to record all documentation that is considered developer documentation. Eg 4.9 "(for one-shot sounds and other short audio clips)."

<mdjp> AudioProcessingEvent - remove node attribute

http://resource.isvr.soton.ac.uk/FDAG/VAP/images/anec_kemar.gif

<mdjp> Panner - include informative note on HRTF to point implementation in the right direction

mdjp: I think there's a chapter on binaural in the MIT Press "Computer Music Tutorial" book?

<mdjp> Panner - add information on why the panner is hard coded to 2 channel only

<mdjp> chrislowis yes - although I think there may be some better references I can dig out.

<mdjp> action - mdjp to investigate use cases for soundfield panning model

<trackbot> Error finding '-'. You can review and register nicknames at <http://www.w3.org/2011/audio/track/users>.

<mdjp> Add informative section on how parameter changes are scheduled and applied to samples

ACTION mdjp to investigate use cases for soundfield panning model

<trackbot> Error finding 'mdjp'. You can review and register nicknames at <http://www.w3.org/2011/audio/track/users>.

ACTION mdjp to investigate use cases for soundfield panning model

<trackbot> Error finding 'mdjp'. You can review and register nicknames at <http://www.w3.org/2011/audio/track/users>.

<mdjp> ACTION mdjp to investigate use cases for soundfield panning model

<trackbot> Error finding 'mdjp'. You can review and register nicknames at <http://www.w3.org/2011/audio/track/users>.

mdjp: I added your nick name, might take a while to register in the system.

ACTION mdjp to investigate use cases for soundfield panning model

<trackbot> Error finding 'mdjp'. You can review and register nicknames at <http://www.w3.org/2011/audio/track/users>.

<mdjp> chrislowis - so did I!

ACTION mdjp to investigate use cases for soundfield panning model

<trackbot> Error finding 'mdjp'. You can review and register nicknames at <http://www.w3.org/2011/audio/track/users>.

<mdjp> ConvolverNode - add implementation page as a note or informative section in the spec.

<mdjp> BiQuad - resolve reference to third party implementation. Host locally or find permanent reference

<mdjp> BiQuad - some default values are missing from the spec

<olivier> (type and detune)

<mdjp> biquad - remove wikipedia links to filter types

<olivier> (but we should use that in dev doc)

Summary of Action Items

[End of minutes]

Minutes formatted by David Booth's scribe.perl version 1.137 (CVS log)
$Date: 2013-03-26 23:56:05 $

Scribe.perl diagnostic output

[Delete this section before finalizing the minutes.]
This is scribe.perl Revision: 1.137  of Date: 2012/09/20 20:19:01  
Check for newer version at http://dev.w3.org/cvsweb/~checkout~/2002/scribe/

Guessing input format: RRSAgent_Text_Format (score 1.00)

Succeeded: s/Ehsan, please provide a list of features that Mozilla thinks can be removed./It would be useful to understand the scope of implementation, too - i.e. how big is the cost of implementation?/
Found ScribeNick: gmandyam
Found ScribeNick: gmandyam
WARNING: No scribe lines found matching ScribeNick pattern: <gmandyam> ...
Found ScribeNick: shepazu
Found ScribeNick: gmandyam
Found ScribeNick: chrislowis
Inferring Scribes: gmandyam, shepazu, chrislowis
Scribes: gmandyam, shepazu, chrislowis
ScribeNicks: gmandyam, shepazu, chrislowis

WARNING: No "Present: ... " found!
Possibly Present: General Koji Olivier audio chris chrislowis colinbdclark colinbdclark_ crogers cwilso ehsan ehsan_ gmandyam https jernoble joe joined kawai left mdjp oliver rosskukulinski scribenick shepazu toyoshi__ toyoshim toyoshim_ trackbot wistoch
You can indicate people for the Present list like this:
        <dbooth> Present: dbooth jonathan mary
        <dbooth> Present+ amy


WARNING: No meeting chair found!
You should specify the meeting chair like this:
<dbooth> Chair: dbooth

Got date from IRC log name: 26 Mar 2013
Guessing minutes URL: http://www.w3.org/2013/03/26-audio-minutes.html
People with action items: 

[End of scribe.perl diagnostic output]