Re: TPAC F2F and Spec Proposals from Alistair MacDonald on 2011-10-18 (public-audio@w3.org from October to December 2011)

From: Alistair MacDonald <al@signedon.com>
Date: Tue, 18 Oct 2011 14:52:36 -0400
To: Chris Rogers <crogers@google.com>, "robert@ocallahan.org" <rocallahan@gmail.com>, Stefan Håkansson LK <stefan.lk.hakansson@ericsson.com>
Cc: public-audio@w3.org
Message-ID: <CAJX8r2m4oJ=DCnL_nH8BO5hTs+Bq=VZnKoB28pPi9ht9iP3BvQ@mail.gmail.com>
Media Streaming seems to be the in the business of routing media signals as
binary data from one client to another, and handling their exceptions. A
kind of pipeline-infrastructure for video and audio media around the web.

It is also very clear that lots of users all around the web want to use
audio in the browser for synthesizing music and creating games with 3d
spatilazed sound. Based on input from the community I have seen giving talks
at conferences, the Web Audio API more completely covers the set of use
cases for the common developer than the Audio Data API or the Media
Streaming API.

On top of this, the amount of cool audio-visual demos, hacks and experiments
flying around web is bringing direct commercial interest to web-services
companies right down to the start-up level.

I disagree with Hixie's statement that it is too soon for an Audio API of
The Web Audio's complexity for three reasons:

1) The market is already knocking at the door asking to use these features -
we have a chance to create more jobs and a richer web in the process.
2) No one is re-inventing how digital audio works for either games or music,
so standardization does not have to be *overly* slow, though obviously care
should be taken.
3) I think we have enough specifications still in the standardization track
that require common audio functionality (HTML Media, Voice Browser, Web
Audio Data), that we should have had a common audio API much earlier.

In my mind the browser should have already had the capability to mix, level,
mute and pan audio sources. We should have learned this from Flash
advertising already. "Oh wow! You've been selected as the 1 millionth
visitor!", is a failure scenario we hear as we scramble to find which tab to
close.

I think we need to learn from what has developed naturally in the OS
already. We currently have:

1) An OS Level Sound Mixer controlling a) devices, and b) applications.
2) An application framework for media file playback. The video/audio player.
2) An application framework for developing software with audio for games or
music. DirectX, OpenAL etc.

I think we need a more complete Browser Audio Framework, that can be broken
down into the following components:

1) A browser UI and architecture for controlling audio -- at a tab and
device level -- it would not be a pressing matter standardize this
functionality and could be done independently by each browser vendor.
2) A "Web Audio Data API" with high-resolution timing, 3D spatialization of
sources, with standardized effects and algorithms for music and games that
accepts inputs from other APIs.
3) A common "Sound Mixer API" for the window which allowed for panning,
mixing, muting, creating JavaScript Sinks and Worker-Threads. RTC, Web Audio
Data and HTML Media elements would play back though the Sound Mixer API.

I have created a diagram to visualize this concept here:
http://f1lt3r.com/w3caudio/Browser%20Audio%20Routing.jpg

With this in mind I think the most pressing concern for right now is an
Sound Mixer API. Then a Web Audio Data API. And finally (who knows how far
out this would be) an overhaul of the browsers internal audio architecture
adding UI features to the UA.

Would be interested to hear people's feedback on this idea.

Alistair













On Tue, Oct 18, 2011 at 1:36 PM, Chris Rogers <crogers@google.com> wrote:

>
>
> On Mon, Oct 17, 2011 at 11:28 PM, Jussi Kalliokoski <
> jussi.kalliokoski@gmail.com> wrote:
>
>> On Tue, Oct 18, 2011 at 3:47 AM, Chris Rogers <crogers@google.com> wrote:
>>
>>>
>>>
>>> On Mon, Oct 17, 2011 at 4:23 PM, Olli Pettay <Olli.Pettay@helsinki.fi>wrote:
>>>
>>>> On 10/14/2011 03:47 AM, Robert O'Callahan wrote:
>>>>
>>>>  The big thing it doesn't have is a library of native effects like the
>>>>> Web Audio API has, although there is infrastructure for specifying
>>>>> named
>>>>> native effects and attaching effect parameters to streams. I would love
>>>>> to combine my proposal with the Web Audio effects.
>>>>>
>>>>
>>>>
>>>> As far as I see Web Audio doesn't actually specify the effects in any
>>>> way, I mean the algorithms, so having two implementations to do the
>>>> same thing would be more than lucky. That is not, IMO, something we
>>>> should expose to the web, at least not in the audio/media core API.
>>>>
>>>
>>> I'm a bit perplexed by this statement.  The AudioNodes represent
>>> established audio building blocks used in audio engineering for decades.
>>>  They have very mathematically precise algorithms.  Audio engineering and
>>> computer music theory has a long tradition, and has been well studied.
>>>
>>> Chris
>>>
>>
>> I have to display my ignorance here, as I've yet to see a resource where
>> you'd have all these effects and their algorithms defined universally. I've
>> seen a lot of different implementations and even more algorithms for doing
>> these things, especially filters and reverbs. I haven't heard of a single
>> universally accepted algorithm for reverberation or spatialization. Makes
>> sense to say that for example a gain effect is standard, but even then
>> there's logarithmic gain effects and linear gain effects. The only things
>> that stand as standards in my mind are FFT and various window functions,
>> which aren't effects per se. Chris, maybe you can point us to a reference
>> where these effects are in fact defined in detail?
>>
>> Jussi
>>
>
> Convolution [1] [2] is the technique used for reverberation.  It's defined
> with mathematical precision and allows a practically infinite number of
> reverberation (and other special effects) depending on the impulse response
> file used.  Given a specific impulse response it generates an exact effect.
>  For gain effects, the AudioGainNode can have its gain adjusted according to
> standard linear/log curves, or with completely arbitrary curves (which are
> precisely defined as values in an ArrayBuffer).
>
> Chris
>
> [1] http://en.wikipedia.org/wiki/Convolution
> [2] http://en.wikipedia.org/wiki/Convolution_reverb
>
>
>
Received on Tuesday, 18 October 2011 18:57:24 UTC