ISSUE-7: (Bugzilla 17419) Power of Two FFTs for RealtimeAnalyserNode

potanalyser

(Bugzilla 17419) Power of Two FFTs for RealtimeAnalyserNode

State:
CLOSED
Product:
Web Audio API
Raised by:
Jussi Kalliokoski
Opened on:
2012-03-22
Description:
Should arbitrary FFTs sizes be possible in the RealtimeAnalyserNode? Currently they are limited to a Power of Two.

Jussi suggests we may have a better API if we change this behavior; performance being one of the considerations.


http://lists.w3.org/Archives/Public/public-audio/2012JanMar/0464.html
Related Actions Items:
No related actions
Related emails:
  1. [web-audio-api] Power of Two FFTs for RealtimeAnalyserNode (#226) (from notifications@github.com on 2013-09-11)
  2. [Bug 17419] New: Power of Two FFTs for RealtimeAnalyserNode (from bugzilla@jessica.w3.org on 2012-06-05)
  3. Minutes of audio WG teleconference, 2012-04-16 (from olivier.thereaux@bbc.co.uk on 2012-04-19)
  4. Re: [Agenda] W3C Audio WG Teleconference, 16th April 2012 (from tmichel@w3.org on 2012-04-14)
  5. Re: [Agenda] W3C Audio WG Teleconference, 16th April 2012 (from gabriel.cardoso@inria.fr on 2012-04-14)
  6. Re: [Agenda] W3C Audio WG Teleconference, 16th April 2012 (from cwilso@google.com on 2012-04-13)
  7. [Agenda] W3C Audio WG Teleconference, 16th April 2012 (from olivier.thereaux@bbc.co.uk on 2012-04-13)
  8. Telecon Minutes from Today (from al@signedon.com on 2012-03-26)
  9. Re: Schedule: Next Telecon - Mon 26th March (from joe@noteflight.com on 2012-03-26)
  10. Schedule: Next Telecon - Mon 26th March (from al@signedon.com on 2012-03-23)
  11. Re: fftSize requirement for RealtimeAnalyserNode (from jussi.kalliokoski@gmail.com on 2012-03-22)
  12. Re: fftSize requirement for RealtimeAnalyserNode (from al@signedon.com on 2012-03-22)
  13. Audio-ISSUE-7: Power of Two FFTs for RealtimeAnalyserNode [Web Audio API] (from sysbot+tracker@w3.org on 2012-03-22)

Related notes:

Several things arose from the telecon when this issue was discussed:


1) A resolution was made in the that the 2^N behavior (power-of-two) should stand as-is in the Web Audio API for this version of the spec.

The two considerations were performance and the ability to work with arbitrary time-scales closer to a particular tempo. With regards to performance: the position was held by Chris Rogers that the power-of-two FFT size is more efficient. With regards to arbitrary time-scales (not 2^n), the point was made that using Keizer or other window on the FFT was enough to get the frequency data that would be required to meet the use-cases.


2) Jussi agreed that given the use-case, 2^N behavior would be sufficient. Jussi went on to suggest a further FFT node, which would behave differently from the RealtimeAnalyserNode. The following action was taken by Jussie to write up scenario, requirements, and proposal for FFT node case.

http://www.w3.org/2011/audio/track/actions/42


3) It was noted that the web-audio api documentation is out of date with regards to: A) the web-audio api having a fixed range of FFT sizes (128 - 2048)... and B) there is no description of the exception that is thrown when a JavaScript developer attempts to create an out-of-range FFT size.

I created an action for this but I do not have access to the web-audio api document.

Would anyone like to document this for the group?

https://www.w3.org/2011/audio/track/actions/41




From the minutes:

http://lists.w3.org/Archives/Public/public-audio/2012JanMar/0506.html

[AGENDUM 3]

Zakim: agendum 3. "ISSUE-7: Power of Two FFTs for RealtimeAnalyserNode"
taken up [from Alistair]

al: two things here. One is documentation. Seems like there's a range for
size of FFT and this is not in the docs.
al: https://www.w3.org/2011/audio/track/issues/7
al: it would be advantageous if size were not limited to power of 2

jussi: as I said on thread, usually when not running in real time one
usually wants to run on arbitrary time windows
... for these kinds of processes arbitrary FFT sizes would be good

al: are you also saying drop the words "real time"

jussi: if it doesn't affect the behvaior

crogers: one technique is to use a window; even if your FFT is a power of
2, you use a window on a smaller number of samples
... there is an implementation cost for arbitrary sizes, it gets complex.
in the analysis work I've done, it's always been sufficient to use a 2^N
FFT with a smaller sample window

al: question for both of you: does this alter the performance or accuracy
of results?

crogers: performance is certainly different with an arbitrary size, it's
not an FFT any more it's a DFT.
... weird size xforms require math that is a lot slower

jussi: you can usually get away with any non-crazy size

crogers: I haven't seen people using these strange size transforms.
Instead people use Kaiser windows, and so on.

jussi: most FFT libraries don't have anything other than 2^N sizes.

crogers: this node was designed more for real time analysis e.g.
visualizers than for audio processing

al: can we use this for faster than realtime output?

crogers: not right now, really. how would we do faster-than-realtime
output for frequency analysis?

al: I could throw a UC out there. If you were to do some sort of xform to
drive how audio was set up in the future based on how it is now

crogers: a developer once wanted analysis frame faster than RT and store
the results in a range of analysis frames to display as a spectrogram
... you can't really do this faster than RT right now
... the graph in the web audio API is always dealing with what's
happening right now in the time domain. you can't pass freq domain data
around the graph
... that gets really complicated really fast
... the current node is designed primarily for visualizers
... for true spectral-domain processing the current node isn't usable
right now

al: I've been talking with an effects processing company and they're
interested in the W3C audio work.
... they are visualizing different frequencies ahead of time and
adaptively adjusting the audio in response
... would this play into what we're discussing?

shepazu: the current node is for one particular use. we know that there
will be a pile of things that this node and API won't do
... I'm already hearing people say, "do this and this and this" that go
beyond. But implementors are saying, "give us something we can build
straightforwardly"
... I think we should at this point put a pin in this particular point
and make it clear that the case we're optimizing for
... is the case that crogers already spoke to, of visualizers etc. does
this make sense?

crogers: in answer to al, in these types of apps, I've worked on this
kind of thing before at IRCAM. it would analyze a sound file
... and draw a spectrograph that you could draw on to create time-varying
filters, etc.
... I am interested in those kinds of apps but going back to what Doug is
saying, [the Web Audio API] graph is optimized for time-varying signals.
... it gets really complicated when you're tossing in frequency-domain
data as well
... you can certainly do all these things in JS though

shepazu: another part of my point is we dont have to solve everything in
v1. We'll find out what we need as people experiment with
... what we put out. There will be more specs to come.

jussi: I suggest that we change the name of the issue. For the given UCs
the 2^N restriction makes sense. Suggest we propose a new node
... that simply does an FFT and converts from time to freq domain and back
... you could put an FFT node, then a delay, then a reverse FFT

crogers: that would be like a phase vocoder engine. if you are doing freq
domain processing then you have to work with overlapping portions of
incoming audio
... you have to move a sliding window
... this is all very cool but it's more complex than just adding some new
node types

jussi: I am actually talking about non-realtime processing


crogers: I've actually seen people writing these time stretching
algorithms in JS
... you can do this offline

joe: we should make sure these new suggestions are linked to use cases,
to avoid going off track

jussi: if we add an fft node [loud tone intervenes]
... I was going to say if we add an FFT node it's best if it's in the 2nd
version of the spec

shepazu: to be concrete about it: we should have a UC and requirement on
this to take it further
... it sounds like you have a specific suggestion, and you can also put
in your suggested solution to the requirement e.g. an FFT node


[ACTION]

shepazu: action: jussi to write up scenario, requirements, and proposal
for FFT node case
* trackbot noticed an ACTION. Trying to create it.
trackbot: Created ACTION-42 - Write up scenario, requirements, and
proposal for FFT node case [on Jussi Kalliokoski - due 2012-04-02].

al: are we in general agreement that we don't need to make the FFT size
arbitrary

jussi: we don't need to. it doesn't help any of the current UCS


[RESOLUTION]

joe: RESOLUTION: an arbitrary size FFT is not needed for version 1
shepazu: Resolution: an arbitrary-size FFT is not needed for version 1
(per Issue-7)


RESOLUTION: an arbitrary size FFT is not needed for version 1

Alistair MacDonald, 27 Mar 2012, 17:03:44

Migrated to Bugzilla:
https://www.w3.org/Bugs/Public/show_bug.cgi?id=17419

Olivier Thereaux, 7 Jun 2012, 14:15:51

Display change log ATOM feed


Matthew Paradis <matthew.paradis@bbc.co.uk>, Raymond Toy <rtoy@google.com>, Chairs, Chris Lilley <chris@w3.org>, Staff Contact
Tracker: documentation, (configuration for this group), originally developed by Dean Jackson, is developed and maintained by the Systems Team <w3t-sys@w3.org>.
$Id: 7.html,v 1.1 2019/11/12 13:32:38 carcone Exp $