17378 – (AudioBufferSourceNodePlaybackRate): AudioBufferSourceNode.playbackRate not strictly defined

This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 17378 - (AudioBufferSourceNodePlaybackRate): AudioBufferSourceNode.playbackRate not strictly defined

Summary: (AudioBufferSourceNodePlaybackRate): AudioBufferSourceNode.playbackRate not s...

Status:	CLOSED WONTFIX

Alias:	None

Product:	AudioWG
Classification:	Unclassified
Component:	Web Audio API (show other bugs)
Version:	unspecified
Hardware:	PC All

Importance:	P2 normal
Target Milestone:	TBD
Assignee:	This bug has no owner yet - up for the taking
QA Contact:	This bug has no owner yet - up for the taking

URL:
Whiteboard:
Keywords:

Depends on:
Blocks:

Reported:	2012-06-05 12:00 UTC by Philip Jägenstedt
Modified:	2014-10-28 17:17 UTC (History)
CC List:	5 users (show)

See Also:

Attachments

Description Philip Jägenstedt 2012-06-05 12:00:25 UTC

Audio-ISSUE-93 (AudioBufferSourceNodePlaybackRate): AudioBufferSourceNode.playbackRate not strictly defined [Web Audio API]

http://www.w3.org/2011/audio/track/issues/93

Raised by: Philip Jägenstedt
On product: Web Audio API

While it's fairly easy to guess what it should do, playbackRate is not actually well defined. In particular, it should be clear what playbackRate 0 means and whether or not negative rates are allowed. (Given an AudioParam oscillating between -1 and 1 it would be possible to remain in PLAYING_STATE perpetually.)

Comment 1 Ehsan Akhgari [:ehsan] 2013-08-01 17:24:34 UTC

I think it probably makes sense to specify a 0 playbackRate to produce silence, and for negative values to play the buffer backwards, that is, from duration to offset (or from loopEnd to loopStart).

Also, note that there is another way that this node can perform resampling, that is, when there is a doppler shift applied to it in the face of a PannerNode.  I think it makes sense to specify what needs to happen based on the multiplication of these two ratios.

Another point which was brought up on today's call was handling of values larger than one, but I think that is probably non-controversial by specifying that the final computed sampling rate ratio should be multiplied by the sampling rate of the buffer for the AudioBufferSourceNode in order to determine the target sampling rate that the resampler should use.

Comment 2 Olivier Thereaux 2013-08-02 09:12:14 UTC

Per our meeting on 2013-08-01 (http://www.w3.org/2013/08/01-audio-minutes.html), this is a call for volunteers to suggest a patch to the web audio API spec to define the expected behaviour when setting negative values for AudioBufferSourceNode.playbackRate.

Comment 3 Ehsan Akhgari [:ehsan] 2013-08-02 14:25:44 UTC

(In reply to comment #2)
> Per our meeting on 2013-08-01
> (http://www.w3.org/2013/08/01-audio-minutes.html), this is a call for
> volunteers to suggest a patch to the web audio API spec to define the
> expected behaviour when setting negative values for
> AudioBufferSourceNode.playbackRate.

Hmm, this is what I was hoping to do in comment 1.  :-)  Do we absolutely need a patch here?  I think it probably makes sense to have the basic discussion first and then move to the exact prose when everybody is on the same boat about what we want.  Do you agree?

Comment 4 Joe Berkovitz / NF 2013-08-02 15:21:38 UTC

I was thinking about volunteering a patch but reached the same conclusion as Ehsan: we need to have a basic discussion first.

The basic outline of my proposal is different from Ehsan's but similar in spirit (I think):

- define an "effective sampling rate" that is the product of the playbackRate AudioParam and the base sampling rate of the underlying AudioBuffer.

(NB I would not include the effect of downstream resampling-like effects such as doppler shifts as I think this may lead to confusion over the behavior of graphs with branched routing. It seems harder for developers to predict what will happen.)

- At each sample frame, determine a buffer playback quantum that advances a notional "playback cursor" associated with the buffer. The quantum is in fractional sample frames and is (effective sampling rate) / (context sampling rate).

- Prior to starting playback the cursor is initialized to the starting offset of the buffer. When playing, the cursor advances by the playback quantum; a positive quantum moves the cursor forwards while a negative quantum moves the cursor backwards. The data window between the previous and newly updated cursor position is used to interpolate the value of the next sample frame. If the cursor moved forwards, sample frames in the window are considered in forward order; if backwards, sample frames in the window are considered in reverse order. Such interpolation may take advantage of the contents of the previously calcualted data window for the buffer [I'm not going to attempt a description of how interpolation actually works here].

- Within the loopStart/loopEnd region, the cursor can only advance by cycling through the loop forwards or backwards. The data window for resampled rendering includes the contents of these cycles.

- If prior to the loop region, the cursor cannot move backwards past offset 0 of the buffer -- i.e. the left edge of the data window is clamped to offset 0.

- If the cursor moves forwards past the end of the buffer, playback ends (I am sure the spec has better language describing this transition).

- If the cursor does not move during the rendering of a frame because the effective sampling rate is 0, then the data window starts at ends at a single point within the buffer, so the output is the single [interpolated] value at thet buffer offset. This implies that a zero playback rate will produce a DC output with the value at the current notional position of the cursor.

Comment 5 Ehsan Akhgari [:ehsan] 2013-08-02 19:18:05 UTC

Doesn't this assume a linear interpolating resampler?  The resampler that we use in Gecko is much more complicated (and of higher quality as a result) than that!  (It's the libspeex resampler.)

If we're going to make it possible for implementations to compete on the resampler quality, assuming the resampling algorithm seems like a mistake.

Comment 6 Joe Berkovitz / NF 2013-08-02 19:34:03 UTC

@Ehsan: I had no intention of assuming any particular algorithm (and tried to call this out -- sorry if it was unclear). Of course linear interpolation is not a preferred choice. I suppose that a literal interpretation of my proposal could suggest linear interpolation, but that was not the intention.

The proposal specifies a sequence of {data window, effective sampling rate} pairs with fractional sample-offset boundaries that form the input to an arbitrary interpolation algorithm. How the interpolator makes use of this sequence is not a concern. A nonlinear interpolator can work with as much of the sequence as it likes, processing arbitrarily large batches of data points at a time.

Of course in practice an implementor would probably not accumulate such a sequence and apply an interpolation algorithm to it, this is an idealized behavior for specification purposes.

If this approach turns out to be too naîve I welcome an improved recasting of it. I think the important aspect of it has to do with the way that playback progress through the buffer is affected by a time-varying playback rate, and I found an idealized cursor the easiest way to express this progress.

Comment 7 Chris Wilson 2013-08-05 01:49:06 UTC

+1 to Joe's general idea - I also do NOT agree that playbackRate < 0 should change where the cursor starts; other than that, I think we're all on the same page.

Comment 8 Joe Berkovitz / NF 2013-08-05 17:51:36 UTC

Just to amplify Chris's comment: apart from my attempt to tease out a more detailed spec of playbackRate, the main behavioral difference in my proposal from Ehsan's is that a negative playbackRate does not cause playback to start at a different point than it would have otherwise. playbackRate determines the time derivative of a "playback path" through the buffer, but not the origin of that path, which remains the buffer offset as specified in the start() call (which defaults to 0).

If we want the ability to start playing a buffer from the end, I think there's a clearer and more explicit way to do that: attach that interpretation to a negative "offset" parameter passed to AudioBufferSourceNode.start(). I don't feel strongly that we need that feature but I do think we should avoid overloading the meaning of playbackRate w/r/t start offsets.

Comment 9 Ehsan Akhgari [:ehsan] 2013-08-08 03:21:06 UTC

I think I was unclear about what I meant, sorry about that.  In the first paragraph of comment 1, I meant to describe the cursor jump boundaries, not that the playback should *start* at `duration'.  In other words, I meant to propose exactly the same thing as Joe described better in terms of the cursor concept.  In light of comment 6, I believe we're mostly proposing the same thing (with my proposal intentionally not talking about the details of the resampling, and with Joe's proposal doing a much better job describing the cursor concept, etc.)

Comment 10 Olivier Thereaux 2014-10-28 17:14:10 UTC

Web Audio API issues have been migrated to Github. 
See https://github.com/WebAudio/web-audio-api/issues

Comment 11 Olivier Thereaux 2014-10-28 17:17:03 UTC

Closing. See https://github.com/WebAudio/web-audio-api/issues for up to date list of issues for the Web Audio API.