17366 – (OscillatorTypes): Oscillator types are not defined

This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 17366 - (OscillatorTypes): Oscillator types are not defined

Summary: (OscillatorTypes): Oscillator types are not defined

Status:	CLOSED WONTFIX

Alias:	None

Product:	AudioWG
Classification:	Unclassified
Component:	Web Audio API (show other bugs)
Version:	unspecified
Hardware:	PC All

Importance:	P2 normal
Target Milestone:	WebAudio LC1
Assignee:	paul@paul.cx
QA Contact:	This bug has no owner yet - up for the taking

URL:
Whiteboard:
Keywords:

Depends on:
Blocks:

Reported:	2012-06-05 11:52 UTC by Philip Jägenstedt
Modified:	2014-10-28 17:16 UTC (History)
CC List:	6 users (show)

See Also:

Attachments
Test page that shows the output of different OscillatorNode implementation (34.63 KB, application/x-zip) 2013-08-30 18:04 UTC, paul@paul.cx	Details

Description Philip Jägenstedt 2012-06-05 11:52:48 UTC

Audio-ISSUE-80 (OscillatorTypes): Oscillator types are not defined [Web Audio API]

http://www.w3.org/2011/audio/track/issues/80

Raised by: Philip Jägenstedt
On product: Web Audio API

https://dvcs.w3.org/hg/audio/raw-file/tip/webaudio/specification.html#Oscillator

The type constants SINE, SQUARE, SAWTOOTH and TRIANGLE are not defined at all except for their names. A definition would need to specify the ideal waveform including the phase. Preferably, they are specified mathematically.

Comment 1 Ralph Giles 2013-08-12 20:44:31 UTC

I noticed that Chrome's implementation has a number of oddities, like ducking the square wave to avoid clipping, and the triangle wave starting at a different phase than the others.

I'd like to see those regularized rather than standardized as-is.

Comment 2 paul@paul.cx 2013-08-30 18:01:38 UTC

I'd like to get the group's opinion on this, as this is a very noticeable problem. This node is not specified, and implementations have taken different routes to implement the basic waveforms (sine, square, triangle, sawtooth).
Ralph Giles has written a tool to visualize what the implementations are doing. I've modified it to put emphasis on some, and attached it to this bug. Simply unzip and load the html file in a browser that has OscillatorNode.

First, how implementation I read currently work:
- In Webkit/Blink, reading the code, an inverse fourrier transform is performed to generate a buffer that will be looped over. 
- In Gecko, we generate the waveform in the time domain directly.

Then, what are the perceptible differences:
- All the basic waveforms on all implementation have the same phase but the triangle on Webkit/Blink (which has a +PI/4 phase offset, and maybe an off-by-one). While not noticeable by ear when the signal is not mixed, this means that mixing a triangle OscillatorNode and an other OscillatorNode with another type will yield different results. It forces the authors to special case the triangle oscillator when they want to intentionally dephase two oscillators.
- Blink/Webkit seems to duck its signal (look at the RMS values at the bottom of the page, for pure signals) to avoid clipping, because of the ripples near the discontinuities.

We should really spec the phase of the basic waveforms.

`phi(t)` is the phase at time `t` (which is in samples), `frequency(t)` and `detune(t)` are the value of the `frequency` and `detune` AudioParam at time `t`, and are defined like so: 

finalFrequency = frequency * pow(2, detune / 1200.) [per spec]
phi = fmod(t * (finalFrequency * 2.0 * Pi / AudioContext.sampleRate), 1.0)

Now, the actual definitions, like so:

- "name of the waveform"
amplitude for phase `phi`
(description of the waveform, to be replaced by images in the actual spec text)

- "sine":
sin(phi)
(starts at zero, 1.0 at 0.25, zero at 0.5, -1.0 at 0.75)

- "square":
1.0 if phi is less than 0.5, -1.0 otherwise
(1.0 at zero, -1.0 at 0.5)

- "sawtooth":
2.0 * phi if phi  is greater than 0.5, 2.0 * (phi - 1.0) otherwise
(0.0 at phi == 0, 1.0 at phi == 0.5, -1.0 at phi == 1.0)

- "triangle": 
if phi is less than 0.25
  4.0 * phi
else if phi is less than 0.75
  1.0 - 4.0 * (phi - 0.25)
else
  4.0 * (phase - 0.75) - 1.0
(0.0 at phi == 0, 1.0 at phi == 0.25, -1.0 at phi == 0.75, 0.0 at phi == 1.0)

Then, onto the ducking problem, I think it would be valuable to have the same RMS level for all waveform types, across implementations, so authors don't have to fine-tune with gain node to get a uniform output. The RMS level for the basic waveforms using the above definition is well defined, we could use those as a reference.

Comment 3 paul@paul.cx 2013-08-30 18:04:32 UTC

Created attachment 1392 [details]
Test page that shows the output of different OscillatorNode implementation

Comment 4 Marcus Geelnard (Opera) 2013-09-02 07:40:27 UTC

Paul, I think that specifying the amplitude (i.e. time-domain signal) the way you suggest requires that an implementation does not deal with frequency folding.

An important point of the oscillator node is that it is capable of producing a high quality signal without folding effects. For instance, this requires that the signal is allowed to have ripples (for every wave type except the sine).

I agree that we need to specify:

- The phase of the signal (and it should be consistent between wave forms).
- The signal strength (RMS, or something else).

On the other hand, I'm not sure how to specify the actual time-domain amplitude level. As I've mentioned before, there are a wide range of options for implementing the signal generation logic.

One option could be to specify the signal amplitude as a continuous time signal (i.e. with infinite sample rate), and then specify in what ways an implementation is allowed to sample this continuous time signal.

Also, I think that we should decide whether or not it's OK for implementations to use different signal generation methods (e.g. trade quality for performance), or if all implementations must use a specific signal generation method.

Comment 5 Chris Wilson 2013-09-03 19:27:00 UTC

(In reply to comment #4)
> Paul, I think that specifying the amplitude (i.e. time-domain signal) the
> way you suggest requires that an implementation does not deal with frequency
> folding.
> 
> An important point of the oscillator node is that it is capable of producing
> a high quality signal without folding effects. For instance, this requires
> that the signal is allowed to have ripples (for every wave type except the
> sine).

Yes.  It would be a definite quality hit if we were to do this as per this suggestion.  Chris (Rogers) spent a ton of time on the anti-aliasing in Webkit/Blink's oscillators; I would not want to lose that.  We can specify it in one particular method (e.g. what webkit and blink do), we can enable options, specify either the mathematical one or a single precise implementation is allowed, or we can define the math and say the actual implementation should approximate this but may improve upon it.  Not there was another bug about this, and Chris added text on in quite a while ago - http://www.w3.org/2011/audio/track/issues/85?changelog.

> I agree that we need to specify:
> 
> - The phase of the signal (and it should be consistent between wave forms).
> - The signal strength (RMS, or something else).

Yes to both of these.  Note that apparently triangle's phase is apparently less standardized, but I agree that the phase in our current implementation looks like a bad choice.  I think the current phase is off by PI/2, though, in order to be consistent with other waveforms?

On RMS - I'm not convinced RMS is a good idea, because I use oscillators to move between values all the time, and I'd want a predictable value.  Perhaps it would be best to consistently duck by some amount?  IIRC, Chris' implementation initially had oscillators go between -0.5 and 0.5, not sure what thought process led to him changing to -1 to 1 with some ducking.

Without looking at the code, I think we may be ducking sawtooth waves as well (since they, too, may clip).

Comment 6 Ralph Giles 2013-09-03 21:03:14 UTC

(In reply to comment #4)
> Paul, I think that specifying the amplitude (i.e. time-domain signal) the
> way you suggest requires that an implementation does not deal with frequency
> folding.

Of course there need to ripples. But do we need to pre-duck the waveform to avoid clipping during naive playback, or can that be a problem for content authors? This API uses float samples, so there's no problem with excursions beyond 1.0 in the oscnode output; it can be adjusted later by a gain node, etc.

> Also, I think that we should decide whether or not it's OK for
> implementations to use different signal generation methods (e.g. trade
> quality for performance), or if all implementations must use a specific
> signal generation method.

This is a more serious question. Do we mind if synths sound slightly different? What about using an oscnode as an lfo, or an animation driver, like Chris suggested? Definite values are more important then.

Comment 7 Chris Wilson 2013-09-03 21:08:51 UTC

(In reply to comment #6)
> Of course there need to ripples. But do we need to pre-duck the waveform to
> avoid clipping during naive playback, or can that be a problem for content
> authors? This API uses float samples, so there's no problem with excursions
> beyond 1.0 in the oscnode output; it can be adjusted later by a gain node,
> etc.

It certainly CAN be a problem for content authors; if they don't adjust with a gain node at some point in the chain, they WILL get clipping distortion.  On the plus side, a -1 to 1 oscillator is so relatively loud that I expect most developers DO adjust with a gain node already.  :)

Much like de-zippering, it's really in how much we want to try to make the default case sound good, vs. predictability for advanced use.

> > Also, I think that we should decide whether or not it's OK for
> > implementations to use different signal generation methods (e.g. trade
> > quality for performance), or if all implementations must use a specific
> > signal generation method.
> 
> This is a more serious question. Do we mind if synths sound slightly
> different? What about using an oscnode as an lfo, or an animation driver,
> like Chris suggested? Definite values are more important then.

I do want to separate the issues of signal generation method (e.g. anti-aliasing vs simple math) and the ducking/level issue.

Comment 8 Olivier Thereaux 2014-10-28 17:13:50 UTC

Web Audio API issues have been migrated to Github. 
See https://github.com/WebAudio/web-audio-api/issues

Comment 9 Olivier Thereaux 2014-10-28 17:16:44 UTC

Closing. See https://github.com/WebAudio/web-audio-api/issues for up to date list of issues for the Web Audio API.