- KVRAF
- 3486 posts since 7 Sep, 2002

I'm not specifically working on it at this time, just wanted to hear an opinion. I'm envisioning coding an efficient alias-free wavetable oscillator this way:

1. I create a base wavetable (let's say, sawtooth) with a predefined base frequency, which is a factor of Nyquist frequency, of course it's antialiased and its higher harmonics fade away into noise floor, I make sure the wavetable length is chosen so that zero crossing happens at its end. I replicate this wavetable several times. Then I upsample the wavetable by a factor of 2 with a linear-phase filter.

2. I create "low-pass filtered" versions (with a steep linear-phase filter) of this wavetable, with corner frequency placed at Nyquist/2, Nyquist/4, Nyquist/8, etc. Note that all wavetables "run" at 2X sample rate.

3. I discard samples at the begining and end of the resulting wavetables, to remove ripples from the low-pass filter. So, if on step 1 I make 5 replications, in the end I leave only 3 center replications that are "looped" well.

4. When doing real-time speed rate modulation, I choose two adjacent versions depending on the speed rate. I use Hermite spline on both versions to obtain interpolated value, then perform cross-mix of two interpolated values, based on speed rate. I do not need to use adjacent version cross-mix when speed rate becomes low.

5. I apply 2X halfband polyphase downsampling to the result.

Will this work, or I'm overlooking something?

1. I create a base wavetable (let's say, sawtooth) with a predefined base frequency, which is a factor of Nyquist frequency, of course it's antialiased and its higher harmonics fade away into noise floor, I make sure the wavetable length is chosen so that zero crossing happens at its end. I replicate this wavetable several times. Then I upsample the wavetable by a factor of 2 with a linear-phase filter.

2. I create "low-pass filtered" versions (with a steep linear-phase filter) of this wavetable, with corner frequency placed at Nyquist/2, Nyquist/4, Nyquist/8, etc. Note that all wavetables "run" at 2X sample rate.

3. I discard samples at the begining and end of the resulting wavetables, to remove ripples from the low-pass filter. So, if on step 1 I make 5 replications, in the end I leave only 3 center replications that are "looped" well.

4. When doing real-time speed rate modulation, I choose two adjacent versions depending on the speed rate. I use Hermite spline on both versions to obtain interpolated value, then perform cross-mix of two interpolated values, based on speed rate. I do not need to use adjacent version cross-mix when speed rate becomes low.

5. I apply 2X halfband polyphase downsampling to the result.

Will this work, or I'm overlooking something?

- KVRist
- 48 posts since 28 Jan, 2013, from Oakland

This seems a bit too complicated, so maybe I missed something?

Here's my approach:

1. Start with a waveform at 2048 samples that you want to anti-alias

2. Create 11 band-limited versions of waveform, one for every octave of harmonics, using (FFT -> clear freqencies -> IFFT)

So first waveform has just fundamental, second waveform has fundamental and first harmonic, third has 4 frequencies, fourth has 8 frequencies and so on. Last waveform is the original waveform.

3. For playing, run the oscillator at 2x oversampling. Use the waveform that doesn't alias at your 2x oversampling samplerate.

4. Apply half band polyphase IIR decimator

For #3 every ~50ms I choose a new waveform. If it's different than the last waveform, I fade from the last waveform to the new waveform over the next 50ms. In my tests, lots of other wavetable synths use this fade approach.

After doing tests on other synths, I think Serum has an extremely similar algorithm. Massive is similar but might use an FIR decimator instead judging by the group delay. Alchemy uses way more than 11 band-limited versions.

Here's my approach:

1. Start with a waveform at 2048 samples that you want to anti-alias

2. Create 11 band-limited versions of waveform, one for every octave of harmonics, using (FFT -> clear freqencies -> IFFT)

So first waveform has just fundamental, second waveform has fundamental and first harmonic, third has 4 frequencies, fourth has 8 frequencies and so on. Last waveform is the original waveform.

3. For playing, run the oscillator at 2x oversampling. Use the waveform that doesn't alias at your 2x oversampling samplerate.

4. Apply half band polyphase IIR decimator

For #3 every ~50ms I choose a new waveform. If it's different than the last waveform, I fade from the last waveform to the new waveform over the next 50ms. In my tests, lots of other wavetable synths use this fade approach.

After doing tests on other synths, I think Serum has an extremely similar algorithm. Massive is similar but might use an FIR decimator instead judging by the group delay. Alchemy uses way more than 11 band-limited versions.

- u-he
- 22230 posts since 7 Aug, 2002, from Berlin

I'd favour CPU over RAM, i.e. have more wavetables, e.g. 3 or 4 per octave, and skip the oversampling part. It's easy to keep aliasing beyond 18kHz this way. I'd crossfade only if necessary. For static waveforms, switching is fine.

A well known, very popular synth has 2 pre-calculated bandlimited wavetables per octave and just switches. Each wavetable is cleverly positioned so that end of bands and lowest possible aliasing meet at around 17kHz (when used at 44.1kHz). I have never seen anyone complain ever.

I recommend to not overthink. Approach from bottom specs, test a lot, listen to people, adjust if necessary.

A well known, very popular synth has 2 pre-calculated bandlimited wavetables per octave and just switches. Each wavetable is cleverly positioned so that end of bands and lowest possible aliasing meet at around 17kHz (when used at 44.1kHz). I have never seen anyone complain ever.

I recommend to not overthink. Approach from bottom specs, test a lot, listen to people, adjust if necessary.

- KVRian
- 831 posts since 19 Dec, 2010

I recommend FFT/iFFT for the filtering part as well. Since FFT is inherently periodic, it does exactly what you want without the need to apply any further tricks, and the CPU cost is rather low.

Regarding switching, in theory it can be audible. But even with just one waveform per octave it requires some effort to come up with a situation where this is really a problem. Plus you can optimize the transitions if you want to. Using 3-4 per octave as Urs recommended is certainly plenty.

Richard

Regarding switching, in theory it can be audible. But even with just one waveform per octave it requires some effort to come up with a situation where this is really a problem. Plus you can optimize the transitions if you want to. Using 3-4 per octave as Urs recommended is certainly plenty.

Richard

Synapse Audio Software - www.synapse-audio.com

- KVRist
- 48 posts since 28 Jan, 2013, from Oakland

Urs wrote:I'd favour CPU over RAM, i.e. have more wavetables, e.g. 3 or 4 per octave, and skip the oversampling part.

Ah yeah if you only have a few waveforms generating way more band-limited versions (and not oversampling) is probably best. I've got one of those wavetable synths with hundreds of frames so I have to do octaves or I'll use an insane amount of RAM.

I actually could probably get away with spacing them at 1.5 octaves and allow the oversampled version to alias a little bit into the frequency area we're filtering on the downsample.

- KVRian
- 1397 posts since 13 Oct, 2003, from Prague, Czech Republic

I used one mipmap per octave and just simply switched it. I also used 2x oversampling which I then converted to the final sample rate. No-one ever noticed a thing. Even when I try to listen really carefully where the switching happens (very tiny "snap!" sound), it's really hard to find the location by ear. So it works great.

Since I'm working mostly on hardware, I had an extra issue which the software usually doesn't: if you have the same mipmap switch frequency when going up and down the frequency, you'll end up with nasty crackling sound if you're controlling the oscillator pitch with analog CV. This is easily fixed if you have a different frequency for switching going up/down.

Since I'm working mostly on hardware, I had an extra issue which the software usually doesn't: if you have the same mipmap switch frequency when going up and down the frequency, you'll end up with nasty crackling sound if you're controlling the oscillator pitch with analog CV. This is easily fixed if you have a different frequency for switching going up/down.

- u-he
- 22230 posts since 7 Aug, 2002, from Berlin

mtytel wrote:Urs wrote:I'd favour CPU over RAM, i.e. have more wavetables, e.g. 3 or 4 per octave, and skip the oversampling part.

Ah yeah if you only have a few waveforms generating way more band-limited versions (and not oversampling) is probably best. I've got one of those wavetable synths with hundreds of frames so I have to do octaves or I'll use an insane amount of RAM.

I actually could probably get away with spacing them at 1.5 octaves and allow the oversampled version to alias a little bit into the frequency area we're filtering on the downsample.

With scannable wavetables of 100+ frames, instead of the waveform I'd simply store the spectrum of each. Then, each time after processing 256 or so samples, I'd copy a spectrum to a new table, zero any harmonics which would alias, run an iFFT and crossfade between old table and new until another is due.

- KVRAF
- 4955 posts since 11 Feb, 2006, from Helsinki, Finland

One possible approach (if you really don't want to fade between mipmaps) is also to just pick a mipmap per note and then keep it no matter what. This results in some aliasing if there is wide-range pitchbends or modulation, but avoids the problem with clicks at higher frequencies.

Anyway... if you are fading and you're doing some fixed-block modulation scheme, one approach is to simply pick one mipmap per block and do a fade over the block if the mipmap is different from the previous one. This way you don't need to spend CPU for mipmap interpolation unless there is some pitch modulation and the mipmap fade time stays predictable no matter what.

Anyway... if you are fading and you're doing some fixed-block modulation scheme, one approach is to simply pick one mipmap per block and do a fade over the block if the mipmap is different from the previous one. This way you don't need to spend CPU for mipmap interpolation unless there is some pitch modulation and the mipmap fade time stays predictable no matter what.

<- plugins | forum

- KVRist
- 48 posts since 28 Jan, 2013, from Oakland

Urs wrote:With scannable wavetables of 100+ frames, instead of the waveform I'd simply store the spectrum of each. Then, each time after processing 256 or so samples, I'd copy a spectrum to a new table, zero any harmonics which would alias, run an iFFT and crossfade between old table and new until another is due.

I think it depends what features you want. I've got a unison feature where the voices scan the wavetable at different points. Running an iFFT for every voice, every 256 samples would probably make a CPU (c/f)ry.

A scannable wavetable synth I know about that might do this is Europa since it doesn't look like it aliases even in the inaudible high end and has that spectral filter after the oscillator.

- u-he
- 22230 posts since 7 Aug, 2002, from Berlin

mtytel wrote:Urs wrote:With scannable wavetables of 100+ frames, instead of the waveform I'd simply store the spectrum of each. Then, each time after processing 256 or so samples, I'd copy a spectrum to a new table, zero any harmonics which would alias, run an iFFT and crossfade between old table and new until another is due.

I think it depends what features you want. I've got a unison feature where the voices scan the wavetable at different points. Running an iFFT for every voice, every 256 samples would probably make a CPU (c/f)ry.

I do this with unison, too. The crossfade between old and new is independent of oscillator phase / readout position. (Zebra has worked like this since... forever...)

- KVRAF
- 4955 posts since 11 Feb, 2006, from Helsinki, Finland

Urs wrote:mtytel wrote:Urs wrote:With scannable wavetables of 100+ frames, instead of the waveform I'd simply store the spectrum of each. Then, each time after processing 256 or so samples, I'd copy a spectrum to a new table, zero any harmonics which would alias, run an iFFT and crossfade between old table and new until another is due.

I think it depends what features you want. I've got a unison feature where the voices scan the wavetable at different points. Running an iFFT for every voice, every 256 samples would probably make a CPU (c/f)ry.

I do this with unison, too. The crossfade between old and new is independent of oscillator phase / readout position. (Zebra has worked like this since... forever...)

Another possible variation of this scheme is to only compute the iDFT for the actual time-instants you actually want to sample, by using some trigonometric recurrence to get the higher harmonics as powers of the fundamental sinusoid. The nice thing about this is that it has a fixed cost per sample (for a given number of harmonics; obviously you can bail-out when you reach Nyquist), which makes it attractive as your mod-frames get shorter and your iFFT work increases while at the same time you end up actually using smaller and smaller fraction of the computed results.

It's almost certainly not profitable for a mod-rate of 256 samples, but for one of my prototypes running mod-frames of 64 samples I found it somewhat faster (optimised with SIMD obviously; getting a good computational order is not entirely trivial, but it's something that maps to SIMD really nicely). The obvious downside is that it doesn't allow cheap unison playing back the same table at different positions, but then again for that particular prototype I specifically wanted the spectral effects on per-unison-voice basis anyway (eg. unison index being a mod-source).

<- plugins | forum

- KVRist
- 462 posts since 4 Apr, 2010

If I may offer, I have a series of articles that discusses tradeoffs and develops a simple but effective wavetable oscillator, very much along the lines of what Urs suggests. The only differences is that I develop a one-table-per-octave oscillator with the aliasing point at one-third the sample rate (14.7 kHz at 44.1k), with the implication that can use more tables per octave to improve it if you feel the need.

http://www.earlevel.com/main/category/digital-audio/oscillators/wavetable-oscillators/?order=ASC

http://www.earlevel.com/main/category/digital-audio/oscillators/wavetable-oscillators/?order=ASC

My audio DSP blog: earlevel.com

- KVRist
- 291 posts since 6 Apr, 2008

Urs wrote:mtytel wrote:

I think it depends what features you want. I've got a unison feature where the voices scan the wavetable at different points. Running an iFFT for every voice, every 256 samples would probably make a CPU (c/f)ry.

I do this with unison, too. The crossfade between old and new is independent of oscillator phase / readout position. (Zebra has worked like this since... forever...)

What size of FFT are we talking about? I played with this idea in the past but figured it to be too expensive. I was assuming a worst-case size of 16384 (1024 harmonics and 8x oversampling to reduce interpolation artifacts). Maybe with this scheme, one should ditch the FFT-oversampling and use the critically bandlimited 2048-sample wavetable with a higher-order interpolation?

- KVRian
- 1017 posts since 9 Jan, 2006

I wonder what kind of interpolation schemes are used? Especially if working at base sample rate.

From some tests I did resampling a sine wave I found that, IIRC, a sinc kernel length of over 70 was needed to keep interpolation artifacts very low. I was designing the interpolator for high quality conversions, so the keeping the noise floor below 100 db would, likely, be overkill for a synth. What noise floor would be acceptable in general for an oscillator? Below 60 db?

Anyway one observation I've made on discussions on interpolators is that noise is discussed in terms of bandlimiting - the source bandwidth, the filtering properties of the interpolation kernel, etc but not much discussion on interpolation noise itself.

From some tests I did resampling a sine wave I found that, IIRC, a sinc kernel length of over 70 was needed to keep interpolation artifacts very low. I was designing the interpolator for high quality conversions, so the keeping the noise floor below 100 db would, likely, be overkill for a synth. What noise floor would be acceptable in general for an oscillator? Below 60 db?

Anyway one observation I've made on discussions on interpolators is that noise is discussed in terms of bandlimiting - the source bandwidth, the filtering properties of the interpolation kernel, etc but not much discussion on interpolation noise itself.

- KVRAF
- 3486 posts since 7 Sep, 2002

matt42 wrote:I wonder what kind of interpolation schemes are used? Especially if working at base sample rate.

From some tests I did resampling a sine wave I found that, IIRC, a sinc kernel length of over 70 was needed to keep interpolation artifacts very low. I was designing the interpolator for high quality conversions, so the keeping the noise floor below 100 db would, likely, be overkill for a synth. What noise floor would be acceptable in general for an oscillator? Below 60 db?

Anyway one observation I've made on discussions on interpolators is that noise is discussed in terms of bandlimiting - the source bandwidth, the filtering properties of the interpolation kernel, etc but not much discussion on interpolation noise itself.

You can easily use 4- or 6- point spline if wavetable is oversampled (low-pass filtered for mipmapping). Splines are sufficiently good on oversampled signal, because usually "audio" splines are optimized to produce results close to sinc interpolation. When used on oversampled wavetables, most phase problems of splines reside in spectral area without signal.