Comparing phase truncation and interpolation in wavetable synth

DSP, Plugin and Host development discussion.
RELATED
PRODUCTS

Post

Hello everyone, I am an HW guy but I started looking into wavetable synthesis, something widely discussed in this forum. I have been reading about the common methods to reduce aliasing and improve sound quality and I wanted to compare truncation and linear interpolation.

At the moment I use 128 samples wavetables with a sampling freq of 44.1k. The interpolation code looks like (I know this can be reduced to a simpler equation but I wanted to give it a try):

Code: Select all

float phase;
int ind_a = int(phase);
int ind_b = ind_b + 1;
frac_1 = phase - float(ind_a );
frac_2 = float(ind_b) - phase; 
if (ind_b == WAVETB_LEN)ind_b=0; // equivalent to wrapping up the table

output= wavetb[ind_a ]*frac_1  + wavetb[ind_b]*frac_2 ; 
The strange thing is that I have noticed that the sound quality is better when just truncating the phase (??) and the aliasing seems to increase when interpolating.

Do you think there is something I am not considering? many thanks

Post

I am reading each sample off a single-cycle waveform. I've tried all the different interpolation methods: linear, cosine, Catmull-Rom, sinc, etc. and in my case, I've found little to no difference. Only 4 or 16x oversampling followed by downsampling has really made any difference in aliasing or sound quality, as it were.

Please note that I do everything wrong. :lol: I started my plugin to be different and it really is, from the ground up.

As a side note, I believe many of the older hardware samplers used either method, either round the pointer down to the nearest sample and interpolate between two samples, and paired with their DACs, sounded great! Sinc interpolation didn't come until later when samplers were 16+ bit and 44.1k and were kinda boring sounding... (<-my opinion here :D )
I started on Logic 5 with a PowerBook G4 550Mhz. I now have a MacBook Air M1 and it's ~165x faster! So, why is my music not proportionally better? :(

Post

ACR_Crew wrote: Tue Jun 02, 2020 7:28 pm [...]
The strange thing is that I have noticed that the sound quality is better when just truncating the phase (??) and the aliasing seems to increase when interpolating.

Do you think there is something I am not considering? many thanks
If it sounds worse with interpolation then I would assume that your interpolation algorithm has a bug. One thing that you should for example check is that in each loop

Code: Select all

frac_1 + frac_2 == 1
should hold. With the code that you are using to compute the fractions I am not really sure if that's really the case. If you have one fractional part frac_a computed correctly then the other can be computed as

Code: Select all

frac_b = 1 - frac_a
which might turn into more efficient code as well (although compilers seem to be almost magical these days and you never know).

How do you test the aliasing? I think the best way should be to put a sine wave into your wavetable and to play high notes. That way you can see all the aliased images next to the frequency of the sine with a spectrum analyzer. If you use a wavetable with lots of harmonics, e.g. a saw wave, they might cover the aliasing and make it hard to judge.

I guess that you already know that you should later increase the buffer size of the wavetable entries to something like 2048 and also have several buffers for different frequency ranges? If you keep the buffer size at 128 then low notes will sound very dull.

In case you haven't found the page yet, here are some great articles about wavetables and potential implementations:
https://www.earlevel.com/main/category/ ... cillators/
Passed 303 posts. Next stop: 808.

Post

With 128 samples you're going to get a lot of aliasing and it's generally not enough to, say, capture a. sawtooth properly across the frequency range when played in a low semitone. 2048 seems to be a very popular size for wavetables. I've found that even then there's still aliasing at higher frequencies unless you oversample more. You haven't mentioned, but you will need to be using FFT to address the aliasing and storing different wavetables for different note frequencies. I've found that every 2 or 3 semitones is ok.

The earlevel article is a pretty good starting place although his stuff does alias. It's easily fixed by storing larger wavetables for more semitones.

I agree with blitbit regarding the phase calculation.

Post

JustinJ wrote: Tue Jun 02, 2020 9:39 pm2048 seems to be a very popular size for wavetables. I've found that even then there's still aliasing at higher frequencies unless you oversample more. You haven't mentioned, but you will need to be using FFT to address the aliasing and storing different wavetables for different note frequencies. I've found that every 2 or 3 semitones is ok.
2048 is popular because it's sort of a "sweet spot" where you have enough samples to cover the audible spectrum down to about 20Hz (eg. 2048*20Hz = 40960Hz which gives a Nyquist around 20kHz). With mipmapping (eg. 1 or 2 per octave) this is enough to give you reasonably clean results with linear interpolation except for the very lowest frequencies (where you have the full or near full set of harmonics), but at that point the "ugly stuff" is usually pretty low level at high frequencies compared to the higher level fundamental at low frequenies so it works out pretty well.

That said, it is absolutely CRITICAL to use mipmaps with cheap (eg. linear) interpolation methods.

Post

BlitBit wrote: Tue Jun 02, 2020 9:02 pm How do you test the aliasing? I think the best way should be to put a sine wave into your wavetable and to play high notes. That way you can see all the aliased images next to the frequency of the sine with a spectrum analyzer. If you use a wavetable with lots of harmonics, e.g. a saw wave, they might cover the aliasing and make it hard to judge.
I would like to argue that saw-waves are actually a better test, because it will also test whether you are properly band-limiting.

The trick with saw-waves (or anything else complex) is to use high enough resolution with an FFT analyzer, put the analyzer in "linear frequencies" mode (rather than the usual logarithmic) and then play a fundamental somewhere around 1-2k and sweep the frequency around slowly: anything that moves too fast or the wrong direction is aliasing. :)

In fact, you can do this even with fairly "noisy" and inharmonic sounds (eg. in FM synthesis, ringmod, filter distortion). In these cases, simply by looking at the spectrum you often don't know what is "correct" and what is aliasing, but when you sweep the whole thing up and down a bit in frequency, it immediately becomes clear which partials move as intended and which ones don't.

Post

Hello everyone, thank you for your inputs. By reading again the code I realized that there was a mistake in the final interpolation formula. The frac factors were inversed. Wow, now I can notice a massive difference in the sound. :love:

I will now implement cubic interpolation. However, I would like to try also oversampling. I still need to figure out how to implement it. If I understood well, I should run the phase accumulator faster and then decimate to go back to 44.1k. Is there any example available to explain the process?

thank you very much!

Post

My 2 cents is that cubic interpolation (in this particular case anyway; it's worthwhile in other situations) is largely waste of time, because you can get a bigger improvement simply by using larger wavetables. YMMV.

Post

Testing for aliasing is easy. Raise the pitch, then the aliasing will dive in pitch. And vice versa.
We are the KVR collective. Resistance is futile. You will be assimilated. Image
My MusicCalc is served over https!!

Post

mystran wrote: Wed Jun 03, 2020 12:17 am 2048 is popular because it's sort of a "sweet spot" where you have enough samples to cover the audible spectrum down to about 20Hz (eg. 2048*20Hz = 40960Hz which gives a Nyquist around 20kHz). With mipmapping (eg. 1 or 2 per octave) this is enough to give you reasonably clean results with linear interpolation except for the very lowest frequencies (where you have the full or near full set of harmonics), but at that point the "ugly stuff" is usually pretty low level at high frequencies compared to the higher level fundamental at low frequenies so it works out pretty well.

That said, it is absolutely CRITICAL to use mipmaps with cheap (eg. linear) interpolation methods.
Yes indeed, it covers the frequency range well.

It all depends on how forensic and fussy you're going to get. I used Voxengo's Span setting the floor to something like -130dB and was using a saw wave as a test. I didn't want to see any aliasing coming through. 2048 works well for most of the range but IIRC, increasing to 4096 (essentially more oversampling at higher frequencies) completely nails it. Well, for a pragmatic definition of 'completely'.

I also found that 1 or 2 per octave wasn't enough. Extremely low-level aliasing appears as you interpolate between maps within an octave. That's because each map is band-limited and by interpolating between them you're either going to have frequencies that extend just beyond Nyquist (and hence alias) or you're going to fall short of Nyquist, which leaves a gap in frequencies just before Nyquist.

One thing I did wonder about was whether it's viable to do the FFT band-limiting in realtime. I know synths like Avenger, Thorn and Europa have spectral filters which fall sideways out of doing this.
But I didn't fancy the idea of calculating a FFT every 16 or 32 samples for every voice to handle modulation!

Post

JustinJ wrote: Thu Jun 04, 2020 11:55 am I also found that 1 or 2 per octave wasn't enough. Extremely low-level aliasing appears as you interpolate between maps within an octave. That's because each map is band-limited and by interpolating between them you're either going to have frequencies that extend just beyond Nyquist (and hence alias) or you're going to fall short of Nyquist, which leaves a gap in frequencies just before Nyquist.
I agree that just 1 per octave is probably not good enough, but with 2 per octave you can get reasonable results. While this leave a "gap" you can cut the actual size of that gap to half adjusting your interpolation in such a way that you intentionally allow alising within that gap. While it isn't perfectly, you can see this type of scheme in some popular commercial synths.
JustinJ wrote: Thu Jun 04, 2020 11:55 am One thing I did wonder about was whether it's viable to do the FFT band-limiting in realtime. I know synths like Avenger, Thorn and Europa have spectral filters which fall sideways out of doing this.
But I didn't fancy the idea of calculating a FFT every 16 or 32 samples for every voice to handle modulation!
I could be wrong, but I believe U-he has said that Zerba computes a new band-limited waveform using FFT every N samples (can't remember which value of N but I think it was fairly high). That said, when I personally tried such a scheme, I found out that if you want a reasonably high modulation rate, then it actually turns out to be ultimately faster to directly use DFT to compute only those samples (eg. 16 or 32 or whatever) that you need rather than paying the cost of a full FFT wavetable where you only use a few datapoints. This is especially so for higher notes where you can cut the harmonic series short, but it's workable even with a full series of 1024 or whatever harmonics you want.

Now, in order for direct DFT to be faster, you certainly need to optimize it. You want to compute the sinusoids using some trigonometric recurrence (ie. derive the sinusoids for the higher harmonics from the sinusoids computed for the lower harmonics) and you also want to SIMD the whole thing (which is fortunately pretty easy). That said, not only does it avoid the uneven CPU load (unless you add latency) from having to FFT full tables, the result is also higher quality as you don't need to interpolate the wavetable at all, because you're directly computing the exact samples that you want.

Post

ACR_Crew wrote: Tue Jun 02, 2020 7:28 pm At the moment I use 128 samples wavetables with a sampling freq of 44.1k.
Other have covered the details, and you might have figured out what I'm going to say now on your own, but...

Two things you need table size for:

1) The number of harmonics.

You need more than two samples per harmonic, so this means you have less than 64 possible with a 128-sample table. The means for a 40 Hz sawtooth, the top harmonic is only around 2.5 kHz. Not only on the dull side, but it the top will move up and down as though through a lowpass filter, unlike an analog sawtooth with harmonics extending throughout your hearing range as and fundamental frequency.

2) Lower error (noise).

Even the best-case scenario—a sine table—64 samples only buys you about -60 dBB RMS noise floor with linear interpolation, pretty marginal for audio use. For extended harmonics—a band limited saw table—you need a lot more than that, since the table is effectively shorter.

2048 is about the minimum table size practical for synths, full range and harmonics, even using higher-quality interpolation. Better to at least double it. While on paper that not nearly enough for linear interpolation, practically speaking it's not so bad. But for sines 256 samples at least (worth it to go 512 and be practically perfect for audio), and don't bother playing with less than 2048 for arbitrary wavetables.
Last edited by earlevel on Fri Sep 18, 2020 5:20 pm, edited 1 time in total.
My audio DSP blog: earlevel.com

Post

Hello everyone,

I am back after some months on the same topic. As I mentioned, the linear interpolation gave good results while the cubic one (taken from MuscDSP) seems to improve a bit the low octaves while does not improve the high ones.

I still have issues with the notes from C5 up where I have clear aliasing.

My wavetables are 128 samples wide and my sampling frequency is 44k. If I understand correctly, increasing the size of them, would have a beneficial effect on the sonic performances. If so, why?

I have been reading a lot about oversampling and how this can improve the performance, however, I have doubts about how this is practically implemented. My synthesis engine runs at 44k with one sample being generated. If I do x2 oversampling, I guess I need to calculate two samples but what is not clear to me is which phase accumulator use. There must be something trivial that I am missing.

Thank you in advance

Post

I answered the 128 sample table question back in June, the message above your post.
My audio DSP blog: earlevel.com

Post

ACR_Crew wrote: Fri Sep 18, 2020 12:05 pm [...]
I still have issues with the notes from C5 up where I have clear aliasing.

My wavetables are 128 samples wide and my sampling frequency is 44k. If I understand correctly, increasing the size of them, would have a beneficial effect on the sonic performances. If so, why?
128 samples fit 64 harmonics. Now imagine that you put a saw wave with 64 harmonics into the wavetable. What would be the perfect frequency to reproduce it? It's the frequency where you play one sample after the other when the samples are requested. This means that you can imagine the 44100 samples that are requested in a second to be filled with multiple consecutive copies of your wavetable. How often does a wavetable of size 128 fit into 44100 samples? It's 44100/128 = 344,53125 Hz. This means that around 344 Hz you will get a saw signal with perfect harmonics up to Nyquist (22.05kHz in this case).

If you only have one wavetable with that small size it means that if you play lower notes then the upper harmonics will also shift downwards in the spectrum and the oscillator will sound progressively duller because there are no more harmonics near Nyquist frequency. For example if you play a 172 Hz note then the highest frequency in the signal will be around 11 kHz. A note at 86 Hz will only have harmonics up to 5 kHz and so on.

If you play higher notes with that one wavetable you will get immediate aliasing effects because the highest harmonics would like to "go to the right" but are reflected at Nyquist.

The solution is to have multiple wavetables of larger size which are progressively filtered with regards to the upper harmonics. If you do the calculations above with a wavetable size of 2048 samples you will find that the lowest note with full harmonics is much lower.
ACR_Crew wrote: Fri Sep 18, 2020 12:05 pm I have been reading a lot about oversampling and how this can improve the performance, however, I have doubts about how this is practically implemented. My synthesis engine runs at 44k with one sample being generated. If I do x2 oversampling, I guess I need to calculate two samples but what is not clear to me is which phase accumulator use. There must be something trivial that I am missing.

Thank you in advance
If you run at twice the sample rate then the phasor will simply run half as fast. So if you play a note at 44.1kHz where your phasor steps in 40 degrees per sample then at an oversampling of 2x (88.2kHz) you phasor will simply step 20 degrees per sample. Because more samples are requested per second you will go at a slower pace through your wavetable for the same frequency. This also helps with the interpolation as it will produce smaller errors.
Passed 303 posts. Next stop: 808.

Post Reply

Return to “DSP and Plugin Development”