KVR Audio

matt42 · Post by **matt42** » Mon Mar 13, 2017 6:24 pm

Hi everyone,

I'm designing an interpolator to downsample a signal, that is already bandlimited to target nyquist, by a non integer ratio. Also the ratio would be fixed, so I don't need to worry about modulation.

I had a couple of assumptions. The frequency response of the kernal should be flat throughout the passband. Also as the signal is bandlimited to the target nyquist I don't need to worry so much about attenuating/removing frequencies above that point. Is this correct?

However I found that using smaller kernel lengths increases distortions (don't think it's aliasing, but could be wrong?) whilst increasing the length reduces the noise. What is the cause of this distortion?

On the implementation side I wanted to get testing quickly, so I just used a lookup table to LERP the coefficients. Smaller table resolution increases noise, but increasing the size to a certain level seems to be accurate enough as further increases don't improve distortions. To get similar results with polynomials, I guess, I'd have to use highish order piecewise functions? Any other techniques I could use?

Quality wise down sampling at a ratio of about 1.8 I found that with a kernel length of around 25 samples distortions are below 60/70 dB (I'm not on my dev machine right now to check). Ideally I'd like to get down bellow 100. How much difference would the window make? Currently I was using Blackman-Nuttall, or something similar. Any other tips for high quality results? Presumably further bandlimited upsampling before the interpolation could work, but might not be the most efficient method? And obviously I can keep increasing the kernel length.

Edit: I may have to change the test procedure as non integer sine periods will presumably cause leakage during analysis.

Thanks for any help/advice,

Matt

keithwood · Post by **keithwood** » Mon Mar 13, 2017 7:55 pm

Have you done a bode plot of your filter kernels to see how they compare to an ideal low pass? At 25 samples for a kernel I assume you're using a fir designer rather than windowing a sinc?

matt42 · Post by **matt42** » Mon Mar 13, 2017 9:12 pm

keithwood wrote:Have you done a bode plot of your filter kernels to see how they compare to an ideal low pass? At 25 samples for a kernel I assume you're using a fir designer rather than windowing a sinc?

Didnt check that yet, but that's really the part I'm having a hard time understanding. Why do I need to apply a steep filter to a band limited signal?

For example if I have a signal at 88200 with no content above 22050 and want to downsample to 44100, an integer ratio, I wont need to filter at all. Why do I need a decent lowpass for non integer ratios? For example I wanted to downsample to 48000.

I guess something to with the way spectral images overlap at integer vs non integer ratios, but I'm not getting it

stratum · Post by **stratum** » Mon Mar 13, 2017 9:15 pm

Why do I need to apply a steep filter to a band limited signal?

If there is a problem, then that signal is not bandlimited, you already know that, but probably there is something hard to debug about as to why it is not bandlimited - maybe about exactly how that interpolation algorithm picks a "sample" to return.

keithwood · Post by **keithwood** » Mon Mar 13, 2017 10:03 pm

matt42 wrote:Didnt check that yet, but that's really the part I'm having a hard time understanding. Why do I need to apply a steep filter to a band limited signal?

Got me there. In theory decimation should be enough as you've effectively already low passed.

How are you doing your non integer ratio? upsampling then downsampling? or something else?

matt42 · Post by **matt42** » Mon Mar 13, 2017 10:16 pm

Well the plan is for a resampler which would upsample then perform interpolated down sampling to the target sample rate. For initial development I'm using a wave table of sines at various frequencies and down sampling that. I can verify that the table contains sines of the correct frequencies. I'll have to look again at the interpolation algo, see if there's any bugs in there. Works fine at integer ratios strangely enough, have to be a weird bug that just shows up at fractional positions.

sault · Post by **sault** » Tue Mar 14, 2017 4:00 am

Interpolation can cause aliasing, and a lowpass that isn't robust enough (too wide of a passband, not enough attenuation) can cause aliasing as well by not reducing the foldover enough. Just because you're producing sines doesn't mean it's bandlimited, and the amount of aliasing produced by the interpolation depends on the algorithm and how many taps the kernel has.

Better interpolation algorithms do make a difference. I've been experimenting with an 8x oversampler by cascading a 2x filter with 4x polynomial interpolation. I can see strong aliasing in upper frequencies when using linear interpolation. Trilinear is better, maybe by 10 dB, but a much stronger improvement comes with a 6-pt 3rd order Hermite, a good 20-30 dB improvement. Haven't tested higher-order algorithms, but I would expect to see improvement over that.

Check the "Polynomial Interpolators for High-Quality Resampling of Oversampled Audio" document for a very good treatment of the issue. I believe that the SNR value in the Summary on p 60 gives a good approximate of the amount of aliasing it introduces, it seems to ballpark with what I'm seeing in the spectrogram with Matlab.

http://yehar.com/blog/wp-content/upload ... 8/deip.pdf

keithwood · Post by **keithwood** » Tue Mar 14, 2017 8:06 am

Just to reiterate what sault said, if you're upsampling by zero stuffing then low passing, the quality of your low pass interpolator at the upsampling stage is really important (and could be the key to your problem) as spectral replications will appear when you zero stuff, so it won't matter if your original signal is band limited i.e. if you've got a signal sampled at 44.1kHz and you upsample by 4, then you will have spectral images centred around 44.1kHz, 88.2kHz, 132.3, etc so even if the original signal is low passed at 1kHz! you will get spectral images after upsampling it.

chipnix · Post by **chipnix** » Tue Mar 14, 2017 12:21 pm

rather offtopic, are there open SRC algorithms, for non-integer ratios, which do NOT use the typical up-/downsampling approach?

Just directly summate the crossing points of the (windowed, src-nyquist-band-limited) sinc-functions (aligned to source sample positions) and the destination sample-rate positions (non-integer)

I could imagine that this is much more efficient than up sampling/downsampling for higher SRC quality.
With the up/downsampling approach most of the calculated information is just thrown away.

matt42 · Post by **matt42** » Tue Mar 14, 2017 6:33 pm

sault wrote:Just because you're producing sines doesn't mean it's bandlimited

Well, in my case it does. As I know the sample rate and don't generate sines above the target nyquist it is band limited.

sault wrote:Check the "Polynomial Interpolators for High-Quality Resampling of Oversampled Audio"

Thanks, it's a decent paper that I'm aware of, but I'd rather build my own windowed sinc interpolator.

keithwood wrote:Just to reiterate what sault said, if you're upsampling by zero stuffing then low passing, the quality of your low pass interpolator at the upsampling stage is really important (and could be the key to your problem) as spectral replications will appear when you zero stuff, so it won't matter if your original signal is band limited i.e. if you've got a signal sampled at 44.1kHz and you upsample by 4, then you will have spectral images centred around 44.1kHz, 88.2kHz, 132.3, etc so even if the original signal is low passed at 1kHz! you will get spectral images after upsampling it.

At the moment I'm just dealing with a band limited, to target nyquist, signal for down sampling only. Later once the downsampler/interpolator is working correctly the upsampling stage will be zero padding and FIR filter, this will be the easy part, simply design the FIR with over 100 dB attenuation by the requited frequency

xoxos · Post by **xoxos** » Tue Mar 14, 2017 9:57 pm

matt, song of my heart and light of my world, fondest lover, gentle and understanding,

interpolation, basically, *is* all about non-integer conversion. if it wasn't, then it wouldn't be interpolation, would it.

there's a bug in your implementation

mystran · Post by **mystran** » Tue Mar 14, 2017 10:02 pm

matt42 wrote: I'm designing an interpolator to downsample a signal, that is already bandlimited to target nyquist, by a non integer ratio. Also the ratio would be fixed, so I don't need to worry about modulation.

If modulation is not a concern, you just need a regular interpolator that is band-limiting to the target sampling rate (well, half of it anyway), or some approximation of the same.

I had a couple of assumptions. The frequency response of the kernal should be flat throughout the passband. Also as the signal is bandlimited to the target nyquist I don't need to worry so much about attenuating/removing frequencies above that point. Is this correct?

For integer rates this works, because you can just throw away some of the samples. Unfortunately for non-integer rates you need to interpolate between the actual sample points, which can be thought of in terms of rational fraction N/M as upsampling by M followed by downsampling by N (and with sufficiently large M this can give you arbitrarily good finite precision approximation of irrational rates), where unfortunately the up-sampling part of the process involves zero-stuffing that replicates the spectra, so you're stuck with the usual brickwall filtering. In practice you obviously want to interpolate directly, but the same rules and same aliasing still applies and the ideal interpolation kernel looks like a sinc (surprise).

However I found that using smaller kernel lengths increases distortions (don't think it's aliasing, but could be wrong?) whilst increasing the length reduces the noise. What is the cause of this distortion?

It can be aliasing or it can be noise from the kernel interpolation.

On the implementation side I wanted to get testing quickly, so I just used a lookup table to LERP the coefficients. Smaller table resolution increases noise, but increasing the size to a certain level seems to be accurate enough as further increases don't improve distortions. To get similar results with polynomials, I guess, I'd have to use highish order piecewise functions? Any other techniques I could use?

Generally speaking, for interpolation (with a precomputed kernel) it's helpful to think in terms of two different numbers: the number of taps-per-branch (how many input samples you look at) and the number of branches. The number of taps-per-branch will determine your attenuation and transition bandwidth (together with design method or window, as per usual for brickwall filters), so this will control the amount of aliasing that happens.

The number of branches will determine the time-domain accuracy of the sampling points (assuming you can't have one branch for every possible time-point, which would be the case for N:M rational situations where N and M are small). If the timing error varies (eg. you just pick the closest branch), this will introduce some noise (depending on magnitude of the error). If you interpolate the kernel itself (other than "nearest"), then the interpolation error from this secondary interpolation will result in some time-variation in the kernel response that will similarly cause some (hopefully less) noise.

This isn't very rigorous I'm afraid, but the point is, when you increase the taps-per-branch (ie. the amount of input you consider) you reduce aliasing and when you increase the number of branches (ie. the time-resolution of the interpolation kernel) you reduce noise. Either one could be a problem. To test which one you're getting, try a slowly frequency-modulated saw and see if you're getting partials moving in the wrong direction (aliasing) or just lots of noise (probably caused by jitter).

Quality wise down sampling at a ratio of about 1.8 I found that with a kernel length of around 25 samples distortions are below 60/70 dB (I'm not on my dev machine right now to check).

Are you talking about the number of input samples or the total kernel length here? Depending on the transition width you're willing the accept having 25 taps per branch might be fine, I usually use 32 for most purposes, but generally speaking this would be at the lower of the two sampling rates and lower cutoff generally requires proportionally longer kernel for the same quality (so for 2x it'd be 64, for 4x it'd be 128 and so on). Even then, for rate of 1.8 the 25 taps (per branch) might be acceptable.

As for the total kernel length, some people suggest that with linear interpolation having 64 branches might be enough, but honestly I'd usually go for somewhat more. Somewhere around 4k to 8k branches you can consider dropping the kernel-interpolation too (but lerping might be cheaper than the increased cache footprint, YMMV). I'm not entirely sure (and too tired to think) how the actual ratio affects this, as I'm generally used to designing at the lower rate (and using scatter-filters for non-integer down-sampling, since this works better in the time varying case; it's less efficient if you only need fixed rate though).

Ideally I'd like to get down bellow 100. How much difference would the window make? Currently I was using Blackman-Nuttall, or something similar. Any other tips for high quality results? Presumably further bandlimited upsampling before the interpolation could work, but might not be the most efficient method? And obviously I can keep increasing the kernel length.

Doing an explicit upsampling + downsampling process is IMHO just a huge waste of CPU with very little to gain and plenty to lose.

As for the other part, for windowed sinc-designs of reasonable length the window determines the attenuation more or less directly. There are plenty to choose from, but if you're lazy, try Kaiser and push the alpha up until you get something that works for you (very large values will run into numerical problems, but you should be able to get over 200dB with double precision, at the cost of rather wide transition bandwidth).

Hopefully some of this was useful.

matt42 · Post by **matt42** » Wed Mar 15, 2017 12:57 am

xoxos wrote:matt, song of my heart and light of my world, fondest lover, gentle and understanding,

Well, someone's been on the sherry.

xoxos wrote:interpolation, basically, *is* all about non-integer conversion

This might be correct on a basic level. But your statement is wrong on a factual level.

xoxos wrote:there's a bug in your implementation

No shit Einstein. What would I be asking for otherwise?

matt42 · Post by **matt42** » Wed Mar 15, 2017 1:38 am

mystran wrote:It can be aliasing or it can be noise from the kernel interpolation.

This is a big part of my question. Other than aliasing, due to poor bandlimiting, can a kernal cause distortions and how? Take linear interpolation; it's just a triangular kernel. Sure it doesn't bandlimit well and will filter the pass band, but would it add distortion to source signal (assuming source is band limited to target nyquist) if we use it to down sample to a target sample rate?

Thanks, mystran, for the detailed reply. That's my main question right now. There's probably more there I'd like to discuss, but I'll have to get back later. Really appreciate your response on this

xoxos · Post by **xoxos** » Wed Mar 15, 2017 2:11 am

matt42 wrote:What would I be asking for otherwise?

irony. but i'm glad to hear the part about the person being wrong is still true.

hey, it's been great reading your posts.

good luck with your sinc interpolation implementation.

Designing sinc interpolator for non-integer ratio down sampling of band limited signal