Designing sinc interpolator for non-integer ratio down sampling of band limited signal

DSP, Plugin and Host development discussion.
RELATED
PRODUCTS

Post

Greatly appreciated, mystran. An incredibly useful explanation. Will probably ask clarifications later.

Post

sault wrote: Imagine you are upsampling by 2... it doesn't work like this

Code: Select all

0 0 1 0 2 0 3 0 4 0 5 0
a b c   e f g
  a b c   e f g
    a b c   e f g
it does look like this, though

Code: Select all

0 0 1 0 2 0 3 0 4 0 5 0
a b c d e f g                = upsampled value between 1 and 2
  a b c d e f g              = upsampled "2" value
    a b c d e f g
Thanks sault. Was discussing upsampling with the sinc tuned to the source samplerate nyquist. So if we zero-stuff the source audio for 2X upsample, the sinc would be tuned to a quarter of the destination samplerate for a "half band filter". Ferinstance, given a source samplerate of 44.1 k, the destination samplerate is 88.2 k, the destination bandwidth is 44.1 k, and the half-bandwidth is 22.05 k.

I don't know if all thangs called "half-band filters" fit the same definition, but the ones I looked at long ago, the response was defined -6 dB at the half-band frequency (1/4 of the samplerate). Even if a half-band filter cutoff slope is crazy steep, it is still only -6 dB at the half-band frequency.

So if an application needs up/downsampling in order to add lots of distortion, then a half-band filter might not be so useful. Maybe you would want to tune the sinc so that it filters -100 dB at 22.05 kHz, rather than -6 dB at 22.05 kHz. But I'm usually trying to keep signals as clean possible, so in that case halfband filters can be useful, perhaps especially if "helped out" with some IIR filters.

For sake of example, here is the first online FIR calculator google turned up-- https://fiiir.com/

If I enter-- filter type: windowed sinc FIR, window type: Kaiser, samplerate: 88200, Cutoff Frequency: 22050, Transition Bandwidth: 22050, Stopband Attenuation: 80 dB

Which yields a half-band filter. The app claims -6 dB at 22.05 kHz and -3 dB at 20 kHz, using 23 coefficients.

With the sinc tuned to half-bandwidth-- Except for the center coff, half of the coefficients are zero. Lets use a notation 00 = zero coefficient or known-zero sample, NZ = Non-zero coefficient or possibly non-zero sample. Some of the source samples POSSIBLY could be zero but we need to multiply those anyway.

Code: Select all

2X Zero-stuffed audio stream, aligned to filter a source sample-- First line is audio, second line is kernel. Only ONE multiplication can possibly have a non-zero result, which is the kernel center coff.
NZ 00 NZ 00 NZ 00 NZ 00 NZ 00 NZ 00 NZ 00 NZ 00 NZ 00 NZ 00 NZ 00 NZ 00 NZ 00 NZ
   NZ 00 NZ 00 NZ 00 NZ 00 NZ 00 NZ NZ NZ 00 NZ 00 NZ 00 NZ 00 NZ 00 NZ
Kernel center-----------------------^^

2X Zero-stuffed audio stream, aligned to filter a zero-stuffed sample-- First line is audio, second line is kernel. Only 12 multiplications can possibly have a non-zero result. We can omit the other multiplications, including the kernel-center multiplication.
NZ 00 NZ 00 NZ 00 NZ 00 NZ 00 NZ 00 NZ 00 NZ 00 NZ 00 NZ 00 NZ 00 NZ 00 NZ 00 NZ
      NZ 00 NZ 00 NZ 00 NZ 00 NZ 00 NZ NZ NZ 00 NZ 00 NZ 00 NZ 00 NZ 00 NZ
Kernel center--------------------------^^
So far as I know, not many sinc alignments allow such large calculation savings. Note that when calculating the zero-stuffed values, the kernel has an odd-length but we are performing an even number of multiplications. Half before the target sample and the other half after the target sample.

Though "probably" any window would work with the "lanczos style" interpolation method-- So far as I can tell, the "lanczos-style" interpolation tunes the sinc to the source samplerate nyquist frequency. For the above half-band filter in the 2X zero-stuffed audio-- If we need an odd-length sinc filter of about the same time duration (therefore about the same filter frequency) at a 44.1 k samplerate-- Rather than the example 23 tap sinc filter at 88.2 k samplerate-- Starting from the center coefficient, delete every alternate coff on both sides of the center, yielding this 44.1 k samplerate kernel tuned to 22.05 kHz-- A kernel of length 11 rather than 23.

00 00 00 00 00 NZ 00 00 00 00 00

That is when the kernel is integer-aligned on any sample at the source sample rate. If the center coff happens to be scaled to 1.0, then no multiplication is even necessary. Just keep the original value.

The nyquist-tuned sinc is zero for all integer sample locations EXCEPT the center coefficient. And it is non-zero in-between the integer sample locations.

So if we want to interpolate for instance the fractional sample location 0.5, we slide the sinc over by 0.5 and calculate what values the sinc would have at all the integer sample locations. We would get 10 non-zero coefficients to multiply against the source samples, half of them up to and including sample location 0.0, and the other half of coffs starting from sample location 1.0. The center coefficient is 1.0. But we do not know the value of the interpolated sample value at sample location 0.5. That's what we are trying to find out. If we just assume we are zero-stuffing that unknown value, then we would be multiplying the center coff by 0.0 anyway, and the result is guaranteed zero, so there's no need to multiply the center coff.

So with the 2X zero-stuffed halfband filter, the kernel is length 23 and the unknown zero-stuffed samples require 12 multiplications to solve. With the non-zero-stuffed interpolation working at the original samplerate, finding any arbitrary interpolation in-between any pair of samples would need 10 multiplications, five on each side of the target location. Or you could use 12 multiplications the same as the zero-stuffed halfband filter, 6 multiplies on each side of the target location. Which would be a kernel length of 13.

So it looks like the "lanczos style" interpolation (even if using some other window) would do about the same number of math operations as the zero-stuffed halfband filter, and maybe it would perform "about the same" too, if you use the same window in constructing both kernels?

Post

@JCJR, I think the transition band should start at 20kHz and end at Nyquist.

Post

keithwood wrote:@JCJR, I think the transition band should start at 20kHz and end at Nyquist.
Thanks keithwood.

I agree that would be much nearer to ideal. Mainly wanted to use a "concrete example" in an ignorant comparison between zero-stuffed halfband filter vs interpolation. Didn't want a concrete example with a zillion coefficients. Was just thinking out loud trying to understand it.

Maybe that lanczos-approach interpolation (with the sinc tuned to the source samplerate nyquist) has about the same kind of "warts" as a 2X zero-stuffed upsample using a half-band filter? Or maybe not.

So far as I know, those kinds of halfband filters, even if they have a very steep rolloff above the cutoff frequency, they are about -6 dB at the cutoff frequency. So maybe a "lanczos-approach" interpolation with sinc tuned to nyquist, would also be -6 dB at nyquist, regardless how steep the attenuation might be achieved above nyquist? Dunno.

Getting real steep attenuation right at nyquist would need more calculations so far as I know. To save on cpu cycles, maybe one could set the cutoff frequency around 10 kHz rather than 20 kHz (target the max attenuation at 22.05 kHz), and IIR boost to try to flatten the response up to maybe 15 kHz or whatever. Many people can't hear much past 10 or 15 kHz, and for instance any music eventually encoded to mp3 will brickwall in the ballpark of about 15 kHz depending on the mp3 bitrate. So depending on the application and audience, it might be "overly ambitious" to try to go flat up to 20 kHz and then roll off to zilch by 22.05 kHz? Dunno.

Or if using a halfband filter that is only -6 dB at 1/4 of the samplerate, one could specify a longer FIR, steeper rolloff than the previous 23 coff example-- Then help it out with an IIR elliptical filter tuned in the ballpark of the 1/4 samplerate frequency.

Maybe Vadim Zavalishin's ZDF state-variable filter would be useful in that frequency range. Tap the SVF lowpass output and mix in "just the right amount" of highpass output. Which would give very steep attenuation near the center frequency, but less attenuation up higher. But up higher where the elliptical filter attenuation begins diminishing, the halfband filter has had a chance to kick-in and begin heavily attenuating the signal. Possibly the elliptical filter Q could also be adjusted to simultaneously help flatten the top octave in the halfband, or maybe not. There would be other IIR ways to try it, but the ZDF state variable might be a good tool to try.

Post

A windowed sinc kernel is generated by creating a sinc kernel with some specific cut-off, then windowing it with a certain algorithm, like Blackman-Harris or Kaiser or Lanczos. Lanczos, for audio, is not a good window. There is a reason why FIIIR (or any other design package you'll find) doesn't offer it. For audio it sucks. Stop focusing on it, you're wasting your time.

A sinc kernel tuned to a 1/integer cutoff has a zero every integer samples away from the center coefficient. This is known as a Nyquist or Lth band filter. A half-band filter is an L=2 Nyquist filter, they're the same thing. Part of the definition of a Nyquist filter is that at the cutoff the attenuation is -6 dB. That's the trade-off. Fewer calculations, fixed -6 at cutoff.

What are your design goals? If you want the least amount of CPU hit, you'll choose a Nyquist filter. If you want the least latency you'll probably choose a minimum phase kernel or use IIRs instead. If you want linear phase but want more attenuation at cutoff then you'll need to either pre-compute your symmetric lowpass coefficients or have algorithms to calculate them at runtime to match the user's samplerate (making sure to have an appropriate non-Lanczos window to go along with).

It really depends upon what you want. It's all a compromise between the number of operations you can afford, the attenuation and passband desired, group delay/latency, linear vs minimum phase, operational complexity, etc.
So depending on the application and audience, it might be "overly ambitious" to try to go flat up to 20 kHz and then roll off to zilch by 22.05 kHz?
If you're using a Nyquist filter, the passband is symmetric around cut-off. If your samplerate is 44.1 khz, Nyquist is 22.05 kHz, a passband starting at 20 khz will hit -6 at 22.05 khz and won't fully attenuate until 24.1 kHz, and the amount of attenuation will depend upon the number of taps and the window chosen.

When I wrote an oversampling library, the design criteria I used were a passband starting at 15 khz, approx 50 dB of attenuation, with the least number of taps it would take to make this happen because I wanted to be able to use it for real-time. I figured that people didn't really hear above 15 kHz, and that relaxed passband made it possible to run a very low latency operation. I ended up with a linear phase 4x upsampler that has less than 1 mS of latency, for example, and that was important for me at the time. Your needs may vary.

Post

sault wrote:A windowed sinc kernel is generated by creating a sinc kernel with some specific cut-off, then windowing it with a certain algorithm, like Blackman-Harris or Kaiser or Lanczos. Lanczos, for audio, is not a good window. There is a reason why FIIIR (or any other design package you'll find) doesn't offer it. For audio it sucks. Stop focusing on it, you're wasting your time.

A sinc kernel tuned to a 1/integer cutoff has a zero every integer samples away from the center coefficient. This is known as a Nyquist or Lth band filter. A half-band filter is an L=2 Nyquist filter, they're the same thing. Part of the definition of a Nyquist filter is that at the cutoff the attenuation is -6 dB. That's the trade-off. Fewer calculations, fixed -6 at cutoff.

What are your design goals? If you want the least amount of CPU hit, you'll choose a Nyquist filter. If you want the least latency you'll probably choose a minimum phase kernel or use IIRs instead. If you want linear phase but want more attenuation at cutoff then you'll need to either pre-compute your symmetric lowpass coefficients or have algorithms to calculate them at runtime to match the user's samplerate (making sure to have an appropriate non-Lanczos window to go along with).
Thanks sault

I guess my writing isn't very clear, because after it was mentioned several pages back that a sinc window isn't the best for audio, I've been referring to it as "lanczos style" interpolation making the assumption that the same kind of interpolation code could work if the coefficients are calculated with some other window than sinc. I might try the same interpolator object with blackman or kaiser or whatever, see what happens.

Though from that wikipedia windowing page mystran recommended-- https://en.wikipedia.org/wiki/Window_function Though there are some windows which look "better performing", many of the listed windows look worse than a sinc window. Sinc window appears "a little better than average" in side-lobe attenuation, and the central peak appears a bit narrower than most of the illustrated windows, which might in some cases be advantageous. Dunno. If I get done with other projects, will try the "lanczos-style" interpolation code with some alternate windows and see what happens. Just for curiosity.

Reasons the "lanczos style" interpolation looked interesting (not the sinc window itelf)--

For upsampling and downsampling, half-band filters look attractive because nearly half of the FIR coffs are zero. However, a quarter-band filter only has one zero coff per each three non-zero coffs, so a quarter-band filter of equivalent performance would use more multiply-add operations. Therefore it may be more cpu-efficient for instance to 4X upsample by doing a pair of 2X upsampling using half-band filters, rather than do a single 4X upsampling using a quarter-band filter. Also, when doing series 2X upsamplings, the subsequent filters don't necessarily need to be as "tight" as the first 2X upsample filter, possibly saving cpu cycles.

A factor which may not matter in practice, but I wondered about re the "series 2X upsample" method-- Because real world filters are never perfect, the first-stage of 2X upsample will make "random" errors on each interpolated sample. Sloppier filters making bigger errors, etc. So the second 2X upsample would also have some degree of numeric errors, and HALF of the source samples for the second 2X upsample pass, already include some degree of error-spread caused by the first-stage of 2X upsample.

So an 8X upsample would have three layers of "small" compounded calculation errors, and a 16X upsample would contain four layers of compounded "small" errors, etc.

On the other hand, if doing a 16 X upsample using that "lanczos style" of interpolation (not necessarily using a sinc window), using 15 FIR tables of nyquist filters tuned to fractional locations 0.0625, 0.125, 0.1875, etc-- All of the interpolated samples would contain slight numeric errors, determined by the quality of the FIR tables. But there would be no "compound errors" calculating errors based on previous errors. All the interpolated samples would be directly derived from the original source audio data, so possibly the "average amount of error" would be about the same for all of the interpolated samples.

The number of calculations per interpolated sample should be the same as a half-band filter of double the nyquist interpolator's kernel length and might be as efficient and "good performing" as a half-band filter which uses the same number of calculatons.

If the nyquist interpolator uses a too-short kernel with too much leakage, for instance it might be economically "helped out" on the resulting 16X stream by running an IIR notch filter at the 16X samplerate, with the notch tuned "a little higher" than the original source nyquist frequency, to help damp the leakage above the original nyquist.

Also, doing it thataway is programmatically simple, or so it seems to me. Regardless, it might be a silly idea. :)

Post

JCJR wrote:I've been referring to it as "lanczos style" interpolation making the assumption that the same kind of interpolation code could work if the coefficients are calculated with some other window than sinc
Perhaps the fact that lanczos is also known as a sinc window is causing some confusion? I was sloppy in the thread title, by sinc interpolator I meant windowed sinc - not a sinc window (well at least not exclusively). Also, yes, any window can be applied to the sinc function.
JCJR wrote:Reasons the "lanczos style" interpolation looked interesting (not the sinc window itelf)
I would suggest the term "windowed sinc" to avoid confusion.

Post

matt42 wrote:
JCJR wrote:I've been referring to it as "lanczos style" interpolation making the assumption that the same kind of interpolation code could work if the coefficients are calculated with some other window than sinc
Perhaps the fact that lanczos is also known as a sinc window is causing some confusion? I was sloppy in the thread title, by sinc interpolator I meant windowed sinc - not a sinc window (well at least not exclusively). Also, yes, any window can be applied to the sinc function.
JCJR wrote:Reasons the "lanczos style" interpolation looked interesting (not the sinc window itelf)
I would suggest the term "windowed sinc" to avoid confusion.
Thanks Matt

According to sault's last message the lanczos interpolation method might be called an L=1 nyquist filter, which I've no reason to doubt. Or using the nomenclature of "half-band" or "quarter-band" filter, maybe call it a "whole band" filter. :)

The frequency response isn't obvious for instance merely by specifying Nyquist as the cutoff frequency in FIIR designer. For one thing, the program doesn't allow typing in nyquist. It will always adjust the cutoff frequency slightly below nyquist, considering a nyquist cutoff frequency as user error. And of course a nyquist-frequency kernel aligned to the integer sample locations is just a bunch of zeros with a 1.0 in the middle anyway. The nyquist kernel coffs only become non-zero when aligned somewhere in-between the integer sample locations.

So that variant seems somewhat nonsensical if viewed as a filter, but makes sense viewed as an interpolator. The frequency response MAY be similar to a half-band filter of twice the kernel length? Or at least I suspect that if the kernel of an L = 1 nyquist filter is solved for fractional location 0.5, you would get about the same coff values as the non-zero coefficients of a half-band filter which is twice as long, assuming that the same window is used to calc both kernels.

IOW, an L=1 nyquist filter used for 2X oversample, of kernel length 13 (12 coffs used to solve each unknown sample), might perform similar to to a 2X oversample on zero-stuffed audio data, using a half-band filter of kernel length 23 (12 non-zero coffs used to solve each zero-stuffed sample).

Or the L=1 nyquist filter of kernel length 25 (24 coffs used to solve each unknown sample), might perform similar to a 2X oversample using a half-band filter of kernel length of 47 (24 non-zero coffs used to solve each zero-stuffed sample).

The spectral response of the L=1 nyquist filter might be similar at any arbitrary intersample interpolation location, not just the 0.5 half-way interpolation location? Or maybe it is more complicated, with different fractional interpolation locations yielding different spectral responses? Dunno.

Post

http://www.arc.id.au/FilterDesign.htmlwill let you design such a filter, with a kaiser window. But what is the point, in the case of a FIR? The center coefficient is valued 1.0 all others are zeroed. I suspect that the frequency responce will change depending on the fractional possition, if we design an interpolator with these parameters. Also the amount of varition will be inversley proportional to kernal length. Well thats my intuition, would have to test it to make sure. Unfortunatly the designer above only allows odd filter length, otherwise you could test the 0.5 offset kernel position (0.5 offset sinc function at least) using even kernel length. The best thing might be to design your own FIR/interpolator code and analyse the results.

Regarding kernals such as L=2, the optimisation of every other coefficient being zero isn't so interesting to me because I use SIMD optimisations for my FIR filters and loose the benifit of the zero coefficients (unless there's an optimised method to zero stuff an SIMD vector?) Plus being forced to exactly half nyquist isn't so nice. But of course each use case is different.
Last edited by matt42 on Wed Mar 29, 2017 10:05 pm, edited 1 time in total.

Post

JCJR wrote:According to sault's last message the lanczos interpolation method might be called an L=1 nyquist filter,
hmmm, could be the wording, but just to clarify - lanczos could be any ratio you choose and so could any other window.

Post

matt42 wrote:
JCJR wrote:According to sault's last message the lanczos interpolation method might be called an L=1 nyquist filter,
hmmm, could be the wording, but just to clarify - lanczos could be any ratio you choose and so could any other window.
Thanks Matt

That is possible I suppose. Am quite ignorant of the topic. So far as I can see, the lanczos interpolation described in this wikipedia article always tunes the sinc to nyquist-- https://en.wikipedia.org/wiki/Lanczos_resampling

When I was experimenting with it for oversampling, did not use huge kernels, but used bigger kernels than the ones shown in the article.

I'd think it a rather crappy interpolator if the spectrum would radically change according to the fractional interpolation location, but maybe so, and maybe it is therefore a crappy way to do it. :)

Post

Deleted
Last edited by matt42 on Wed Mar 29, 2017 11:13 pm, edited 3 times in total.

Post

JCJR wrote: So far as I can see, the lanczos interpolation described in this wikipedia article always tunes the sinc to nyquist-- https://en.wikipedia.org/wiki/Lanczos_resampling
Thanks. It could well be that "Lanczos Interpolation" refers to specifically L=1 cases. (I dunno, but seems so from that artical). But the lanczos window itself could validly be applied to any ratio.

Post Reply

Return to “DSP and Plugin Development”