KVR Audio

raphx · Post by **raphx** » Mon Jun 24, 2013 6:45 am

I've been unsatisfied with the current state of implementations of the Moog ladder filter. The basic implementation is to just cascade four one-pole IIR filters and loop the resonance back (with an added unit delay). This is conceptually simple, but the resonant peaks are not properly tuned, the Q decreases as the cutoff frequency goes up (the parameters are not independent), and also the fact that each stage depends serially on the previous one is not great for fast implementation, especially on SIMD.

So I had another go at the math, and discovered that by encoding the entire state transition of the filter as a matrix, I could compute it more accurately, with no tuning problems at all. And the central operation of the new approach is a 4x4 matrix multiplication, which is on any given SIMD architecture is going to be one of most highly tuned and optimized operations possible.

I wrote a rough draft of a paper, which is not quite up to academic publication standards, but I think gets the idea across.

http://levien.com/ladder.pdf

The basic idea seems so simple I keep thinking it must have been done before, but I haven't found anything quite like it. I think the basic idea would work for discretization of other similar filter designs.

Should I try to publish this in a conference, maybe? Happy to hear any feedback that people might have.

mdsp · Post by **mdsp** » Mon Jun 24, 2013 8:53 am

Non-linear state-space moog filter models have already been proposed by Thomas Hélie:

here: http://dafx.ca/proceedings/papers/p_007.pdf

and here: http://recherche.ircam.fr/pub/dafx11/Papers/84_e.pdf

the second paper has a strong emphasis on stability analysis.

Music Engineer · Post by **Music Engineer** » Mon Jun 24, 2013 1:37 pm

raphx wrote:I've been unsatisfied with the current state of implementations of the Moog ladder filter. The basic implementation is to just cascade four one-pole IIR filters and loop the resonance back (with an added unit delay). This is conceptually simple, but the resonant peaks are not properly tuned, the Q decreases as the cutoff frequency goes up (the parameters are not independent)

i once wrote a paper, how to fix the tuning and resonance errors by taking the unit delay into account in the one-pole-coefficient and feedback-gain calculations:

http://www.rs-met.com/documents/dsp/Res ... Filter.pdf

however, nowadays, as i have the zero-delay-feedback technique in my toolbox, i consider this approach as obsolete. you might be interested in vadim zavalishin's excellent book aboout this stuff:

http://www.kvraudio.com/forum/viewtopic.php?t=350246

Music Engineer · Post by **Music Engineer** » Mon Jun 24, 2013 1:43 pm

raphx wrote:Should I try to publish this in a conference, maybe? Happy to hear any feedback that people might have.

personally, i don't care, if something is published in a conference paper or just on a private website or blog or whatever. however, i certainly appreciate when someone shares his research results. so, thanks for writing this paper. it's always good to see different approaches.

Music Engineer · Post by **Music Engineer** » Mon Jun 24, 2013 1:46 pm

i have one question: you say, you exponentiate the transition matrix and can thereby obtain arbitrary close approximations to the ideal behavior. can i interpret this exponentiation as a shortcut to oversampling - such that your time-delta becomes the - for example - 1024th fraction of the sampling-period?

raphx · Post by **raphx** » Mon Jun 24, 2013 2:01 pm

Yes, you can think of it as specifying a 1/1024 time delta for the state transition matrix. The "trick" is that, instead of running the transition 1024 times to simulate running it for the full time delta, you can get the same result by squaring the matrix 10 times repeatedly. I'm not sure exactly what this is called, although I've seen a similar technique used for approximating eigenvectors.

Thanks to both of you for the rich links. I'll read through them and figure out if what I've got still makes sense.

Music Engineer · Post by **Music Engineer** » Mon Jun 24, 2013 2:11 pm

raphx wrote:Yes, you can think of it as specifying a 1/1024 time delta for the state transition matrix. The "trick" is that, instead of running the transition 1024 times to simulate running it for the full time delta, you can get the same result by squaring the matrix 10 times repeatedly. I'm not sure exactly what this is called.

when i called it "shortcut to oversampling", i mean that i get the same result as if would have operated the filter at 1024 times the actual samplerate. yes - i think, we mean the same thing. nice trick indeed.

Z1202 · Post by **Z1202** » Mon Jun 24, 2013 3:36 pm

Robin from www.rs-met.com wrote:
raphx wrote:Yes, you can think of it as specifying a 1/1024 time delta for the state transition matrix. The "trick" is that, instead of running the transition 1024 times to simulate running it for the full time delta, you can get the same result by squaring the matrix 10 times repeatedly. I'm not sure exactly what this is called.
when i called it "shortcut to oversampling", i mean that i get the same result as if would have operated the filter at 1024 times the actual samplerate. yes - i think, we mean the same thing. nice trick indeed.

I agree, this is a nice trick. What I'm surprised about is that the "naive" oversampling really works, because I would have expected it to introduce some kind of aliasing effect on the frequency response, causing it to distort (of course, no aliasing of the signal itself, since the system is LTI).

By "naive" oversampling I mean S&H for the upsampling and sample dropping for the downsampling. One probably could think of introducing better resamplers here, but then you can't simply square the matrix anymore.

Also, do I understand it right, that the nonlinearities are applied without oversampling here? Interesting, whether this noticeably changes the sound or not.

Yet another thought: as the oversampling factor increases, the transition matrix will probably converge. Maybe it's even possible to derive an analytical expression for the limit?

Regards,
{Z}

Music Engineer · Post by **Music Engineer** » Mon Jun 24, 2013 3:41 pm

Z1202 wrote:Yet another thought: as the oversampling factor increases, the transition matrix will probably converge.

i had the same thought. what i wonder about is, as we increase the oversampling factor, the delta-t variable should apporach zero which makes the transition matrix approach the identity matrix. i must have something wrong.

or wait: probably, the alpha is inversely related to delta-t?

raphx · Post by **raphx** » Mon Jun 24, 2013 5:35 pm

The transition matrix does converge, and I finally found the mathematical concept I was looking for: matrix exponentiation. This is known to be a general solution to systems of linear differential equations.

Basically, the full transition matrix looks like this:

A = lim_{n->\infty} (I + J/n)^n

where J is the Jacobian of the system. This is the obvious generalization of the usual definition of exp(x).

Here's the best paper I found on computing matrix exponentials:

http://eprints.ma.man.ac.uk/634/01/cove ... 06_394.pdf

And this one is also interesting:

http://www.cs.cornell.edu/cv/researchpdf/19ways+.pdf

And the use of matrix exponential is known in IIR filtering (this reference compares direct, sample and hold, and bilinear techniques for discretizing a continuous time filter - and it briefly addresses the question that Z1202 raised about the effect that S&H has on frequency response):

http://www.idsc.ethz.ch/Courses/signals ... otes10.pdf

So if I do publish this, I'll be sure to update it to be in terms of the correct mathematical concepts.

From this literature, it looks like "scaling and squaring" is definitely considered one of the basic techniques for solving matrix exponentiation, although I see that the most sophisticated techniques start off using a Pade approximant first, followed by repeated squaring. For this particular application, it seems from experiment that it converges quickly even without that, and I personally find the simplicity of the pure squaring technique very appealing.

Lastly, to Z1202's question on the effect of oversampling: I think listening tests are the only way to really resolve that question. We know that without any oversampling, you're still doing a tanh() on the input before the rest of the filter, so any harmonics that creates above the Nyquist frequency will become aliased. Other than that, from looking at plots it seems robust, but plots and sounds may well tell different stories. Certainly for the highest quality you would want to oversample, but computing 5 tanh approximations per (oversampled) sample is turning into a lot of computation, and I want this to run smoothly on cell phones, with polyphony at that.

raphx · Post by **raphx** » Tue Jun 25, 2013 7:17 am

I've read the links (at least at a pretty good skim) and have a much better idea what's new, what's old, and how to understand this stuff. Also, it's nice to meet you more deeply - I didn't realize that Z1202 was the same Vadim that wrote that book.

I meant to say matrix exponential above, not matrix exponentiation. The usual notation for this is exp(A). The scaling-and-squaring method is particularly simple but is not necessarily the fastest or most numerically stable (especially with single precision arithmetic, which you want for SIMD, while my numpy prototype is using doubles). I think it's worthwhile to separate that out and treat it as a primitive.

The discretization technique is called "impulse invariance" (which is somehow related to the matched z-Transform, but I don't quite understand this yet). The biggest drawback by far is the aliasing of the impulse response. In this particular application, we win because the impulse response we're sampling has a strong (24dB/octave) lowpass slope. So the amount of energy that gets converted into aliasing is small enough not to matter. For a high-pass filter, the technique probably won't work well. But, as Vadim points out, it's almost certainly possible to resample the impulse response with a more sophisticated filter, at the expense of losing the simple exponential structure.

The zero-delay stuff with the bilinear transform is definitely valid. Both seem to strongly preserve the essential properties of the filter, including stability. I think that at heart, discretization forces a tradeoff. You can either have frequency warping and no aliasing (BLT), or aliasing and no frequency warping (impulse invariance). For sharp lowpass filters, the latter feels like a good tradeoff.

If you're oversampling (which you have to do to suppress aliasing from the nonlinear components), then I think the aliasing of the impulse response becomes even less of a problem. But I still maintain that it's useful to do the linear case without it.

To me, the biggest advantage of the matrix exponential approach is the simplicity, especially going from the filter topology and parameters to the state space transition matrix. The exponentiation implicitly solves the entire system of equations in one go, in strong analogy to the way that 1 + x + x^2 + x^3 ... solves 1/(1-x) for |x| < 1.

So I'm still very much liking this approach for handling the Moog filter with varying control parameters. You absolutely can build a delay-free network with BLT-transformed one pole filters, and then do the frequency warping to preserve the resonant frequency (or, alternatively, do a LUT to invert the natural BLT frequency warping), but I'm not sure anyone would characterize that as simple.

The other design choice is whether to use direct IIR forms (for example, in cascade, as in Antti's 2004 paper), or a state transition matrix operating directly on the state space. If you're optimizing strictly for the number of multiplies, the direct forms are obviously superior, and I think this is why they dominate the classical literature. On modern hardware, a cascade of IIR's gives you serial dependencies, which can cause pipeline stalls and make it harder to parallelize. By contrast, the state space approach requires N^2 multiplies (as opposed to 2N for the direct form IIR evaluation), but these multiplies are arranged in an absolutely ideal form. And N=4 is a special sweet spot for your typical SIMD chip.

So I think these are the wins and loses:

Win: Simple conversion from filter parameters to state transition matrix
Win: Accurate tuning, no frequency warping
Win: Q is completely independent of frequency, also no tuning needed
Win: generalizes to other (low-pass) filter topologies
Win: state space evaluation is faster than cascaded IIR's

Lose: frequency response of impulse response is aliased
Lose: number of multiplies is higher than IIR, maybe slower on scalar hardware

I'm certainly not going to claim that the state space approach or the use of matrix exponential is new. The Hélie references clearly show both. But it's not a simple approach. Does anyone here know how to translate that approach into working, efficient SIMD code? I'm willing to accept that it's possible, but can only see the way murkily.

So overall, this is still looking pretty appealing to me as an implementation technique for the Moog and now I'm thinking the SVF bandpass and lowpass as well.

Swiss Frank · Post by **Swiss Frank** » Wed Jun 26, 2013 3:32 am

The discussion is far over my head but could you tell me the meanings of SIMD, SVF, LTI?

Thanks!

raphx · Post by **raphx** » Wed Jun 26, 2013 4:27 am

Sure. SIMD = Single Instruction Multiple Data, and is a popular extension to computer architectures that lets them do a bunch of operations in parallel, rather than one at a time. Typically multiply-accumulate (one multiply, one add) is one of the operations that's made to go insanely faster than regular C code. It's difficult to program, but the speedup is often worth it. SSE, AVX, and NEON are all implementations of the SIMD concept, and have more similarities than differences, but it's (sadly) not really possible to write code for one and have it work across multiple CPUs.

SVF = State Variable Filter. It's one of the other major filter types in analog synthesizers other than the Moog ladder filter. One advantage it has is that it can support highpass, bandpass, and notch modes, all using the same circuitry. To my ears, it doesn't hold as much resonant energy as a ladder filter when swept, but this is somewhat subjective.

LTI = Linear Time Invariant systems. These are dramatically easier to analyze and manipulate than systems that have nonlinearities. Among other things, if you do LTI systems right you can never actually add aliasing - any frequency that comes out is a frequency that was present in the input.

Don't ask me to explain Volterra kernels, though

Swiss Frank · Post by **Swiss Frank** » Wed Jun 26, 2013 4:37 am

So practically speaking where/how do you get access to SIMD?

Are graphics processors considered SIMD? Are they available as PCI daughter cards? Or do you need to get a prototyping kit (eg that you hook to your PC, download code and so on) or even make your own prototype?

kuniklo · Post by **kuniklo** » Wed Jun 26, 2013 5:02 am

Swiss Frank wrote:So practically speaking where/how do you get access to SIMD?

Are graphics processors considered SIMD? Are they available as PCI daughter cards? Or do you need to get a prototyping kit (eg that you hook to your PC, download code and so on) or even make your own prototype?

Most common processors support SIMD now. Both the Intel chips you find in PCs and the ARM chips in mobile devices like the iPad. You can code for them directly in their native instruction set but more often you use higher-level compiler intrinsics or even higher level libraries that wrap them like Apple's Accelerate library or Intel's Performance Primitives package.

You can get an idea here:

http://software.intel.com/en-us/article ... algorithms

However, if you're just getting started with DSP coding I'd recommend you don't worry about this just yet.

A matrix approach to the Moog ladder filter