KVR Audio

Kallyn · Post by **Kallyn** » Mon Feb 12, 2024 11:57 pm

Hey!
Please forgive me if at any point of this post I am an idiot, I am very new to DSP.

I would like to make a Python script for FL Studio's Fruity Convolver to automatically convert a loaded sample to a Minimum Phase IR. I am trying to replicate the functionality of MCabinet that Au5 does in this video (which as I understand, converts the audio sample to a minimum phase impulse response). From a bit of research, I have found that they are using a Linear Phase filter which is converted to Minimum Phase through some operations. I was able to find this website which has the full conversion from linear phase to minimum phase within 4 lines.

My issue is, I don't understand what any of these steps are actually doing. I would use a library and skip this whole process, but I would like to distribute this script so others can use it without having to set up a crazy library. Ideally, all the code would be contained within one single .pyscript file that people can run on a selection of audio and get the minimum phase IR.

When I first found this rabbit hole, I created a script which would take the FFT of the first 2^n samples in a selection, find the absolute values of all the complex numbers, and do the IFFT on this new array of only real-valued data, which would result in a sample that was as close to an impulse as I could make it, but retained the tonal aspects of the source sample. Since then, I have learned a lot about FFTs and I realized I was doing it without windowing, which made the quality of the output a lot worse. Once I implement that, I am assuming that I need to perform the operations from the Codepad site on this output to get the minimum phase IR. Is this correct? Can this be reasonably programmed without external libraries?

Again, forgive me if I am rattling nonsense because I really don't know what I am doing, I am just using my knowledge of music production more than my year of programming experience to try and find a solution. I would greatly appreciate some guidance in the right direction.

Thanks!
Kallyn

AnalogGuy1 · Post by **AnalogGuy1** » Tue Feb 13, 2024 7:17 am

Sorry if this sounds overly simplified; I'm not sure where you are in your learning. A FIR filter's difference equation will look like y[n] = a0 x[n] + a1 x[n-1] + a2 x[n-2] + ... aN x[n-N] for an Nth order filter, where x[n] is your input signal and y[n] is your output signal.

In the frequency domain, that would be, using the same coefficients,
H(z) = a0 + a1 z^-1 + a2 z^-2 + ... aN z^-N

If you solve the above equation for all the values of z for which H(z)=0 you will have the zeros of your transfer function. They may be complex, so you can plot them in the complex plane: real axis is horizontal, imaginary is vertical. The zeros will always be either real or in complex conjugate pairs (e.g. in polar coordinates, if there's one at 2 angle 30 degrees, there will be one at 2 angle -30 degrees.)

If the filter is linear phase, they'll also be in reciprocal magnitude pairs, so, for instance, one at 2 angle 30 degrees will also have one at 1/2 angle 30 degrees (and therefore by the first rule also at 2 angle -30 deg and 1/2 angle -30 deg). For every zero outside the complex unit circle, there will be one zero inside it.

If the filter is minimum phase, all the zeros will be inside the unit circle. To convert a linear phase to minimum phase filter, take each zero outside the complex unit circle (with a magnitude of greater than 1) and reflect it to its reciprocal position inside the unit circle (e.g. a 2 angle 30 becomes 1/2 angle 30). That's all there is. If you want to find the difference equation, just do the polynomial multiplication of those zeros, as you factored to find them.

At least, in theory. In reality, there's numerical stability issues for bigger filters, so you want to convert things not all at once into a single huge polynomial, but into little stages. IIR filters are typically converted into biquads aka second order stages; FIR will tolerate bigger stages before having numerical issues.

Z1202 · Post by **Z1202** » Tue Feb 13, 2024 7:55 am

What's said above is correct, but IMHO gives little insight to what the linked code is doing. I'm not too much into this area, but it looks to me like the code is using the idea of the cepstrum, which is a common trick to avoid polynomial factoring for minimum phase conversion. There are other forum members with a more in-depth knowledge on these matters, so I'll leave it to them to comment on the details.

Edit: actually, I think i made some mistake in my explanation (well, as I said, I'm not too familiar with this area). It's probably not cepstrum per se, but the Hilbert transform of the log spectrum, which is kind of performing an equivalent (I think) task here. But it's more or less the same idea, maybe even computationally identical (depending on how they implement the hilbert function).

Kallyn · Post by **Kallyn** » Wed Feb 14, 2024 12:18 am

AnalogGuy1 wrote: ↑Tue Feb 13, 2024 7:17 am Sorry if this sounds overly simplified; I'm not sure where you are in your learning.

You are very right in oversimplifying, I have no idea what I'm doing.

My intent is to scan a selection of an audio file, taking an FFT of each window in the scan (using proper windowing functions etc etc), and storing an array containing the maximum amplitude of each bin, completely ignoring phase. Once I have the max amplitude of each bin for the entire selection, I will somehow turn that back into audio again. I was originally just doing an inverse FFT right there, getting a linear phase impulse with lots of smearing. Are you saying that I can get the zeros of the array of max magnitudes, do that unit circle flipping thing, and do an ifft to get the minimum phase impulse?

Assuming that's right, are there any python libraries that can do all this for me but don't require an intense setup process? Otherwise, I'm in for a LOT of programming.

mystran · Post by **mystran** » Wed Feb 14, 2024 10:12 am

Kallyn wrote: ↑Wed Feb 14, 2024 12:18 am Are you saying that I can get the zeros of the array of max magnitudes, do that unit circle flipping thing, and do an ifft to get the minimum phase impulse?

Just forget this whole root-finding business. You'd have to find those roots numerically, likely in arbitrary precision math to deal with numerical instability and it might still not work. I believe in order to find duplicat

Fortunately, just about the only situation where one typically needs to have a true minimum-phase filter is when you actually want to invert it by turning the FIR into an inverse IIR. In this case we need the zeroes to be inside the unit circle so that when they become poles, they are causally stable.

Rest of the time, what we want is "more or less" minimum-phase and that's where the cepstrum approach comes in. The short version (there are a few equivalent variations) is take FFT, take logarithm of magnitude, take IFFT into cepstrum, zero out negative time-offsets, take FFT (we now have log magnitude in real component and phase in imaginary component), take complex exponent to get a "normal" spectrum and then take IFFT back into time-domain. I have no idea what "hilbert" does in numpy exactly, so I'm guessing the code does some slight variation of the general idea (in a sense the "zero one half" business is a Hilbert transform of the log-spectrum done with FFT, so perhaps "hilbert" either saves some code or uses an alternative method), but the point is, numerically the result is "more or less" minimum-phase. It suffers from temporal aliasing, so it won't be perfect, but some zero-padding before you start will help with that.

Since there's a bunch of math involved, doing this "manually" in Python would likely be slow and there are a few details that can be tricky to get right... so if the code works, I'd just use it as-is. If you want an actual indepth explanation of why this method gives you minimum-phase, I'd probably consult some actual mathematicians... but for engineering purposes the important thing is mostly that it works "reasonably well" as long as you don't need provably minimum-phase for stability purposes.

Z1202 · Post by **Z1202** » Wed Feb 14, 2024 1:55 pm

mystran wrote: ↑Wed Feb 14, 2024 10:12 am If you want an actual indepth explanation of why this method gives you minimum-phase, I'd probably consult some actual mathematicians...

Off the top of my head, the guess is kinda following. A causal FIR transfer function can be expanded into a product of factors (1-z_i/z), or (z-z_i)/z. We can more or less ignore the denominator (it doesn't affect the minimum phase property) and concentrate on the numerator, which is a product of (z-z_i). By taking a log of it we turn the product into a sum of log(z-z_i). Now log(z-z_i) has a singularity (acting in a way like a pole) at z=z_i. For "minimum phase zeros" this singularity is inside the unit circle (where I'd guess we obtain a causal sequence upon inverse z-transform, although off the top of my head I'm not 100% sure how to demonstrate it, but for a usual pole that would be the case already), for "maximum phase zeros" we respectively obtain an anti-causal sequence. By flipping (or erasing, in case of symmetric inverse z-transform result) the anti-causal sequence we turn it into a causal one, respectively obtaining "minimum phase zeros". The fact that we have a sum of terms log(z-z_i) combined with the linearity of z-transform allows us to transform all "maximum phase zeros" at once, by simply flipping/erasing the entire anti-causal part). Technically we can use Fourier transform instead of z-transform, which is what the linked code is doing.

There are some details which I skipped (the causality of log(z-z_i)'s transform as well as the n log(z) term arising from the pole), so there might still be some formal loopholes.

mystran · Post by **mystran** » Wed Feb 14, 2024 4:09 pm

Z1202 wrote: ↑Wed Feb 14, 2024 1:55 pm By flipping (or erasing, in case of symmetric inverse z-transform result) the anti-causal sequence we turn it into a causal one, respectively obtaining "minimum phase zeros".

As a practical matter, because we must somehow deal with the case where we happen to have a zero exactly (or numerically speaking "too close") to the unit circle, it generally makes sense to take the logarithm of the magnitude rather than a proper complex logarithm, because this way you can just bias your logarithms in one direction by actually computing .5*log(re*re+im*im+eps) which effectively turns the IR into linear phase first, which means the cepstrum is necessarily symmetric.

Then (assuming my memory serves me correctly; this is the sort of algorithm you work out once and then you just treat it as a black box) by erasing one half of the cepstrum, we actually lose half the magnitude unless we compensate (that's what the "flipping" strategy avoids, I think), but conveniently there's a .5 term in front of the logarithm that we can drop to obtain said compensation.

Since after log(re*re+im*im+eps) we have purely real spectrum, I guess we could further optimize by replacing the IFFT by IDCT-I? That one I haven't tried, but it seems like it could save a few cycles (well, if we assume it takes half the time of a real-to-half-complex transform which we do 4.. then we'd save about 1/8 of the total transform work?) if you ever need to do this in a tight loop.

Kallyn · Post by **Kallyn** » Wed Feb 14, 2024 9:37 pm

mystran wrote: ↑Wed Feb 14, 2024 10:12 am The short version (there are a few equivalent variations) is take FFT, take logarithm of magnitude, take IFFT into cepstrum, zero out negative time-offsets, take FFT (we now have log magnitude in real component and phase in imaginary component), take complex exponent to get a "normal" spectrum and then take IFFT back into time-domain.

AHA! These are words I understand. Supposing I can find an FFT algorithm that doesn't take too much time or memory, I can't imagine those steps should take terribly long. That would mean the main time/mem drain should be the other steps. I'll look into libraries or algorithms to do this when I get the chance.

Bear in mind this isn't for real-time processing, this is a script to run within Fruity Convolver. It doesn't have to be too fast, just fast enough that it doesn't become tedious to use on many different samples.

mystran · Post by **mystran** » Thu Feb 15, 2024 4:46 am

Kallyn wrote: ↑Wed Feb 14, 2024 9:37 pm
mystran wrote: ↑Wed Feb 14, 2024 10:12 am The short version (there are a few equivalent variations) is take FFT, take logarithm of magnitude, take IFFT into cepstrum, zero out negative time-offsets, take FFT (we now have log magnitude in real component and phase in imaginary component), take complex exponent to get a "normal" spectrum and then take IFFT back into time-domain.
AHA! These are words I understand. Supposing I can find an FFT algorithm that doesn't take too much time or memory, I can't imagine those steps should take terribly long. That would mean the main time/mem drain should be the other steps. I'll look into libraries or algorithms to do this when I get the chance.

Given that the code you linked in the original does all the required steps on one line using numpy and doesn't really seem to require anything else (for the minimum-phase conversion) .. I'd probably just use the same approach if you're doing a Python script.

Kallyn · Post by **Kallyn** » Thu Feb 15, 2024 6:20 am

I'm SO CLOSE but I'm getting overflow errors at the last step!!! :(((
I took to ChatGPT to help explain what you suggested and with some of my own interpretation I was able to get this code:

Code: Select all

fftOutput = fft(input) # take fft

fftOutput = [math.log(abs(i)) for i in fftOutput] # take log of magnitude of the fft

cepstrum = ifft(fftOutput) # take ifft into cepstrum
        
for i in range(len(cepstrum)//2, len(cepstrum)):
   cepstrum[i] = 0 # zero out negative time-offsets
   # chatGPT helped me decipher this step, could I have done something wrong?
        
fftCep = fft(cepstrum) # take fft of cepstrum with no negative time-offsets

spectrum = [cmath.exp(complex(i.real, i.imag)) for i in fftCep]
# take complex exponent to get "normal" spectrum
# ^^ THIS STEP CAUSES OVERFLOW ERROR

output = ifft(spectrum) # take fft back into time-domain

I can tell it is getting SO close, if I can just figure out how to get past the overflow error. I was hoping I could scale something somewhere, but I either get an overflow error or complete silence in my output, no in between.

mystran wrote: ↑Thu Feb 15, 2024 4:46 am
Kallyn wrote: ↑Wed Feb 14, 2024 9:37 pm
mystran wrote: ↑Wed Feb 14, 2024 10:12 am The short version (there are a few equivalent variations) is take FFT, take logarithm of magnitude, take IFFT into cepstrum, zero out negative time-offsets, take FFT (we now have log magnitude in real component and phase in imaginary component), take complex exponent to get a "normal" spectrum and then take IFFT back into time-domain.
AHA! These are words I understand. Supposing I can find an FFT algorithm that doesn't take too much time or memory, I can't imagine those steps should take terribly long. That would mean the main time/mem drain should be the other steps. I'll look into libraries or algorithms to do this when I get the chance.
Given that the code you linked in the original does all the required steps on one line using numpy and doesn't really seem to require anything else (for the minimum-phase conversion) .. I'd probably just use the same approach if you're doing a Python script.

I'd love to do that, but I can't use external libraries like scipy or numpy. I plan to make this available for people without programming experience to use, and I know people won't want to download giant python libraries just to use this little script. Plus, as far as I can tell, I've already got most of the steps complete without external libraries. As i mentioned before, this definitely doesn't need to be real-time, and the version I have made works in 1 or 2 seconds which is definitely fine for my use case.

mystran · Post by **mystran** » Thu Feb 15, 2024 12:15 pm

Kallyn wrote: ↑Thu Feb 15, 2024 6:20 am I'm SO CLOSE but I'm getting overflow errors at the last step!!! ((

See my remarks about "too close to zero" above. The logarithm of a zero is -inf (or I guess the point at infinity for the complex case). The fix is to add a small bias that is large enough that the logarithm doesn't get large enough to actually cause mayhem. Something like 1e-10 (= -200dB) should usually be workable in double precision, though making it slightly bigger (eg. 1e-6 maybe) can reduce rounding errors in the FFTs when you're (probably) not trying to get 200dB SNR anyway.

Once you actually get results and they don't look right.. there's a slight detail I skipped: if you're just zeroing the negative offsets without touching the positive ones, you might need to multiply the DC-bin and the Nyquist-bin (or offset 0 and N/2 depending on how you look at it) by 0.5 'cos in a sense they are half positive- and half negative-side.

Kallyn · Post by **Kallyn** » Thu Feb 15, 2024 5:37 pm

Turns out I forgot to scale the output of my IFFT so I was trying to take the exponential of numbers > 10000, which is obviously gonna give overflow. I did still add the 1e-6 before the log step just in case people run into errors in the future. I could tell that the issue didn't come from taking the log of 0 because the line that kept giving errors was the one that converted it back to the normal spectrum by taking the exponential of the cepstrum fft. It's a REALLY good feeling finally having a working prototype. I will look into finding FFT algorithms that are more efficient time- and space-wise, as well as check out that IDCT transform you mentioned previously. Thanks so much for your help! I'll credit all of you in the script :)))))

juha_p · Post by **juha_p** » Thu Feb 15, 2024 8:47 pm

Here's one I've bumped to (Octave code):

Code: Select all

function [hn_min] = lp_fir_2_mp_fir(hn_lin)

% This function helps in converting the linear phase 
% FIR filter to minimum phase FIR filter. It essentially 
% brings all zeros which are outside the unit circle to 
% inside of unit circle.

r 	= roots(hn_lin);
k	= abs(r) > 1.01;
r1  	= r(k);		% roots outside of unit circle.
r2	= r(~k);		% roots on or within the unit circle
s	= [1./r1; r2];
hn_min	= poly(s);
hn_min	= hn_min*sqrt(sum(hn_lin.^2))/sqrt(sum(hn_min.^2));

mystran · Post by **mystran** » Fri Feb 16, 2024 3:05 am

Kallyn wrote: ↑Thu Feb 15, 2024 5:37 pm as well as check out that IDCT transform you mentioned previously.

Just to make sure: this was just a wild idea and I'm not sure if this would actually work...

Kallyn · Post by **Kallyn** » Sat Feb 17, 2024 1:02 am

mystran wrote: ↑Fri Feb 16, 2024 3:05 am
Kallyn wrote: ↑Thu Feb 15, 2024 5:37 pm as well as check out that IDCT transform you mentioned previously.
Just to make sure: this was just a wild idea and I'm not sure if this would actually work...

Yeah I tried to implement it and it gave me something that did NOT look viable... The implementation I found also took about 10 times longer than the FFT. I figured I'd just keep with what I have.

I was actually able to find a better FFT algorithm, its iterative and works probably twice as fast. I'm curious whether there's an FFT algorithm that works faster if I only care about the magnitude and not the phase, since that's all I need. I'm not unhappy with the performance, though, but if I ever want to make a real time version then I might want to find a better algorithm.

Converting Linear Phase FIR to Minimum Phase FIR, for a complete newbie