KVR Audio

kamalmanzukie · Post by **kamalmanzukie** » Sun Jul 22, 2018 7:48 am

have a linear predictive coding mock up working pretty good, but it remains touchy maintaining a balance between enough input gain to get proper bandwidth and keeping it under a limit where it explodes horribly.

as i understand the method of bandwidth expansion is simply scaling all coefficients by some constant less than 1, which would amount to something like an overall gain control for the impulse response, is that correct? it seems that would be less effective than something like a window function for the outer coefficients to ensure the impulse response decays like it is supposed to

i have the 40 sample impulse response going zero padded into an fft display, and that seems to give a pretty good indication of the frequency response... but apparently the trick is to keep it from flying off the unit circle? that is a concept at the limits of my understanding, but i know it has to do with complex numbers and ends up relating to fourier stuff though i suspect an explanation may fly over my head

really, the more general advice about how to keep an all pole resynthesis filter stable, the merrier. cheers

kamalmanzukie · Post by **kamalmanzukie** » Sun Jul 22, 2018 8:06 am

a quick aside, anyone who can provide a big picture illustration of why it could be that punk bill burroughs provides clear diction but james brown and lou reed immediatly blow up the filter? could it be to do with lower register? could this indicate a need to pre whiten the signal?

stratum · Post by **stratum** » Sun Jul 22, 2018 9:24 pm

given that there are no replies, maybe you should rephrase the question?

p.s. an all pole filter is an all pole filter after all?? (that's what I'd say without actually understanding the context)

JCJR · Post by **JCJR** » Sun Jul 22, 2018 11:18 pm

kamalmanzukie wrote:a quick aside, anyone who can provide a big picture illustration of why it could be that punk bill burroughs provides clear diction but james brown and lou reed immediatly blow up the filter? could it be to do with lower register? could this indicate a need to pre whiten the signal?

I don't know about LPC filters, but something I noticed many years ago experimenting with old analog "vocal eliminator" circuits-- The classic simple effect was simple indeed-- Just phase-flip one of the stereo channels and anything center-panned in the stereo mix is cancelled out.

Because bass, kick, snare and some other instruments are commonly center-panned along with the common center-panning of vocals, a dodge to try to retain some of the bass, kick, snare and such while cancelling vocals was to attenuate the lows and highs of one of the channels, so that frequencies above and below the vocal range do not cancel out, keeping some of that information in the "vocal eliminated" output.

I used bass and treble tuning knobs in the devices I built, so that as much highs and lows could be retained possible, while cancelling as much possible lead vocal. It was always a tradeoff, if you wanted a little more bass in the output then you often also had to allow a little more lead vocal leakage in the output, adjusting the hipass and lowpass knobs to allow more vocal to get thru, to allow more bass or snare to get thru.

So anyway, I hadn't noticed before those experiments, but some vocalists took up such a wide audio spectrum that it wasn't really possible to notch out much of the vocals without also cancelling nearly all the bass, kick and snare. I am not a big Elvis fan, but the man is an excellent example of the effect. His voice shared spectrum with bass on one end and snare/hats on the other end. To do a good Elvis vocal cancel (using that particular crude technique) you had to cancel just about the entire audio band.

I don't recall critically testing JB or Lou Reed under similar circumstance, but especially early JB probably was a big voice covering a huge audio spectrum.

Dunno how that phenomena would interact with LPC. Just reminiscing that some vocalists do cover a wide audio spectrum, but you wouldn't necessarily notice just listening, unless the width of the spectrum makes some kind of difference to processing you are trying to accomplish.

kamalmanzukie · Post by **kamalmanzukie** » Mon Jul 23, 2018 5:35 am

yeah, the above examples were spoken word with no music to worry about cancelling out, but other than that it sounds about right. looking at spectrographs of respective vocal samples shows quite a bit of variability, though im not sure how much that is to due with the source audio not being great on some of them, and i've read that noisy signals can make linear prediction unstable

one solution, as far as i can tell, seems to be pitch tracking of the fundamental and comb filtering the audio signal, which would be sort of the opposite of a vocal eliminator, but using phase instead of polarity

anyway, i seem to have worked out something like a crude 40 sample fourier-esque analysis that seems to track the formant shifts, maybe from there is a way to derive something more stable. though that doesnt help the coefficient calculation getting stuck when the filter blows up.

i mean cell phone carriers use linear predictive coding and you never hear them blow up so theres gotta bee a way

JCJR · Post by **JCJR** » Mon Jul 23, 2018 6:12 am

I don't know much about the LPC factors or what you want to do with them, but maybe somewhere in there is "rough voice" vs "smooth voice". Some strong singers have "angelic pure" voice, sometimes male baritones but more likely tenors. Or female "pure tone" altos or sopranos vs "rough" such as Aretha Franklin or Koko Taylor.

The "rough" voices I suppose have a bigger noise component, but I personally like the timbre and expressiveness of rough voices better. I mean its a matter of taste but I'd rather hear "rough" Ray Charles or Tom Waits than "pure" (Yes vocalist) Jon Anderson or (Chicago vocalist) Pete Cetera. Adding natural noise or natural vocal tract distortion would widen the spectral footprint. OTOH such as "young elvis" a lot of the songs were not real gritty (except the rockers) but even the smooth angelic elvis song tones were thick as a brick with very wide spectral footprint.

I am very ignorant of vocals but recall reading long ago of "naive" analysis of vocal tract as vocal cords as pulse train stimulus with the chest, nose, throat, mouth resonating, tone-shaping the rough vocal cord pulse train. So maybe even "pure tone" angelic voices are strongly noise-stimulated but "purified" by the physical filtering mechanisms.

On the other hand not all rough voices are "good". Most rough voices suck. The "better than pure tone" rough voice seems a fairly rare thing.

kamalmanzukie · Post by **kamalmanzukie** » Mon Jul 23, 2018 7:38 am

thats the basis of linear predictive coding, the first step produces an 'error' signal, which is supposed to be glottal source but in practice it is more whitened version of the signal, or maybe it would be more accurate to say the frequency response is linearized, which leaves you with a waveform with clearly defined pitch + noise excitation. each stage in this process gives a single coefficient for the resonances.

so yeah, i think you're right in that your 'rough' voice has stronger resonances and thinking about it that is probably where i am running into trouble. especially lower resonances seem to be very destructive, so maybe pre whitening is the way to go. i've always found william burroughs voice to be almost ideal in this regard, something to do with his nasally sort of tone

i know that the strength of formants has to do with register, the lower register for a male voice has much stronger formants, but if that same singer has a strong falsetto it can be a much more pure sound. freddy mercury is a good example

kamalmanzukie · Post by **kamalmanzukie** » Mon Jul 23, 2018 7:56 am

stratum wrote:given that there are no replies, maybe you should rephrase the question?

p.s. an all pole filter is an all pole filter after all?? (that's what I'd say without actually understanding the context)

yeah, id say in general my way of phrasing things is not conducive to generating a lot of feedback, based on experience. probably it is wishful thinking to solve a problem like this without a stronger grasp of filter theory anyway

the problem in a nutshell is that the bandwidth seems to be dependant on input level, too low and there is not enough filtering, too hot and it explodes. from what i can gather this is because the impulse response fails to decay and violates causality.

stratum · Post by **stratum** » Mon Jul 23, 2018 8:50 am

kamalmanzukie wrote:
stratum wrote:given that there are no replies, maybe you should rephrase the question?

p.s. an all pole filter is an all pole filter after all?? (that's what I'd say without actually understanding the context)
yeah, id say in general my way of phrasing things is not conducive to generating a lot of feedback, based on experience. probably it is wishful thinking to solve a problem like this without a stronger grasp of filter theory anyway

Are you sure it's about the filter theory? I guess it says something simple and straightforward about the pole locations (albeit, understanding exactly why it is so is obviously deeper). Maybe the algorithm that you use to set those pole locations is incorrect? If that's correct maybe the code is running into numerical issues?

kamalmanzukie · Post by **kamalmanzukie** » Mon Jul 23, 2018 9:49 am

stratum wrote:
kamalmanzukie wrote:
stratum wrote:
Are you sure it's about the filter theory? I guess it says something simple and straightforward about the pole locations (albeit, understanding exactly why it is so is obviously deeper). Maybe the algorithm that you use to set those pole locations is incorrect? If that's correct maybe the code is running into numerical issues?

the algorithm is an adaptive lattice that generates forward and backward coefficients simultaneously. its not my code but as far as i can tell it seems to follow the block diagram. the bit that i find weird is with n stages you get a set of coefficients that decays and then as far as i can tell it applies those coefficients forwards and backwards at the same time for the filter

it would make more sense to me that for n analysis stages there would be 2n stages for the filter with the coefficients applied forward and backward, the result looking like a linear phase impulse response. thats the only thing i can think of. i hope that makes sense

the only thing numerically i can think of is that all the coefficients are clipped on output, but thats after the integrator which may well shoot to 150 or return a NAN

i dont have the code, this is built in reaktor. think would pictures be of help?

stratum · Post by **stratum** » Mon Jul 23, 2018 10:13 am

I dont know reaktor but surely posting a picture of the blocks would help, there are no doubt people who know about them here.

Kraku · Post by **Kraku** » Thu Jul 26, 2018 6:07 am

Maybe this is one of those "filter cutoff too low/high --> filter blows up" issues which can happen more easily with traditional IIR filters with unit delays in them?

stratum · Post by **stratum** » Thu Jul 26, 2018 7:20 am

for what it's worth, the question appears to be about this http://www.cs.tut.fi/~sgn14006/PDF/S03-LP.pdf

mystran · Post by **mystran** » Sat Jul 28, 2018 12:21 am

Something to keep in mind is that if your implementation involves working with direct form polynomials, those can get unstable due to coefficient quantisation pretty easily once the order gets higher and the desired poles closer to unit circle as relatively small errors (due to finite precision) in the higher order terms can throw of the whole thing pretty badly, even if the actual final processing structure is something like a ladder that doesn't really suffer significantly from this problem.

As far as input level dependence... how about running the thing through an AGC first? I'd imagine in traditional voice compression applications this would usually be part of the signal chain for other reasons anyway. If you don't actually want to compress the dynamics down, you could still flatten it out for the LPC and then restore it afterwards with an inverse envelope?

mda · Post by **mda** » Thu Aug 02, 2018 7:36 am

Maybe compare to: https://sourceforge.net/p/mda-vst/code/ ... alkBox.cpp

This line is stopping any peaks in the autocorrelation being accidentally higher than the peak at zero lag:
r[0] *= 1.001f; //stability fix
but I guess there are other positions in the code a similar fix could be applied.

stability of all pole filter for linear prediction