Linear or nonlinear filter for speech processing?

DSP, Plugin and Host development discussion.
Post Reply New Topic
RELATED
PRODUCTS

Post

I'm building a filter for speech, and I'm trying to decide if linear or non-linear filters are better, given my need to save large numbers of pre-computed coefficients. There can be a delay of a second or so, so if I went linear I could easily opt to make it zero phase, and that's my first instinct. But why is there so much bias against a using an IIR filter, like a Butterworth, that could potentially have a similar gain magnitude response with a far smaller order?

The standard answer for linear is "in the time domain a linear phase filter preserves the signal morphology" and "zero phase in addition to linear phase advantages loses the N/(2*Fs) second delay" but so what? My ear doesn't hear in the time domain; it hears in the frequency domain, with cochlear innervation at approximately harmonically-related frequencies. Some say "but IIR filters exhibit lots of ringing in their impulse response" but that's either an artifact of a low-order filter or possibly finite-precision arithmetic - a higher order filter will have close-to-ideal response so, if implemented with double precision, won't ring any more or less than a linear FIR with similar gain-magnitude performance.

My interest in IIRs is that, because of hardware architecture limitations, I'll need to pre-compute a large number of filter coefficients, and that would take a lot less space for an IIR implementation.

Post

What do you mean by "filter for speech"? What are you trying to achieve exactly?

Post

Filtering speech recordings. I'm mentioning that it is speech to indicate that certain types of processing artifacts would make it unintelligible, but I'm questioning the general consensus that IIR filters are bad for this because the non-constant group delay would make it appear morphologically different on an oscilloscope. My ears work in the frequency transform domain; I hear a pure sinewave as a single tone, not as a continuously variable warble as on an oscope, so I'd think the magnitude of the frequency domain behavior would be more important than the relative phases of the component harmonics.

Post

Some processes used in speech processing do use IIR filters. For instance, "LPC" speech compression uses math to find the best-fit IIR filter for each block. A lot of classic speech synthesis algorithms (such as the one used by Stephen Hawking) use IIR filters.

A lot of the natural processes that cause filtering (such as sound bouncing around your throat to produce vowels) produce a result that is much closer to minimum-phase filtering than linear-phase filtering. IIR filters cause minimum-phase filtering as well, so they can sound more natural than linear-phase filtering in some applications.

IIR and linear-phase filters preserve different parts of the signal integrity. Linear-phase filters preserve the phase relationships between harmonics when there is a volume differential. IIR filters preserve edges: if your waveform has a jump/impulse/slope-change, it will be preserved in IIR filtering (with some wiggling after the fact as the filter does its job) - inversely, the linear-phase filter can easily change this edge into a sum of wiggling waveforms, with side-effects such as pre-echo. This is relevant to voice, because you can clearly see this edge-followed-by-wiggling effect in vocal waveforms.

Post

AnalogGuy1 wrote: Mon Jan 20, 2020 8:39 pmlinear or non-linear filters are better
I want to point out that a linear or non-linear filter means something very different from linear or non-linear phase. Strictly speaking the whole transfer function (including the phase) is a concept that only really makes sense in the first place for linear shift-invariant (also known as "time-invariant" when sampling over time) filters.

Post Reply

Return to “DSP and Plugin Development”