plusfer
KVRist

46 posts since 1 Apr, 2016
I've tried both methods, but these sound the same (?)
What is the difference?

Code: Select all
`y[n] = b0 * x[n] + b1 * x[n-1] + b2 * x[n-2] - a1 * y[n-1] - a2 * y[n-2]`

and
Code: Select all
`out = b0 * in + z0;z0 = z1 + b1 * in - a1 * out;z1 = b2 * in - a2 * out;`
Last edited by plusfer on Fri Oct 06, 2017 7:03 am, edited 1 time in total.
matt42
KVRian

997 posts since 9 Jan, 2006
The first biquad is direct form 1 (DF1). The second is Transposed Direct form 2 (TDF2). There are also TDF1 and DF2, which IIRC are not as stable numerically.

The main differences between DF1 and TDF2 is that DF1 will handle parameter modulation much better than TDF2, which typically breaks when modulated. DF1 isn't perfect in this regard and will also have modulation artifacts which will be more obvious at higher update rates.

TDF2 can produce faster code. Which makes it the better choice is you don't need to modulate the filter.

And, yes, they will sound the same. You should be able to look at the equations, or perhaps block diagrams of these forms, and deduce that the resulting output, if you feed both the exact same input, will be exactly the same (assuming no modulation). Algorithmically the difference is in how they store and handle the input and output samples, but at the output you have the same inputs and outputs multiplied by the same coefficients - they just took different paths to get there.
plusfer
KVRist

46 posts since 1 Apr, 2016
Very clear explanation, thanks!
Miles1981
KVRian

1343 posts since 26 Apr, 2004, from UK
matt42 wrote:The main differences between DF1 and TDF2 is that DF1 will handle parameter modulation much better than TDF2, which typically breaks when modulated. DF1 isn't perfect in this regard and will also have modulation artifacts which will be more obvious at higher update rates.

For me, the experience is exactly the opposite.
Due to the state inside TDF2, it is inherently more stable than DF1.
And DF1 due to the lack of state is faster in terms of memory accesses and stalls (TDF2 is clearly bad in that regard!).
Max M.
KVRist

241 posts since 20 Apr, 2005, from Moscow, Russian Federation
Miles1981 wrote: And DF1 due to the lack of state is faster in terms of memory accesses and stalls

This is a huge oversimplification of things. Sure we can pretend there's no state in DF1 since its 4 state cells are shared with its I/O values, but they are still there. And there's no generic rule on which one would require more mem accesses as there infinity of external factors and contexts (after all, in many cases, e.g. in a small tight loop it's much easier to keep the two state vars of DF2T right in registers than to deal with four extra I/O var reads of DF1 (not even counting more complex cases taking into account all the caching/write-buffers etc etc. trickery stuff of a bigger loops)).

TDF2 is clearly bad in that regard!.

So nope. (I don't think we're going to throw concrete examples at each other as yet again there're infinite variations of particular implementations counting all possible process-buffer/process-tick/n-filters-per-each methods - so just "no, you can't say it like this - it would be a huge mislead).
matt42
KVRian

997 posts since 9 Jan, 2006
Miles1981 wrote:Due to the state inside TDF2, it is inherently more stable than DF1.
What exactly do you mean more stable? I'm not talking numerical. If you modulate TDF2 then the powers of z each end up getting modulated by different filter states!
mystran
KVRAF

4927 posts since 11 Feb, 2006, from Helsinki, Finland
Direct form filters in general are terrible when it comes to modulation, so who cares really. I can't really see how DF1 would really be computationally (or rather in terms of memory or register pressure) more efficient than TDF2, but if you want smooth modulation then you should really be using something that isn't direct form at all.
<- plugins | forum
matt42
KVRian

997 posts since 9 Jan, 2006
mystran wrote:Direct form filters in general are terrible when it comes to modulation, so who cares really. I can't really see how DF1 would really be computationally (or rather in terms of memory or register pressure) more efficient than TDF2, but if you want smooth modulation then you should really be using something that isn't direct form at all.
True that. While outside the bounds of the original question I think that's a more than valid point.

I have a method (can't say I invented it, more like put it together from other sources) to arrive at a ZDF/trapezoidal integrated filter. No one seemed to like it as it didn't circuit model, but the whole point was an easy way to arrive at linear filters that were stable with modulation. Totally off topic, but, yeah
Miles1981
KVRian

1343 posts since 26 Apr, 2004, from UK
Max M. wrote:
Miles1981 wrote: And DF1 due to the lack of state is faster in terms of memory accesses and stalls

This is a huge oversimplification of things. Sure we can pretend there's no state in DF1 since its 4 state cells are shared with its I/O values, but they are still there. And there's no generic rule on which one would require more mem accesses as there infinity of external factors and contexts (after all, in many cases, e.g. in a small tight loop it's much easier to keep the two state vars of DF2T right in registers than to deal with four extra I/O var reads of DF1 (not even counting more complex cases taking into account all the caching/write-buffers etc etc. trickery stuff of a bigger loops)).

TDF2 is clearly bad in that regard!.

So nope. (I don't think we're going to throw concrete examples at each other as yet again there're infinite variations of particular implementations counting all possible process-buffer/process-tick/n-filters-per-each methods - so just "no, you can't say it like this - it would be a huge mislead).

It's not an oversimplification.
And yes, you can count the number of memory accesses in both cases, for a DF1 and TDF2 implementation. You can turn that as much as you want, you can count them on the algorithm directly.
And stalls is what kills performance nowadays.
Max M.
KVRist

241 posts since 20 Apr, 2005, from Moscow, Russian Federation
Miles1981 wrote:It's not an oversimplification.
And yes, you can count the number of memory accesses in both cases, for a DF1 and TDF2 implementation. You can turn that as much as you want, you can count them on the algorithm directly.

I wonder if you actually read what I wrote, otherwise - okay - let's count (As I hinted above I'll keep DF2T state vars in registers):
DF2T: 1 mem read (1 input) + 1 mem write (1 output) per iteration/tick.
DF1: 5 mem reads (3 inputs + 2 outputs) + 1 mem write (1 output) per iteration/tick.

So?
(It's not a problem to optimize DF1 loop to keep its states in registers as well thus reducing it to the same 1read/1write per tick too, but this will require 4 registers instead of 2 of DF2T).
Miles1981
KVRian

1343 posts since 26 Apr, 2004, from UK
Max M. wrote:I wonder if you actually read what I wrote, otherwise - okay - let's count (As I hinted above I'll keep DF2T state vars in registers):
DF2T: 1 mem read (1 input) + 1 mem write (1 output) per iteration/tick.
DF1: 5 mem reads (3 inputs + 2 outputs) + 1 mem write (1 output) per iteration/tick.

So?
(It's not a problem to optimize DF1 loop to keep its states in registers as well thus reducing it to the same 1read/1write per tick too, but this will require 4 registers instead of 2 of DF2T).

Don't make it too obvious, the numbers you are giving are ridiculous. If you knew how to count memory accesses, it could be obvious that the two cases are identical. The difference is what you say about number of registers, but it doesn't work for anything bigger than 2nd order, and even for a second order, you have to make sure that you have enough registers to hold everything in memory (because you also need the 5 coefficients and the loop index + holding the output result before moving it to memory, so 9 registers).
So DF1 puts less pressure on hardware and using less memory accesses in the general case. Obviously, 2nd order is easily optimized in both cases.
Max M.
KVRist

241 posts since 20 Apr, 2005, from Moscow, Russian Federation
Miles1981 wrote:but it doesn't work for anything bigger than 2nd order ...

I hope you do realize that any-order filter can be implemented as a cascade of 2nd-order sections and each section may have its own tight-loop (I'm not even counting that even with 16-year old SSE2 we have 8x4 float and 8x2 double var registers, and things like loop-counter and I/O pointers are in GP registers anyway).

... it could be obvious that the two cases are identical.

Aha, "identical". Now at least not TDF2 is clearly bad as before.

Well, I see you prefer to stick to your once-decided-binary-logic-dont-want-to-bother-anymore-thing ("A is good and B is bad"), so let's just stop here. The op at least got some food for thought and I don't see any point in continuing to argue with newer and newer conditions brought in ("what about 64-order filter? what about 999 parallel filters? What about 487-FPU?").

Moderator: Moderators (Main)