Floating type for dsp
-
- KVRian
- Topic Starter
- 814 posts since 26 May, 2013 from France, Sisteron
Hi,
I wonder if the C++ type float is suited for DSP development, or if double is more appropriate?
I heard that after a few effects chained, you can start to notice error accumulation.
Is it something you observed?
Thanks.
I wonder if the C++ type float is suited for DSP development, or if double is more appropriate?
I heard that after a few effects chained, you can start to notice error accumulation.
Is it something you observed?
Thanks.
-
- KVRAF
- 3080 posts since 17 Apr, 2005 from S.E. TN
There was one or more papers years ago, related to the common motorola 56K fixed point dsp chip. As best I recall.
The computational size could be either 24 or 48 bit fixed, and it was concluded that iir filters, especially at low frquencies where some coefficients can be very small, that performance is much better at 48 bit fixed.
Personally, I try to do as many computations in double as possible, and keep temp results in doubles whenever it is convenient. But even if that is a proper decision, I'm quickly falling further behind. I have a bunch of objects and functions written in fpu asm for speed, and some time if I ever bother to compile 64 bit apps, most likely will have to either go back to high level code or rewrite all that crap in sse, which as best I understand has some differences in container size and such.
The computational size could be either 24 or 48 bit fixed, and it was concluded that iir filters, especially at low frquencies where some coefficients can be very small, that performance is much better at 48 bit fixed.
Personally, I try to do as many computations in double as possible, and keep temp results in doubles whenever it is convenient. But even if that is a proper decision, I'm quickly falling further behind. I have a bunch of objects and functions written in fpu asm for speed, and some time if I ever bother to compile 64 bit apps, most likely will have to either go back to high level code or rewrite all that crap in sse, which as best I understand has some differences in container size and such.
- KVRAF
- 7892 posts since 12 Feb, 2006 from Helsinki, Finland
It's worth noting that a numerically poor algorithm (direct forms anyone?) in double precision can be worse than a good algorithm in single precision.
-
- KVRian
- 1000 posts since 1 Dec, 2004
Imho, both float and double are appropriate. I think it's not very common to see cases where float has insufficient precision - float is perfectly appropriate, except as an accumulator for sample position and a few other similar "time accumulator" cases. As a format for audio stream from effect to effect, I think it's perfect.
It's also not very common to see an appreciable speed gain when switching from double to float: you get to process 4 floats at the same time instead of 2 doubles if you use SSE, but using SSE isn't very common. Afaik the most important speed gain is that it takes half as much space (so it eats up half as much memory bandwidth and it uses half as much cache).
It's also not very common to see an appreciable speed gain when switching from double to float: you get to process 4 floats at the same time instead of 2 doubles if you use SSE, but using SSE isn't very common. Afaik the most important speed gain is that it takes half as much space (so it eats up half as much memory bandwidth and it uses half as much cache).
-
- KVRAF
- 3080 posts since 17 Apr, 2005 from S.E. TN
I'd like to know that as well. Well, something like it.abique wrote:Is SSE something that can be generated for you by the compiler or you need to write asm yourself?
Thanks.
The limited reading I've done, gives the impression that with some compilers, the default is usually sse when compiling for 64 bit. In some compilers, the default appears to be fpu for compiling to 32 bit. Then it seems that some compilers give you a choice of either fpu or sse for 32 bit.
Additionally, it seems that some allow a switch to compile for FPU in 64 bit mode. Hadn't studied it much, and didn't even know whether FPU will work or not, in 64 bit mode.
I guess I'd have to rework my asm to be 64bit compatible in addressing modes and such, but if it is indeed possible to use the FPU in 64 bit mode, it ought to save me a lot of work if I ever build for 64 bit.
- KVRAF
- 7892 posts since 12 Feb, 2006 from Helsinki, Finland
Well, if you tell your compiler that it's allowed to use SSE (or SSE2 in case of doubles) then basically all floating point math will automatically results in SSE code (in the sense that SSE unit will be used instead of FPU). In 64-bit this is always the case.
But what MadBrain was referring to with 2-way vs. 4-way is the SIMD operations that work on multiple values at a time, and for the most part that involves you writing those manually. There's no need to use asm, you can use compiler intrinsics and let the compiler still deal with things like register allocation for you.
Now .. some compiler do have some sort of auto-vectorization that is supposed to basically do some of this automatically at least in the cases where it's reasonably straight-forward, turning regular floating point code into SIMD code.. YMMV.
But what MadBrain was referring to with 2-way vs. 4-way is the SIMD operations that work on multiple values at a time, and for the most part that involves you writing those manually. There's no need to use asm, you can use compiler intrinsics and let the compiler still deal with things like register allocation for you.
Now .. some compiler do have some sort of auto-vectorization that is supposed to basically do some of this automatically at least in the cases where it's reasonably straight-forward, turning regular floating point code into SIMD code.. YMMV.
-
- KVRist
- 231 posts since 15 Apr, 2012 from Toronto, ON
I believe you still need to watch out for register spilling, however. If memory serves, there are 8 registers for SSE and 16 for AVX, and if your intrinsics use more than that, the compiler will have to push and pop temporary values onto the stack which will bascially wipe out the performance gains you get from SIMD operations.mystran wrote:There's no need to use asm, you can use compiler intrinsics and let the compiler still deal with things like register allocation for you.
- KVRAF
- 7892 posts since 12 Feb, 2006 from Helsinki, Finland
It's no different from any other code you write; if you have too many live variables (the actual number of variables is usually irrelevant) you end up with some of them spilled. That's why compilers do register allocation in the first place, it's purpose is basically to try to find a way to minimize the spill costs. Typical heuristics will usually spill something long-lived though, since typically one can then allocate the same registers to a large number of short lived temporaries.LemonLime wrote:I believe you still need to watch out for register spilling, however. If memory serves, there are 8 registers for SSE and 16 for AVX, and if your intrinsics use more than that, the compiler will have to push and pop temporary values onto the stack which will bascially wipe out the performance gains you get from SIMD operations.mystran wrote:There's no need to use asm, you can use compiler intrinsics and let the compiler still deal with things like register allocation for you.
Anyway, you have 8 registers for SSE in 32-bit, and 16 in 64-bit. These are the same registers that the compiler will use for floating point code too, when it's not using x87.
-
- KVRAF
- 7401 posts since 17 Feb, 2005
Of course! Using less operations to do the same thing is always the edge to less noise.mystran wrote:It's worth noting that a numerically poor algorithm (direct forms anyone?) in double precision can be worse than a good algorithm in single precision.
I have been researching it for a little while, just now I have been looking at rounding error as a function of the input domain. If you have a 6 bit integer input (converted to float with no error), you can use a maximum of 4 multiplicands with intermediate results under 2^24 with 0!! error from the mantissa.
The same arrangement in double type could use a 13 bit integer domain.
- KVRAF
- 7892 posts since 12 Feb, 2006 from Helsinki, Finland
Actually it's not quite that simple. It's the magnitudes of values that are typically important, not so much the number of operations, and often you can improve things by actually doing a bit more calculations!camsr wrote:Of course! Using less operations to do the same thing is always the edge to less noise.mystran wrote:It's worth noting that a numerically poor algorithm (direct forms anyone?) in double precision can be worse than a good algorithm in single precision.
For example, the basic 2D rotation algorithm that can be used a generator for exponential sinusoids (x is the cosine, y is the sine, obviously):
Code: Select all
newX = oldX * cos(p) + oldY * sin(p)
newY = oldY * cos(p) - oldX * sin(p)
The same thing can be written instead as:
Code: Select all
tmpX = oldX * (-2*sin(p/2)^2) + oldY * sin(p)
tmpY = oldY * sin(p) + oldX * (-2*sin(p/2)^2)
newX = oldX + tmpX;
newY = oldY + tmpY;
It's basically the same problem with cosines that cause direct forms to perform so poorly at low frequencies and by using something like modified couple form (which can be extended to a general filter) or state variable, you get rid of the cosines and the whole low-frequency problem basically vanishes... but in terms of raw number of operations, those will be worse than your typical direct form.
In these cases, the extra cost is fairly negligible though. In other cases, you might end up with much uglier trade-offs (where the cost of keeping precision might be much larger).
-
- KVRAF
- 7401 posts since 17 Feb, 2005
Well I was speaking strictly in terms of multiplication. Addition is an entirely different problem that suffers when addends are further from 0.mystran wrote:Actually it's not quite that simple. It's the magnitudes of values that are typically important, not so much the number of operations, and often you can improve things by actually doing a bit more calculations!camsr wrote:Of course! Using less operations to do the same thing is always the edge to less noise.mystran wrote:It's worth noting that a numerically poor algorithm (direct forms anyone?) in double precision can be worse than a good algorithm in single precision.
Since addition's resolution is limited by the domain AND the mantissa, it only stands that values more true to the number line are more accurate. Multiplication is different in the fact that we can use the exponent to do 100% accurate ops over the entire domain, example being by adding or subtracting the exponent a perfect multiplication by a power of 2 can be had. The mantissa once again limits how many values will actually fall on a "perfect float". What I am checking on is how much range is shrunk by multiplying sequentially, and how far can a float go with a*b*c*d*... before ANY quantization is introduced.
- KVRAF
- 7892 posts since 12 Feb, 2006 from Helsinki, Finland
Oh right... but even with multiplies, one would typically like any sensitive coefficients to be small, so they can be represented accurately [edit: well, meaning a small deviation in values of large scale should not result in a large deviation in the results]. Returning to my previous example, simply storing the (fixed) a cosine coefficient for a small angle (so it's close to 1, forcing a "large" exponent) will result in more deviation than storing something like -2*sin(2/p)^2 which is close to zero all the way. So in this case you can lose precision even before any "runtime" calculation actually done.camsr wrote: Since addition's resolution is limited by the domain AND the mantissa, it only stands that values more true to the number line are more accurate. Multiplication is different in the fact that we can use the exponent to do 100% accurate ops over the entire domain, example being by adding or subtracting the exponent a perfect multiplication by a power of 2 can be had.