Integer is King? - final thoughts about the EQ challenge
- KVRAF
- 12615 posts since 7 Dec, 2004
if the smaller int version does have "tighter bass", it might be because the quantization noise in the int version is more balanced, and the representation of numbers is linear.
the large int version will have less quantization noise. it might be interesting to test the larger int version with some white noise mixed in at each multiplication equal to the amplitude of the quantization noise you would get by using the smaller int format. that may produce equal results for both int versions.
for the float version, you may have issues because if you add a very small float number to a large float number, it is equivalent to not doing any addition at all. the small number gets quantized away completely and doesnt even leave a trace, unlike how a very small int number would produce over time a difference through quantization noise by acting as a dither.
the effects using float and double are extremely small, however the numbers with very low integration rates are also very small. over time, the errors in the float version can build up. this would be most noticeable in very low and very high frequency settings, and with a large amount of memory in the form of resonance/feedback in this case.
these are the same reasons that my filter becomes unstable (well, this is my theory) in both float and double versions, yet works fine even at very low int bit depths.
the large int version will have less quantization noise. it might be interesting to test the larger int version with some white noise mixed in at each multiplication equal to the amplitude of the quantization noise you would get by using the smaller int format. that may produce equal results for both int versions.
for the float version, you may have issues because if you add a very small float number to a large float number, it is equivalent to not doing any addition at all. the small number gets quantized away completely and doesnt even leave a trace, unlike how a very small int number would produce over time a difference through quantization noise by acting as a dither.
the effects using float and double are extremely small, however the numbers with very low integration rates are also very small. over time, the errors in the float version can build up. this would be most noticeable in very low and very high frequency settings, and with a large amount of memory in the form of resonance/feedback in this case.
these are the same reasons that my filter becomes unstable (well, this is my theory) in both float and double versions, yet works fine even at very low int bit depths.
-
- KVRian
- 770 posts since 2 Apr, 2003
Not when the difference between exponents is less than the bit length of the mantissa, which when you're using 64 bit floats is unlikely to be an issue compared with 64 bit ints, and certainly compared to 32 bit ints.aciddose wrote:for the float version, you may have issues because if you add a very small float number to a large float number, it is equivalent to not doing any addition at all. the small number gets quantized away completely and doesnt even leave a trace, unlike how a very small int number would produce over time a difference through quantization noise by acting as a dither.
-
- KVRAF
- 1907 posts since 29 Oct, 2003
@accidose - re: cables
For balanced, I only experimented with 15m cable.
My thoughts: for input, use the "superbal" config and "tighten" it, so it will be in cca. 4k range, and "tighten" the driver as well. Superbal is down here:
http://www.dself.dsl.pipex.com/ampins/b ... lanced.htm
Also, for the driver (you can get some inspiration from "quasi-floating" on the same page), it's worth trying pseudo current-feedback (like on the "floating" driver, output AND feedback is taken from after the serial resistor, 75E in the Self's schematic). I'd also use pulldowns to minimize the LR impact of the cable.
There's also some info from Jensen, but uses different approach:
http://www.jensen-transformers.com/an/ingenaes.pdf
This one gives you an idea how to "harden" the output of a single opamp (5532's in this case):
http://www.jensen-transformers.com/as/as056.pdf
For balanced, I only experimented with 15m cable.
My thoughts: for input, use the "superbal" config and "tighten" it, so it will be in cca. 4k range, and "tighten" the driver as well. Superbal is down here:
http://www.dself.dsl.pipex.com/ampins/b ... lanced.htm
Also, for the driver (you can get some inspiration from "quasi-floating" on the same page), it's worth trying pseudo current-feedback (like on the "floating" driver, output AND feedback is taken from after the serial resistor, 75E in the Self's schematic). I'd also use pulldowns to minimize the LR impact of the cable.
There's also some info from Jensen, but uses different approach:
http://www.jensen-transformers.com/an/ingenaes.pdf
This one gives you an idea how to "harden" the output of a single opamp (5532's in this case):
http://www.jensen-transformers.com/as/as056.pdf
-
- KVRist
- 86 posts since 14 Jan, 2007 from around
How come this balanced quantisation noise is better for ma bass?aciddose wrote:if the smaller int version does have "tighter bass", it might be because the quantization noise in the int version is more balanced, and the representation of numbers is linear.
So mixing in float would mean big amplitudes truncate small amplitudes constantly? Signals loosing and regaining their "presence",all the way along, hundrets of times per second depending on what other signals they are with? That getting worse with every track you add? And on top of that the big truncation at the end when it all has to be fixed?aciddose wrote: for the float version, you may have issues because if you add a very small float number to a large float number, it is equivalent to not doing any addition at all. the small number gets quantized away completely and doesnt even leave a trace, unlike how a very small int number would produce over time a difference through quantization noise by acting as a dither.
How is one supposed to control that?
Why would the errors build up on high and low freqs? What does the amound of memory have to do with all that?aciddose wrote: the effects using float and double are extremely small, however the numbers with very low integration rates are also very small. over time, the errors in the float version can build up. this would be most noticeable in very low and very high frequency settings, and with a large amount of memory in the form of resonance/feedback in this case.
thorKz
-
- KVRist
- 86 posts since 14 Jan, 2007 from around
Oh. And what is a numbers integration rate.
And all those effects that are "far below the hearing level", as everybody calls them, are not somewhere hidden far behind the signal, they are right there on the surface of the signal! Wanted to say that for a long time.
thorKz
And all those effects that are "far below the hearing level", as everybody calls them, are not somewhere hidden far behind the signal, they are right there on the surface of the signal! Wanted to say that for a long time.
thorKz
- KVRAF
- 12615 posts since 7 Dec, 2004
"Not when the difference between exponents is less than the bit length of the mantissa, which when you're using 64 bit floats is unlikely to be an issue compared with 64 bit ints, and certainly compared to 32 bit ints."
well, when you're using a very low frequency it means you'll be operating with very small numbers. lets take 50hz at 96khz for example in a simple lossy integrator. the whole calculation will end up as equal to the floating point representation of
exp(-2*pi * 50 / 96000)
the error for a float version (1:8:23) is
(sign*(1+manissa/2^23))*(2^(exponent-127))
with [1][127][8333794]
= 0.00000002541812071502541143185416
the maximum error for an int version is
1 / (2^31)
= 0.0000000004656612873077392578125
54 times more error for float here.
for double...
(sign*(1+manissa/2^52))*(2^(exponent-2047))
with [1][2046][4474171814146206]
= 4.2972895747497294664630987600103e-18
obviously the error is less than for 32bit int.
however, double has 40 times more error than for 64bit int.
whether an error at -151db vs. -186db matters can be argued i'm sure, however it seems thorkz can hear the difference and as it is cumulative in the case of the filters used here i'm not surprised. remember, over the period of one cycle it becomes -102db, and over two cycles already -96db if the error is summed every cycle. if it is multiplied.. then it requires only 17 cycles to become 100%.
this is for a simple lossy integrator. generally the error will grow much faster than this through multiple stages. i may be using a crazy method to find how fast the error grows, however it does grow and it does start from a level above the lowest level of human hearing.
well, when you're using a very low frequency it means you'll be operating with very small numbers. lets take 50hz at 96khz for example in a simple lossy integrator. the whole calculation will end up as equal to the floating point representation of
exp(-2*pi * 50 / 96000)
the error for a float version (1:8:23) is
(sign*(1+manissa/2^23))*(2^(exponent-127))
with [1][127][8333794]
= 0.00000002541812071502541143185416
the maximum error for an int version is
1 / (2^31)
= 0.0000000004656612873077392578125
54 times more error for float here.
for double...
(sign*(1+manissa/2^52))*(2^(exponent-2047))
with [1][2046][4474171814146206]
= 4.2972895747497294664630987600103e-18
obviously the error is less than for 32bit int.
however, double has 40 times more error than for 64bit int.
whether an error at -151db vs. -186db matters can be argued i'm sure, however it seems thorkz can hear the difference and as it is cumulative in the case of the filters used here i'm not surprised. remember, over the period of one cycle it becomes -102db, and over two cycles already -96db if the error is summed every cycle. if it is multiplied.. then it requires only 17 cycles to become 100%.
this is for a simple lossy integrator. generally the error will grow much faster than this through multiple stages. i may be using a crazy method to find how fast the error grows, however it does grow and it does start from a level above the lowest level of human hearing.
- KVRAF
- 12615 posts since 7 Dec, 2004
my "maximum error for int" is a bit crazy. the actual error for that particular number (to be fair, counting it rather than the minimum representable number) is
exp(-2*pi * 50 / 96000) - (2140467509 / 0x7FFFFFFF)
0.0000000002739325828206115792938
which.. wait.. is odd since it is less than
0.0000000004656612873077392578125
what am i doing wrong?
ah, or is it that that particular number is closer to a whole 32 bit int fraction than the smallest space between two ints. so in fact, it is more fair to use the smallest space between two ints in this case since it is the worst case scenario.
so, in this particular case, with these particular numbers float is 108 times worse than int in terms of the scale for error. in different situations you'll get different results and i'm sure float can in many cases be more accurate than int. (er.. actually if we're dealing with numbers in a reasonable range like we'd see in filters, int i think will always be significantly more accurate)
when you get into high frequencies you have different issues. at very high frequencies like near nyquist you'll get a "wobble" in the amplitude (a beat frequency) of a frequency which is a whole fraction of the sample rate. (/2, /3, /4, etc)
since the beat frequency can be very low, you'll run into the same problem as with the very low frequency example i've already posted. the "beats" will get phase-syncing and you'll end up then with the frequency locking against whole fractions of whole fractions as it moves around. this doesnt sound nice even if you cant really pick it out listening other than "it feels better" when it isnt happening but most important this can cause all hell to break loose in a function which requires that this doesnt happen, like many filters.
this is the reason many filters will break down at high or low frequencies in float, yet remain stable in int.
exp(-2*pi * 50 / 96000) - (2140467509 / 0x7FFFFFFF)
0.0000000002739325828206115792938
which.. wait.. is odd since it is less than
0.0000000004656612873077392578125
what am i doing wrong?
ah, or is it that that particular number is closer to a whole 32 bit int fraction than the smallest space between two ints. so in fact, it is more fair to use the smallest space between two ints in this case since it is the worst case scenario.
so, in this particular case, with these particular numbers float is 108 times worse than int in terms of the scale for error. in different situations you'll get different results and i'm sure float can in many cases be more accurate than int. (er.. actually if we're dealing with numbers in a reasonable range like we'd see in filters, int i think will always be significantly more accurate)
when you get into high frequencies you have different issues. at very high frequencies like near nyquist you'll get a "wobble" in the amplitude (a beat frequency) of a frequency which is a whole fraction of the sample rate. (/2, /3, /4, etc)
since the beat frequency can be very low, you'll run into the same problem as with the very low frequency example i've already posted. the "beats" will get phase-syncing and you'll end up then with the frequency locking against whole fractions of whole fractions as it moves around. this doesnt sound nice even if you cant really pick it out listening other than "it feels better" when it isnt happening but most important this can cause all hell to break loose in a function which requires that this doesnt happen, like many filters.
this is the reason many filters will break down at high or low frequencies in float, yet remain stable in int.
- KVRAF
- 12615 posts since 7 Dec, 2004
thorkz, i'm sorry that i really cant explain how this takes effect in an integrator. i'll give a simple example but i do not think you will understand.
an integrator is when we take a number/buffer "memory" and add to that with a scaled input signal every sample.
buffer = buffer + input * rate
the rate is the integration rate. if we wanted to go from 0 to 1 in 100 samples, we would use rate = 1/100
a simple lowpass filter is called a lossy integrator, it is an integrator only rather than scaling the input signal, we scale the difference between the input signal and the memory
buffer = buffer + (input - buffer) * rate
so, if you think about this, we're adjusting the amplitude of the changes in the waveform. so if we decrease the changes, it will make the waveform change more slowly, a lowpass filter.
filters are created usually by combinations of integrators, lossy integrators and other stages.
so, like you asked:
"So mixing in float would mean big amplitudes truncate small amplitudes constantly?"
yes, this is exactly what happens. for normal things like mixing, the error is so small that it does not make a difference. in integrators, like used in filters it does make a difference over time.
if you're using a "scale" which is very small, and the "memory" is large, that means that "scale" might be ignored completely and no change will happen.
if we are doing
(input - memory) * scale
what will happen is in order for the result to be more than zero, the (input - memory) difference must become big enough. so what happens is we will get a many samples where there is no change, then suddenly a big change will happen.
if we were using int, the small changes would happen constantly and we would not have an issue.
what i'm describing does not normally happen. rather than having nothing happen (no change), in a real filter it will change with a slight error. either too much, or not enough. over time this will mean that sometimes the "memory" is too high and sometimes too low. what this will lead to is an error signal appearing. this can be noise, or it can be specific frequencies depending upon exactly how the filter works.
this error signal is also too small to matter much in most filters, however just like the integration error was cumulative, the error signals can also be cumulative. if you have a filter with many stages the error can be allowed to build up to a large level which eventually becomes something you can hear.
an integrator is when we take a number/buffer "memory" and add to that with a scaled input signal every sample.
buffer = buffer + input * rate
the rate is the integration rate. if we wanted to go from 0 to 1 in 100 samples, we would use rate = 1/100
a simple lowpass filter is called a lossy integrator, it is an integrator only rather than scaling the input signal, we scale the difference between the input signal and the memory
buffer = buffer + (input - buffer) * rate
so, if you think about this, we're adjusting the amplitude of the changes in the waveform. so if we decrease the changes, it will make the waveform change more slowly, a lowpass filter.
filters are created usually by combinations of integrators, lossy integrators and other stages.
so, like you asked:
"So mixing in float would mean big amplitudes truncate small amplitudes constantly?"
yes, this is exactly what happens. for normal things like mixing, the error is so small that it does not make a difference. in integrators, like used in filters it does make a difference over time.
if you're using a "scale" which is very small, and the "memory" is large, that means that "scale" might be ignored completely and no change will happen.
if we are doing
(input - memory) * scale
what will happen is in order for the result to be more than zero, the (input - memory) difference must become big enough. so what happens is we will get a many samples where there is no change, then suddenly a big change will happen.
if we were using int, the small changes would happen constantly and we would not have an issue.
what i'm describing does not normally happen. rather than having nothing happen (no change), in a real filter it will change with a slight error. either too much, or not enough. over time this will mean that sometimes the "memory" is too high and sometimes too low. what this will lead to is an error signal appearing. this can be noise, or it can be specific frequencies depending upon exactly how the filter works.
this error signal is also too small to matter much in most filters, however just like the integration error was cumulative, the error signals can also be cumulative. if you have a filter with many stages the error can be allowed to build up to a large level which eventually becomes something you can hear.
- KVRAF
- 12615 posts since 7 Dec, 2004
and.. the difference in using int.
float numbers are exponential. very small numbers have more accuracy than very large numbers. this is why when you add a very small number to a very large number, the small number might be ignored - such a small number doesnt fit between two of the larger numbers.
ints are linear, meaning that you can always add any two numbers and the result will always be perfect. there will never be an error in addition or subtraction.
because of this difference, error signals in floating point will have a bias. depending upon how the filter/function works, the resulting numbers can be too small or too large. over time this will cause the numbers to "creep" in one direction and the error will be allowed to build up.
in int, errors are also possible. with int we do not suffer from errors caused by a non-linear representation, but we do suffer from quantization error which can take effect during scaling, (multiplication and division). the distribution of these errors however will not be biased into one direction. instead, the error will usually be noise, often close to white noise. this means that over time the error will tend to cancel itself out and will be unable to "creep" like in float code.
float numbers are exponential. very small numbers have more accuracy than very large numbers. this is why when you add a very small number to a very large number, the small number might be ignored - such a small number doesnt fit between two of the larger numbers.
ints are linear, meaning that you can always add any two numbers and the result will always be perfect. there will never be an error in addition or subtraction.
because of this difference, error signals in floating point will have a bias. depending upon how the filter/function works, the resulting numbers can be too small or too large. over time this will cause the numbers to "creep" in one direction and the error will be allowed to build up.
in int, errors are also possible. with int we do not suffer from errors caused by a non-linear representation, but we do suffer from quantization error which can take effect during scaling, (multiplication and division). the distribution of these errors however will not be biased into one direction. instead, the error will usually be noise, often close to white noise. this means that over time the error will tend to cancel itself out and will be unable to "creep" like in float code.
-
- KVRian
- 770 posts since 2 Apr, 2003
I'm rather confused by your calculation for maximum area, especially in respect to your previous statement regarding adding small values to large ones, in which case it is the difference in magnitude in the two values being added which matters.
If you're not talking about that and purely about the error in your coefficient, then again your calculations seem wrong. You can simply approach fixed point as the same as floating point with a fixed exponent. A floating point coefficient always has a precision of 24 bits, giving it at the worst case 256 times more error than a 32 bit int. However at best case its error is hundreds of bits better, it all depends on the range you set your int to represent.
If you're not talking about that and purely about the error in your coefficient, then again your calculations seem wrong. You can simply approach fixed point as the same as floating point with a fixed exponent. A floating point coefficient always has a precision of 24 bits, giving it at the worst case 256 times more error than a 32 bit int. However at best case its error is hundreds of bits better, it all depends on the range you set your int to represent.
-
Christian Budde Christian Budde https://www.kvraudio.com/forum/memberlist.php?mode=viewprofile&u=25572
- KVRAF
- Topic Starter
- 1538 posts since 14 May, 2004 from Europe
Hi there,
I had some similar thoughts about that. But the error should be too small to be audible. I mean we are talking about an error of -151dB. Even if it would be much higher, you still have to apply a nearly fullscale signal to have the advantage for Ints. This would additionally mask the error I think.
However one thing I could do is to repeat the SNR measurement for a very low(/high) freuqncy input signal. Right now I only checked 1kHz. Let's see if it unveils something...
Christian
I had some similar thoughts about that. But the error should be too small to be audible. I mean we are talking about an error of -151dB. Even if it would be much higher, you still have to apply a nearly fullscale signal to have the advantage for Ints. This would additionally mask the error I think.
However one thing I could do is to repeat the SNR measurement for a very low(/high) freuqncy input signal. Right now I only checked 1kHz. Let's see if it unveils something...
Christian
- KVRAF
- 12615 posts since 7 Dec, 2004
i'm only talking about one range, -1.0 to 1.0. in float, we throw away a single bit, half the exponent range goes unused.
it is a lot to take in and i'm not surprised if it is confusing, this is coming off the top of my head and i'm not the best writer.
as far as i can see however, my calculation for error above is correct. the numbers i've used represent the difference in magnitude, i've just stripped away the redundant or useless parts of the function.
the difference in magnitude is exactly what i'm talking about in my above posts. for very low frequencies, you'll be attempting to integrate very small numbers. i came up with an example which is partly worst case and partly best case (on both sides, float and int) and as far as i can see, equal and fair. it demonstrates that the average error level in float for very low frequencies in an integrator will be much higher than for int.
if you're sticking to a single range like -1.0 ... +1.0, it isnt possible for float to be more accurate than a 32 bit int. in fact, it isnt ever possible in any circumstance for it to be more accurate. it can only be equally accurate in special circumstances, like if you were adding a very small number to a very small number. while that is more accurate for float while you continue working with only very small numbers, immediately when you mix small and large numbers float becomes far less accurate. if you take into consideration real situations (like working in a constantly moving range of numbers between minimum and maximum, small and large, large and small) rather than extreme cases (like adding two tiny numbers) it should be obvious that float is never more accurate than int.
Christian;
i know it is very small, i addressed that. that level which i calculated remember is only the error introduced in a single sample step. we're making 96000 steps per second and the error will add up over time. it is too small to matter in simple situations like mixing two signals, however in situations where you have complex memories and interactions between linear and non-linear elements the error can grow many times faster. sort of like summation vs. multiplication vs. exponentiation. it depends upon the context. many filters are on the "like exponentiation" side.
most importantly i pointed out that while the floating point error will accumulate quite happily, int quantization error will tend to cancel out over time. so we're not just talking about one error level vs. another here, we're talking about a considerable fundamental difference between the two systems.
it is a lot to take in and i'm not surprised if it is confusing, this is coming off the top of my head and i'm not the best writer.
as far as i can see however, my calculation for error above is correct. the numbers i've used represent the difference in magnitude, i've just stripped away the redundant or useless parts of the function.
the difference in magnitude is exactly what i'm talking about in my above posts. for very low frequencies, you'll be attempting to integrate very small numbers. i came up with an example which is partly worst case and partly best case (on both sides, float and int) and as far as i can see, equal and fair. it demonstrates that the average error level in float for very low frequencies in an integrator will be much higher than for int.
if you're sticking to a single range like -1.0 ... +1.0, it isnt possible for float to be more accurate than a 32 bit int. in fact, it isnt ever possible in any circumstance for it to be more accurate. it can only be equally accurate in special circumstances, like if you were adding a very small number to a very small number. while that is more accurate for float while you continue working with only very small numbers, immediately when you mix small and large numbers float becomes far less accurate. if you take into consideration real situations (like working in a constantly moving range of numbers between minimum and maximum, small and large, large and small) rather than extreme cases (like adding two tiny numbers) it should be obvious that float is never more accurate than int.
Christian;
i know it is very small, i addressed that. that level which i calculated remember is only the error introduced in a single sample step. we're making 96000 steps per second and the error will add up over time. it is too small to matter in simple situations like mixing two signals, however in situations where you have complex memories and interactions between linear and non-linear elements the error can grow many times faster. sort of like summation vs. multiplication vs. exponentiation. it depends upon the context. many filters are on the "like exponentiation" side.
most importantly i pointed out that while the floating point error will accumulate quite happily, int quantization error will tend to cancel out over time. so we're not just talking about one error level vs. another here, we're talking about a considerable fundamental difference between the two systems.
-
- KVRian
- 770 posts since 2 Apr, 2003
ironically for the value you chose, given that you are using the int to represent a range of +/- 1 (which can be deduced from your use of 2^31, it actually has a higher error than the float representation.
-
Christian Budde Christian Budde https://www.kvraudio.com/forum/memberlist.php?mode=viewprofile&u=25572
- KVRAF
- Topic Starter
- 1538 posts since 14 May, 2004 from Europe
Btw. the SNR for the 24bit version was only -79dB. Just as a reason, why thorKz might have heard that difference.
However, if thorKz is still around, could you please tell us, what settings you used for the EQ? Did you use some extreme settings? Or only peak EQs? The type of filter is at least important to make the 24bit version even worse. Especially if you attenuate and gain again.
However, if thorKz is still around, could you please tell us, what settings you used for the EQ? Did you use some extreme settings? Or only peak EQs? The type of filter is at least important to make the 24bit version even worse. Especially if you attenuate and gain again.
