KVR Audio

Jeff McClintock · Post by **Jeff McClintock** » Tue Apr 02, 2024 12:29 am

camsr wrote: ↑Sat Mar 30, 2024 11:26 pm If a plugin dev wanted to change their mapping, for example to extend the range of a filter cutoff or a delay line, how could they do this without breaking previously saved states?

unpopular opinion: If you change a plugin that much, it is simply no longer 100% compatible with the previous version. Plugin developers need discipline to avoid creating this type of dilemma for our customers.
One approach is to release a new plugin that is independent of the old one. Another technique is to have e.g. a switch on the GUI "Classic Mode" that restores the original functionality.

S0lo · Post by **S0lo** » Tue Apr 02, 2024 1:11 am

camsr wrote: ↑Mon Apr 01, 2024 10:04 pm I want to argue more about the established 0.f - 1.f range as being sub-optimal in it's usage of available floating point bits, but I would need to do some example calculations first to support my statements. From memory, this range (in SP FP) is utilizing 24-bits, and modulating only 30 (or 31) bits in total. This is something to consider when the normalized range is scaled.

For single-precision float (assuming IEEE 754), the mantissa (or significand) uses 23 bits and the exponent uses 8 bits. Additionally, there is 1 bit reserved for the sign of the number, making a total of 32 bits used to represent a float.

So your basically playing with 23 bits for the 0 to 1 range. Apart from DAW none recorded automation, You have two more limiting factors:

1. You move the UI parameter using mouse. And your UI API handler has a range (=sensitivity factor) to set the maximum throw of the mouse in screen pixels for a full knob revolution to happen. I think this is like 200 pixels in VSTGUI 3 by default. Which is nothing really considering todays screen resolutions. I've personally doubled it for a bit more resolution, and I still think its low. I don't know what other GUI APIs do. JUCE ? Vector based ? I don't know.

You could set larger values ofcourse, but at some point it will start to become annoying as the user has to keep moving the mouse for a long distance. Or you can allow something like Shift+tweak (ie zoom) for fine tuning, (setZoomFactor() even allows you to change the range for it), but thats a different thing altogether.

2. You move the parameter using a MIDI controller. Which is even more limiting, 7 bits for 127 values. pfffff, nothing. Sure you can do 14bit, but good luck trying to find a good controller for that. Most users don't have it.

Ofcourse you can do smoothing internally with a double precision, but thats never like having the full resolution tactile at the hands of the user. Point is, the 23bit of float mantissa is not really the limiting factor when you think of it. Except for DAW originating automation.

BertKoor · Post by **BertKoor** » Tue Apr 02, 2024 7:02 am

camsr wrote: ↑Mon Apr 01, 2024 10:04 pm From memory, this range (in SP FP) is utilizing 24-bits, and modulating only 30 (or 31) bits in total.

If you're talking about eg modulation of the audio amplitude, the target (output audio) is in the same floating point format with a similar capped range as well.

earlevel · Post by **earlevel** » Tue Apr 02, 2024 7:43 am

S0lo wrote: ↑Tue Apr 02, 2024 1:11 amFor single-precision float (assuming IEEE 754), the mantissa (or significand) uses 23 bits and the exponent uses 8 bits. Additionally, there is 1 bit reserved for the sign of the number, making a total of 32 bits used to represent a float.

So your basically playing with 23 bits for the 0 to 1 range...

Pardon the trivia, but the mantissa is normalized and the leading 1 is omitted, so you have 24-bits of mantissa, not counting sign.

But it works out better than plain 24-bit—overall at least. Start at 1.0 and working downward, from 1.0 to 0.5 is the worst-performing span, with 24 bits of mantissa. For 0.5 to 0.25, you also have 24 bits dropping to 23, but then steps are half the size (resolution doubled). This happens again from 0.25, 0.125, and so on. Each getting finer. (Well, it wouldn't be complete without saying that when you get super close to 0, you start losing more mantissa bits. But it's inconsequential because the step size is absurdly small.)

But I agree with your analysis, so this is a trivial point, as I said—anything you set it with would have much less than 24-bits of precision. But I'll add another argument to it:

While we could argue that 32-bit integer is a better fit, the fact that the worst case is 24-bit of precision and effectively getting better and better towards zero is pretty good. But 24-bits of precision exceeds most physical constraints. For instance, if you digitized a real knob that rolled from 0.0 to 1.0 volts, you could not get 24 bits of precision, regardless of the quality of the potentiometer, voltage source, or ADC. Nyquist-Johnson noise alone makes it impossible.

So despite 32-bit float being a strange fit in some ways, it has more than enough precision, and since we're probably going to multiply it by another float, it's handy to have it be a float so it doesn't require a conversion.

Music Engineer · Post by **Music Engineer** » Tue Apr 02, 2024 9:41 am

camsr wrote: ↑Mon Apr 01, 2024 10:04 pm I want to argue more about the established 0.f - 1.f range as being sub-optimal in it's usage of available floating point bits

Maybe suboptimal in terms of getting the most resolution out of the available bits - but I'd say optimal in terms of convenience. It was a design decision that had to take this trade-off into account and I happen to think that it was the right design decision. If you need to work with a normalized range, then 0..1 seems the most natural and most convenient choice. ...although, other choices are not entirely unheard of. OpenGL, for example, uses -1..+1 for its "normalized device coordinates". According to mystran's analysis, that should give us two more bits of resolution (in the worst cases near -1 and +1) when compared to 0..1. To really use the full range, one would have to use something like -3.4*10^38...+3.4*10^38 - now that would be really inconvenient indeed, even though being "optimal" in your sense. ...different desiderata that pull the design decisions into opposing directions and making informed choices about the compromise that works best in a given situation - that's big part of what engineering is all about. With VST, Steinberg engineers made some design decisions that I disagree with - but this is not one of them.

mystran · Post by **mystran** » Tue Apr 02, 2024 10:19 am

Music Engineer wrote: ↑Tue Apr 02, 2024 9:41 am OpenGL, for example, uses -1..+1 for its "normalized device coordinates". According to mystran's analysis, that should give us two more bits of resolution (in the worst cases near -1 and +1) when compared to 0..1.

Slightly offtopic, but can't resist:

The fact that OpenGL defines clip-space z ranging from -1 to 1 is actually bad for precision in the z-buffer.

Basically the way the perspective transform normally works, we're essentially storing 1-1/z (well, close enough; this assumes farplane at infinity, but you actually do that so.. whatever) in the depth buffer and the near-plane is where 1/z=1 so the range we're storing is [0,1]. The mapping is non-linear so most of the range is very close to the near plane, which in theory is nice if we're using fixed-point, but the effect is typically actually a "bit" (read: totally) too strong so the precision issues with the z-buffer are practically always far from the camera. With a fixed-point z-buffer, the best you can do is move your near-plane slightly further out.. even a small change here will improve precision far away a ton.

With floating point, you actually get better precision overall with what is known as "reverse-z" where you store 1 at the near-plane and 0 at the farplane (ie. store 1/z rather than 1-1/z and invert your depth comparisons? sorry I forgot the details, but I think that's now it worked). This way the 1/z mapping increases resolution near the nearplane and the better precision of the floats near zero increases the resolution towards the far plane where we lost a bunch due to the non-linear 1/z mapping.

Unfortunately, when we're working with (classic) OpenGL though, our clip-space is [-1,1] and not [0,1] so our best floating point range is at the middle (of 1/z, so that's something like twice the nearplane distance); that's the range that always works fine no matter what you do with your z) where it's totally not useful. Then the hardware takes the OpenGL clip-space and remaps it to the conventional [0,1] range to actual compute the z-buffer values (screenspace z is still 0,1)... but we already lost the precision when we multiplied by the projection matrix!

So.. there's actually ARB_clip_control (core in 4.5) to tell OpenGL to use the normal [0,1] range precisely so that you can then use reverse z-buffer to actually get good precision with a floating point z-buffer.

The moral of this story with regards to audio is that when it comes to numerical precision, just randomly making assumptions that something is good or bad is not what you want to do. What you actually want to do is look at the math and see if there is something where you actually end up losing lots of precision... because it's not always obvious exactly where that precision is needed the most.

Music Engineer · Post by **Music Engineer** » Tue Apr 02, 2024 12:11 pm

Interesting insights. I didn't know this. It's always good to learn new things.

Tone2 Synthesizers · Post by **Tone2 Synthesizers** » Tue Apr 02, 2024 4:03 pm

A wise decision would have been to use integers instead of floats. At least 14 bits for full Midi compatibility. But we all know that Steinberg doesn't care about 'unimportant' industry standards like MIDI. They do their own soup

Music Engineer · Post by **Music Engineer** » Tue Apr 02, 2024 4:44 pm

Integers for normalized parameters? ...hmmmm....seems to be inconvenient in the context of an environment where the actual DSP algorithms typically run on floating point numbers. For integers, one would probably choose an unsigned 32 bit format - that seems to be what MIDI 2.0 uses. The uniform resolution throughout the range may be advantageous for some types of parameters but disadvantageous for others - and we would have to do a lot of int-to-float conversions. I think, I'd still prefer the float 0..1 convention but I guess, I could live with another convention, too

earlevel · Post by **earlevel** » Tue Apr 02, 2024 5:13 pm

Music Engineer wrote: ↑Tue Apr 02, 2024 4:44 pm Integers for normalized parameters? ...hmmmm....seems to be inconvenient in the context of an environment where the actual DSP algorithms typically run on floating point numbers. For integers, one would probably choose an unsigned 32 bit format - that seems to be what MIDI 2.0 uses. The uniform resolution throughout the range may be advantageous for some types of parameters but disadvantageous for others - and we would have to do a lot of int-to-float conversions. I think, I'd still prefer the float 0..1 convention but I guess, I could live with another convention, too

Agreed. My post was a little long, I'll recap:

1. 24-bit mantissa is enough for anything—change my mind. (Picture the guy sipping his coffee at the folding table with the statement on a sign—let me know what it won't cover that could possibly be audible.)

2. The calculations we'll use these number for are likely floating point.

Sure, we use more bits than needed, but 16 bits is dicey and 24-bit data is awkward, so it's going to be 32 anyway.

Bottom line: Pretty hard to make a compelling case to change existing implementations of normalized from float32 0..1 to something else.

Music Engineer · Post by **Music Engineer** » Tue Apr 02, 2024 6:01 pm

earlevel wrote: ↑Tue Apr 02, 2024 5:13 pm 1. 24-bit mantissa is enough for anything—change my mind.

Nah. I agree with that. And yes - it's awkward to deal with 24 bits on regular PCs, so the next higher "natural" option would be 32. 16 is also a kind of "natural" word size but that would be a bit too stingy, so 32 it is.

S0lo · Post by **S0lo** » Tue Apr 02, 2024 7:41 pm

Hi earlevel

earlevel wrote: ↑Tue Apr 02, 2024 7:43 am
S0lo wrote: ↑Tue Apr 02, 2024 1:11 amFor single-precision float (assuming IEEE 754), the mantissa (or significand) uses 23 bits and the exponent uses 8 bits. Additionally, there is 1 bit reserved for the sign of the number, making a total of 32 bits used to represent a float.

So your basically playing with 23 bits for the 0 to 1 range...
Pardon the trivia, but the mantissa is normalized and the leading 1 is omitted, so you have 24-bits of mantissa, not counting sign.

But it works out better than plain 24-bit—overall ........

Thats why I said:

S0lo wrote: ↑Tue Apr 02, 2024 1:11 am the mantissa (or significand) uses 23 bits

I didn't say mantissa is 23 bits. Sure there is 1 bit that is omitted because it's always one. And sure mantissa is normalized. References:

https://blog.demofox.org/2017/11/21/flo ... precision/
"32 bit floats use 1 bit for sign, 8 bits for exponent and 23 bits for mantissa. Whatever number is encoded........"

https://learn.microsoft.com/en-us/cpp/c ... w=msvc-170
"Single-precision values with float type have 4 bytes, consisting of a sign bit, an 8-bit excess-127 binary exponent, and a 23-bit mantissa......."

https://users.cs.fiu.edu/~downeyt/cop2400/float.htm
"1 bit for the sign, 8 bits for the exponent, 23 bits for the mantissa
However, since the leading bit in the mantissa is never stored, then there are actually 24 bits for the mantissa. Pretty sneaky."

I'm sure your already aware of all that. So references are for new comers who read this thread.

mystran · Post by **mystran** » Tue Apr 02, 2024 10:02 pm

S0lo wrote: ↑Tue Apr 02, 2024 7:41 pm I didn't say mantissa is 23 bits. Sure there is 1 bit that is omitted because it's always one. And sure mantissa is normalized.

Well, in a sense the mantissa is 24 bits only 23 of which are stored (with the first one implied, unless the number is denormal, which we'd rather not exist at all when it comes to audio). In another sense the mantissa is 23 bits, because that's how many bits are actually allocated to store it. Depends on whether you think of values or bit patterns.

Either way, the important part is that a single precision float can store unsigned integers of up to 24 bits exactly.. and for signed we can actually store the equivalent of 25-bit(!) 2s complement integers exactly (because the integer loses one bit to sign and we don't), so really a single-precision float is strictly one bit better than a signed 24-bit fixed point over the whole range (and obviously a whole lot better at lower signal amplitudes).

So.. just in order to be controversial I'm going to count both the signbit and the implied leading one as being part of the mantissa, so that I can claim it's 25 bits. In doing so I have managed to pack 25+8=33 bits into a 32bit word, so clearly we have managed about 3% compression and if we perform this conversion about 23 times recursively, we can compress any file to less than 50% of it's original size. Please discuss.

camsr · Post by **camsr** » Tue Apr 02, 2024 11:20 pm

Can we discuss 32-bit FP DACs instead?

The sign bit is a special bit that can offer a range [-1,1] which is double the amount of values of [0,1] and also offers value inversion, which can mean it can perform an operation implicitly. That would be useful in many mixing parts. Not for reference, just for insight, but the FP range [0,1] is just about 1/4 of the entire range of values that a SP FP can have, and with the sign bit it is doubled, so 1/2 of all the possible values. I was thinking before that if the normalized upper limit was instead 8388608.f (23 exponent values up from 1.f), that would offer even more possible values. But now we remember that the mantissa is only 23 bits (or is it really 24?) and it's traction to fixed-point integer is limited by the mantissa WHEN CONVERTING BETWEEN TYPES. What that means is, having 8388608 integer values represented as a float, is directly castable to the same 23 (or 24) bit range that the integer is capable of.

I should just check the IEEE 754 applet on the web (I am lazy

) but I am pretty sure that having the top of the SP FP range 1 inclusive is calling upon the MSB in the exponent, and that means 24 bits without the sign, 25 with. And thank you again mystran for your general experience.

So to reiterate what I was saying, having 1/4 or 1/2 of the possible values available, even if most of them may never be mapped, in some circumstances will change the precision results of the scaling and shifting of those values. And in some circumstances it may result in a greater result precision.

camsr · Post by **camsr** » Tue Apr 02, 2024 11:26 pm

BertKoor wrote: ↑Tue Apr 02, 2024 7:02 am
camsr wrote: ↑Mon Apr 01, 2024 10:04 pm From memory, this range (in SP FP) is utilizing 24-bits, and modulating only 30 (or 31) bits in total.
If you're talking about eg modulation of the audio amplitude, the target (output audio) is in the same floating point format with a similar capped range as well.

What I meant was, if you only count to 8388607 in unsigned 32bit int, the top 9 MSB will never change in value.

Parameter input range