In a dither over "dither"

DSP, Plugin and Host development discussion.
RELATED
PRODUCTS

Post

Just musing here, but thinking back on the analog days, and the early digital days...With analog recording, there was no hard limit at "0 dB" (relative to full scale), you could lay it down hot. When digital came out, there was controversy over what exact value should be considered "0 dB", whether it should allow for +18 dB headroom, for instance. I guess that might be part of why I find the +1.0 paranoia funny. We're already excluded a lot more than that singular value, from the analog perspective. :phones:
My audio DSP blog: earlevel.com

Post

Fender19 wrote: Sat Sep 19, 2020 6:40 pm The correct conversion is:

Code: Select all

integernumber = (1<<15) * floatnumber;
if (floatnumber < 0.) integernumber -= 1;
No. If whatever conversion you are doing involves branches, then it is almost certainly wrong.

The canonical solution is simple:
(int) roundf(x * 32767)

That's equivalent to:
(int) floorf(x * 32767 + 0.5f)

We can bias so that cast is equivalent to floor for the range of interest:
(((int) ((x * 32767) + 32768.5f)) - 32768)

The last one is usually fastest. I checked the usual suspects in Godbolt and only GCC seems able to avoid a library call for the others although realistically this is probably not going to be a performance bottleneck.
Last edited by mystran on Sat Sep 19, 2020 8:31 pm, edited 1 time in total.

Post

earlevel wrote: Sat Sep 19, 2020 7:11 pm I guess that might be part of why I find the +1.0 paranoia funny.
Personally I simply wish that everyone would do the same thing, whatever it is.

The whole clipping thing is kinda strawman anyway because as soon as you add dither even 0x7fff as a scaling factor will clip at +1.0 and noise shaping could require pretty much arbitrary headroom depending on what sort of filter you want to use.

Post

mystran wrote: Sat Sep 19, 2020 8:23 pm
earlevel wrote: Sat Sep 19, 2020 7:11 pm I guess that might be part of why I find the +1.0 paranoia funny.
Personally I simply wish that everyone would do the same thing, whatever it is.

The whole clipping thing is kinda strawman anyway because as soon as you add dither even 0x7fff as a scaling factor will clip at +1.0 and noise shaping could require pretty much arbitrary headroom depending on what sort of filter you want to use.
I agree it should be consistent. I'll add, though, that one of the fortunately things about floats is that even 32-bit floats can encode digitized values from ADC and to DAC exactly. If you do no processing, you play back exactly the same thing you recorded (without relying on scaling and rounding). Seems a shame to dispense with that for seemingly no advantage.

I don't get what you mean by "the whole clipping thing" being a straw man (which is fallacy). I can think of specific clipping arguments that could be a straw man, but the whole thing...I mean, clipping is what it is, it's just the consequence of a limit. Have a particular clipping argument in mind? Maybe you simply mean clipping is irrelevant to the issue...
My audio DSP blog: earlevel.com

Post

earlevel wrote: Sat Sep 19, 2020 7:11 pm Just musing here, but thinking back on the analog days, and the early digital days...With analog recording, there was no hard limit at "0 dB" (relative to full scale), you could lay it down hot. When digital came out, there was controversy over what exact value should be considered "0 dB", whether it should allow for +18 dB headroom, for instance. I guess that might be part of why I find the +1.0 paranoia funny. We're already excluded a lot more than that singular value, from the analog perspective. :phones:
I'm not concerned with being under scale in digital but even 1 sample OVER scale - caused by an incorrect integer calculation - can cause the signal to erratically INVERT. Ever heard THAT sound? It's 10x worse than clipping.
Last edited by Fender19 on Sat Sep 19, 2020 9:29 pm, edited 1 time in total.

Post

mystran wrote: Sat Sep 19, 2020 8:15 pm No. If whatever conversion you are doing involves branches, then it is almost certainly wrong.
Actually, yes, it's NOT right. (1<<15) * float will overflow and wrap at full scale. Should be:

Code: Select all

outint = (int)((32767) * normfloatvalue);//truncate to 16 bit signed int (+32767/-32768)
if (normfloatvalue < 0.f) outint--;
Added it to my noise shaping code and it cleaned it right up. Not my invention, BTW - it came from MusicDSP. Claims it's faster than "floor".

Post

earlevel wrote: Sat Sep 19, 2020 9:11 pm I don't get what you mean by "the whole clipping thing" being a straw man (which is fallacy). I can think of specific clipping arguments that could be a straw man, but the whole thing...I mean, clipping is what it is, it's just the consequence of a limit. Have a particular clipping argument in mind? Maybe you simply mean clipping is irrelevant to the issue...
What I mean is that talking about one specific integer value is irrelevant, when correct conversion requires dither (with or without noise shaping) and the dither would clip anyway if you push the source signal all the way to floating point full scale. Without knowing how the signal is going to be dithered (or rather how the noise is shaped) you cannot make judgements on how much it will clip and once you've added dither, you already know the target bitdepth and at that point you can already directly monitor whether or not any samples got clipped in the integer format instead.

Post

mystran wrote: Sat Sep 19, 2020 10:10 pm
earlevel wrote: Sat Sep 19, 2020 9:11 pm I don't get what you mean by "the whole clipping thing" being a straw man (which is fallacy). I can think of specific clipping arguments that could be a straw man, but the whole thing...I mean, clipping is what it is, it's just the consequence of a limit. Have a particular clipping argument in mind? Maybe you simply mean clipping is irrelevant to the issue...
What I mean is that talking about one specific integer value is irrelevant, when correct conversion requires dither (with or without noise shaping) and the dither would clip anyway if you push the source signal all the way to floating point full scale. Without knowing how the signal is going to be dithered (or rather how the noise is shaped) you cannot make judgements on how much it will clip and once you've added dither, you already know the target bitdepth and at that point you can already directly monitor whether or not any samples got clipped in the integer format instead.
Sure, I just wanted to make sure we were saying the same thing in that regard. And digital maximums aren't the same as signal maximums anyway. Most people leave some headroom, but if you really care you need to check the output against an oversampled meter.
My audio DSP blog: earlevel.com

Post

Fender19 wrote: Sat Sep 19, 2020 9:21 pmShould be:

Code: Select all

outint = (int)((32767) * normfloatvalue);//truncate to 16 bit signed int (+32767/-32768)
if (normfloatvalue < 0.f) outint--;
Added it to my noise shaping code and it cleaned it right up. Not my invention, BTW - it came from MusicDSP. Claims it's faster than "floor".
You still missed the more important point where if you use floor() without 0.5 offset (in order to make it round() instead), your output will have -0.5 lsb worth of DC.

In terms of performance, round() and floor() apparently are usually library calls, but as I pointed out you can use truncation without branches if you add a positive bias temporarily. The two adds to bias will almost certainly cost you less than the comparisons and a branch mispredictions every time the signal crosses zero. Note that when the signal is zero and there is only dither, every sample will cause a branch misprediction with 50% probability (ie. it's potentially about 10 times slower to branch than to bias).

Using SIMD intrinsics (with hardware set to the desired rounding mode) is even faster, but obviously makes the code less portable.
Last edited by mystran on Sat Sep 19, 2020 10:30 pm, edited 1 time in total.

Post

Fender19 wrote: Sat Sep 19, 2020 9:21 pm Actually, yes, it's NOT right. (1<<15) * float will overflow and wrap at full scale. Should be:

Code: Select all

outint = (int)((32767) * normfloatvalue);//truncate to 16 bit signed int (+32767/-32768)
if (normfloatvalue < 0.f) outint--;
Er...first, I've given reason why I think 32768 is the right multiplier, but even if you want to go with 32767, then your comment should be +32767/-32767 for +1.0/-1.0. But more importantly, plugins typically don't enforce +1.0/-1.0—EQ plugins can boost the signal level, and if you feed it a full-level signal, the output will be greater than +/-1.0, and it's bad form for the EQ plugin to clip itself, because the next plugin or channel gain might bring it back in line, with no harm done. If plugins clipped everything, that would kill that nice feature of floating point.

So, you should do the gain (32768, 32767, whatever), then something like "if (val >32767) val = 32767 else if (val < -32678) val = -32768", or equivalent to clip to range.

And the second line is still bad, sorry. If you don't think so, please run through an example of why you think it does something good.
My audio DSP blog: earlevel.com

Post

earlevel wrote: Sat Sep 19, 2020 10:28 pm And the second line is still bad, sorry. If you don't think so, please run through an example of why you think it does something good.
Actually it works, sort of.. assuming you wanted floor() and don't care which way the threshold values go, but you really want a round(), so you need an offset and the branch is mispredict at high rates, so .. like really don't do it.

Note that branching to clip is fine, as you're not expected to clip at a random with probability close to 50% so these will usually predict just fine, but one should generally avoid branches on the sign-bit of audio whenever possible, since these predict really poorly.

Post

mystran wrote: Sat Sep 19, 2020 8:15 pm The canonical solution is simple:
(int) roundf(x * 32767)

That's equivalent to:
(int) floorf(x * 32767 + 0.5f)

We can bias so that cast is equivalent to floor for the range of interest:
(((int) ((x * 32767) + 32768.5f)) - 32768)

The last one is usually fastest. I checked the usual suspects in Godbolt and only GCC seems able to avoid a library call for the others although realistically this is probably not going to be a performance bottleneck.
Nice. Will give these a try - better than "if()" for sure. So it looks like Wavelab is just plain truncating whereas other DAWs are converting float to int using round(), floor() or ceil(), etc.? They are grabbing that 1/2 lsb whereas Wavelab is not.

Post

Why has no one mentioned a driving backbeat or a b9 #11 chord?

Post

mystran wrote: Sat Sep 19, 2020 8:15 pm That's equivalent to:
(int) floorf(x * 32767 + 0.5f)

We can bias so that cast is equivalent to floor for the range of interest:
(((int) ((x * 32767) + 32768.5f)) - 32768)

The last one is usually fastest. I checked the usual suspects in Godbolt and only GCC seems able to avoid a library call for the others although realistically this is probably not going to be a performance bottleneck.
YES! That's it!

I revised my code to use rounding rather than truncating and now it works in Wavelab and Reaper and Cubase and Pro Tools with (1<<15) scale factor (-90.3dBFS peak noise level).

Here's my final result (for testing, not yet optimized):

Code: Select all

void Dither::Process(float inL, float inR, float& outL, float& outR, bool ditheron, bool noiseshapeon)
{
	inL += noiseshapeon * 0.5f * (errorL + errorL - errorLprev);
	inR += noiseshapeon * 0.5f * (errorR + errorR - errorRprev);
	
	float outLI = inL + ditheron * dither();
	float outRI = inR + ditheron * dither();

	outLint = (int)floor(32767 * outLI + 0.5f);//round to 16 bit signed int (+/-15)
	outRint = (int)floor(32767 * outRI + 0.5f);

	errorLprev = errorL;
	errorRprev = errorR;

	outL = (float)outLint / 32768;
	outR = (float)outRint / 32768;

	errorL = inL - outL;
	errorR = inR - outR;
}
The optional noiseshaping loop is something I picked up from a post in MusicDSP.org. It works really nicely.

IMO the int back to float conversion near the end still isn't quite right (can't get to +1.0) so I will keep looking at that, but it works pretty well as-is.

Anyhow, YES - that 1/2 lsb matters! Thank you Mystran! :tu:
Last edited by Fender19 on Sun Sep 20, 2020 3:42 am, edited 1 time in total.

Post

mystran wrote: Sat Sep 19, 2020 10:32 pm
earlevel wrote: Sat Sep 19, 2020 10:28 pm And the second line is still bad, sorry. If you don't think so, please run through an example of why you think it does something good.
Actually it works, sort of.. assuming you wanted floor() and don't care which way the threshold values go, but you really want a round(), so you need an offset and the branch is mispredict at high rates, so .. like really don't do it.
Well, "works"—I'm just giving the way I think it should work, of course. Rounding is fine, but that's not the issue I'm referring to. Run -1.0 through that, which should be 0x8000 as far as I'm concerned, and it comes out 0xFFFF7FFF (out of 16-bit range) and needs to get clipped to 0xFFFF8000. And there is a one-count gap just below 0. Sorry, I don't see how subtracting 1 is fixing anything.

PS—Oops, I see he was using 32767 instead of 32768, so that's not accurate. But again, I think 32767 is the wrong thing. If you calculate the smallest possible symmetrical square wave, for instance, instead of being 0x0001/0xFFFF to the DAC, it turns into half the expected amplitude, 0x0000/0xFFFF. Anyway, people can do what they want, but I have doubts that CoreAudio drivers, for instance, are doing that scaling to the DAC, it doesn't make much sense to me. I think it's conflating different issues.
My audio DSP blog: earlevel.com

Post Reply

Return to “DSP and Plugin Development”