In a dither over "dither"

DSP, Plug-in and Host development discussion.
earlevel
KVRian
579 posts since 4 Apr, 2010

Post Tue Sep 15, 2020 2:00 pm

Also agree with what mystran said (even if autocorrect wants me to type "minstrel"). Frankly, some of these dither options some of these plugins give seem to be an assertion they understand the problem better than "you" do, even if the options are unlikely to improve anything. Option 1 (low)—why? If your plugin really does have "natural noise" of sufficient level (it probably will if you recorded it), it's almost certainly Gaussian (thermal noise from mixer, etc.) and will do fine at dithering by itself—but it won't hurt you to have option 2 enabled. If it doesn't, you should go for #2 anyway. Why muddy the water with the choice of suboptimal? Anyway, look again at my video. Optimal dither level practically eliminates correlation of the error with the signal, and anything above just increases the noise level. Truncation distortion is an issue of bit thresholds—if you have to get on your 10-foot roof, you need a ladder that's long enough. Too short by much and it won't do the job—way too long is ok, but doesn't improve the task and has drawbacks.

Anyway, maybe you can share how you're truncating the samples.
My audio DSP blog: earlevel.com

Fender19
KVRian
511 posts since 30 Aug, 2012

Post Tue Sep 15, 2020 3:58 pm

earlevel wrote:
Tue Sep 15, 2020 2:00 pm
Anyway, maybe you can share how you're truncating the samples.
I will share more than that!

I created a low level audio clip in my DAW, added dither and exported two times as a 16 bit wave file - one time with with my (Mystran's) dither code applied and the other time with Steinberg's "16bit, flat TPDF, no noise shaping" dither applied.

I then imported those two wave files side-by-side into Reaper and applied 60dB of gain to both tracks so we can hear the dither. Screenshot of that here: https://drive.google.com/file/d/1Hi11I9 ... sp=sharing

As you can see in the screenshot the peak level and RMS levels of both dither signals during the intro before the music starts are exactly the same - however note that the CREST factors are different. Not sure how that's possible since crest factor = Peak/RMS - however this shows that SOMETHING is different. Also note that the two tracks start growing in amplitude in different places even though they are the exact same clips at the exact same gain and bit depth. Two different dithers is the only difference.

Now, for the real proof they are not the same - please listen very closely to the following clips:

This is clip #1 with my dither (Mystran's code) applied. Note how the noise level increases as the music fades in - producing a "gritty" sound. https://drive.google.com/file/d/1CeFwbK ... sp=sharing I thought "TPDF" was not supposed have "noise modulation" like this - so why is the noise level growing with the signal?

This is clip #2 with the Steinberg "flat TPDF, no noise shaping" dither applied. The noise level stays constant as the audio fades in and there is NO grittiness: https://drive.google.com/file/d/1eiF4bH ... sp=sharing

So, how do we explain this? Both dithers are "flat TPDF" yet they SOUND different and even LOOK different.

BTW - I had to "gain up" the mystran dither code by 6dB to get these results.

Code: Select all

//scale = 1.f / double(1 << 15);
scale = 1.f / double(1 << 14);
Before doing that the dither noise was completely gated out before the audio fade in.

mystran
KVRAF
5894 posts since 12 Feb, 2006 from Helsinki, Finland

Post Tue Sep 15, 2020 4:13 pm

earlevel wrote:
Tue Sep 15, 2020 2:00 pm
Frankly, some of these dither options some of these plugins give seem to be an assertion they understand the problem better than "you" do, even if the options are unlikely to improve anything.
There are some legit choices that can be made, like whether you prefer your TPDF to be white or blue and/or whether you want to do some noise shaping (which has it's own pros and cons), but the general trend with all of these is that they just move the noise around from one part of the spectrum to another, without really changing the total noise power.
Preferred pronouns would be "it/it" because according to this country, I'm a piece of human trash.

earlevel
KVRian
579 posts since 4 Apr, 2010

Post Tue Sep 15, 2020 4:50 pm

mystran wrote:
Tue Sep 15, 2020 4:13 pm
earlevel wrote:
Tue Sep 15, 2020 2:00 pm
Frankly, some of these dither options some of these plugins give seem to be an assertion they understand the problem better than "you" do, even if the options are unlikely to improve anything.
There are some legit choices that can be made, like whether you prefer your TPDF to be white or blue and/or whether you want to do some noise shaping (which has it's own pros and cons), but the general trend with all of these is that they just move the noise around from one part of the spectrum to another, without really changing the total noise power.
Sure, I wasn't going so far as noise shaping. Noise shaping does something real, and fairly profound. (That said, most people would be best served to pick TPDF and forget about it. TPDF will always sound good, and the benefits of noise shaping at 16 bits and up is questionable, for music recorded at reasonable levels and dynamics—if not, you probably have more serious problems than your dither. Just my opinion.)
My audio DSP blog: earlevel.com

mystran
KVRAF
5894 posts since 12 Feb, 2006 from Helsinki, Finland

Post Tue Sep 15, 2020 6:38 pm

For completeness, here the complete code I suggested:

Code: Select all


uint32_t hash(uint32_t x)
{
    x ^= x >> 17;
    x *= UINT32_C(0xed5ad4bb);
    x ^= x >> 11;
    x *= UINT32_C(0xac4c1b51);
    x ^= x >> 15;
    x *= UINT32_C(0x31848bab);
    x ^= x >> 14;
    return x;
}

double dither()
{
    uint32_t x = hash(++counter);
    return (x & 0xffff) * (1. / 0xffff) - (x >> 16) * (1. / 0xffff);
}

...

scale = 1. / double(1<<15);

out0 = in0 + scale * dither();
out1 = in1 + scale * dither();
The hash is good enough that it can be split into components like this and 16-bit values should be plenty. Subtraction cancels the DC offset, resulting in normalized TPDF in range [-1,+1] which is then scaled down by 15 bits, resulting in a 2 lsb spread. Feel free to divide the components by 0x10000 rather than 0xffff if having an exclusive range of ]-1,1[ makes you feel better inside for whatever reason; such tiny differences make no practical difference whatsoever.

Other than the final scale factor (which is handy thing to have if you want to target multiple bit-depths), this is what I also used to make the autocorrelation plot (hash TPDF) earlier.

There are approximately 4 valid ways to write a quantizer (so in theory the DAW could be doing various things), but in practice the differences these make in terms of plain dither are completely irrelevant if you're not quantizing yourself. If you want to do noise shaping, then you need the actual quantizer in the loop and in this case you really need to be able to predict bit-for-bit what the DAW is going to produce from your output. Fortunately in practice with 16-bit audio you should usually be able to assume scaling by 0x7fff and rounding to nearest.

One thing to watch out for is that the DAW should be told not to do any sort of auto-bypass or ignoring silence or anything like that, because sometimes "silence" is detected at slightly higher levels than dither alone (although I suppose this is usually more of a problem when dithering to 24-bits). If your framework supports reporting things like tails, make sure they are set to infinite (eg. at least legacy IPlug defaults to "none" for VST3 and this could potentially cause issues).
Preferred pronouns would be "it/it" because according to this country, I'm a piece of human trash.

earlevel
KVRian
579 posts since 4 Apr, 2010

Post Tue Sep 15, 2020 10:42 pm

Fender19 wrote:
Tue Sep 15, 2020 3:58 pm
earlevel wrote:
Tue Sep 15, 2020 2:00 pm
Anyway, maybe you can share how you're truncating the samples.
I will share more than that!...
OK, obviously something is amiss. The noise shouldn't modulate with the signal, obviously, so I don't know what is going on.

To be clear, I asked how you were truncating—I wasn't sure whether you were actually truncating (I use "truncating" in the sense of reducing the word size—whether it includes rounding or not), or just adding the appropriate dither noise and letting the truncation happen later by whatever means (when the file is bounced to disk, etc.). But if you truncate within your plugin it lends itself to more complete testing. The output of your plugin will accurately convey the sound of the reduced/dithered output. (Ideally with unity gain from your plugin to DAC, but it's not going to make a noticeable difference if we're talking 16-bit.)

Anyway, I gave my suggestions for testing. If you follow it, you'll know at what stage things went wrong. Beats doing the whole process, listing, and wondering what went wrong. :wink:
My audio DSP blog: earlevel.com

Fender19
KVRian
511 posts since 30 Aug, 2012

Post Wed Sep 16, 2020 10:45 am

mystran wrote:
Tue Sep 15, 2020 6:38 pm
For completeness, here the complete code I suggested:
That is the exact code I used in the sound files and screenshots I posted above (with the exception of the scale factor which did not work as written).

You said “it works in FL” but did you export as a 16bit file and then listen to that file at enough gain to hear the dither?

Something in the theory here is not working properly IN PRACTICE. There is something missing/not right.
earlevel wrote:
Tue Sep 15, 2020 10:42 pm
Anyway, maybe you can share how you're truncating the samples.
I am exporting a 32bit float DAW project to a 16bit wave file. The DAW truncates it. It’s a process I’ve used hundreds of times to produce 16bit masters for CDs and, AFAIK, how everyone else does it as well.
earlevel wrote:
Tue Sep 15, 2020 10:42 pm
I gave my suggestions for testing. If you follow it, you'll know at what stage things went wrong.
I have done all of those steps MANY times and see nothing “going wrong”. Levels are correct, noise is flat white, etc., but the SOUND/result is not right.

Please try the code in a DAW as I did here and let’s compare results.

mystran
KVRAF
5894 posts since 12 Feb, 2006 from Helsinki, Finland

Post Wed Sep 16, 2020 11:27 am

Fender19 wrote:
Wed Sep 16, 2020 10:45 am
You said “it works in FL” but did you export as a 16bit file and then listen to that file at enough gain to hear the dither?
Single very low-level low-frequency sine-wave (to sanity check noise modulation) + dither test-plug, export 16-bit -> no dither from FL -> load into Edison -> normalize -> check the waveform looks as expected, dither spread looks as expected -> play into a spectrum analyzer -> check there is no peaks other than the single sine-wave and flat noise floor. And after this I rendered the auto-correlation just to make sure.

Sure, I did also "listen" to it, but ear isn't particularly great for analyzing stuff like this.
Preferred pronouns would be "it/it" because according to this country, I'm a piece of human trash.

earlevel
KVRian
579 posts since 4 Apr, 2010

Post Wed Sep 16, 2020 12:51 pm

Fender19 wrote:
Wed Sep 16, 2020 10:45 am
earlevel wrote:
Tue Sep 15, 2020 10:42 pm
Anyway, maybe you can share how you're truncating the samples.
I am exporting a 32bit float DAW project to a 16bit wave file. The DAW truncates it. It’s a process I’ve used hundreds of times to produce 16bit masters for CDs and, AFAIK, how everyone else does it as well.
Well, I'm suggesting that if you do it all inside your plugin, including truncation, you can test and verify it all inside your plugin. Also, truncating in the plugin lets you hear the result—otherwise, you're just monitoring signal plus noise you added, and not any effects of the truncation until you've printed—doesn't seem like a good idea for a mastering plugin, so I have doubts everyone does it that way. In fact, without looking I know Ozone reduces the bits internally, it has a bit meter.
earlevel wrote:
Tue Sep 15, 2020 10:42 pm
I gave my suggestions for testing. If you follow it, you'll know at what stage things went wrong.
I have done all of those steps MANY times and see nothing “going wrong”. Levels are correct, noise is flat white, etc., but the SOUND/result is not right.

Please try the code in a DAW as I did here and let’s compare results.
I would, really, if I had time. I understand you're frustrated, I'm just suggesting that you take a step-by-step approach to debugging—there are very few steps, and each is easily verifiable. I'm sure you've done other testing on your own, but I don't know exactly what and how you've done—most of what you're posting here involves doing the whole thing and asking why it isn't working. I'm sure you're just tripping over something small, but I've had no problems writing plugins that dither, and obviously the same for mystran—neither one of us is likely to come up with a common "gotcha" you might be hitting.

I think one of the most fundamental sanity checks is to generate the final dithered and truncated output, subtract the input, and output that. (You can do something similar with DAW routing and inverted summing, but more to go wrong, and if you don't bit-reduce you're not really listening to the right thing.)
My audio DSP blog: earlevel.com

Fender19
KVRian
511 posts since 30 Aug, 2012

Post Thu Sep 17, 2020 10:12 pm

Well, now this is very odd. I tested the code in Reaper and in Cubase 10 and it works perfectly - just as mystran and earlevel say it should. 16-bit wave export is clean and quantize-free with just a tad of steady hiss (no modulation).

So, it appears my problem - all along - has been in WAVELAB. I have been using the "Monitor 16 Bit Dither" function to audition my dither plugin. It works perfectly with every other dither plugin but does NOT work right for some reason with MY plugin. https://drive.google.com/file/d/1g09xK6 ... sp=sharing

I am once again baffled. My peak meters are showing that the dither signal has a peak amplitude of -90.3dBFS. AFAIK, the "bottom" of 16-bit audio is -90.3dBFS (+/-15bits, 16th bit is a sign bit). So why is that noise signal AUDIBLE (and working) when exported as 16-bit in Reaper or Cubase?

If I export as 16-bit or monitor in 16-bit mode in Wavelab the dither noise is gated/truncated out as I assumed it SHOULD be - and that is what's causing the "gritty signal" I have been talking about. If I disable the dither and just let the unaltered source signal through my plugin it has UNITY gain in Wavelab - so where is the gain error?

So again, what is going on here? Something is off in levels. Maybe my plugin framework interface with Wavelab has issues? I am using iPlug and I HAVE had weird issues with Wavelab in the past. Any ideas?

Anyhow, Mystran's code DOES work. Thank you everyone for your help.

earlevel
KVRian
579 posts since 4 Apr, 2010

Post Thu Sep 17, 2020 11:45 pm

Well that's good news. WaveLab, huh...funny, but when I did my dither demonstration plugin for one of my videos, using IPlug, a friend testing it had issues with using it and exporting from WaveLab. But then again, there were issues in IPlug for VST and VST3, which I subsequently fixed. That was five years ago, I believe he said it worked after I did the fixes.

My experience with IPlug was that Steinberg software is the most unforgiving with VST/VST3. I've never used WaveLab, but Cubase required extra care to satisfy. It seems that they expect things a certain way, but the developer documentation they provide isn't that explicit. Anyway, I got the impression that most other DAW manufacturers give some latitude for variations in VST/3 plugins, while Steinberg expects you to make it work under their hosts.
My audio DSP blog: earlevel.com

User avatar
kryptonaut
KVRian
767 posts since 25 Apr, 2011

Post Fri Sep 18, 2020 12:52 am

It looks to me as though wavelab might be truncating towards zero (rather than rounding or truncating towards -infinity) when converting from signed float to signed int. I think that would explain why you had to add more noise than expected before seeing any output for a silent input, and why the noise level increases once the signal starts deviating from zero.

You could test this by adding a small DC offset to your signal, and then possibly adjusting the noise back down again to Mystran's suggested level.

User avatar
kryptonaut
KVRian
767 posts since 25 Apr, 2011

Post Fri Sep 18, 2020 7:36 am

Ok, my curiosity was piqued. I ran an analysis over the two audio files - the manually dithered one (Mystran's code with 2x the recommended noise level), and the Steinberg dithered one.

I counted the number of occurrences of sample values between -5 and +5 'units' in 3 regions of each file - the quiet region at the start, a region where the tone was present, and another where the tone was louder.
The results were as shown:

Code: Select all

	    Manual dither		   Steinberg dither		
	Quiet	Medium	Loud		Quiet	Medium	Loud
Value	section	section	section		section	section	section
-5	0	2154	1667		0	2211	1662
-4	0	2396	1689		0	2547	1750
-3	0	2756	1730		0	3029	1837
-2	0	3048	1837		0	3267	1911
-1	8192	3302	1897		8602	3374	1863
 0	49171	7268	3952		49027	3596	1973
+1	8173	3583	1825		8207	3693	1921
+2	0	3446	1767		0	3724	1824
+3	0	3064	1688		0	3237	1703
+4	0	2439	1494		0	2754	1616
+5	0	2017	1387		0	2229	1448
 
You can see that in the sections where there is signal present (Medium and Loud sections) there are roughly twice as many samples in the '0' bucket as you would expect for the Manual dither, whereas the Steinberg dither has a more reasonable-looking total in this bucket. In the quiet section the stats look the same for both dithers, despite the added noise level being twice what it 'should' be for the manual dither.

So it looks very much as though when WaveLab converts the original manually dithered floats down to 16-bit ints, it puts twice as many values into the '0' bucket as you might expect. This can be accounted for if it is truncating signed values towards 0 rather than rounding to the nearest whole number - so that any value between -0.999...+0.999 becomes 0 (this is what 'C' does by default).

I guess this also means that WaveLab must introduce some low-level crossover distortion when converting any floating point wave to integer format. If anyone has a copy it might be interesting to check that out.

mystran
KVRAF
5894 posts since 12 Feb, 2006 from Helsinki, Finland

Post Fri Sep 18, 2020 8:16 am

kryptonaut wrote:
Fri Sep 18, 2020 7:36 am
So it looks very much as though when WaveLab converts the original manually dithered floats down to 16-bit ints, it puts twice as many values into the '0' bucket as you might expect. This can be accounted for if it is truncating signed values towards 0 rather than rounding to the nearest whole number - so that any value between -0.999...+0.999 becomes 0 (this is what 'C' does by default).
Urgh.. Among the 4 or so theoretically valid ways to truncate, this would not be one of them.

That said, since a 32-bit float can represent 16-bit integers (with whatever exponent) exactly, those dither plugins that still work are probably doing the quantization internally and output the final 16-bit values directly, such that the actual rounding doesn't matter. It seems rather weird though if they truncate for no dither, yet correctly round for internal dither (which they certainly must do if the internal dither works).

Not sure what you mean by 2x the recommended noise though, as anything other than 2lsb would still be wrong and if you need more to match the internal dither then rounding might not be the only thing broken.
Preferred pronouns would be "it/it" because according to this country, I'm a piece of human trash.

Fender19
KVRian
511 posts since 30 Aug, 2012

Post Fri Sep 18, 2020 9:39 am

kryptonaut wrote:
Fri Sep 18, 2020 12:52 am
It looks to me as though wavelab might be truncating towards zero (rather than rounding or truncating towards -infinity) when converting from signed float to signed int. I think that would explain why you had to add more noise than expected before seeing any output for a silent input, and why the noise level increases once the signal starts deviating from zero.

You could test this by adding a small DC offset to your signal, and then possibly adjusting the noise back down again to Mystran's suggested level.
Yes, I have tried that. Adding a DC offset or doubling the dither amplitude works (the test results I posted links to above used 2x mystran’s dither amplitude). So it appears what you are saying is what’s happening.

It appears Pro Tools may do the same thing. Their stock dither plugin - flat TPDF with noise shaping off - has a peak level of -84dBFS. My dither plugin (Mystran’s code) at -90.3dBFS peak does not work right there either unless raised by 2x to -84dBFS.

So the solution is to truncate in the plugin and not leave it up to the DAW?

I still have two other confusions about levels:

1) the lowest level a 16bit signed int can capture is -90.3dBFS. So why is the dither signal - at -90.3dBFS PEAK - fully audible in the 16bit wave files exported from Cubase and Reaper?

2) the peak level of a 16bit wave file in the positive excursion is 32767 so why are we scaling the dither signal by 1<<15 (32768) and not 1<<15 - 1 (32767) ?

Return to “DSP and Plug-in Development”