KVR Audio

No_Use · Post by **No_Use** » Mon Sep 11, 2017 6:34 pm

Wavelab has an option to normalize to "Top of loudness range".
According to the manual this means "the average loudest 3 second audio section".

https://steinberg.help/wavelab_yellowte ... log_r.html

A bit of further info in this thread where PG (main author of Wavelab I think) replies:
https://www.steinberg.net/forums/viewto ... 89&t=63049

I'm trying to replicate this in a script in Reaper but I have a little difficulties understanding how this is actually calculated.

Could anyone shed some light ?

My main question for now is the following:

Via a script I can currently easily get the following values from a file:
- program loudness (aka integrated loudness)
- loudness range
- true peak
- maximum short term
- maximum momentary

Do these values suffice to calculate Top of loudness range ?
From my current understanding, no, I'd need accesss to the actual samples and do some further calculations to get Top of loudness range. Am I correct ?

antto · Post by **antto** » Mon Sep 11, 2017 8:47 pm

pretty much it smells like this requires processing the whole audio signal

JCJR · Post by **JCJR** » Mon Sep 11, 2017 10:43 pm

Hi No_Use. What time period does your current "max short term" code measure? If not 3 seconds then maybe you could keep the code about the same except changing the time constants?

One way to measure loudness might be to measure tbe first three seconds, then clear variables and measure the next three seconds, etc. Which would be flawed because the loudest three seconds probably straddles between a pair of the three second bins.

I don't know whether right ot wrong but would be inclined to use a continuous process which at every sample has an estimate of how loud it has been over the three seconds previous up to the current sample location-- Much like a specialized compressor envelope detector.

That way you could do a compare after calculating each new sample, saving the loudest reading every time. Which should get the absolute loudest contiguous 3 seconds.

PG mentions percentiles. You could create an array of "value cubbyholes" and count instances which fit in each cubbyhole while measuring the audio as above. Then easily find quartiles and percentiles and such. For instance an array of 960 cubbyhole counters could tally instances at 0.1 dB resolution from 0 to -96 dB. On each new sample calc the 3 second level and then increment the cubbyhole matching the current measurement.

After processing the entire file you could use incidence counts in the array to find percentiles and such to 0.1 dB resolution.

No_Use · Post by **No_Use** » Tue Sep 12, 2017 1:45 pm

Thanks for the replies.

The "max short term" indeed measures over a sliding time window of 3 seconds as specified per R128 (actually EBU Tech 3341 document I think).
I should probably clarify that the loudness measurement itself according to these specs is already covered (including loudness range, short term max, momentary max), so that's not the problem.

The only thing I'm struggling is calculating (or understanding what it actually means) "Top of loudness range", as this seems not to be covered in the EBU tech docs (seems like a thing 'invented' by Wavelab).

Some pseudo code could really help I think.

JCJR · Post by **JCJR** » Wed Sep 13, 2017 1:42 am

Hi No_Use. I read your two links and your messages but probably do not understand the question, sorry.

If it doesn't mean "the one loudest contiguous 3 second interval (max short term loudness)"-- And maybe it doesn't because it says "top of loudness range" or maybe "average top of loudness range".

The author mentions percentiles and quartiles in the second thread, so possibly it could be an average of all short term measurements falling within the 99th percentile, or 90th-99th percentile, or maybe the average of all measurements in the top quartile (75th to 99th percentile). Maybe a flexible procedure would allow the user to select a desired range for "top of loudness range", if in fact that is what the author is getting at.

There are probably much better ways to handle percentiles than the simple brute-force methods I recall.

My earlier example used an array of 960 counters to accumulate the "bell curve distribution", a value histogram to accuracy of 0.1 dB, over the range of -96 dB upto 0 dB. If you only want 1 dB accuracy then an array of about 96 elements should suffice.

For instance saving data at a per-sample density (with huge overlap between measurements but perhap that doesn't matter). If the song is 3 minutes duration, at 44.1k samprate, we need to process 3 seconds at the head before we have the first valid short term level. So if we count the measurement per-sample from there to the end we have counted (180 - 3) * 44100 measurements and tallied them "sorted by level" into our histogram array.

The sum of all elements in our histogram array for the 3 minute song should be 7805700 measurements. So if we want to find the approx 99th percentile, we could add values of each histogram bin starting from the top (0 dB) until the sum of top bins equals total N * 0.01 = 7805700 * 0.01 = 78057. If it so happens we have to add bins from 0 dB down to -11.3 dB before the sum reaches about 78057, then the 99th percentile is near -11.3 dB. If we want the average level of all measurements in the 99th percentile, we could do a multiply/add loop from the 0 dB bin down to the -11.3 dB bin to get the total of all 99th percentile measurements, and then divide by the 99th percentile N (78057) to get the average 99th percentile value.

If you wanted the average value of all measurements above 90th percentile, use the same code except search down from the top for 7805700 * 0.10 total measurements. Or for top quartile, do the same thing looking for the divider sum line 7805700 * 0.25.

Of course it would work just as well scanning the histogram from the bottom. In that case the 99th percentile dividing line would be total N * 0.99, rather than total N * 0.01 .

Just wild guesses for what its worth.

No_Use · Post by **No_Use** » Wed Sep 13, 2017 3:14 pm

Hi JCJR,

If it doesn't mean "the one loudest contiguous 3 second interval (max short term loudness)"-- And maybe it doesn't because it says "top of loudness range" or maybe "average top of loudness range".

From my understanding, you're right.
"the one loudest contiguous 3 second interval" would indeed be max short term loudness so "top of loudness range" seems to differ in that it's 1. not a contiguous section used for calculation 2. based on some averaging calculation.

Admittedly the math involved in your further reply is a little over my head currently (I have to read up on percentiles) but it does provide a starting point to go further.

Thanks again.

JCJR · Post by **JCJR** » Wed Sep 13, 2017 11:14 pm

Hi No_Use. I probably explained the histogram thing too poorly for understanding. That part of statistics is very simple, though there are likely much more sophisticated ways to do it than I recall. I did a few simple stats programming in the last decade but main experience was back in early 1970's, scientific stats in fortran or pl/1 on mainframe and minicomputer. Academic work and a bit of work attempting to analyze local social service / counselling program effectiveness. Just sayin, compared to today programming knowledge was primitive back then even among experts and I was never an expert then or now.

If you would like to try what I described, I could try find time to make some simple pseudocode or js example code. It isn't hard, mainly simple array looping. I don't know the languages for reaper scripts.

JCJR · Post by **JCJR** » Wed Sep 13, 2017 11:59 pm

The basic idea of percentiles-- For instance if we have test scores of 1000 students-- Maybe the lowest score was 12 and biggest score was 98. We could do it on paper-- Stack all the 1000 tests in a big pile on the desk sorted from lowest to highest. Lowest score on the bottom and highest score on top. 99th percentile means that 99 percent of scores are lower than a person's test score. Top quartile means that 75 percent of scores are below the lowest score in the top quartile.

So with 1000 paper tests stacked in order, the top 10 tests on the stack are 99th percentile. The 10th test we take off the top of stack, that 10th score is the bottom threshold for 99th percentile. But distributions can be skewed-- Not all are gaussian normal distributions. So if the test was real hard and only a couple of real smart students, maybe we have one score of 98 and then several scores going down to 77 even among the top 10 scores.

We could compute the arithmetic mean of the 99th percentile by adding those 10 top test scores and divide the sum by 10.

If you want 90th percentile and above, do the same but lift off the top 100 tests from the sorted stack of papers. 90 percent of 1000 students scored lower than that top 100 test papers.

Distributions could affect normalization. The classic not-uncommon case of "loudest sample normalization" going wrong is a song of average low level but there is a short loud digital glitch somewhere in the song. Maybe one lonely loud 0dbfs click sample prevents raising the gain of the song any at all.

That is statistically similar to a tough school test and only one smart student in the class. The average score might have been 32 but one lonely nerd scored 100, "blowing the curve". A skewed distribution.

Arithmetic mean might be as good as any for normalization, but there are two other common flavors of mean, explained in a wikipedia article. For instance geometric mean tends to give a lower answer than arithmetic mean on skewed data. So POSSIBLY geometric mean could be better in cases where maybe the 99th percentile cutoff might be -18 dB but there are one or two lonely 0 dB single sample clicks in the song?

Calculating "Top of loudness range" ?