KVR Audio

JCJR · Post by **JCJR** » Thu Sep 10, 2015 8:16 pm

Thanks all for the good references, will study them.

If I can stay awake long enough to fix some old code, just want to fix it to conform with some recent ideas which seem to make good sense.

Seems reasonable practice, especially with audio to be encoded to compressed format, to limit true peak to about -1 dB. Because true peaks could be as much as several dB louder than the peak sample values, maybe the lazy way to do it might be to naive peak limit at -4 dB or some other guestimate.

However it would be better to make a fairly good estimate of the actual peaks. Maybe some music has much louder intersample peaks, compared to other music.

logicalhippo · Post by **logicalhippo** » Sat Sep 12, 2015 7:52 pm

juha_p wrote:
Aletheian-Alex wrote:If you have not come across this yet, there is a decent amount of relevant info in papers relating to the ITU's bs.1770 standard that might be helpful. Specifically, the "Algorithms to measure audio programme loudness and true-peak audio level" PDF here: https://www.itu.int/rec/R-REC-BS.1770/en. There is inter-sample peak stuff near the end and in the appendix that might be helpful.
http://techblog.izotope.com/2015/08/24/ ... detection/

EDIT: IIRC, here's an implementation for this http://kode54.foobar2000.org/ (foo_r128norm/scan)

Original author here, feel free to ask for clarification.

As to your question, if you're doing this for BS.1770 compliance, unfortunately spec is quite ambiguous. It only says that you need to use a method for intersample detection that is similar or better in performance to the example algorithm which uses 4x FIR upsampling with 12 taps per phase (48 taps total). In the blog post we have a wav file of the example taps if you want to explore it yourself. It looks like a remez-exchange designed linear phase filter to me. No definition of "similar" or "better" is given, so it's really not clear what you are supposed to do as a meter implementor. Most meters I've looked at significantly outperform the algorithm given in the specification.

earlevel · Post by **earlevel** » Wed Sep 16, 2015 11:41 pm

JCJR wrote:Reckon what would be the cheapest fastest way to get a "good enough" intersample peak estimation for an audio level meter or limiter?

Supposedly at 44.1 or 48 K samplerate, oversampling by 4x would get fairly close, then again to get super accurate would require big oversampling.

Hi Jim,

Just a note about something that everyone knows, but we often overlook in the heat of thinking about things: First, of course the reason to oversample is to get data points that follow the underlying audio better (and as someone pointed out, it depends whether you are trying to recreate the input, or the output and whatever hardware it might entail, if you wanted to be "perfect")—and with the highest frequency being low relative to the sample rate, there is less wiggle in between. But don't forget, also, that it's the highest frequencies that are the biggest cause of (inter-sample) wiggles, yet they are typically limited if the audio is music or natural sounds (more of a "pink" spectrum).

That is, you probably care most about the peak level being accurate in regards to clipping when the audio (/music) is at it's cleanest, most pleasantly listenable. And that's also when you will be far from the worst case as far as the potential wiggles at a given sample rate. Transients in such music would be the worse case, typically, but they are also where very slight clipping would be unnoticeable (and inter-sample-only clipping is inherently small with modest oversampling).

Nigel

JCJR · Post by **JCJR** » Thu Sep 17, 2015 1:39 am

Thanks Nigel

On another forum FabienTDR posted a link to a paper about limiting for fm broadcast, describing cases of overs happening as a consequence of brickwall lopass filtering. Supposedly the filtering in such as mp3 encode/decode can also generate overs which were not present in the original source. IOW "exposing" hidden intersample peaks not so obvious in the original clean source audio. Dunno much about it.

As you say lows tend to predominate in music and lows would ordinarily be expected to have sample peaks very close to the "true peaks".

I was too busy to record about 15 years but was recently trying to clean up some old recordings. I always tried to mix with a spectrum darker than pink. Seemed to me that even a pink mix tends to be unpleasantly bright, though I read that nowadays lots of music is close to pink. As best I recall, back in the day pink music tended to be such as aggressive metal, which could get rather dang bright.

So anyway had one set of "browner than pink" old songs transferred digitally from dat at 44.1K into float files, and another group of old songs transferred analog from pcm501. In overabundance of caution, recorded the analog dubs 24 bit 96K to 96K float files. Because fidelity was already compromised, didn't want fidelity to get any worse because of the dub.

Yesterday was level matching all the files to -6 dB peak for mastering experiments. On the daw master bus I had a sample peak meter and also a true peak meter plugin.

On my "browner than pink" 44.1K songs, there seemed to be about 1 dB difference (or less) with the true peak meter reading slightly higher. The <= 1 dB difference seemed fairly consistent across 11 songs.

On the 15 songs at 96K samplerate (analog dubbed from 44.1K 16 bit PCM-501), both meters seemed to read very close, nearly identical. I didn't pay attention whether the peak readings were EXACTLY identical on every song. The two meters just seemed close enough in agreement that it shouldn't matter much in the real world.

So the oversampling of the 96K seemed to help some, though even at 44.1 K, a 1 dB undershoot wouldn't be so bad unless trying to peaklimit real close to 0 dB.

I don't know how much worse true peak would disagree from sample peak on bright, loud, distorted music. Then again, if the music is already bright loud and distorted, maybe a little extra distortion from intersample peaks wouldn't much matter?

earlevel · Post by **earlevel** » Thu Sep 17, 2015 6:37 am

Interesting thoughts, Jim.

A while back, I was thinking about writing a blog article addressing inter-sample peaks claims from the manufacturer of a very expensive converter (one that claims to be able to allow for such peaks). Their article gave the example of the sine wave at quarter-sample-rate and worst-case phase (1.0, 1.0, -1.0, -1.0). Basically, I just wanted to show more realistic expectations of inter-sample peaks in real music. I didn't consider existing meters, but just planned to use 8x oversampling, which I supposed (educated guess) would be more than enough. And I suppose 4x would be adequate.

The bottom line is that I'd say it's pretty doubtful that adjusting sample levels to 1.0 peak based on a near-perfect meter and one that used ok 4x oversampling could yield a real difference in sound quality. To start, you'd have to adjust the peak to full-scale with the "perfect" one, then have that sustained long enough to be heard, and compare that with a slight (if any) error on the 4x version. I just don't see that as likely.

JCJR · Post by **JCJR** » Thu Sep 17, 2015 11:58 pm

Yah, I don't know much about it.

When I've looked at speectrum plots of decoded mp3 songs, the response takes a nosedive somewhere between 10 K and 20 KHz. I don't recall lately looking at the spectrum plot of low bitrate mp3, maybe those take a nosedive at lower frequencies.

Dunno if this high freq cut would officially be called a brickwall, but looks rather sudden.

So Gibbs phenomenon could expose peaks which were previously not there in the source. Though I don't know whether limiting against True Peak by itself could always defend against peaks exposed by Gibbs phenom or other effects.

Maybe insignificant, or not. Dunno.

Googling for "gibbs mp3 intersample peak" returns some hits. A few of the comments toward the middle of this thread contains some typical explanations. Maybe mountain == molehill. Or not.

https://www.reddit.com/r/audioengineeri ... ver_0dbfs/

Occurred to me, for a peak limiter (not a meter) then if using some kind of short fir or curve fitter to seek ISP's, maybe it would be safe to entirely ignore ISP if the raw samples are a few dB below the limiter threshold? Only calculate for ISPs on samples near or above limiter threshold. That ought to somewhat minimize cpu impact unless a song is running most of the time over-threshold?

Bad Bone · Post by **Bad Bone** » Sun Sep 27, 2015 2:29 pm

There is no "true peak detection". Such an algorithm does not know if the signal was a square or a sine.

helium · Post by **helium** » Sun Sep 27, 2015 6:05 pm

Bad Bone wrote:There is no "true peak detection". Such an algorithm does not know if the signal was a square or a sine.

Why does it matter what the signal once was? We want to know the true peak of the current signal.

Miles1981 · Post by **Miles1981** » Sun Sep 27, 2015 6:59 pm

I think what Bad Bone wanted to say is that you would need to analytically (or with continuous sinc) reconstruct the signal and then search for the true maximum. If you know the true expression, you can find the location and the value of this true max. Otherwise, it's a shot in the dark.

JCJR · Post by **JCJR** » Sun Sep 27, 2015 8:36 pm

In case of high frequency signals, if properly anti-aliased when computer generated or recorded, regardless whether the original signal was a high freq square wave, isn't it a sine wave inside the digital audio data? Anything above nyquist presumably removed.

Even if using high sample rate, 96K or higher, wouldn't most processes brickwall filter around 20kHz to keep ultrasonics out of the speakers?

aciddose · Post by **aciddose** » Mon Sep 28, 2015 6:54 pm

A discrete signal is band-limited by definition. Each sample is defined as an infinite length sinc function centered on the sample at half the sample rate.

https://en.wikipedia.org/wiki/Nyquist%E ... ng_theorem

Wikipedia wrote:The symbol T = 1/fs is customarily used to represent the interval between samples and is called the sample period or sampling interval. And the samples of function x(t) are commonly denoted by x[n] = x(nT) (alternatively "xn" in older signal processing literature), for all integer values of n. The mathematically ideal way to interpolate the sequence involves the use of sinc functions, like those shown in Fig 2. Each sample in the sequence is replaced by a sinc function, centered on the time axis at the original location of the sample, nT, with the amplitude of the sinc function scaled to the sample value, x[n]. Subsequently, the sinc functions are summed into a continuous function.

The resulting signal is the combination of these overlapped sinc functions and is defined exactly.

The argument "doesn't know if the signal was A or B" is just plain stupid and demonstrates complete ignorance of the topic.

aciddose · Post by **aciddose** » Mon Sep 28, 2015 7:04 pm

Regarding the idea that "eight times oversampling should be enough", it may be in some cases although this is a very bad way to think about the problem. You should look at it from an alternate perspective.

Consider the worst-case error in terms of time then being 1/8 in place of 1 at the original sample rate.

The error in amplitude is relative to this ratio via some function, likely sinc is involved and I'm not specifically aware and too lazy to research this at the moment.

You must also consider however the fact that at best any interpolating filter used to produce the oversampled signal must be executed at least N times, equal to the oversampling multiplier.

I would suggest the possibility that you could reach a far more accurate result after executing a zero-finding (peak-finding in this case) algorithm on the very same interpolating filter in less than eight steps.

The article I linked was the result of a quick google search based upon this intuition and it seems yes, it has been done ages ago and demonstrated very well by people far more skilled than myself.

aciddose · Post by **aciddose** » Mon Sep 28, 2015 7:07 pm

Furthermore, on the topic of optimization with regard to only performing the interpolation and zero-finding on curves which contain a peak; the article I linked describes this and provides an example implementation utilizing exactly this optimization.

earlevel · Post by **earlevel** » Tue Sep 29, 2015 12:07 am

aciddose wrote:Regarding the idea that "eight times oversampling should be enough", it may be in some cases although this is a very bad way to think about the problem.

I think I'm the only one who said 8x. A "very bad" way to think about the problem? It was for offline processing for a video looking at typical inter-sample overs in response to a manufacturer's claim. I could use 64x and higher and not care. I simply meant that from my experience, 8x would be ample, given typical musical spectrum and typical mastered levels. A "very bad" way to look at it would have been to spend a lot of theoretical time on it

aciddose · Post by **aciddose** » Tue Sep 29, 2015 3:09 pm

Well, I would argue that it is the absolute worst-case in terms of efficiency and accuracy due to being a brute-force method. So take from that what you will.

True Peak Detection