KVR Audio

wminor · Post by **wminor** » Sat Jul 10, 2010 4:45 pm

Hi,

I'm in the process of building a small application which analyses wave files and extracts their amplitude and pitch envelopes.

This is my first attempt at any sort of audio processing code, and i'm using c++.

My main question is about PCM data, and volume levels. I have a basic understanding of PCM data, and I understand the concepts of sample rate and bit rate well. However, I am having a little trouble getting my head around exactly how the data in each sample relates to the dB scale. For example, I would like to be able to analyse a file and report it's peak volume level. For a 16-bit file, for example, I can easily check each sample and determine which has the greatest amplitude... but i have no idea if this is -0.1 dB or -3 db. Equally, i would like to know how to, for example, increase the gain of a sample by a set number of dB.

I'm sure this is some very basic stuff, so if anyone could point me in the direction of some good resources that tackle these basics I would be most grateful. I have looked at some books but everything seems to be tackling more advanced things like FFT, which I am not ready to tackle yet.

Thanks for your help.

Borogove · Post by **Borogove** » Sat Jul 10, 2010 6:45 pm

The key thing is that PCM data is linear and the dB scale is logarithmic. In digital audio, we normally equate "full scale" PCM data with 0dB (sometimes referred to as 0dBFS). That is, for 16-bit integer data, a sine wave that touches +32767 at the peak and -32767* at the trough is 0dB. (For floating point we usually use +1.0 and -1.0 as "full scale" for, among other things, convenience in using trig functions.)

A 6dB change is a halving or doubling of amplitude. So if your 16-bit PCM peaks out at +16383, call that -6dB. (Technically it's 6.014dB or something but 6 is good enough for rock'n'roll).

To increase the gain of a sample by X db, multiply the PCM value by pow( 2.0, X/6.014 ). i.e. gain +6dB means doubling the value of the sample, -6dB means halving it.

wminor · Post by **wminor** » Sat Jul 10, 2010 8:23 pm

Thanks for the info, that's very helpful.

Just one more question on the same topic...

to 'normalise' pcm data (in the audio technology sense... adding an amount of gain to the whole file so it peaks at 0dB)... what would be the best approach?

I have toyed with the idea of working out the maximum value of the file, then dividing this into, for example, 32767 (assume 16 bit) to get a 'scale factor' of, for example, 1.3.... and then multiplying all samples by this amount to scale them up. Is this correct? The other alternative i can think of is simply subtracting the max value from 32767 and adding that result to each sample... which one of these should i be doing?

Thanks.

antto · Post by **antto** » Sat Jul 10, 2010 8:27 pm

Code: Select all

inline double amp2dB(const double amp)
{
    // input must be positive +1.0 = 0dB
    if (amp < 0.0000000001) { return -200.0; }
    return (20.0 * log10(amp));
}
inline double dB2amp(const double dB)
{
  // 0dB = 1.0
  //return pow(10.0,(dB * 0.05)); // 10^(dB/20)
  return exp(dB * 0.115129254649702195134608473381376825273036956787109375);
}

1. you can remove the IF statement if you want..
2. i used exp() since i've heard it is a little bit faster than pow()

antto · Post by **antto** » Sat Jul 10, 2010 8:36 pm

so if i understand correctly, you have some audio in integer format and you wanna "normalize" it to 0dB?
that's easy, you can even use the dB converting formulas i posted

0. (init) double max = 0.0; double tmp;
1. convert your integer audio to floating point (i would use double, but whatever)
2. for every sample do:

Code: Select all

tmp = amp2dB(fabs(x));
max = (tmp > max ? tmp : max); // store the highest dB peak..
3. at the end, when you have processed the whole audio - max gives the maximum peak level in dB, so to normalize to 0dB you can do this:
0. (precalculate) double scale = dB2amp(max * -1.0);
1. multiply each sample by "scale"
2. convert back to integer if you want..

wminor · Post by **wminor** » Mon Jul 12, 2010 8:51 pm

Thanks for your help. I'm not sure i understand 100% but i'll give it a try and if not i'll come back with more questions.

Thanks!

IIRs · Post by **IIRs** » Mon Jul 12, 2010 9:23 pm

wminor wrote:
I have toyed with the idea of working out the maximum value of the file, then dividing this into, for example, 32767 (assume 16 bit) to get a 'scale factor' of, for example, 1.3.... and then multiplying all samples by this amount to scale them up. Is this correct?

Looks ok to me.

wminor wrote:The other alternative i can think of is simply subtracting the max value from 32767 and adding that result to each sample...

This would just result in DC offset.

Some DSP basics... dealing with PCM data, volume levels...