KVR Audio

rickowens · Post by **rickowens** » Thu Mar 17, 2016 11:29 am

Hello! In order to quantify the broadband noise of an audio track, how
about implementing an adaptive filter? Does anyone know how can I proceed
(matlab or c ++)? I've seen lots of things around and it seems the only
viable option to estimate the broadband noise and I'd be interested to estimate the hiss.

Thank you in advance.

rickowens · Post by **rickowens** » Tue Apr 05, 2016 11:38 am

anyone has any suggestion?

BertKoor · Post by **BertKoor** » Tue Apr 05, 2016 11:53 am

If this were my project, I'd perform a FFT analysis of the audio. Then determine per frequency bin what the minimum amplitude within that bin is. The result (a set of minimum amplitudes) would be the fingerprint of the background noise. Compare that with a found maximum, and you know what the whole dynamical range is.

rickowens · Post by **rickowens** » Tue Apr 05, 2016 4:54 pm

I had already thought about it.. but does this method works for all genres of music? Or it could create problems? Some genres are mostly white noise..

JCJR · Post by **JCJR** » Tue Apr 05, 2016 6:02 pm

Hi Rick

I didn't earlier respond any further because it sounded like you had done more and better research on the topic than I ever did. Dunno much about it.

One thing I thought, maybe scholarly articles on information theory. Google shannon, information theory, etc.
https://en.wikipedia.org/wiki/Information_theory

Shannon et al seemed to focus a lot on signal entropy, signal-to-noise issues. I did some light searching re communications noise measurements, but most of the quick hits discussed measures where you know about the clean signal before noise was added. Typical method of measuring noise in audio gear is about the same. You inject a clean signal, measure the output, subtract the input signal, and what remains is the noise and distortion.

Which is useless if you haven't a clue about the original signal. You need a single-ended noise measure, with no knowledge of the original signal. Maybe digging deep enough there are zillions of scholarly articles on that, dunno.

That is why I earlier mentioned noise reduction programs. Single-ended noise reduction has to identify the noise in order to remove it. You seem to only want to measure the noise rather than eliminate it, but I figured if there are unusually successful single-ended noise reduction strategies, then the noise-reduction has to estimate and identify the noise in some way.

BertKoor's method is about what I would first try. The other day I ran across this short set of pages, searching for something else entirely--
http://www.iks.rwth-aachen.de/en/resear ... ing-rules/

It lists common approaches for identifying and removing noise. The last weighting rule on the list, psychoacoustical weighting, seems perhaps most-similar to Bert's suggestion.

It also seems simplest-- Assume that big-valued harmonics are signal, and small-valued harmonics are noise, and measure the level of the small-valued harmonics. Maybe average the level of the small-valued harmonics over the duration of the file, and average the level of the big-valued harmonics over the duration of the file. Finally make a simple signal-to-noise calculation.

This would fail miserably if a recording is so noisy that the NOISE is making big-valued harmonics. Which is the condition when single-ended noise reduction is most likely to fail-- When the noise is nearly as loud, or even louder than the signal.

Some of the other weighting rules in that list might be better, dunno. Maybe a less-successful NR method, which doesn't sound so good to the ear for noise reduction, but be somehow superior for simply measuring the noise content.

With harmonics tending to be loud most of the time, maybe there are solid ways to infer how much of that harmonic is noise and how much is signal, statistically speaking over the duration of the file.

Am guessing that intensive-enough searching would find lots of papers on each of those weighting rules in that link's list.

BertKoor · Post by **BertKoor** » Tue Apr 05, 2016 7:57 pm

rickowens wrote:I had already thought about it.. but does this method works for all genres of music? Or it could create problems?

Often the most simple approach works best, especially in the prototyping stage to prove the concept. Only after trying out such a prototype you can try to address flaws and improve it.

So... you thought about this approach and other approaches. That's only theory, but you have to turn that into something that supposedly works. Not nescessarily the best, just a starting point would be good, not?
What have you actually tried so far? What was succesful and what did not work? And why?

rickowens wrote:Some genres are mostly white noise..

Well, if that is what you are trying to prove...

rickowens · Post by **rickowens** » Thu Apr 07, 2016 2:16 pm

BertKoor wrote:
rickowens wrote:I had already thought about it.. but does this method works for all genres of music? Or it could create problems?
Often the most simple approach works best, especially in the prototyping stage to prove the concept. Only after trying out such a prototype you can try to address flaws and improve it.

So... you thought about this approach and other approaches. That's only theory, but you have to turn that into something that supposedly works. Not nescessarily the best, just a starting point would be good, not?
What have you actually tried so far? What was succesful and what did not work? And why?

rickowens wrote:Some genres are mostly white noise..
Well, if that is what you are trying to prove...

Ok! This is what I tried to do in matlab:

Code: Select all

noise=0;
sig=0;
for i=in:fin
        noise=noise+(abs((min(B(:,i)).^2)));
        sig=sig+(abs((max(B(:,i)).^2)));
end
snr=sig/noise

Where B is the frequencies matrix. Is this is exactly what you told me?

This actually works pretty good, except for some values and I can't quite figure out what is wrong..

What I tried to do before was: the Yule-Walker estimation of the autoregressive models using the output variance to quantify the broadband noise.. with poor results.
With the aim of perfecting it, I created a Matlab code that lets me see on a chart as each frequency occurs along the signal, in order to characterize the noise..
After that I read about the adaptive filters but I understand that I need, in addition to dirty signal, also the noise information that you want to go to find.. I could try to give input a white noise vector to go to "stimulate" the filter to then read the output error, but I don't know if it's something that makes sense.

The problem for me is that there are different types of broadband noise and I want to be able to find an approach that is standard for all kinds.

JCJR · Post by **JCJR** » Thu Apr 07, 2016 3:00 pm

One feature which might be useful, as a "practical kludge" to catch some kinds of common noise--

Not only low-level harmonics that are "a little different on every frame", but some noises may be remarkably constant levels thruout a file.

There may be some music tracks, of synth pad drones or whatever, which stay about the same level thruout the file. More likely individual tracks in a mix, rather than final release music. Most music will be constantly changing over time.

For instance hum-- If a file has steady hum then you might measure a very constant level of 60 Hz thruout the song. If music harmonics happen at 60 Hz, the level would occasionally get louder than this "minimum floor" of the hum, but it might stay very steady at the level of the hum, the harmonic never getting lower than some constant threshold.

In recordings from 60 Hz nations, the hum would be 60 Hz, POSSIBLY a little frequency at half, 30 Hz. And possibly multiples of 60 Hz going all the way up into the highs. When hum is hipassed and comes out of the tweeters it is often called buzz, but it is just little high-freq harmonic spikes based on 60 Hz. The high-freq buzz can happen from such as light dimmer interference getting into guitar amps or mixers, picked up by guitar pickups or poorly shielded mics or whatever.

The same artifacts would be based on 50 Hz in "50 Hz nations".

If the recording has been either accidentally or intentionally changed in speed, it might be some other exact number. A steady pattern of harmonics "in the ballpark of 60 Hz" or "in the ballpark of 50 Hz". Many recordings were done using different tape machines different studios, and it might be improbable that many of them were all at "exactly the same speed". And sometimes finished recordings were intentionally sped-up or slowed-down a little for artistic reasons. The artists/producers deciding at the last moment that the song really should have been a little faster than it was recorded. Or in the case of movie sounds or radio/tv advertisements, changed in speed to exactly fit the music to an exact play duration.

Just saying, maybe a feature that notices characteristics that "look like" hum. Relatively constant harmonics thruout the song, all related by a frequency multiple.

Some computer artifacts and machine artifacts can be relatively steady "whines" or "whistles". Or at least they can seem steady in level to the ear. I never tried to analyze them to know if they look so steady in frequency doman.

rickowens · Post by **rickowens** » Thu Apr 07, 2016 4:55 pm

JCJR wrote:One feature which might be useful, as a "practical kludge" to catch some kinds of common noise--

Not only low-level harmonics that are "a little different on every frame", but some noises may be remarkably constant levels thruout a file.

There may be some music tracks, of synth pad drones or whatever, which stay about the same level thruout the file. More likely individual tracks in a mix, rather than final release music. Most music will be constantly changing over time.

For instance hum-- If a file has steady hum then you might measure a very constant level of 60 Hz thruout the song. If music harmonics happen at 60 Hz, the level would occasionally get louder than this "minimum floor" of the hum, but it might stay very steady at the level of the hum, the harmonic never getting lower than some constant threshold.

In recordings from 60 Hz nations, the hum would be 60 Hz, POSSIBLY a little frequency at half, 30 Hz. And possibly multiples of 60 Hz going all the way up into the highs. When hum is hipassed and comes out of the tweeters it is often called buzz, but it is just little high-freq harmonic spikes based on 60 Hz. The high-freq buzz can happen from such as light dimmer interference getting into guitar amps or mixers, picked up by guitar pickups or poorly shielded mics or whatever.

The same artifacts would be based on 50 Hz in "50 Hz nations".

If the recording has been either accidentally or intentionally changed in speed, it might be some other exact number. A steady pattern of harmonics "in the ballpark of 60 Hz" or "in the ballpark of 50 Hz". Many recordings were done using different tape machines different studios, and it might be improbable that many of them were all at "exactly the same speed". And sometimes finished recordings were intentionally sped-up or slowed-down a little for artistic reasons. The artists/producers deciding at the last moment that the song really should have been a little faster than it was recorded. Or in the case of movie sounds or radio/tv advertisements, changed in speed to exactly fit the music to an exact play duration.

Just saying, maybe a feature that notices characteristics that "look like" hum. Relatively constant harmonics thruout the song, all related by a frequency multiple.

Some computer artifacts and machine artifacts can be relatively steady "whines" or "whistles". Or at least they can seem steady in level to the ear. I never tried to analyze them to know if they look so steady in frequency doman.

Thanks a lot, you gave me some important inputs! So do you suggest me to keep the minimum/maximum approach to the snr and then improve my algorithm with other features like for example the research of n60 Hz armonics?

JCJR · Post by **JCJR** » Thu Apr 07, 2016 5:41 pm

Hi Rick

I wasn't really suggesting much. Dunno much about it. Was just thinking that harmonics with random noise, each frame you measure, each harmonic would be a little louder or softer on each frame, because it is random. So I imagine you would look for that kind of noise with statistical methods, averaging of some kind or whatever. Maybe a smart guy would devise very fancy "averaging".

Maybe the same statistical methods would always measure steady hums or computer/machine whines properly without doing anything extra.

Was just mentioning some of the "steady" kinds of noise, in case those things would need sensing in a different fashion.

rickowens · Post by **rickowens** » Tue Apr 12, 2016 3:38 pm

I've done pretty much as you told me, doing ffts, finding the minimum amplitude for each bin for an approximation of the noise floor and the maximum amplitude for an approximation of the signal. Then I did the variances of both vectors and finally I made a simple signal to noise calculation.

This approach works better than other methods tried previously but still has many false negatives and false positives and I don't understand why. I mean that I can still don't classify well how much noise is present in a track.

Can I ask what could be the problems of this kind of method and how it can be improved?

Thank you in advance.

btw I'm doing like this:

Code: Select all

noise=0;
sig=0;
j=1;
for i=in:fin
        nois(j)=(abs((min(B(:,i)).^2)));
        sign(j)=(abs((max(B(:,i)).^2)));
        j=j+1;
end
snr=var(sign)/var(nois)

where B is the spectrogram matrix.

Synapse2k · Post by **Synapse2k** » Tue Apr 12, 2016 3:56 pm

What was originally known as Cool Edit now known as Adobe Audition 2015 can save you time and effort. It has built in noise reduction as does Sound Forge. The only thing missing from what I use known as Twisted Wave is noise reduction. It cost's $79 dollars and has very special voice synthesis, batch editing processes including format/sampling rate/bit rate conversion. it does most of what Adobe can do for the full price of $79 but it's also missing multi track and volume matching. I am not in the budget to prefer unneeded functions and the price of Adobe Audition is in the impossibly wasted range for a 3 year "cloud subscription" and not at all worth it. Sound Forge is only recommended if file conversion issues are fixed or avoided through sample rate/bit rate locking attempts if possible.

I tried the demo and was impressed in ways but always had sample rate and bit rate error's. I believe Zynaptiq also produce viable noise reduction and waves but the best choice is Sound Forge Pro or Sound Forge Mac for noise reduction. In relation to an Iron Forge the name.
I don't believe in any useless spectrogram. Sound Forge does it all but the bit rate sample rate issue leads to pitch problems in my demo I tried. I'm pretty sure you don't need an adaptive filter if it's simple code to make it. Waves is likely one of the best choices for noise reduction as it is also for "uncolored" limiting with no frills but loudening. That's right a limiter for loudening. You learn something new every day if you're smart. Don't code a noise filter in C with basic coding.

rickowens · Post by **rickowens** » Tue Apr 12, 2016 4:09 pm

Hi Synapse2k, actually I don't have to do a noise filter.. I only need to write something (now in Matlab) that allows me to classify different tracks according to background noise.

The method that BertKoor and JCJR suggested to me is pretty good and has a solid foundation! I just need to improve that.
And the research I've done in those days lead me to say that the spectrogram is the best way to do that..

Synapse2k · Post by **Synapse2k** » Tue Apr 12, 2016 4:16 pm

It's not called quantify then.

rickowens · Post by **rickowens** » Tue Apr 12, 2016 4:38 pm

Synapse2k wrote:It's not called quantify then.

Sorry, my english is not good.

I meant quantify the broadband noise in a music track.. Using basically the signal to noise ratio.

AUDIO BROADBAND NOISE DETECTION WITH ADAPTIVE FILTER