KVR Audio

Fender19 · Post by **Fender19** » Mon Aug 12, 2019 12:36 am

A lot of plugins these days - especially EQ plugins - include real time spectrum graphs in their display. Many of these plugins claim to be "light on resources" yet, AFAIK, FFTs are the only way to generate spectrum data - and FFTs are NOT light on resources!

So, are there less computationally-intensive means being used to generate these spectrum graphs?

xoxos · Post by **xoxos** » Mon Aug 12, 2019 12:47 am

?? all of my audio dsp has been done on single cores. i use the fft from dspguide.com, with some obvious optimisation. its not as fast as the common libraries for sure. but eg. for a 2x overlap, without much else going on i'd expect it to stay under 6% cpu. and of course for a graphic display the rate can be dropped some. i threw a 256 window display on the last standalone generator i coded, iirc it just captured a window every 16th note and so was basically invisible on cpu.

Fender19 · Post by **Fender19** » Mon Aug 12, 2019 1:02 am

xoxos wrote: ↑Mon Aug 12, 2019 12:47 am ?? all of my audio dsp has been done on single cores. i use the fft from dspguide.com, with some obvious optimisation. its not as fast as the common libraries for sure. but eg. for a 2x overlap, without much else going on i'd expect it to stay under 6% cpu. and of course for a graphic display the rate can be dropped some. i threw a 256 window display on the last standalone generator i coded, iirc it just captured a window every 16th note and so was basically invisible on cpu.

I guess I am thinking of processors like noise reduction where FFT heavily taxes CPUs. Perhaps it's all the computation before and after the FFT/iFFT conversions that causes that load.

So, keep the sample rate and resolution low and FFTs are not a problem for graphing. Will check out dspguide.com Thanx!

vortico · Post by **vortico** » Tue Aug 13, 2019 4:44 pm

>FFTs are NOT light on resources

FFTs are very light on resources. For `float[256]` blocks, the computation of an FFT is <1ns per sample, which is faster than accessing the elements in memory, so it's essentially a free computation.

mystran · Post by **mystran** » Tue Aug 13, 2019 10:54 pm

Fender19 wrote: ↑Mon Aug 12, 2019 12:36 am A lot of plugins these days - especially EQ plugins - include real time spectrum graphs in their display. Many of these plugins claim to be "light on resources" yet, AFAIK, FFTs are the only way to generate spectrum data - and FFTs are NOT light on resources!

So, are there less computationally-intensive means being used to generate these spectrum graphs?

Another way to generate a spectrum graph is to use a bank of band-pass filters, but using an FFT is generally faster.

mystran · Post by **mystran** » Tue Aug 13, 2019 10:56 pm

vortico wrote: ↑Tue Aug 13, 2019 4:44 pm >FFTs are NOT light on resources

FFTs are very light on resources. For `float[256]` blocks, the computation of an FFT is <1ns per sample, which is faster than accessing the elements in memory, so it's essentially a free computation.

Well.. for good resolution at low frequencies you might want a lot longer FFTs (eg. 4k or more) and then you end up with the problem that you have to overlap them to get a decent visual framerate..

But even then, drawing the resulting data is usually a whole lot more expensive than the FFTs.

AnalogGuy1 · Post by **AnalogGuy1** » Tue Aug 13, 2019 11:42 pm

If you don't care about phase, a way to make FFT decomposition even faster is to pre-allocate memory (that's a 2^N size) and fill it up with incoming samples in a ring fashion. Then "unfold" the FFT computation for this specific length so there's no loops (easier to write code that writes code to remove all the loops). This machine-written code that's specific to your loop length will be ugly as anything, but I've found it runs much, much faster than looped code written for the general-length case. If you really need that phase, you could recover it since the loop will add a linear phase proportional to the offset in the loop of your current sample.

On laptops that were out in 2006 I could do a 32k FFT on streaming data with a 30 fps screen update and still only use 4% of the processor core.

Fender19 · Post by **Fender19** » Wed Aug 14, 2019 4:06 am

vortico wrote: ↑Tue Aug 13, 2019 4:44 pm >FFTs are NOT light on resources

FFTs are very light on resources. For `float[256]` blocks, the computation of an FFT is <1ns per sample, which is faster than accessing the elements in memory, so it's essentially a free computation.

4-6% CPU usage that some are stating here is not “light” IMO. Try to use a dozen plugins like that in a typical mixing scenario and see what happens!

What do you mean by “processes a sample in <1nS”? On what speed of machine, with what overlap, etc.?

Can you clarify?

matt42 · Post by **matt42** » Wed Aug 14, 2019 4:49 am

Fender19 wrote: ↑Wed Aug 14, 2019 4:06 am4-6% CPU usage that some are stating here is not “light” IMO. Try to use a dozen plugins like that in a typical mixing scenario and see what happens!

Perhaps, but I wouldn't keep a dozen plugins open all displaying spectral graphs at the same time and, presumably, your plugin will only perform these calculations while the interface is open.

vortico · Post by **vortico** » Wed Aug 14, 2019 6:28 am

4-6% CPU usage is definitely wrong. I don't know what FFT implementation they're using, but you should be getting <1ns/sample with FFT block sizes ranging from 2^5 to 2^18 with anything in the last 10 years that supports AVX. That'd be around 0.005% CPU at 44.1kHz.

mystran · Post by **mystran** » Wed Aug 14, 2019 9:24 am

AnalogGuy1 wrote: ↑Tue Aug 13, 2019 11:42 pmThen "unfold" the FFT computation for this specific length so there's no loops (easier to write code that writes code to remove all the loops). This machine-written code that's specific to your loop length will be ugly as anything, but I've found it runs much, much faster than looped code written for the general-length case.

Something that should probably be mentioned is that a typical text-book radix-2 can easily be a factor of 100 (or more) slower than a properly optimised FFT library. Optimising twiddle computations (and using constant lookups for small blocks) is the first step. Completely unfolding (use split-radix) tends to only be profitable for small (sub-)blocks (eg. 64 or less). As the blocks get larger, you tend to become more and more memory bound and you should basically run the largest radix (probably either 4 or 8 in practice) you can fit into registers, then stride blocks as large as possible without trashing L1 cache.

But seriously, unless you want to spend a ton of time learning about all the possible optimisations and then tuning it, you should really just use some existing FFT library.

quikquak · Post by **quikquak** » Wed Aug 14, 2019 10:38 am

At first I thought 'Fender19' was thinking they need to be done every sample!

mystran · Post by **mystran** » Wed Aug 14, 2019 11:26 am

Another tip: if you don't need the spectrum data in DSP code (eg. conventional EQ where it's only used for visual display), then you can simply send the samples as-is into a wait-free queue and do all the FFT computations in the GUI thread. This way the audio processing cost is practically zero and even if you start running low on CPU, you'll only take a performance hit with GUI framerates rather than glitching the audio.

DJ Warmonger · Post by **DJ Warmonger** » Wed Aug 14, 2019 11:35 am

matt42 wrote: ↑Wed Aug 14, 2019 4:49 am
Fender19 wrote: ↑Wed Aug 14, 2019 4:06 am4-6% CPU usage that some are stating here is not “light” IMO. Try to use a dozen plugins like that in a typical mixing scenario and see what happens!
Perhaps, but I wouldn't keep a dozen plugins open all displaying spectral graphs at the same time and, presumably, your plugin will only perform these calculations while the interface is open.

Yep. It is also important to run spectrum in separate thread so it doesn't block audio stream and slow down audio buffer processing.

Fender19 · Post by **Fender19** » Wed Aug 14, 2019 3:27 pm

quikquak wrote: ↑Wed Aug 14, 2019 10:38 am At first I thought 'Fender19' was thinking they need to be done every sample!

No - I got past THAT misunderstanding long ago when I needed to run some big FIR filters. Using FFT means was “light years” faster.

Efficient real time spectrum graphing techniques?