KVR Audio

Elizabeth21 · Post by **Elizabeth21** » Mon May 25, 2015 1:16 pm

Hi, I hope this isn't off topic. I've seen FFT questions here before. I want to double the size of a short, like 1/10 second, audio clip while preserving the frequency spectrum using:

1. Perform Fast Fourier Transform.
2. Double the size of the result, using a simple resample function:

f[new] = f[old/2];

Do this for both real and imaginary parts

3. Perform inverse FFT.

It works, sort of. The wave forms are preserved, but the amplitudes scale from correct at the beginning, to zero in the middle to correct at the end of the now doubled clip. I can correct these amplitudes using a scaling function, but I'm wondering if there's something else I'm doing wrong like the way I resample my fft data.

Honestly, the FFT is a great black box to me. I know it has something to do with moving back and forth between amplitude and frequency domains, but how and why it works is not something I understand. If anybody has any good links on this subject, I'd appreciate it.

Thanks.

fftdouble.png

aciddose · Post by **aciddose** » Mon May 25, 2015 1:25 pm

The DFT of a finite length sampled signal is only valid for that exact length of signal. If you change it, the DFT must also change.

This is similar to interpolation... How is the DFT to know how to fill in the missing data? You have to tell it what to do.

In other words, FT is not a solution to this problem. You end up with the same issues you had in the time domain, how to interpolate across known sections of the clip to fill in the unknown sections.

One of the most simple solutions is to cross blend, which is very similar to what the FT with "amplitude correction" (correcting the frequency domain data in some way) would do.

xoxos · Post by **xoxos** » Mon May 25, 2015 4:57 pm

i believe there will be similar issues, though i'd anticipate them being less severe: double window length for ifft, multiply phase x2. share results?

Elizabeth21 · Post by **Elizabeth21** » Tue May 26, 2015 1:03 pm

Thanks for your responses. Aciddose, you're explanation is fascinating. Kind of an information theory concept. It's interesting that the FFT seems to "know" that it doesn't have enough information to give complete amplitudes, in successive repeating cycles. To me it's a very deep subject that I don't understand.

From a wave pattern generation point of view, I did look and see what generating very close frequencies looks like. They tend to cancel each other out periodically This is one second of 440 and 441 with 44100 sample rate.

freq1.png

So my filling algorithm would be expected to generate such patterns. I suppose there's something about doubling that makes the 50% cycling pattern with the FFT.

Wow pattern, clicks, and clunks are basically my issue. I want to replicate short, simple sound clips to arbitrary lengths without getting these features. I wonder if there's a mathematical test for clicks and pops. I've looked around and there are some explanations like sudden variations in amplitude, but it's no where near that simple.

Thanks.

Elizabeth21 · Post by **Elizabeth21** » Tue May 26, 2015 3:31 pm

double window length for ifft, multiply phase x2.

xoxos, I went to do what you're suggesting, but I don't know what you mean by "multiply phase x2." Do you mean instead of

f[new] = f[old/2];

something like:

f[new] = f[old/2+1]; or some sort of alternating offset?

Thanks.

kryptonaut · Post by **kryptonaut** » Tue May 26, 2015 4:23 pm

I don't think you're really going to gain anything by using FFT - I reckon you'd be better off cutting your input up into small overlapping snippets and then crossfading between them, repeating snippets as required to stretch the overall sound to the length you want. This is essentially what granular synthesis does.

You can also playback the snippets at different sample rates, so you effectively have separate control over pitch and time.

If you want to get fancy once it's basically working, you could try to adjust the overlaps when crossfading, in order to minimise any out-of-phase cancellation effects.

xoxos · Post by **xoxos** » Tue May 26, 2015 4:30 pm

i know it difficult to get used to because not used to it. xoxos mean what xoxos say.

here's pseudocode:

the_new_phase_of_a_thing = the_old_phase_of_a_thing * 2;

xoxos · Post by **xoxos** » Tue May 26, 2015 4:38 pm

xoxos also say, remember first part of first thing me say too.

Using FFT to double length of audio clip