Using FFT to double length of audio clip

DSP, Plugin and Host development discussion.
Post Reply New Topic
RELATED
PRODUCTS

Post

Hi, I hope this isn't off topic. I've seen FFT questions here before. I want to double the size of a short, like 1/10 second, audio clip while preserving the frequency spectrum using:

1. Perform Fast Fourier Transform.
2. Double the size of the result, using a simple resample function:

f[new] = f[old/2];

Do this for both real and imaginary parts

3. Perform inverse FFT.

It works, sort of. The wave forms are preserved, but the amplitudes scale from correct at the beginning, to zero in the middle to correct at the end of the now doubled clip. I can correct these amplitudes using a scaling function, but I'm wondering if there's something else I'm doing wrong like the way I resample my fft data.

Honestly, the FFT is a great black box to me. I know it has something to do with moving back and forth between amplitude and frequency domains, but how and why it works is not something I understand. If anybody has any good links on this subject, I'd appreciate it.

Thanks.
fftdouble.png
You do not have the required permissions to view the files attached to this post.

Post

The DFT of a finite length sampled signal is only valid for that exact length of signal. If you change it, the DFT must also change.

This is similar to interpolation... How is the DFT to know how to fill in the missing data? You have to tell it what to do.

In other words, FT is not a solution to this problem. You end up with the same issues you had in the time domain, how to interpolate across known sections of the clip to fill in the unknown sections.

One of the most simple solutions is to cross blend, which is very similar to what the FT with "amplitude correction" (correcting the frequency domain data in some way) would do.
Free plug-ins for Windows, MacOS and Linux. Xhip Synthesizer v8.0 and Xhip Effects Bundle v6.7.
The coder's credo: We believe our work is neither clever nor difficult; it is done because we thought it would be easy.
Work less; get more done.

Post

i believe there will be similar issues, though i'd anticipate them being less severe: double window length for ifft, multiply phase x2. share results? :)
you come and go, you come and go. amitabha neither a follower nor a leader be tagore "where roads are made i lose my way" where there is certainty, consideration is absent.

Post

Thanks for your responses. Aciddose, you're explanation is fascinating. Kind of an information theory concept. It's interesting that the FFT seems to "know" that it doesn't have enough information to give complete amplitudes, in successive repeating cycles. To me it's a very deep subject that I don't understand.

From a wave pattern generation point of view, I did look and see what generating very close frequencies looks like. They tend to cancel each other out periodically This is one second of 440 and 441 with 44100 sample rate.
freq1.png
So my filling algorithm would be expected to generate such patterns. I suppose there's something about doubling that makes the 50% cycling pattern with the FFT.

Wow pattern, clicks, and clunks are basically my issue. I want to replicate short, simple sound clips to arbitrary lengths without getting these features. I wonder if there's a mathematical test for clicks and pops. I've looked around and there are some explanations like sudden variations in amplitude, but it's no where near that simple.

Thanks.
You do not have the required permissions to view the files attached to this post.

Post

double window length for ifft, multiply phase x2.
xoxos, I went to do what you're suggesting, but I don't know what you mean by "multiply phase x2." Do you mean instead of

f[new] = f[old/2];

something like:

f[new] = f[old/2+1]; or some sort of alternating offset?

Thanks.

Post

I don't think you're really going to gain anything by using FFT - I reckon you'd be better off cutting your input up into small overlapping snippets and then crossfading between them, repeating snippets as required to stretch the overall sound to the length you want. This is essentially what granular synthesis does.

You can also playback the snippets at different sample rates, so you effectively have separate control over pitch and time.

If you want to get fancy once it's basically working, you could try to adjust the overlaps when crossfading, in order to minimise any out-of-phase cancellation effects.

Post

i know it difficult to get used to because not used to it. xoxos mean what xoxos say.


here's pseudocode:

the_new_phase_of_a_thing = the_old_phase_of_a_thing * 2;
you come and go, you come and go. amitabha neither a follower nor a leader be tagore "where roads are made i lose my way" where there is certainty, consideration is absent.

Post

xoxos also say, remember first part of first thing me say too.
you come and go, you come and go. amitabha neither a follower nor a leader be tagore "where roads are made i lose my way" where there is certainty, consideration is absent.

Post Reply

Return to “DSP and Plugin Development”