KVR Audio

jackcampbell · Post by **jackcampbell** » Sun Apr 30, 2017 9:44 pm

AUTO-ADMIN: Non-MP3, WAV, OGG, SoundCloud, YouTube, Vimeo, Twitter and Facebook links in this post have been protected automatically. Once the member reaches 5 posts the links will function as normal.

Hi!

I'm working on a real-time convolution reverb effect for a game-sound engine (eventually using XAudio2's audio processing loop) but I'm trying to prototype it in MATLAB. My grasp on this DSP stuff is getting better but there still remains a lot of fuzziness--

The current state of my prototype is using the overlap-save method with a precomputed DFT for the impulse response and I'm trying to figure out how to adapt this to a real-time audio processing loop where you don't really know how long the input/output signals will be. I am also planning on trying out the Gardner approach with the FIR convolution at the beginning to avoid the latency up front (but I thought I could at least get a reverb working that just had the latency at the beginning?)

I guess my biggest question right now is about using the overlap save method in a realtime setting: I'm pretty sure I'm getting the latency on every loop rather than paying for the latency up front and using it to seamlessly concatenate successive blocks. Is there a way to do it like that or should I take a different approach to convolution in this context? (Maybe overlap-add?)

Any help would be appreciated!

Note -- this is the convolution code in my current non-realtime prototype that works, and I am trying to adapt the loop to work in a real-time environment

Code: Select all (#)


[x, Fs] = audioread ('A.wav');
[h] = audioread('CCRMAStairwell.wav');

% PRE-PROCESS
%--------------------------------------------------------
% choose input signal partition window, size L (say, 2048)
% find a way to make this less arbitrary
L = 2048; % based on length of signal

%impulse response length
M = length(h);
% size of the DFT; (L + M - 1) minimum, rounded to nearest power of 2
% for efficiency with FFT algorithm
N = 2^nextpow2(L+M-1);
L = N - M + 1;

% pre-compute the DFT for the impulse response
% zero-pad to match length of the blocks
h = vertcat(h, zeros(N - (M -1), 2));
H = fft(h, N);

% buffer to save overlap in each iteration of the loop
% last M-1 samples are appended to the start of each block of x
% BEFORE the block's DFT is multiplied with H
OLAP = zeros((M - 1), 1);

% end result: a two channel signal that is the length of X convolved with H
% -> Length(x) + Length(h) - 1
y = zeros(length(x) + M - 1, 2);
%--------------------------------------------------------


%OVERLAP-SAVE LOOP
%-----------------------------------------------
% for our purposes in this demo, loop will be a for loop from 0 to
% length(x)/L. runs for length of program in real-time version

% index represents how many blocks total we will compute
% will be adjusted for real-time applications
i = 0;
while( i < (length(x) + M) / L)

    % compensate for when we reach the last chunk of samples
    % since it will be smaller than block size
    k = min(((i + 1)*(L)), length(x));
    
    % get the window we will be computing; block length L
    time_domain_input_window = (i * L) + 1:k;
    
    % get this window from the input signal
    x_r = x(time_domain_input_window);
    
    % pre-pend the saved overlap to the block
    x_r_overlap = vertcat(OLAP, x_r);
    
    % pad with zeros at the end to make up for uneven chunk of samples at
    % the end
    % (zeros(length(H) - length(x_r_overlap)) will only be non-zero in that
    % case
    x_r_zeropadded = vertcat(x_r_overlap, zeros(length(H) - length(x_r_overlap), 1));
    
    % save the last M - 2 samples to pre-pend the block on the next
    % iteration
    OLAP = x_r_zeropadded(length(x_r_zeropadded) - (M - 2):length(x_r_zeropadded));
   
    % get DFT of the block
    Xm = fft(x_r_zeropadded, N);
    
    % convolve for left channel
    Ym_1 = Xm .* H(:,1);
    ym_1 = real(ifft(Ym_1));
    
    % convolve for right channel
    Ym_2 = Xm .* H(:,2);
    ym_2 = real(ifft(Ym_2));
    
    % combine into a stereo signal
    ym = [ym_1, ym_2];
    
    % place in the output buffer -- 
    % from newly created output block, take the section after the
    % M - 1 overlap (which is discarded) to the end
    
    % and put this block into the appropriate L-sized chunk in the
    % concatenated output signal
    y((i * L + 1: ((i + 1) * L) + 1), 1) = ym(M-1 : length(ym), 1);
    y((i * L + 1: ((i + 1) * L) + 1), 2) = ym(M-1 : length(ym), 2);
    
    i = i + 1;
end
%-----------------------------------------------

Miles1981 · Post by **Miles1981** » Sun Apr 30, 2017 11:14 pm

You should not do an FFT of your full impulse, but cut it in pieces. You can have a look at my blog post on it: http://blog.audio-tk.com/2015/07/07/aud ... nvolution/

nonnaci · Post by **nonnaci** » Mon May 01, 2017 3:02 am

If you have multithreading support which is atypical of sound in games (sound gets the lowest CPU budget from my experience), then Gardner's zero-latency is easy. If not, then the approach suffers from load-balancing difficulties due to the not being able to interrupt chip-optimized FFT routines. The better way is to stop doubling the non-uniform block-sizes at some upper bound.

IMO, should use algorithmic reverbs in games due to ease of changing parameters in real time. With convolution reverbs, you'll need to do much more multi-threading hackery to smoothly blend different impulse responses.

Richard_Synapse · Post by **Richard_Synapse** » Mon May 01, 2017 6:35 pm

This is a typical problem that occurs when using FFT in realtime code. The solution is simple though, you need ring buffers both on the input and output. You collect samples in your input buffer until your block is full, then process the block and add the output samples to your output buffer. This method will give you a latency of exactly the block size, regardless of how many samples are processed. Because of this, you can then add direct convolution on top of this to completely eliminate the latency if you want to. In a reverb this is not always necessary though, since you often want some predelay anyway.

Richard

Real-time Convolution Reverb (for games)