VST Plug-In: How to implement a “lookahead” buffer?

DSP, Plugin and Host development discussion.
RELATED
PRODUCTS

Post

KarLoff wrote:Why should the host not be able to do anything, until it receives all pending outputs from the plug-in? That's just not true. The host simply feeds the available input samples, if any, into the first plug-in. Also, if any plug-in inside the chain returned some output, it forwards that output to the next plug-in in the chain. Otherwise not. Finally, if the last plug-in in the chain returned some output, it sends that output to the Soundcards output buffer (or the output file). That all is repeated in a large loop until the process ends.
if you have just one such plugin which returns less samples than it was fed with - how will the host fill the soundcard's output buffer? say goodbye to asio
please explain
and also, if you have a plugin which returns variable number of samples from block to block - wouldn't that sound like total crap at the end?
It doesn't matter how it sounds..
..as long as it has BASS and it's LOUD!

irc.libera.chat >>> #kvr

Post

antto wrote:if you have just one such plugin which returns less samples than it was fed with - how will the host fill the soundcard's output buffer? please explain
If, in one particular call, the plug-in got N samples of input and returned M samples of output, with M < N, then the host has only M samples to pass on. It's as simple as that. If M is too small (or even zero!) to feed the soundcard, then we simply can't feed the soundcard at this particular moment. Who says we need to feed the soundcard after every single plug-in call? If we don't have enough output samples (or even zero!) after a particular plug-in call, we simply have nothing to do at this particular moment. Instead, we make the next plug-in call - to get more output samples soon.

Note: At the beginning of a "LIVE" processing session, it may take a moment until we got enough data back from the plug-in to start feeding the soundcard. So we simply don't start feeding the soundcard until we do have enough data. That's the unavoidable delay you have in "LIVE" mode. But this must not be mixed up with an "artificial" delay introduced by the plug-in. Also it naturally doesn't exist in "file-based" processing mode.
antto wrote:iand also, if you have a plugin which returns variable number of samples from block to block - wouldn't that sound like total crap at the end?
No, it doesn't. I'm not an ASIO expert. I'm more familiar with Win32 WaveOut. But the principle should be the same: The soundcard's output buffer works like a FIFO queue. At the beginning, we add some initial data to the FIFO queue and, once enough initial data is in the queue, we start playback. The only thing that matters is that the FIFO queue never runs out of data, once playback has been started! It doesn't matter at all how much data we add to the queue at once, or how often we append new data to the qeue, as long as we make sure the queue doesn't run out of data...

Post

KarLoff wrote:
antto wrote:if you have just one such plugin which returns less samples than it was fed with - how will the host fill the soundcard's output buffer? please explain
If, in one particular call, the plug-in got N samples of input and returned M samples of output, with M < N, then the host has only M samples to pass on. It's as simple as that. If M is too small (or even zero!) to feed the soundcard, then we simply can't feed the soundcard at this particular moment. Who says we need to feed the soundcard after every single plug-in call? If we don't have enough output samples (or even zero!) after a particular plug-in call, we simply have nothing to do at this particular moment. Instead, we make the next plug-in call - to get more output samples soon.
Ok, what if you have a synth produce 256 samples of data, and it goes into both effect A and effect B, which are then mixed back together. Effect A returns 256 samples, that's fine. Effect B returns 128. This is a problem because the soundcard asked for 256. So we process another sound block. The synth produces yet another 256 samples of data, effect A returns yet another 256 samples (so we have 512 samples of data total), effect B returns 128 (which means we have 256 samples of effect B) and we've finally got enough to mix effect A with effect B and then output. But then we have 256 samples too much of effect A, and we have to go through the older samples first, so next buffer the output is going to be delayed by 256, and it's still going to produce twice as many samples as needed so the following blocks it will be delayed by 512, then 768, then 1024 and so on. Plus, the note data for the synth will probably go out of sync. This is bad.
Note: At the beginning of a "LIVE" processing session, it may take a moment until we got enough data back from the plug-in to start feeding the soundcard. So we simply don't start feeding the soundcard until we do have enough data. That's the unavoidable delay you have in "LIVE" mode. But this must not be mixed up with an "artificial" delay introduced by the plug-in. Also it naturally doesn't exist in "file-based" processing mode.
Remember that in live mode, your maximum acceptable latency is often VERY small. 20ms latency might not sound like much but it makes keyboards totally unplayable! Typically you work with something like 96 sample buffers - which means you only have 2ms to fill in the soundcard buffer!

Furthermore, what if you run microphone input through effect B? Soundcard asks for 256 samples of output, you get 256 samples of input from the microphone, effect B returns 128 samples, and then you're screwed: you need a full 256 sample block for output or else you will have a drop out (if you don't give the soundcard enough data it will fill in zero and it will sound terrible). But you can't get 256 more samples out of the microphone input because these samples don't exist yet.

Post

KarLoff wrote:Why should the host not be able to do anything, until it receives all pending outputs from the plug-in? That's just not true. The host simply feeds the available input samples, if any, into the first plug-in. Also, if any plug-in inside the chain returned some output, it forwards that output to the next plug-in in the chain. Otherwise not. Finally, if the last plug-in in the chain returned some output, it sends that output to the Soundcards output buffer (or the output file). That all is repeated in a large loop until the process ends.
I think you're implicitly assuming a "spacebar" app, where a user just hits spacebar to play back something, e.g. as in an audio editor. A sequencer, however, has to sync MIDI and audio inputs too. Since a host has no way of knowing what the user does next, there is no free lookahead.

Richard
Synapse Audio Software - www.synapse-audio.com

Post

KarLoff wrote:Also it naturally doesn't exist in "file-based" processing mode.
Nobody cares about "file-based" processing, because that's a trivial process to solve: just load the freaking file into the plugin directly!

Post

MadBrain wrote:
KarLoff wrote:
antto wrote:if you have just one such plugin which returns less samples than it was fed with - how will the host fill the soundcard's output buffer? please explain
If, in one particular call, the plug-in got N samples of input and returned M samples of output, with M < N, then the host has only M samples to pass on. It's as simple as that. If M is too small (or even zero!) to feed the soundcard, then we simply can't feed the soundcard at this particular moment. Who says we need to feed the soundcard after every single plug-in call? If we don't have enough output samples (or even zero!) after a particular plug-in call, we simply have nothing to do at this particular moment. Instead, we make the next plug-in call - to get more output samples soon.
Ok, what if you have a synth produce 256 samples of data, and it goes into both effect A and effect B, which are then mixed back together. Effect A returns 256 samples, that's fine. Effect B returns 128. This is a problem because the soundcard asked for 256. So we process another sound block. The synth produces yet another 256 samples of data, effect A returns yet another 256 samples (so we have 512 samples of data total), effect B returns 128 (which means we have 256 samples of effect B) and we've finally got enough to mix effect A with effect B and then output. But then we have 256 samples too much of effect A, and we have to go through the older samples first, so next buffer the output is going to be delayed by 256, and it's still going to produce twice as many samples as needed so the following blocks it will be delayed by 512, then 768, then 1024 and so on. Plus, the note data for the synth will probably go out of sync. This is bad.
I the outputs of plug-in A and B are supposed to be mixed, but A returned 256 samples while B returned only 128, we will mix the 128 samples from B with the first of 128 samples of A. The rest of A's samples need to wait in a temporary intermediate buffer, for now. We'll mix those with B's output, as soon as we get more output from B. Now, if A always returned 256 samples (for an input of 256) and B always returned 128 (for an input of 256), more and more samples from A would accumulate in the temporary intermediate buffer. But in practice this doesn't happen! Unless B has an infinitely large "lookahead" buffer, which is impossible, B will be "saturated" at one point. This is the point where B starts returning 256 output samples for 256 input samples. Thus, at this point, the amount of samples we need to keep in the temporary intermediate buffer has reached its maximum.

Now about sending the data to the Soundcard, which is an unrelated issue: After each step, we mix as much output from A and B as possible at that point. For example, after the very first step, this would be 128 samples. At some point it will go up to 256 samples per step. Anyway, the samples that have already been mixed together, regardless of how much that is, can be forwarded to the next plug-in in the chain - or to the Soundcard (if there is no further plug-in in the chain). In theory, we can append as much data to the Soudcard's FIFO buffer as we have ready at a certain point. If, in practice, the smallest chunk of data that can be appended to the Soundcard's queue at once is fixed to N samples, then we simply collect all samples we have ready in a temporary intermediate buffer. As soon as at least N samples have accumulated in that buffer, we send them to the Soudcard.

MadBrain wrote:Furthermore, what if you run microphone input through effect B? Soundcard asks for 256 samples of output, you get 256 samples of input from the microphone, effect B returns 128 samples, and then you're screwed: you need a full 256 sample block for output or else you will have a drop out (if you don't give the soundcard enough data it will fill in zero and it will sound terrible). But you can't get 256 more samples out of the microphone input because these samples don't exist yet.
The microphone doesn't stop producing input after the first 256 samples, I suppose. So, if the plug-in returned only 128 samples for the first 256 samples of input, we keep those 128 samples. Then we feed the next 256 samples from the microphone into the plug-in, as soon as they come in. Again we keep as much output as we get back from the plug-in. And so on. As soon as we have accumulated enough output from the plug-in in the Soundcard's FIFO queue, we can start playback. From that point on, we append whatever we get from the plug-in to the Soundcard's queue.

mystran wrote:
KarLoff wrote:Also it naturally doesn't exist in "file-based" processing mode.
Nobody cares about "file-based" processing, because that's a trivial process to solve: just load the freaking file into the plugin directly!
Sorry, just because you don't care, you cannot say the nobody does. If you don't care, that's fine for me. But how does it help the discussion to tell us that you don't care? Do you always scan this forum for topics you don't care about, just to post that you don't care? :P

Anyway, there exist a large number of audio editors that are mainly intended for editing audio files (Audition, Acoustica, WaveLab, GoldWave - just to name a few). And the only plug-in interfaces which is (almost) universally supported by those audio editors is VST. So getting my VST plug-in to work properly with those audio editors is very important for me. As explained before, it is far away from being as "trivial" as you imply. As it turns out, this is party due to the vague specification of VST and partly due to incomplete VST support in some audio editors.

After all, the reason for this whole thread is that I wanted to learn the correct and most compatible way to implement a "lookahead" buffer in VST. I didn't intend to start a "philosophical" discussion about plug-in interfaces :shrug:

Post

Just telling you why VST is designed that way (it's designed for real time use, so many design choices are going to work against you).

Post

The old long-gone mac premiere plugin spec had a non-realtime mode where the plugin takes over the host. Once the user clicks the "process" button in the plugin window, the plugin could inquire of the host how many samples in the song, ask for any number of samples in a random-access fashion, and then process the samples and send them back to the host to save back to file. The plugin was allowed to continue random-access messing with the audio, in any order it likes, until it finally gets done and returns control to the host, or alternately maybe the user would lose patience and click the esc key if it went on too long. :)

Maybe some of the other oddball audio-editor-specific plugin formats in current audio editor softwares have such features. I haven't surveyed the field.

It is a nice way to do things for nonrealtime editing. I wrote some non-realtime premier plugins long ago that would be near impossible in vst.

===

Dunno much about protools, surely it has latency compensation nowadays. I recall long in the past, when audio software was a new thing-- There was a time when protools didn't have latency compensation, but they included a delay plugin. In order to use a plugin with latency on one track, the user had to put compensating delay plugins on all the other tracks to get everything to line up.

===

The different-size-input-versus-output feature is common in time-pitch stretching libraries. Of course it would be almost impossible to offer a time-pitch stretch library unless the input and output sizes could be different. In the better ones, they also need a lookahead "head start" to get far enough ahead in real time so they can make good-sounding stretching.

So during ordinary operation of the library, cranking a track thru it-- If you have to relocate audio, you tell the lib to clear its buffers, then you send it a lot of data at the new location so it can get its headstart, and then depending on the time-pitch parameters, it might ask for a different number of samples, buffer-to-buffer, in order to return a constant number of samples per buffer. And another approach is to send it fixed blocks of samples, but it waits awhile before it starts sending anything back.

I studied several of the nice libraries, and programmed a lot using Elastique, which IMO is a good sounding, well-behaved and well-written library.

Anyway, just saying that it is interesting that all the different time-pitch libraries that I looked at, have different approaches as how to handle the difficulty of different input-output buffer sizes.

It is a problem like, "where do we put the skunk works?" There is no ideal way to handle the problem, and is a PITA to program regardless the hand-shaking trivia between host and library. But rather unavoidable that you have to put up with it, for certain tasks. If you are gonna stretch time and pitch, it is a necessary evil.

Post Reply

Return to “DSP and Plugin Development”