How much CPU would take 10 Envelopes per voice? It seems a lot...

DSP, Plugin and Host development discussion.
Post Reply New Topic
RELATED
PRODUCTS

Post

Nowhk wrote:
S0lo wrote:Well then, IMO your worst case is good!!.

Later on you could multi thread the voices, which should give you very good performance.
Its the worst case doing nothing :) Only envelopes...
So those voices are not playing?
www.solostuff.net
Advice is heavy. So don’t send it like a mountain.

Post

S0lo wrote:
Nowhk wrote:
S0lo wrote:Well then, IMO your worst case is good!!.

Later on you could multi thread the voices, which should give you very good performance.
Its the worst case doing nothing :) Only envelopes...
So those voices are not playing?
Voices are playing. Envelopes are playing/processing (i.e. calculate the value on each sample). But nothing all (no current modulation on pitch and/or filter, no osc running, no filter at all).

Only envelopes processing, 6% :D
Is it still good?

Post

160 envelopes at 6%. I wouldn't say thats bad at all :)
www.solostuff.net
Advice is heavy. So don’t send it like a mountain.

Post

S0lo wrote:160 envelopes at 6%. I wouldn't say thats bad at all :)
Well, yes :) Maybe I can improve them even more ;)

Still not sure how I could smooth controls with this new configuration... :(
This way is wrong:

Code: Select all

for (int voiceIndex = 0; voiceIndex < PLUG_VOICES_BUFFER_SIZE; voiceIndex++) {
	Voice &voice = pVoiceManager->mVoices[voiceIndex];
	if (!voice.mIsPlaying) {
		continue;
	}
	
	int nbSamples = blockSize;			
	while (nbSamples > 0) {
		// smooth controls
		AdvancedParameter **ppParameter = mAdvancedParameters.GetList();
		for (int i = 0; i < mAdvancedParameters.GetSize(); i++, ppParameter++) {
			AdvancedParameter *pParameter = *ppParameter;
			pParameter->SmoothControl();
		}

		//
		
		nbSamples--;
	}
}
since only first voice will be "correct" (if blockSize will be "large" enough; else is a mashup of errors between smooth and voices :D )...

Post

Nowhk wrote:
S0lo wrote:160 envelopes at 6%. I wouldn't say thats bad at all :)
Well, yes :) Maybe I can improve them even more ;)

Still not sure how I could smooth controls with this new configuration... :(
This way is wrong:

[ code excerpt ]

since only first voice will be "correct" (if blockSize will be "large" enough; else is a mashup of errors between smooth and voices :D )...
This is how I suggest you do it:
(notice that you can do a TON of stuff at control rate with a single "clock" per voice, and simply just ramp the final final computed volumes at the very end of the modulation process with a simple linear ramp that can be processed very fast - this includes all the control smoothing stuff!)

Code: Select all

void MainIPlug::ProcessDoubleReplacing(double **inputs, double **outputs, int nFrames) {
	double *outputLeft = outputs[0];
	double *outputRight = outputs[1];
	memset(outputLeft, 0, nFrames * sizeof(double)); // outputLeft[i] = inputs[0][i];
	memset(outputRight, 0, nFrames * sizeof(double)); // outputRight[i] = inputs[1][i];

	// sync
	pTempoEngine->Sync();

	// buffer
	int samplesLeft = nFrames;
	while (samplesLeft > 0) {
		// events
		pMidiHandler->Process();
		pVoiceManager->Process();

		int blockSize = samplesLeft;
		blockSize = pMidiHandler->GetSamplesTillNextEvent(blockSize);
		blockSize = pVoiceManager->GetSamplesTillNextEvent(blockSize);

		// voices
		for (int voiceIndex = 0; voiceIndex < PLUG_VOICES_BUFFER_SIZE; voiceIndex++) {
			Voice &voice = pVoiceManager->mVoices[voiceIndex];
			if (!voice.mIsPlaying) {
				continue;
			}

			// init
			if (voice.mSample == 0.0) {
				// new voice restart the envelopes
				for (int envelopeIndex = 0; envelopeIndex < ENVELOPES_CONTAINER_NUM_ENVELOPE_MANAGER; envelopeIndex++) {
					VoiceParameters &voiceParameters = pEnvelopesManager->pEnvelope[envelopeIndex]->mVoiceParameters[voiceIndex];
					voiceParameters.mIsCompleted = false;
					voiceParameters.mControlRateIndex = 0;
					voiceParameters.mBlockStep = gBlockSize;
					voiceParameters.mStep = 0.0;
				}
			}

			double *left = outputLeft;
			double *right = outputRight;
			int nbSamples = blockSize;			
			while (nbSamples > 0) {		
				if (voiceParameters.mControlRateIndex-- == 0) {
					voiceParameters.mControlRateIndex = PLUG_CONTROL_RATE - 1;
	
					// Parameter smoothing is done by ramping all values AFTER
					// calculating all modulations, envelopes, LFOs etc
					// and only on parameters that are highly susceptible to zipper
					AdvancedParameter **ppParameter = mAdvancedParameters.GetList();
					for (int i = 0; i < mAdvancedParameters.GetSize(); i++, ppParameter++) {
						AdvancedParameter *pParameter = *ppParameter;
						pParameter->CalcNextValue();
					}

					// voice volume
					voice.mVoiceVolume.Process();

					// envelopes
					pEnvelopesManager->Process(voice);

					// PARAMETER RAMPING TIME!!!
					// Calculate final output from whole modulation process
					// and calc linear ramp speed over control block
					float tgVolume = voice.mVoiceVolume.output();
					float tgPanning = voice.mPanning;
					// etc repeat for everything that needs ramping

					if(voice.mInitialRampSkip) {
						// special skip for snappy attacks!
						voice.mInitialRampSkip = false;

						voice.mRampedVolume = tgVolume;
						voice.mRampedPanning = tgPanning;
						// etc repeat for everything that needs ramping

					}
					voice.mVolumeRamp = (tgVolume - voice.mRampedVolume) / PLUG_CONTROL_RATE;
					voice.mPanningRamp = (tgPanning - voice.mRampedPanning) / PLUG_CONTROL_RATE;
					// etc repeat for everything that needs ramping

				}

				// audio
				float leftOut = 0;
				float rightOut = 0;
				//pOscillator1->Process(voice, leftOut, rightOut);
				
				*left++ += leftOut * voice.mRampedVolume * (1 - voice.mRampedPanning);
				*right++ += rightOut * voice.mRampedVolume *  (1 + voice.mRampedPanning);
				
				voice.mRampedVolume += voice.mVolumeRamp;
				voice.mRampedPanning += voice.mPanningRamp;
				// etc repeat for everything that needs ramping
				
				nbSamples--;
			}

			voice.Increment(blockSize);
		}
		
		// fx (ProcessBlock)

		samplesLeft -= blockSize;
		outputLeft += blockSize;
		outputRight += blockSize;

		pMidiHandler->Flush(blockSize);
		pVoiceManager->Reset();
	}
}
The EnvelopesManager::Process function is now only called once per control block instead of once per sample, MUCH faster:

Code: Select all

void EnvelopesManager::Process(Voice &voice) {
	int voiceIndex = voice.mIndex;
	for (int envelopeIndex = 0; envelopeIndex < ENVELOPES_CONTAINER_NUM_ENVELOPE_MANAGER; envelopeIndex++) {
		Envelope &envelope = *pEnvelope[envelopeIndex];
		VoiceParameters &voiceParameters = envelope.mVoiceParameters[voiceIndex];

		// no loop/sustain mode ended cycle
		if (voiceParameters.mIsCompleted) {
			continue;
		}


		// calculate dynamic spline block
		if (voiceParameters.mBlockStep >= gBlockSize) {
			// loop
			if (voiceParameters.mStep >= envelope.mLoopLengthInSamples) {
				if (envelope.mLoopType == EnvelopeLoopType::ENVELOPE_LOOP_TYPE_LOOP || envelope.mLoopType == EnvelopeLoopType::ENVELOPE_LOOP_TYPE_MASTER) {
					voiceParameters.mStep = fmod(voiceParameters.mStep, envelope.mLoopLengthInSamples);
				}
				else {
					// update value
					double value = envelope.mAmps[envelope.mLoopPointIndex];
					value = envelope.mIsEnabled * ((1 + envelope.mIsBipolar) / 2.0 * value + (1 - envelope.mIsBipolar) / 2.0) * envelope.mAmount;
					envelope.mOutputConnector_CV.mPolyValue[voiceIndex] = value;

					voiceParameters.mIsCompleted = true;
					continue;
				}
			}

			// refresh section index
			unsigned int sectionIndex = 0;
			while (sectionIndex < envelope.mNumPoints - 1) {
				if (voiceParameters.mStep >= envelope.mSectionLengths[sectionIndex] && voiceParameters.mStep < envelope.mSectionLengths[sectionIndex + 1]) break;
				sectionIndex++;
			}

			// refresh section step
			unsigned int previousSectionsSteps = 0;
			int i = 0;
			while (i < sectionIndex) {
				previousSectionsSteps += envelope.mSectionLengths[i + 1] - envelope.mSectionLengths[i];
				i++;
			}
			double sectionStep = voiceParameters.mStep - previousSectionsSteps;

			// refresh  block index
			unsigned int blockIndex = (unsigned int)floor(sectionStep / gBlockSize);

			// refresh  section length
			double sectionLength = envelope.mSectionLengths[sectionIndex + 1] - envelope.mSectionLengths[sectionIndex];

			// refresh p0/p1
			int numBlocks = (int)(sectionLength / gBlockSize);
			double numBlocksFraction = 1.0 / numBlocks;
			double pos0 = blockIndex * numBlocksFraction;
			double pos1 = (blockIndex + 1) * numBlocksFraction;
			double a = 1.0 - (1.0 / envelope.mTensions[sectionIndex]);
			double p0 = pos0 / (pos0 + a * (pos0 - 1.0));
			double p1 = pos1 / (pos1 + a * (pos1 - 1.0));

			// refresh block start/end amp
			double sectionStartAmp = envelope.mAmps[sectionIndex];
			double sectionEndAmp = envelope.mAmps[sectionIndex + 1];
			double sectionDeltaAmp = sectionEndAmp - sectionStartAmp;
			voiceParameters.mBlockStartAmp = sectionStartAmp + p0 * sectionDeltaAmp;
			double blockEndAmp = sectionStartAmp + p1 * sectionDeltaAmp;

			// refresh block fraction/step
			voiceParameters.mBlockFraction = (blockEndAmp - voiceParameters.mBlockStartAmp) * (1.0 / gBlockSize);
			voiceParameters.mBlockStep = fmod(voiceParameters.mBlockStep, gBlockSize);
		}

		// update value
		double value = voiceParameters.mBlockStartAmp + (voiceParameters.mBlockStep * voiceParameters.mBlockFraction);
		value = envelope.mIsEnabled * ((1 + envelope.mIsBipolar) / 2.0 * value + (1 - envelope.mIsBipolar) / 2.0) * envelope.mAmount;
		envelope.mOutputConnector_CV.mPolyValue[voiceIndex] = value;
		
		// next phase
		voiceParameters.mBlockStep += envelope.mRate * PLUG_CONTROL_RATE;
		voiceParameters.mStep += envelope.mRate * PLUG_CONTROL_RATE;
	}
}

Post

Hi MadBrain, thanks for the reply :)
There are few things that I don't get at all within your code, for beginning...

1° Point: you call pEnvelopesManager->Process(voice); at control rate's voice, but also within EnvelopesManager::Process you use a control rate before "calculate dynamic spline block". Does it make sense? A double control rate? :o

2° Point: if you call pEnvelopesManager->Process(voice) at control rate (from PDR), the envelope phases will be updated at control rate, which mess up the progression, I think that's wrong? This is why I always call that function and I just do "heavy" math at "internal" control rate: phase need to be updated per sample, ever!

Unfortunately, that control rate (with the actual math I have) change very few, because of the dependency compiler set between voice and cache/braches (I think?!?!).
If you write somethings like this:

Code: Select all

void EnvelopesManager::Process(Voice &voice) {
	int voiceIndex = voice.mIndex;
	for (int envelopeIndex = 0; envelopeIndex < ENVELOPES_CONTAINER_NUM_ENVELOPE_MANAGER; envelopeIndex++) {
		Envelope &envelope = *pEnvelope[envelopeIndex];
		VoiceParameters &voiceParameters = envelope.mVoiceParameters[voiceIndex];

		// no loop/sustain mode ended cycle
		if (voiceParameters.mIsCompleted) {
			continue;
		}

		// control rate
		if (voiceParameters.mControlRateIndex-- == 0 && false) {
			voiceParameters.mControlRateIndex = PLUG_CONTROL_RATE - 1;

			// my stuff with access to the voiceParameters
		}

		// next phase
		voiceParameters.mBlockStep += envelope.mRate;
		voiceParameters.mStep += envelope.mRate;
	}
}
(where all my heavy math stuff will never be calculated, since there's that && false) it takes 5% instead of 6% of CPU. But If I remove the whole "if statement" (which basically does the same, since the "if" in any case is not executed) CPU fall to 3%. 2% for a "futile" if :) I think because the "unused" code contains a read of voiceParameters.mBlockStep, that's the dependency which create weight and CPU. But honestly, not sure how I could separate the two things.

3° Point: I see what you mean with "PARAMETER RAMPING TIME!!!" (its similar to already what I do), but theres the same problem: parameter is "global", not per voice. When you do pParameter->CalcNextValue();, you are doing it for a fixed parameter along all voices. Example: if I have a parameter that in 100 samples goes from 0 to 1.0, and I have 10 voices, it reaches the value 1.0 after 10 samples, not 100 (because, for every samples, there's 10 voice that will call CalcNextValue() on the same parameter. That's my trouble.

Do you see what I mean?

Post

Nowhk wrote: 1° Point: you call pEnvelopesManager->Process(voice); at control rate's voice, but also within EnvelopesManager::Process you use a control rate before "calculate dynamic spline block". Does it make sense? A double control rate? :o
No it's just a mistake :-)
Nowhk wrote:2° Point: if you call pEnvelopesManager->Process(voice) at control rate (from PDR), the envelope phases will be updated at control rate, which mess up the progression, I think that's wrong? This is why I always call that function and I just do "heavy" math at "internal" control rate: phase need to be updated per sample, ever!
I haven't looked into your code much at all, but that's just a question of making sure your envelope handling can adapt to whatever slower control rate it's run at. You already need to do this to adapt to different sampling rates (44khz vs 48khz vs 96khz etc). This means you'll probably have to handle the case where you advance through multiple steps in a single block, yes. Your code should be written in a way that the overall progression doesn't change.
Nowhk wrote:

Code: Select all

		if (voiceParameters.mControlRateIndex-- == 0 && false) {
(where all my heavy math stuff will never be calculated, since there's that && false) it takes 5% instead of 6% of CPU. But If I remove the whole "if statement" (which basically does the same, since the "if" in any case is not executed) CPU fall to 3%. 2% for a "futile" if :) I think because the "unused" code contains a read of voiceParameters.mBlockStep, that's the dependency which create weight and CPU. But honestly, not sure how I could separate the two things.
It might be the updating and testing of voiceParameters.mControlRateIndex for every envelope, per voice. Since your control rate should be the same for all envelopes, why not use the same update counter for all your envelopes? Or better yet, share it with all of your modulation process.
Nowhk wrote:3° Point: I see what you mean with "PARAMETER RAMPING TIME!!!" (its similar to already what I do), but theres the same problem: parameter is "global", not per voice. When you do pParameter->CalcNextValue();, you are doing it for a fixed parameter along all voices. Example: if I have a parameter that in 100 samples goes from 0 to 1.0, and I have 10 voices, it reaches the value 1.0 after 10 samples, not 100 (because, for every samples, there's 10 voice that will call CalcNextValue() on the same parameter. That's my trouble.

Do you see what I mean?
I see what you mean, it's just that I wonder:
- Do you really have to fade every single parameter?
- ...including the ones like pitch where you have to call pow() every time it changes?
- For volume, do you really need to fade every single parameter that goes into the calculation? (gain, envelope, lfo modulation, velocity, volume and expression CC, etc...) Why not just calculate per-voice per-block final volume once all of those are factored in using the unramped parameters, and ramp the final result per voice? (which would cover all of those parameters)

Post

MadBrain wrote:I haven't looked into your code much at all, but that's just a question of making sure your envelope handling can adapt to whatever slower control rate it's run at. You already need to do this to adapt to different sampling rates (44khz vs 48khz vs 96khz etc). This means you'll probably have to handle the case where you advance through multiple steps in a single block, yes. Your code should be written in a way that the overall progression doesn't change.
Yep, it already adapt to different sample rates for example. Not sure what you mean with "handle the case where you advance through multiple steps in a single block": mRate is an "addittive-multipler". If its 1.0 @ sample rate 44100 and my envelope length is 1 seconds, it will increment of 1 sample at time and take 441000 samples to complete a cycle:

Code: Select all

voiceParameters.mStep += envelope.mRate;
If I double the speed, the same lenght is reached in 22050 samples.
But that's increment need to be called on every sample, not at control rate (this is what I meant).
MadBrain wrote:It might be the updating and testing of voiceParameters.mControlRateIndex for every envelope, per voice. Since your control rate should be the same for all envelopes, why not use the same update counter for all your envelopes? Or better yet, share it with all of your modulation process.
Tried. I call the whole Envelope process at control rate, and increments only phase on every samples. This way:

Code: Select all

// control rate
if (mControlRateIndex-- == 0) {
	mControlRateIndex = PLUG_CONTROL_RATE - 1;

	// envelopes
	pEnvelopesManager->Process(voice);
}

....

// increments
pEnvelopesManager->Increments(voice);

void EnvelopesManager::Increments(Voice &voice) {
	int voiceIndex = voice.mIndex;

	for (int envelopeIndex = 0; envelopeIndex < ENVELOPES_CONTAINER_NUM_ENVELOPE_MANAGER; envelopeIndex++) {
		Envelope &envelope = *pEnvelope[envelopeIndex];
		VoiceParameters &voiceParameters = envelope.mVoiceParameters[voiceIndex];

		// no loop/sustain mode ended cycle
		if (voiceParameters.mIsCompleted) {
			continue;
		}

		// next phase
		voiceParameters.mBlockStep += envelope.mRate;
		voiceParameters.mStep += envelope.mRate;
	}
}
But the performance doesn't change. It seems that these two "increments" is so huge to perform (I'm still at 6%; without the increments, keeping the env process, I'm at 3%).

Maybe should I also move the increments within control rate check (where envelopes are processed) and do somethings like this?

Code: Select all

		voiceParameters.mBlockStep += envelope.mRate * PLUG_CONTROL_RATE;
		voiceParameters.mStep += envelope.mRate * PLUG_CONTROL_RATE;
Not sure how "elegant" this way is. And still I don't get why this increment operations is so huge for CPU :neutral:
MadBrain wrote: I see what you mean, it's just that I wonder:
- Do you really have to fade every single parameter?
- ...including the ones like pitch where you have to call pow() every time it changes?
- For volume, do you really need to fade every single parameter that goes into the calculation? (gain, envelope, lfo modulation, velocity, volume and expression CC, etc...) Why not just calculate per-voice per-block final volume once all of those are factored in using the unramped parameters, and ramp the final result per voice? (which would cover all of those parameters)
Yes I need (or at least, I think so).

On every sample, the modulation is calculed by param value + mod amount.
Mod amount will change at control rate (which is fine, as you suggested).
Param value can't change at "block size", because can result a huge step between blocks, introducing clicks. It must be smoothed.

Let says I'm modulating internal volume (or pitch).
Block size is 256. Between blocksize 0 and blocksize 1 vol param will change from 0.3 to 0.7.
Mod range between the two blocks instead is 0.2. I'll have:

sample 0: 0.3 + 0.2/256
sample 1: 0.3 + 0.2/256
...
sample 7: 0.3 + 0.2/256
sample 8: 0.3 + 0.2/256 * 2
sample 9: 0.3 + 0.2/256 * 2
...
sample 15: 0.3 + 0.2/256 * 2
sample 16: 0.3 + 0.2/256 * 3
sample 17: 0.3 + 0.2/256 * 3
...
sample 255: 0.3 + 0.2/256 * 256
... end block 0. Value 0.5
... start block 1
sample 256: 0.7 + 0.2/256

And here I get a click..., because I brutally pass from 0.5 to 0.7 since its not smoothed. That's why I also need to smooth the param during this progression. But globally, not per voice (since each voice must catch, at every sample, the same smooth value for adding the per-voice modulation).

Post

Nowhk wrote:
MadBrain wrote: I see what you mean, it's just that I wonder:
- Do you really have to fade every single parameter?
- ...including the ones like pitch where you have to call pow() every time it changes?
- For volume, do you really need to fade every single parameter that goes into the calculation? (gain, envelope, lfo modulation, velocity, volume and expression CC, etc...) Why not just calculate per-voice per-block final volume once all of those are factored in using the unramped parameters, and ramp the final result per voice? (which would cover all of those parameters)
Yes I need (or at least, I think so).

On every sample, the modulation is calculed by param value + mod amount.
Mod amount will change at control rate (which is fine, as you suggested).
Param value can't change at "block size", because can result a huge step between blocks, introducing clicks. It must be smoothed.
Using your suggestion and a bit of logic, I got the whole process :)

Before iterate voices, I create a ramp values array for each param; than, at each sampleIndex (for each voice), I pass the current sample index and I read the value from the ramp pre-calculated values. Of course I reset sampleIndex when new voice start.

The only "thing" I don't like for the moment is that I allocate for each param a huge array, such as:

Code: Select all

double mParameterRamp[2560];

inline void CreateParameterRamp(int blockSize) {
	for (int sample = 0; sample < blockSize; sample++) {
		mParameterRamp[sample] = mParameterSmoother.Process(mParameterAmount);
	}
}
I fill it from 0 to blockSize, but I pre allocate "huge" buffer in case of higher audio buffer from DAW (which it is unpredictable; it could be more than 2560 for example). I don't like it so much... should I just use a pointer?

Post

What I do in that kind of case is that I limit the number of samples per processing block in the top loop. Then, all your temp buffers can be of that size (since you're guaranteed that any larger block will be split by the main loop). Something like:

Code: Select all

#define MAX_PROCESS_BLOCK 64

Code: Select all

double mParameterRamp[MAX_PROCESS_BLOCK];

Code: Select all

      int blockSize = samplesLeft;
      blockSize = pMidiHandler->GetSamplesTillNextEvent(blockSize);
      blockSize = pVoiceManager->GetSamplesTillNextEvent(blockSize);
      if(blockSize > MAX_PROCESS_BLOCK)
            blockSize = MAX_PROCESS_BLOCK;

Post

It comes all down to Data Oriented Design. Some random resources about DOD: ​​​​​​​Taken into the realm of a software synthesizer this means that you abandon the concept of an "object" (for instance, a voice). You no longer have a pointer to a voice object, instead you have a big array with ALL the envelopes, an array with ALL the filters, an array with ALL the oscillators. You update them by running tight loops over these arrays, preferably vectorized, in order to update many elements in parallel. A voice is then a distributed entity, it may allocate oscillators 52 and 53, filter 15 and envelopes 31 and 32, etc, you get the idea. That's about it. 
 
EDIT:
Oh and of course Mike Actons presentation at CppCon: http://www.youtube.com/watch?v=rX0ItVEVjHc

Post

MadBrain wrote:What I do in that kind of case is that I limit the number of samples per processing block in the top loop. Then, all your temp buffers can be of that size (since you're guaranteed that any larger block will be split by the main loop).
Awesome and smart! I'll try this (with 256 as buffer, 64 looks small, isn't?), and than "increments" per block instead of each samples (as write above, which consume lots of CPU for a single increment). I'll let you know in the next days...

Thanks again ;) You are helping me a lot!
Christian Schüler wrote:It comes all down to Data Oriented Design. Some random resources about DOD:
Lots of interesting stuff. I'll check it out. Thanks again!

Post

Christian Schüler wrote:Oh and of course Mike Actons presentation at CppCon:
I saw that one a couple of months back and though it was really interesting. I'd be curious to hear if anyone has any experience converting an OO design to a data oriented and can say anything about what performance gains are possible, if any.

Post

It comes all down to Data Oriented Design.
I've seen the talk before but not really figured out how you'd apply it to a synth. It seems obvious right - get all the data lined up so the processor barely has to break a sweat.

But when I try and apply this to a synth architecture in my mind it starts to get a bit iffy.

So if anyone has done this successfully for all, or parts of a synth, I'd be very interested to know what worked and what didn't
Devious Machines

Post Reply

Return to “DSP and Plugin Development”