KVR Audio

MadBrain · Post by **MadBrain** » Mon May 22, 2017 5:47 pm

Nowhk wrote:

	for (int envelopeIndex = 0; envelopeIndex < ENVELOPES_CONTAINER_NUM_ENVELOPE_MANAGER; envelopeIndex++) {
		Envelope &envelope = pEnvelopesContainer->pEnvelopeManager[envelopeIndex]->mEnvelope;
		VoiceParameters &voiceParameters = envelope.mVoiceParameters[voiceIndex];

This might be a bit slower than expected because it needs to:
- Load a pointer (pEnvelopesContainer)
- (If pEnvelopeManager is a std::vector, it needs to load the pointer to that data)
- Load a pointer from that indexed location (pEnvelopeManager[envelopeIndex])
- Load the indexed voice parameters from there

Generally this kind of indirection (3 or 4 levels instead of 1) is fine for control code (ie stuff that runs per block), but tends to show up on the profiler if it runs per sample... Why not just put all your envelope data inside your voice object?

Another thing is that you could probably have a single 128-sample counter for all of your envelopes instead of one per envelope, this way 127 samples out of 128, no envelope code would run at all!

I'm starting to suspect that there's something slow that you're calling within your per-sample loop and it's just not in the code you're showing us. Something like:

- printf() calls (or cout)
- slow math functions like sqrt() or % or /
- malloc/free or anything that ends up calling that (new/delete, modifying an std::string etc)
- file access
- virtual function calls (ideally your inner per-sample loop should have no function calls at all)
- unaligned objects (for instance if you declare everything with #pragma pack())

Nowhk · Post by **Nowhk** » Mon May 22, 2017 6:55 pm

PurpleSunray wrote: But I can just repeat again to read (and understand) this:
<span class="skimlinks-unlinked">https://en.wikipedia.org/wiki/Instructi ... ning</span>

I'll try to do my best trying to got the whole concept. At the moment is very low level, can't get the whole point associated with my piece of code. Thank you for everything!!!!

MadBrain wrote: I'm starting to suspect that there's something slow that you're calling within your per-sample loop and it's just not in the code you're showing us. Something like:

- printf() calls (or cout)
- slow math functions like sqrt() or % or /
- malloc/free or anything that ends up calling that (new/delete, modifying an std::string etc)
- file access
- virtual function calls (ideally your inner per-sample loop should have no function calls at all)
- unaligned objects (for instance if you declare everything with #pragma pack())

None of these (I guess

). Well, yes: I've many / functions

Tomorrow I'll show you the whole code (which now run at 7% with dynamic calc... before it was 22%... wooaaaww!!!!), so maybe you can help me a little bit more

(if you want; you already did soooo much for me <3).

But first, I need to ask to you another thing: now, I'm iterating voice and than sample block (as you suggested).

What about smooth param? It must run within per-sample block, because each sample will smooth a bit the param (i.e. de-zipper) on every sample.

But if I run (for example) the smooth for every sample for the first voice, than the second voice will use just the smoothed value (nothing more to smooth; or it resume from the last sample of prev voice). Which is wrong.

Do I need a smooth param per-voice array? It doesn't make much sense, since knob and smooth is "global", not per voice. So, how do you usually manage it?

MadBrain · Post by **MadBrain** » Mon May 22, 2017 9:31 pm

In my code, typically only stuff that needs to be de-zippered gets smoothened, generally volumes (including panning), sometimes delay length (in algos like reverbs, choruses and physical models)... most of the time, pitch is not smoothened. Filter params can go both ways (but are generally not smoothened).

Generally I use really simple ramping (linear) and it's not done on the parameters but more like on the final volume, per modulation block. (which catches a lot of cases such as square wave modulator LFOs)

Nowhk · Post by **Nowhk** » Tue May 23, 2017 6:46 am

MadBrain wrote:In my code, typically only stuff that needs to be de-zippered gets smoothened, generally volumes (including panning), sometimes delay length (in algos like reverbs, choruses and physical models)... most of the time, pitch is not smoothened. Filter params can go both ways (but are generally not smoothened).

Generally I use really simple ramping (linear) and it's not done on the parameters but more like on the final volume, per modulation block. (which catches a lot of cases such as square wave modulator LFOs)

Yes, but lets taking your PDR "skeleton":

Code: Select all

void MainIPlug::ProcessDoubleReplacing(double **inputs, double **outputs, int nFrames) {
	double *outputLeft = outputs[0];
	double *outputRight = outputs[1];
	memset(outputLeft, 0, nFrames * sizeof(double)); // outputLeft[i] = inputs[0][i];
	memset(outputRight, 0, nFrames * sizeof(double)); // outputRight[i] = inputs[1][i];

	// sync
	pTempoEngine->Sync();

	// buffer
	int samplesLeft = nFrames;
	while (samplesLeft > 0) {
		// events
		pMidiHandler->Process();
		pVoiceManager->Process();

		int blockSize = pMidiHandler->GetSamplesTillNextEvent(samplesLeft);
		blockSize = pVoiceManager->GetSamplesTillNextEvent(blockSize);

		// voices
		for (int voiceIndex = 0; voiceIndex < PLUG_VOICES_BUFFER_SIZE; voiceIndex++) {
			Voice &voice = pVoiceManager->mVoices[voiceIndex];
			if (!voice.mIsPlaying) {
				continue;
			}

			double *left = outputLeft;
			double *right = outputRight;
			int nbSamples = blockSize;
			while (nbSamples > 0) {
				// envelopes

				// audio
				pOscillator1->Process(voice, left, right);
				
				left++;
				right++;
				
				nbSamples--;
			}

			voice.Increment(blockSize);
		}
		
		// fx (ProcessBlock)

		samplesLeft -= blockSize;
		outputLeft += blockSize;
		outputRight += blockSize;

		pMidiHandler->Flush(blockSize);
	}
}

and let say I have some parameters where knobs must be smoothed (I use this to smooth the knob value; changed by GUI, for example):

Code: Select all

AdvancedParameter **ppParameter = mAdvancedParameters.GetList();
for (int i = 0; i < mAdvancedParameters.GetSize(); i++, ppParameter++) {
	AdvancedParameter *pParameter = *ppParameter;
	pParameter->SmoothControl();
}

How would you smooth it since it will process samples-block per voice? Only first voice will be smoothed (if I place it within while (nbSamples > 0)).

When it reaches the second (and the other following voices), it will introduce zip (since it "jump" to the already smoothed knob "current" value).

Did you store the prev knob value and re-process (re-smooth) it on every voice? I don't think so (waste of re-calculations).
Or did you store them (at the first processed voice) in an array and than just read values on every following voice?

Nowhk · Post by **Nowhk** » Tue May 23, 2017 10:13 am

Anyway, here you go with my whole ("optimized") ProcessDoubleReplacing function:

Code: Select all

void MainIPlug::ProcessDoubleReplacing(double **inputs, double **outputs, int nFrames) {
	double *outputLeft = outputs[0];
	double *outputRight = outputs[1];
	memset(outputLeft, 0, nFrames * sizeof(double)); // outputLeft[i] = inputs[0][i];
	memset(outputRight, 0, nFrames * sizeof(double)); // outputRight[i] = inputs[1][i];

	// sync
	pTempoEngine->Sync();

	// buffer
	int samplesLeft = nFrames;
	while (samplesLeft > 0) {
		// events
		pMidiHandler->Process();
		pVoiceManager->Process();

		int blockSize = samplesLeft;
		blockSize = pMidiHandler->GetSamplesTillNextEvent(blockSize);
		blockSize = pVoiceManager->GetSamplesTillNextEvent(blockSize);

		// voices
		for (int voiceIndex = 0; voiceIndex < PLUG_VOICES_BUFFER_SIZE; voiceIndex++) {
			Voice &voice = pVoiceManager->mVoices[voiceIndex];
			if (!voice.mIsPlaying) {
				continue;
			}

			// init
			if (voice.mSample == 0.0) {
				// new voice restart the envelopes
				for (int envelopeIndex = 0; envelopeIndex < ENVELOPES_CONTAINER_NUM_ENVELOPE_MANAGER; envelopeIndex++) {
					VoiceParameters &voiceParameters = pEnvelopesManager->pEnvelope[envelopeIndex]->mVoiceParameters[voiceIndex];
					voiceParameters.mIsCompleted = false;
					voiceParameters.mControlRateIndex = 0;
					voiceParameters.mBlockStep = gBlockSize;
					voiceParameters.mStep = 0.0;
				}
			}

			double *left = outputLeft;
			double *right = outputRight;
			int nbSamples = blockSize;			
			while (nbSamples > 0) {			
				// smooth param here, even if that's the wrong place and method (since it will work only for first voice)
				AdvancedParameter **ppParameter = mAdvancedParameters.GetList();
				for (int i = 0; i < mAdvancedParameters.GetSize(); i++, ppParameter++) {
					AdvancedParameter *pParameter = *ppParameter;
					pParameter->SmoothControl();
				}

				// voice volume
				voice.mVoiceVolume.Process();

				// envelopes
				pEnvelopesManager->Process(voice);

				// audio
				//pOscillator1->Process(voice, left, right);
				
				left++;
				right++;
				
				nbSamples--;
			}

			voice.Increment(blockSize);
		}
		
		// fx (ProcessBlock)

		samplesLeft -= blockSize;
		outputLeft += blockSize;
		outputRight += blockSize;

		pMidiHandler->Flush(blockSize);
		pVoiceManager->Reset();
	}
}

And here's the envelope "heavy" Process code:

Code: Select all

void EnvelopesManager::Process(Voice &voice) {
	int voiceIndex = voice.mIndex;
	for (int envelopeIndex = 0; envelopeIndex < ENVELOPES_CONTAINER_NUM_ENVELOPE_MANAGER; envelopeIndex++) {
		Envelope &envelope = *pEnvelope[envelopeIndex];
		VoiceParameters &voiceParameters = envelope.mVoiceParameters[voiceIndex];

		// no loop/sustain mode ended cycle
		if (voiceParameters.mIsCompleted) {
			continue;
		}

		// control rate
		if (voiceParameters.mControlRateIndex-- == 0) {
			voiceParameters.mControlRateIndex = PLUG_CONTROL_RATE - 1;

			// calculate dynamic spline block
			if (voiceParameters.mBlockStep >= gBlockSize) {
				// loop
				if (voiceParameters.mStep >= envelope.mLoopLengthInSamples) {
					if (envelope.mLoopType == EnvelopeLoopType::ENVELOPE_LOOP_TYPE_LOOP || envelope.mLoopType == EnvelopeLoopType::ENVELOPE_LOOP_TYPE_MASTER) {
						voiceParameters.mStep = fmod(voiceParameters.mStep, envelope.mLoopLengthInSamples);
					}
					else {
						// update value
						double value = envelope.mAmps[envelope.mLoopPointIndex];
						value = envelope.mIsEnabled * ((1 + envelope.mIsBipolar) / 2.0 * value + (1 - envelope.mIsBipolar) / 2.0) * envelope.mAmount;
						envelope.mOutputConnector_CV.mPolyValue[voiceIndex] = value;

						voiceParameters.mIsCompleted = true;
						continue;
					}
				}

				// refresh section index
				unsigned int sectionIndex = 0;
				while (sectionIndex < envelope.mNumPoints - 1) {
					if (voiceParameters.mStep >= envelope.mSectionLengths[sectionIndex] && voiceParameters.mStep < envelope.mSectionLengths[sectionIndex + 1]) break;
					sectionIndex++;
				}

				// refresh section step
				unsigned int previousSectionsSteps = 0;
				int i = 0;
				while (i < sectionIndex) {
					previousSectionsSteps += envelope.mSectionLengths[i + 1] - envelope.mSectionLengths[i];
					i++;
				}
				double sectionStep = voiceParameters.mStep - previousSectionsSteps;

				// refresh  block index
				unsigned int blockIndex = (unsigned int)floor(sectionStep / gBlockSize);

				// refresh  section length
				double sectionLength = envelope.mSectionLengths[sectionIndex + 1] - envelope.mSectionLengths[sectionIndex];

				// refresh p0/p1
				int numBlocks = (int)(sectionLength / gBlockSize);
				double numBlocksFraction = 1.0 / numBlocks;
				double pos0 = blockIndex * numBlocksFraction;
				double pos1 = (blockIndex + 1) * numBlocksFraction;
				double a = 1.0 - (1.0 / envelope.mTensions[sectionIndex]);
				double p0 = pos0 / (pos0 + a * (pos0 - 1.0));
				double p1 = pos1 / (pos1 + a * (pos1 - 1.0));

				// refresh block start/end amp
				double sectionStartAmp = envelope.mAmps[sectionIndex];
				double sectionEndAmp = envelope.mAmps[sectionIndex + 1];
				double sectionDeltaAmp = sectionEndAmp - sectionStartAmp;
				voiceParameters.mBlockStartAmp = sectionStartAmp + p0 * sectionDeltaAmp;
				double blockEndAmp = sectionStartAmp + p1 * sectionDeltaAmp;

				// refresh block fraction/step
				voiceParameters.mBlockFraction = (blockEndAmp - voiceParameters.mBlockStartAmp) * (1.0 / gBlockSize);
				voiceParameters.mBlockStep = fmod(voiceParameters.mBlockStep, gBlockSize);
			}

			// update value
			double value = voiceParameters.mBlockStartAmp + (voiceParameters.mBlockStep * voiceParameters.mBlockFraction);
			value = envelope.mIsEnabled * ((1 + envelope.mIsBipolar) / 2.0 * value + (1 - envelope.mIsBipolar) / 2.0) * envelope.mAmount;
			envelope.mOutputConnector_CV.mPolyValue[voiceIndex] = value;
		}

		// next phase
		voiceParameters.mBlockStep += envelope.mRate;
		voiceParameters.mStep += envelope.mRate;
	}
}

I'm about 6% right now (without audio/osc; only 10 envelopes running on 16 voices simultaneously).

Do you see any improvements?

S0lo · Post by **S0lo** » Tue May 23, 2017 10:34 am

Here is a trick I some times do to discover whats the bottle neck in a piece of code.

Increase the number of voices to a ridiculous amount, like for example 100 voices, while having only one envelope. How does that affect the CPU ?

Now increase the number of envelopes to a ridiculous amount, say 100 envelopes, while playing only one voice (if thats possible). again How does that affect the CPU ?

The thing that worsened the CPU the most, is probably the part of the code that MAY need to be optimized first.

I hope you got my point.

Nowhk · Post by **Nowhk** » Tue May 23, 2017 10:40 am

S0lo wrote:Here is a trick I some times do to discover whats the bottle neck in a piece of code.

Increase the number of voices to a ridiculous amount, like for example 100 voices, while having only one envelope. How does that affect the CPU ?

Now increase the number of envelopes to a ridiculous amount, say 100 envelopes, while playing only one voice (if thats possible). again How does that affect the CPU ?

The thing that worsened the CPU the most, is probably the part of the code that MAY need to be optimized first.

I hope you got my point.

Well, here's more easy than this test: if I comment this line:

Code: Select all

pEnvelopesManager->Process(voice);

CPU drop to 1 or 2%. That's pretty obvious (watching the math within the code) that the "bottleneck" here is processing the envelopes

S0lo · Post by **S0lo** » Tue May 23, 2017 10:48 am

Nowhk wrote:
S0lo wrote:Here is a trick I some times do to discover whats the bottle neck in a piece of code.

Increase the number of voices to a ridiculous amount, like for example 100 voices, while having only one envelope. How does that affect the CPU ?

Now increase the number of envelopes to a ridiculous amount, say 100 envelopes, while playing only one voice (if thats possible). again How does that affect the CPU ?

The thing that worsened the CPU the most, is probably the part of the code that MAY need to be optimized first.

I hope you got my point.
Well, here's more easy than this test: if I comment this line:
Code: Select all
pEnvelopesManager->Process(voice);
CPU drop to 1 or 2%. That's pretty obvious (watching the math within the code) that the "bottleneck" here is processing the envelopes

I haven't looked at the code in detail honestly. Is it 10 EGs per voice? or 10 EGs in total ?

Nowhk · Post by **Nowhk** » Tue May 23, 2017 10:52 am

S0lo wrote:
Nowhk wrote:
S0lo wrote:Here is a trick I some times do to discover whats the bottle neck in a piece of code.

Increase the number of voices to a ridiculous amount, like for example 100 voices, while having only one envelope. How does that affect the CPU ?

Now increase the number of envelopes to a ridiculous amount, say 100 envelopes, while playing only one voice (if thats possible). again How does that affect the CPU ?

The thing that worsened the CPU the most, is probably the part of the code that MAY need to be optimized first.

I hope you got my point.
Well, here's more easy than this test: if I comment this line:
Code: Select all
pEnvelopesManager->Process(voice);
CPU drop to 1 or 2%. That's pretty obvious (watching the math within the code) that the "bottleneck" here is processing the envelopes
I haven't looked at the code in detail honestly. Is it 10 EGs per voice? or 10 EGs in total ?

It's 10 different envelope per voice, using 16 voices. So an amount of 160 envelope...

S0lo · Post by **S0lo** » Tue May 23, 2017 10:54 am

Are those 10 EGs multistage (has to be played in serial). Or are they just mixed to produce one complex EG?

Nowhk · Post by **Nowhk** » Tue May 23, 2017 10:56 am

S0lo wrote:Are those 10 EGs multistage (has to be played in serial). Or are they just mixed to produce one complex EG?

Not sure what you mean. They are indipendent, per voice. I can link them to whatever I want within my plug i.e. env 1 to osc 1 pitch, env 2 to filter cutoff; they will act "per voice".

S0lo · Post by **S0lo** » Tue May 23, 2017 11:02 am

hmm, got you. You have a mass situation here.

Question. Will the user be ALWAYS wanting to use the 10 EGs at all times (according to what you doing with the plug)?

If not, how about processing only the ones he is actually using? Or are you already doing that?

Nowhk · Post by **Nowhk** » Tue May 23, 2017 11:26 am

S0lo wrote:hmm, got you. You have a mass situation here.

Question. Will the user be ALWAYS wanting to use the 10 EGs at all times (according to what you doing with the plug)?

If not, how about processing only the ones he is actually using? Or are you already doing that?

Yeah of course, the system will use only the linked envelopes, only on active voices... I'm testing the worst case

S0lo · Post by **S0lo** » Tue May 23, 2017 11:32 am

Well then, IMO your worst case is good!!.

Later on you could multi thread the voices, which should give you very good performance.

Nowhk · Post by **Nowhk** » Tue May 23, 2017 11:46 am

S0lo wrote:Well then, IMO your worst case is good!!.

Later on you could multi thread the voices, which should give you very good performance.

Its the worst case doing nothing

Only envelopes...

How much CPU would take 10 Envelopes per voice? It seems a lot...