Login / Register  0 items | $0.00 NewWhat is KVR? Submit News Advertise
PurpleSunray
KVRian
 
626 posts since 13 Mar, 2012

Postby PurpleSunray; Thu May 18, 2017 3:35 am Re: How much CPU would take 10 Envelopes per voice? It seems a lot...

@Nowhk:
Hm, I can't see much of optimization here to be honest.

Couple of things that come to my mind:

Code: Select all
// events
pMidiHandler->Process();
pVoiceManager->Process();

int blockSize = samplesLeft;
blockSize = pMidiHandler->GetSamplesTillNextEvent(blockSize);

Does this need to happen on every sample?
I assume pMidiHandler->Process(); pVoiceManager->Process(); pick up new midi notes and trigger the OSC.
Could this happen at control rate?
If yes, you can skip that GetSamplesTillNextEvent as well (block size derives from control rate - if your control rate is 1/16 of sample rate, the block size is 16 frames .. fix len, fixed intervals).
Also, does this need to happen on the signal processing code path?
Don't know what the pMidiHandler->Process() is doing, but if it processes MIDI messages - the signal processing path might not be the right place to do that.
Same for Flush at the end, don't what it does.
So let assume you move midi handling and voice trigger out of that loop, and work with a control rate on triggering voices.

Next would be to reverse the loops.
Process a voice at once, not a sample at once.

To give some rough idea how this could look like instead:
Code: Select all
void MainIPlug::ProcessDoubleReplacing(double **inputs, double **outputs, int nFrames) {
   double *outputLeft = outputs[0];
   double *outputRight = outputs[1];
   double *inputLeft = inputs[0];
   double *inputRight = inputs[1];
   //
   // Init outputs with 1.0f
   // We will use outputLeft to frist accumulate envelopes into it, so it should be 1.0f by default
   for (int i = 0; i < nFrames; i++) {
      outputLeft[i] = outputRight[i] = 1.0f;
   }
   //
   // Process voices
   for (int voiceIndex = 0; voiceIndex < PLUG_VOICES_BUFFER_SIZE; voiceIndex++)
   {
      Voice &voice = pVoiceManager->mVoices[voiceIndex];
      if (!voice.mIsPlaying) {
         continue;
      }
      //
      // Collect envelopes
      for (int envelopeIndex = 0; envelopeIndex < ENVELOPES_CONTAINER_NUM_ENVELOPE_MANAGER; envelopeIndex++)
      {
         Envelope &envelope = pEnvelopesContainer->pEnvelopeManager[envelopeIndex]->mEnvelope;
         if (voice.mSample == 0.0) {
            envelope.Reset(voice);
         }
         // Envelope.Apply will multiply the Envelope values onto the outputLeft array.
         // so after that "Collect envelopes" loop is finished, outputLeft has the product of all Envelopes in it.
         envelope.Apply(outputLeft, voice.mSample, nFrames);
      }
      //
      // Apply envelopes
      for (int i = 0; i < nFrames; i++)
      {
         double env_mod = outputLeft[i];
         outputLeft[i] = inputLeft[i] * env_mod;
         outputRight[i] = inputRight[i] * env_mod;
      }
   }
}


Lots of little loops with almost no branching in it.

About the void Envelope::Process(Voice &voice) .. :o :o
Need some time first to understand what this actually does on the signal
Last edited by PurpleSunray on Thu May 18, 2017 3:59 am, edited 3 times in total.
User avatar
S0lo
KVRist
 
459 posts since 31 Dec, 2008

Postby S0lo; Thu May 18, 2017 3:38 am Re: How much CPU would take 10 Envelopes per voice? It seems a lot...

@Nowhk

Every function call you do, every variable assignment, memory access, pointer access, loop, repeating if that doesn't change it's outcome... etc is going to take a small byte of your CPU. At the end you loose a bulk.

The way I personally do high performance programming, is to lift off some of the restrictions imposed by modularity, object orientedness, so called good programming practice (that is not always good any way). And concentrate only on performance as a primary target. All other coding style objectives should come as secondary.

A few things to consider

1. Access memory in a serial a fashion, not a random fashion. The more random, the more cache misses you will have, the slower your program becomes.
2. Do minimum possible functions calls, with minimum possible passing of parameters. I personally avoid almost all function calls in ProcessReplacing() unless it's absolutely unavoidable. And don't depend on inline keyword, the compiler may very much ignore it. Yes the compiler knows allot, but not every thing.
3. Use the disassembler in your IDE to inspect the code. Not the debug code, the release code.

Finally, optimization can some times be counter intuitive, you do some thing that you really think should improve. But the result may be degrading, because of other complex reasons that you may have overlooked.
Last edited by S0lo on Thu May 18, 2017 3:49 am, edited 2 times in total.
PurpleSunray
KVRian
 
626 posts since 13 Mar, 2012

Postby PurpleSunray; Thu May 18, 2017 3:42 am Re: How much CPU would take 10 Envelopes per voice? It seems a lot...

camsr wrote:Can someone explain what an if statement branching to a continue directive is doing? :)

? then the jump address points to begining of the loop, instead to somewhere in the middle of the loop.
camsr
KVRAF
 
6659 posts since 16 Feb, 2005

Postby camsr; Thu May 18, 2017 4:05 am Re: How much CPU would take 10 Envelopes per voice? It seems a lot...

The use of multiple levels of indirection also incurs a slight overhead. I like to keep a pointer table, avoiding OO design for example, where it is known in advance where the access will occur.
Image
PurpleSunray
KVRian
 
626 posts since 13 Mar, 2012

Postby PurpleSunray; Thu May 18, 2017 4:21 am Re: How much CPU would take 10 Envelopes per voice? It seems a lot...

@Nowhk:
About the Envelope::Process - it looks really "chaotic" to me. Lots of things done that do not need to be done there I think.

I would propably change the Envelope class to work similar to something like that:
Code: Select all
double mEnvelope[ENVELOPE_LEN]; // <- that's your envelope curve

void Envelope::Generate()
{
   // that function is supposed to re-created mEnvelope with new settings
   // you want to call that if envelope changes, like if user changes the Attack knob.
}

void Envelope::Apply(double* env, int pos, int count)
{
   for (int i = 0; i < count; i++) {
      int env_pos = pos + i:
      env[i] *= mEnvelope[env_pos >= ENVELOPE_LEN ? ENVELOPE_LEN-1 : env_pos ];
   }
}


So instead of running a 60-line-function on signal processing code path, it is only about to:
Code: Select all
for (int i = 0; i < count; i++) {
      int env_pos = pos + i:
      env[i] *= mEnvelope[env_pos >= ENVELOPE_LEN ? ENVELOPE_LEN-1: env_pos ];
   }
User avatar
Nowhk
KVRian
 
675 posts since 2 Oct, 2013

Postby Nowhk; Thu May 18, 2017 4:29 am Re: How much CPU would take 10 Envelopes per voice? It seems a lot...

PurpleSunray wrote:@Nowhk:
About the Envelope::Process - it looks really "chaotic" to me. Lots of things done that do not need to be done there I think.

I would propably change the Envelope class to work similar to something like that:
Code: Select all
double mEnvelope[ENVELOPE_LEN]; // <- that's your envelope curve

void Envelope::Generate()
{
   // that function is supposed to re-created mEnvelope with new settings
   // you want to call that if envelope changes, like if user changes the Attack knob.
}

void Envelope::Apply(double* env, int pos, int count)
{
   for (int i = 0; i < count; i++) {
      int env_pos = pos + i:
      env[i] *= mEnvelope[env_pos  > ENVELOPE_LEN ? ENVELOPE_LEN : env_pos ];
   }
}


So instead of running a 60-line-function on signal processing code path, it is only about to:
Code: Select all
for (int i = 0; i < count; i++) {
      int env_pos = pos + i:
      env[i] *= mEnvelope[env_pos  > ENVELOPE_LEN ? ENVELOPE_LEN : env_pos ];
   }

The fact is: envelope points can be automated. I can Generate that envelope-wave-table as you suggested, and just read/process sample by smaple the curve (with a fixed rate/speed), but what if I'm modulating some point? (which are "stretching/fitted"; so one point will change the others).

Even if I update at control rate (let say every 16 samples), this means that every 16 samples I need to regenerate ALL envelope samples. Which end up with a massive amount of CPU.

Let say the envelope curve is 1 secords lenght: this means iterate and calculate 44100 samples (at sr 44100).
PurpleSunray
KVRian
 
626 posts since 13 Mar, 2012

Postby PurpleSunray; Thu May 18, 2017 4:41 am Re: How much CPU would take 10 Envelopes per voice? It seems a lot...

Nowhk wrote:The fact is: envelope points can be automated. I can Generate that envelope-wave-table as you suggested, and just read/process sample by smaple the curve (with a fixed rate/speed), but what if I'm modulating some point? (which are "stretching"/fit; so one point will change the others).

If I update at control rate (let say every 16 samples), this means that every 16 samples I need to regenerate ALL envelope samples. Which end up with a massive amount of CPU.

Ok, if that envelope is going to get updates with ever new block you also need to refresh it every new block (was thinking about a typical Amp envelope, that stays same most of the time, unless user touches knobs).
Maybe worth thinking about: would it make sense to have an Envelope::Generate that can also refresh a subset of the curve? Instead of calcuating it all on-demand? (idk)
Nowhk wrote:Let say the envelope curve is 1 secords lenght: this means iterate and calculate 44100 samples (at sr 44100).

Not necessarily.
My envelopes (well not all, but most of them) have a fixed size.
Like, they have 4096 sample values. This is independent from sample rate or envelope len. Generate() will always generate 4096 values that represent this envelope curve.
On the Apply() this is interpolated to target sample rate.
You need to know audio sample rate, and len of the envelope in samples. If you have that, you can simply interpolate the 4096 values to whatever len you need (my envelopes use linear interpolation, never had the feeling the envs need cubic/quadric or whatever interpolation).

This is what optimizing on DSP is often about: reduce processing complexity where it is not absolutely needed. Like I do not need 24kHz Nyquist for my envelopes, so no point on calculating it at 48khz. I'm fine with calcualting 4096 envelope values and than interpolate to the sample rate when applying it.
User avatar
Nowhk
KVRian
 
675 posts since 2 Oct, 2013

Postby Nowhk; Thu May 18, 2017 5:15 am Re: How much CPU would take 10 Envelopes per voice? It seems a lot...

PurpleSunray wrote:Maybe worth thinking about: would it make sense to have an Envelope::Generate that can also refresh a subset of the curve? Instead of calcuating it all on-demand? (idk)

Its stretched fit: a single change will propagate to all other points/splines. So I need to recaluclate all points.

PurpleSunray wrote:Not necessarily.
My envelopes (well not all, but most of them) have a fixed size.
Like, they have 4096 sample values. This is independent from sample rate or envelope len. Generate() will always generate 4096 values that represent this envelope curve.

Still I think it's not worth to regenerate 4096 samples at block/control rate (if I have continuous automations). What do you think?
PurpleSunray
KVRian
 
626 posts since 13 Mar, 2012

Postby PurpleSunray; Thu May 18, 2017 6:01 am Re: How much CPU would take 10 Envelopes per voice? It seems a lot...

Nowhk wrote:Still I think it's not worth to regenerate 4096 samples at block/control rate (if I have continuous automations). What do you think?

Well, you want to optimize that code :D
I think, generating 4096 samples instead of 48000 samples, is about 11.7 times faster (aka optmized by >1000%).

The whole process of calculating the envs at a different rate:
Lets use the easy case and assume you decided to use fix sized envs of 2048 samples.
First you need a function to generate the envelope curve, lets call it Generate().
You know that this envelope will update very often and it is very likley that there is an update before a full cycle is complete. So you want a Generate() function that can also update only a subset of that curve.
If, for whatever reason, there is no update to the env curve until the cycle is complete, appyling it the second time does not need re-generation (you still have the curve values on mEnvelope).

Now on the processing, the following happens:

1) Have envelope paramters changed? (I would add a kind of "dirty" flag, that is true when parameters have changed, until the curve is re-generated completely). If yes, call Envelope::Generate to refresh the part of the envelope curve you want to process now.

Lets assume you want to process 4800 audio samples(100ms).
Your audio sample rate is 48000 and the envelope len is 48000 samples (1000ms).
Put differntly: Env curve has 2048 values and is 1000ms long. So 100ms on this curve are 204 values.
Means, to process 4800 audio samples (100ms), you need to (re-)generate 204 envelope samples (100ms) (they will interpolate from 204 to 4800 on the Apply()).

2) Upsample to audio rate
This is a simple interpolation on the mEnvelope.
Next 204 values on the envelope curve have just been updated by Generate(), or they are still valid.
Interpolation will upsample it to audio rate and make 4800 samples of it.

3) Apply to audio signal
Now you have an envelope signal and an audio signal, both at same rate. So appyling it is a audio_sample*env_sample multiplication.


If possible somehow , only 3) should be part of the audio signal processing path.
If 2 and 1 is also part of it, you will end up with a lof of if's (that's the link back to the branching topic :D)
User avatar
Nowhk
KVRian
 
675 posts since 2 Oct, 2013

Postby Nowhk; Thu May 18, 2017 6:27 am Re: How much CPU would take 10 Envelopes per voice? It seems a lot...

PurpleSunray wrote:Lets assume you want to process 4800 audio samples(100ms).
Your audio sample rate is 48000 and the envelope len is 48000 samples (1000ms).
Put differntly: Env curve has 2048 values and is 1000ms long. So 100ms on this curve are 204 values.
Means, to process 4800 audio samples (100ms), you need to (re-)generate 204 envelope samples (100ms) (they will interpolate from 204 to 4800 on the Apply()).

It exactly what I do now withint the Process() function. The fact is that I do it always... because... well... mod is continuous, so I need and want a real-time update.

PurpleSunray wrote:You know that this envelope will update very often and it is very likley that there is an update before a full cycle is complete. So you want a Generate() function that can also update only a subset of that curve.

I know I will modulate it constantly (every 128 samples, ever) for what I'm doing, that's why I placed my code to refresh "that part of curve" (watch at // refresh at block size code). There is where I refresh the curve.

You are right that if I don't have any modulation, it calculate always the same amount of data (so I can "cache" it within a table). And you will be right. But it will modulate every 128 samples "always". That's why I always calculate it (modulations will be always active, for some reasons :P).

Take this as "I'm evalutating the worst case".
I'm trying to optimize the worst case (which will actually be 90% of my cases).

Meanwhile, I changed a bit of code (removing for the moment the part where I refresh that part of curve), trying to use cache and removing if/else as possible, ending with:

Code: Select all
bool isFirstVoiceSample = voice.mSample == 0.0 ? true : false;
int nbSamples = blockSize;
while (nbSamples > 0) {
   // envelopes
   for (int envelopeIndex = 0; envelopeIndex < ENVELOPES_CONTAINER_NUM_ENVELOPE_MANAGER; envelopeIndex++) {
      Envelope &envelope = pEnvelopesContainer->pEnvelopeManager[envelopeIndex]->mEnvelope;
      VoiceParameters &voiceParameters = envelope.mVoiceParameters[voiceIndex];
      
      // new voice restart envelope
      if (isFirstVoiceSample) {
         voiceParameters.mBlockStep = gBlockSize;
         voiceParameters.mStep = 0.0;
      }

      // update value
      double value = voiceParameters.mBlockStartAmp + (voiceParameters.mBlockStep * voiceParameters.mBlockFraction);
      value = envelope.mIsEnabled * ((1 + envelope.mIsBipolar) / 2.0 * value + (1 - envelope.mIsBipolar) / 2.0) * envelope.mAmount;
      envelope.mOutputConnector_CV.mPolyValue[voiceIndex] = value;

      // next phase
      voiceParameters.mBlockStep += envelope.mRate;
      voiceParameters.mStep += envelope.mRate;
   }

   nbSamples--;
}

If the same "result" before executed with 7%, now I'm at 4% (only one "if" at the moment, not sure if I could really avoid it, but as S0lo suggested, since its executed every gBlockSize = 128 samples, it maybe is irrelevant).
Which is nice.

But for what it does, I guess it can be improved further... not sure where now heheh
Last edited by Nowhk on Thu May 18, 2017 7:37 am, edited 2 times in total.
PurpleSunray
KVRian
 
626 posts since 13 Mar, 2012

Postby PurpleSunray; Thu May 18, 2017 6:47 am Re: How much CPU would take 10 Envelopes per voice? It seems a lot...

Nowhk wrote:It exactly what I do now withint the Process() function. The fact is that I do it always... because... well... mod is continuous, so I need and want a real-time update.

Ok, than I missed that (that's what i meant with "chaotic" :P).
I saw the if (voiceParameters.mControlRateIndex-- == 0) but haven't look at the number-crunching code below. Was looking for the interpolation from env to audio rate, but it's proably in there.

Nowhk wrote:I know I will modulate it constantly (every 128 samples, ever) for what I'm doing, that's why I placed my code to refresh "that part of curve" (watch at // refresh at block size code). There is where I refresh the curve.

Do be honest, the more I heart about the more I think Amp envelopes are the wrong solution to your problem.
An Amp envelope that "modulates constantly, every 128 samples" sound more like an Amplitude Modulation (AM) to me (an AM would be part of the OSC, not of the Amp).

If the same "result" before executed with 7%, now I'm at 4%

That's a quite nice improvement already :) :)
User avatar
Nowhk
KVRian
 
675 posts since 2 Oct, 2013

Postby Nowhk; Thu May 18, 2017 7:14 am Re: How much CPU would take 10 Envelopes per voice? It seems a lot...

PurpleSunray wrote:Do be honest, the more I heart about the more I think Amp envelopes are the wrong solution to your problem.
An Amp envelope that "modulates constantly, every 128 samples" sound more like an Amplitude Modulation (AM) to me (an AM would be part of the OSC, not of the Amp).

I never talk about Amp mod, you did :) My envelopes will modulate every kind of stuff (pitch, filter, and whatever I need such as modular synths cable/pin). The core/key point is that they must be automatable, that's all ;)
PurpleSunray
KVRian
 
626 posts since 13 Mar, 2012

Postby PurpleSunray; Thu May 18, 2017 7:20 am Re: How much CPU would take 10 Envelopes per voice? It seems a lot...

Nowhk wrote:
PurpleSunray wrote:Do be honest, the more I heart about the more I think Amp envelopes are the wrong solution to your problem.
An Amp envelope that "modulates constantly, every 128 samples" sound more like an Amplitude Modulation (AM) to me (an AM would be part of the OSC, not of the Amp).

I never talk about Amp mod, you did :) My envelopes will modulate every kind of stuff (pitch, filter, and whatever I need such as modular synths cable/pin). The core/key point is that they must be automatable, that's all ;)


Yeah, but generic code is the vise versa of optimized code.
Making a fast envelope that can do all, from a slow filter cutoff modulation up to emulating an AM/FM will be hard.
Writng code that is prefect for "post-processing-kind-of-modultion" and other code that is for AM/FM will be easier ;)
User avatar
Nowhk
KVRian
 
675 posts since 2 Oct, 2013

Postby Nowhk; Thu May 18, 2017 7:32 am Re: How much CPU would take 10 Envelopes per voice? It seems a lot...

PurpleSunray wrote:Yeah, but generic code is the vise versa of optimized code.
Making a fast envelope that can do all, from a slow filter cutoff modulation up to emulating an AM/FM will be hard.
Writng code that is prefect for "post-processing-kind-of-modultion" and other code that is for AM/FM will be easier ;)

Not sure why you talk about "fast", AM or FM envelopes. Its a generic LFO (weird splines, thats why I call it envelope), which run at slow rate (10hz max) and it is not really "audio-sample based".

Every 128 samples I just calculate the next 128-point ahead (looks under "// refresh at block size" code), than I just interpolate between these values:

Code: Select all
double value = voiceParameters.mBlockStartAmp + (voiceParameters.mBlockStep * voiceParameters.mBlockFraction);

and I scale it by a fixed amount and by polarity (or return 0.0 if its not enabled):

Code: Select all
value = envelope.mIsEnabled * ((1 + envelope.mIsBipolar) / 2.0 * value + (1 - envelope.mIsBipolar) / 2.0) * envelope.mAmount;

I can place a control rate that will calculate "value", but this will introduce an "if" (and for what I see right now, with control rate 8, nothing change).
PurpleSunray
KVRian
 
626 posts since 13 Mar, 2012

Postby PurpleSunray; Thu May 18, 2017 7:43 am Re: How much CPU would take 10 Envelopes per voice? It seems a lot...

Sry, with "fast" in mean computing time on CPU not speed of of the envelope signal.
Mentioned it because using such a "generate curve into cache if changes" works perfect for user-driven envelops. But will kill perforamce on your AM usecase, or make it difficult to implement.
If you wanna cover both usecases in the same Envelope class, it is probaly gonna be a compromise. But you dont want that. You rather want two Envelope classes, that do same to outside, but internally they have different implemtations, optimized for a specific usecase.
PreviousNext

Moderator: Moderators (Main)

Return to DSP and Plug-in Development

Who is online

Users browsing this forum: CCBot (commoncrawl)