Login / Register  0 items | $0.00 NewWhat is KVR? Submit News Advertise
Nowhk
KVRian
 
576 posts since 2 Oct, 2013

Postby Nowhk; Thu May 18, 2017 7:50 am Re: How much CPU would take 10 Envelopes per voice? It seems a lot...

PurpleSunray wrote:Mentioned it because using such a "generate curve into cache if changes" works perfect for user-driven envelops. But will kill perforamce on your AM usecase, or make it difficult to implement.

I don't need AM usecase, at all :) But also I can't cache the curve as well, because its going to change every 128 samples. So why should I cache the generate curve samples if than I need to regenerate them? I just "generate" every 128. And in between I interpolate. That's what I'm doing... but thats require if/else, which seems to very slow down the whole task :(
PurpleSunray
KVRian
 
587 posts since 13 Mar, 2012

Postby PurpleSunray; Thu May 18, 2017 8:27 am Re: How much CPU would take 10 Envelopes per voice? It seems a lot...

Nowhk wrote:I don't need AM usecase, at all :) But also I can't cache the curve as well, because its going to change every 128 samples. So why should I cache the generate curve samples if than I need to regenerate them? I just "generate" every 128. And in between I interpolate. That's what I'm doing... but thats require if/else, which seems to very slow down the whole task :(


Well, that is exactly my point.

A normal envelope does not change with every chunk that comes out from the OSC.
They change when an event occurs, like the user pushes a button. A MIDI event comming it, a timer event triggering, whatever.
This is even true for an LFO. It can run a wavetable osc, which is nothing else what we do alreay: have a waveform / curve on a table an then interpolate according to sample rate and frequency. So a "cached envelope" would work prefecly here - it contains an LFO cycle, which will not change unless some event changes the LFO settings. On the change-event handler, you can re-generate the curve (outside of the audio signal code path).

Now for you this i differnt, because your envelope is not driven by any event to update, but is it kind of a side-channel to OSC. Every 128 audio sample you get a modultion sample. Doesn't fit at all into caching model from a above. There is nothing to cache.

So, for envelopes that are driven by events a cache would be cool.
But for "self oscillating envelopes" this is nonsense.
You get my point with the "generic vs optimized code"?
An implementation that handles both, will have "mk.." performance. To reach "wow" performance, you want a CachedEnvelope and a DynamicEnvelope class ;)
That is also one of the solutions to avoid if's.
Instead of having a:
Envelope::Process with an if(mCached), you have:
CachedEnvelope::Process and DynamicEnveloped::Process and no more need for a if(mCached) inside the Process().
Nowhk
KVRian
 
576 posts since 2 Oct, 2013

Postby Nowhk; Thu May 18, 2017 9:15 am Re: How much CPU would take 10 Envelopes per voice? It seems a lot...

Its the point dude :) How can I optimize my "Dynamic" Process call? That's my whole post about. I'm stuck with It... still 10% of CPU...
PurpleSunray
KVRian
 
587 posts since 13 Mar, 2012

Postby PurpleSunray; Thu May 18, 2017 9:22 am Re: How much CPU would take 10 Envelopes per voice? It seems a lot...

ahaha, looks like this is becoming more of a workshop than forum thread xD

Can you explain me how that envelope signal is beeing synthesized?
I assume it is the number-crunching code on Envelope::Process, but don't get what it does :D
Nowhk
KVRian
 
576 posts since 2 Oct, 2013

Postby Nowhk; Thu May 18, 2017 11:44 am Re: How much CPU would take 10 Envelopes per voice? It seems a lot...

PurpleSunray wrote:ahaha, looks like this is becoming more of a workshop than forum thread xD

Hahah, yeah! Its funny. And I'm really happy of your huge (huge) helps. Thanks a loooot!!!

PurpleSunray wrote:Can you explain me how that envelope signal is beeing synthesized?

I do :wink: Here's my void Envelope::Process(Voice &voice) explanation.

Code: Select all
VoiceParameters &voiceParameters = mVoiceParameters[voice.mIndex];
if (voiceParameters.mIsCompleted) {
   return;
}

This first line take the voice parameters (array) which contains the data for that envelope (which is different for each voice). I check so if the bool mIsCompleted is true or not (meaning: if I have a envelope with type "no loop" and I reach the end of the shape, it just need to ignore the following and keep the last/calculated value).

Now, let examine the "CORE" of the envelope calculations.

Code: Select all
// refresh at block size
if (voiceParameters.mBlockStep >= gBlockSize) {
}

This branch is where I calculate the "next" point value. Once 128 samples (my blocksize) have passed (I mean 128 samples of the shape... a chunk; the tick of when I process this of course depends by the rate of processing, because if I'm a 1 hz it is executed every 128 sample; if rate is 2 hz, it will happens every 64 samples, and so on), I calculate, from current block start, the next block value.

Note: the whole envelope shape is divided in sections (defined as be between two "user-gui" points; between two points I have a tension point, which determine the slope of the segment). Each section is so divided in block of 128 samples.

sectionIndex is the index of each section (if I have 10 points, there are so 9 sections). sectionStep is the step (sample) of the current section. sectionLength is the lenght of the section (in samples). Since envelope is divided into blocks, blockIndex is the index of the block of the current sectionIndex at the current sectionStep:

Code: Select all
unsigned int sectionIndex = RefreshSectionIndex(voiceParameters.mStep);
double sectionStep = RefreshSectionStep(sectionIndex, voiceParameters.mStep);
unsigned int blockIndex = RefreshBlockIndex(sectionStep);
double sectionLength = RefreshSectionLength(sectionIndex);

Note that all of these need to be calculated "real time", because if in the meanwhile I move a point (which at the end are the only interface that can be modulated), section/step/block changes (finishing in the last/next point/section).

After this, I use a sort of "homographic function" to determine the start/end point value of the section (by current data, tension, etc), thus the start/end of the current block:

Code: Select all
int numBlocks = (int)(sectionLength / gBlockSize);
double numBlocksFraction = 1.0 / numBlocks;
double pos0 = blockIndex * numBlocksFraction;
double pos1 = (blockIndex + 1) * numBlocksFraction;
double a = 1.0 - (1.0 / mTensions[sectionIndex]);
double p0 = pos0 / (pos0 + a * (pos0 - 1.0));
double p1 = pos1 / (pos1 + a * (pos1 - 1.0));

double sectionStartAmp = mAmps[sectionIndex];
double sectionEndAmp = mAmps[sectionIndex + 1];
double sectionDeltaAmp = sectionEndAmp - sectionStartAmp;
voiceParameters.mBlockStartAmp = sectionStartAmp + p0 * sectionDeltaAmp;
double blockEndAmp = sectionStartAmp + p1 * sectionDeltaAmp;

voiceParameters.mBlockFraction = (blockEndAmp - voiceParameters.mBlockStartAmp) * (1.0 / gBlockSize);
voiceParameters.mBlockStep = fmod(voiceParameters.mBlockStep, gBlockSize);


Once I've them, its easy: I just "interpolate" using the start, the current block step (which depends by rate) and the ending point. "Classic" linear interpolation:

Code: Select all
voiceParameters.mValue = (voiceParameters.mBlockStartAmp + (voiceParameters.mBlockStep * voiceParameters.mBlockFraction));
voiceParameters.mValue = mIsEnabled * ((1 + mIsBipolar) / 2.0 * voiceParameters.mValue + (1 - mIsBipolar) / 2.0) * mAmount;
mOutputConnector_CV.mPolyValue[voice.mIndex] = voiceParameters.mValue;

(This is the optimized version of the one made this afternoon. It also multiply by amount of the envelope, polarity, and just return 0 if the envelope is !mIsEnabled.)

Once calculated, I store the output to a sort of "CV output pin" object (which will be linked to other targets later in the plugin).

Finish! I increment the phase of the envelope:

Code: Select all
voiceParameters.mBlockStep += mRate;
voiceParameters.mStep += mRate;

and I'm done 8)

That's the whole concept. Hope is it more clear now?
PurpleSunray
KVRian
 
587 posts since 13 Mar, 2012

Postby PurpleSunray; Thu May 18, 2017 12:40 pm Re: How much CPU would take 10 Envelopes per voice? It seems a lot...

Nearly :D

There is just one thing I do not get yet, where is this "dynaimc effect" comming in?

Right now I understand it like:
User draws points and gives a tension in between.
This is a curve you can sample as such, like into 2048 values.
If the user changes a point or tension, re-create that curve.

When you want to apply the envelope to audio signal, interpolate.
So that audio processing loop is down to maintaining an index and
float env_val = mEnvelope[i] + mEnvelope[j] * a;

If that curve is "vibrating" (to make an examle for a dynamic effect), i would stack it.
There is still the user-drawn curve that does only change / need re-calc if user changes it.
The virbation is than a little function that multiplcates on top the user-drawn curve.
So on audio signal path you do only a little part of the math to build the env (the "dynamic" part), the user-drawn curve math is not on audio signal code path, but on signal handler for user event.
Nowhk
KVRian
 
576 posts since 2 Oct, 2013

Postby Nowhk; Thu May 18, 2017 1:10 pm Re: How much CPU would take 10 Envelopes per voice? It seems a lot...

PurpleSunray wrote:Nearly :D

There is just one thing I do not get yet, where is this "dynaimc effect" comming in?

Points can be defined this way:

Code: Select all
double mLengths[gMaxNumPoints] = { 0.0, 0.1, 0.3, 0.6, 1.0 }; // x positions
double mAmps[gMaxNumPoints] = { 0.8, -0.5, 0.4, 0.2, 0.2 }; // y positions
double mTensions[gMaxNumPoints - 1] = { 0.5, 0.5, 0.5, 0.5 }; // tensions

This identify (with the math above) 4 splines (5 points). Now let say that some other envelopes will modulate any (even all) of these points (thus, the arrays positions above; in any direction): the splines samples need to be recalculated. It changes a lot... every spline would be affected.

Or am I missing some math "scaling" magic?

PurpleSunray wrote:The virbation is than a little function that multiplcates on top the user-drawn curve.

This would be the magic :D
PurpleSunray
KVRian
 
587 posts since 13 Mar, 2012

Postby PurpleSunray; Thu May 18, 2017 1:22 pm Re: How much CPU would take 10 Envelopes per voice? It seems a lot...

This identify (with the math above) 4 splines (5 points). Now let say that some other envelopes will modulate any (even all) of these points (thus, the arrays positions above; in any direction): the splines samples need to be recalculated. It changes a lot... every spline would be affected.

Ok.. got it now, an envelope can modulate point of another envelope.
Up to now I was thinking only user can modify points. :D

And you are sure there is no way to pre-calc this?
I mean, even the envelope that modifies another envelope will be drawn by the user at point, or?
So when user changes env0, which modulates env1, cloud you calucate an env3? which result of env0->env1 modulation?
env3 is then what you use on the audio processing (pre-calculated curve that only needs interpolation)

Edit:
If there is absolutly no way to reduce math on the audio processing, well, then you are down on code level optimization. But don't except too much from it. Compilers are pretty good on optimizing already, even with hand-crafted SIMD code you don't get more than a couple of percent usually.

Where you can gain most is by finding smarter ways on how you process the signals.
(like: I would aproach that modulation like a signal chain on it is own. There is a modulation signal which you multiply to audio. That is the audio procssing function. Nothing more.
Where is the modulation signal comming from? Oh, that is chain of 10 independently triggered envelope generatos .. each modulating the other.. ending up on dozens of index and pos and slope and tension and .. calucations. Damn, that expodes my CPU. Can I synthezie that in any smarted way? A wavetable osc, where the user basically draws the waveform and mod is about to morph that waveform? No, does not work. Can a gaintable osc do it? or <xyz> osc with that special post-processing, or idk, it just needs to be smarted than calucating a lot of slope-cruves from a couple of points)
Last edited by PurpleSunray on Thu May 18, 2017 2:59 pm, edited 10 times in total.
camsr
KVRAF
 
6602 posts since 16 Feb, 2005

Postby camsr; Thu May 18, 2017 2:22 pm Re: How much CPU would take 10 Envelopes per voice? It seems a lot...

I've been glancing over this thread but I can't seem to find the problem... except for some ugly looking conditionals (I try to avoid nested loops).
Pre-rendering the amp mod envelopes is a bad idea. You should not store what you can calculate, as long as that calculation isn't incredibly long or resource heavy. Simply iterating over the envelope formula (piecewise over x?) should be sufficient even if the parameters are changed mid-iteration. To use more data versus more CPU instructions only wins if the CPU instruction stream is running full-bore through your algorithm, never waiting for caching to occur. Using too much data and too few instructions could be a major bottleneck depending on the algorithm complexity.

Usually you should do everything by the buffer slice, don't do checks inside the process loop, don't call functions, always inline, have everything available before the loop starts, and just expect the loop to use what it's given before it starts. This is the #1 optimization for any loop, because branching can be bad with deep pipeline CPUs.

But I suspect your problem is that there's actually a lot of data-per-sample to deal with, and that overhead is unavoidable. I coded a windowed RMS that calculated the RMS at sample-time from the surrounding 4800 samples, every sample. Only a few lines of code, but CPU time exceeded 60 percent.
Image
Nowhk
KVRian
 
576 posts since 2 Oct, 2013

Postby Nowhk; Thu May 18, 2017 10:54 pm Re: How much CPU would take 10 Envelopes per voice? It seems a lot...

PurpleSunray wrote:And you are sure there is no way to pre-calc this?
I mean, even the envelope that modifies another envelope will be drawn by the user at point, or?
So when user changes env0, which modulates env1, cloud you calucate an env3? which result of env0->env1 modulation?
env3 is then what you use on the audio processing (pre-calculated curve that only needs interpolation)

I guess that's unpredictable. What if env1's rate is itself modulated? This means that env2 cycle will be different every time, so I need anyway to constantly change the wavetable. Or what's if env1 is modulated by external source? (i.e. DAW automation)? I can't know. Or even "worse", what if env1 params are modulated by other "random" lfo shape? I don't see too much solutions with pre-calc...

camsr wrote:Usually you should do everything by the buffer slice, don't do checks inside the process loop, don't call functions, always inline, have everything available before the loop starts, and just expect the loop to use what it's given before it starts. This is the #1 optimization for any loop, because branching can be bad with deep pipeline CPUs.

But some sounds inevitable. Looks at this piece of code (I'm within a voice iteration):

Code: Select all
bool isFirstVoiceSample = voice.mSample == 0.0 ? true : false;
int nbSamples = blockSize;
while (nbSamples > 0) {
   // envelopes
   for (int envelopeIndex = 0; envelopeIndex < ENVELOPES_CONTAINER_NUM_ENVELOPE_MANAGER; envelopeIndex++) {
      Envelope &envelope = pEnvelopesContainer->pEnvelopeManager[envelopeIndex]->mEnvelope;
      VoiceParameters &voiceParameters = envelope.mVoiceParameters[voiceIndex];

                // new voice will restart the envelope
      if (isFirstVoiceSample) {
         voiceParameters.mBlockStep = gBlockSize;
         voiceParameters.mStep = 0.0;
      }

      // calculate dynamic spline block
      if (voiceParameters.mBlockStep >= gBlockSize) {
      }    

      // update value

      // next phase
   }

   nbSamples--;
}

(where, once I need a new chunk of samples, I updated that part of spline; and I check if I need to restart the envelope due to new voice).

How would you get rid of this? Its specific for voice && sample...
PurpleSunray
KVRian
 
587 posts since 13 Mar, 2012

Postby PurpleSunray; Fri May 19, 2017 12:16 am Re: How much CPU would take 10 Envelopes per voice? It seems a lot...

Hm, I also don't see how to optimize this any further on algorithm level.
The requirement is that env must handle any kind of input as a modulation input for its own parameters.
It's a very generic requirement, which will require a generic implemenation.

So unless some math geek comes up with a genuis idea how simplify math on this - I have no more idea. What you want to do involves a lot of number crunching, so it will require at lot of CPU.

I think that is also the reason why I have not seen such a feature so far :D
In LFOTool you can also create custom envelope curves via points and shapes/tensions.
You can also modulate that envelope, but modulation params are "swing", "phase", "PWM", "smooth"...
So is all stuff that modulates the given curve, it is not about to re-create/calc the curve on every modulation tick (LFOtool causes almost no CPU load at all, even with all 12 envs enabled.. because they restricted what they can do to a set of use-cases that can be optimized ;) ).
Nowhk
KVRian
 
576 posts since 2 Oct, 2013

Postby Nowhk; Fri May 19, 2017 12:36 am Re: How much CPU would take 10 Envelopes per voice? It seems a lot...

PurpleSunray wrote:I think that is also the reason why I have not seen such a feature so far :D
In LFOTool you can also create custom envelope curves via points and shapes/tensions.
You can also modulate that envelope, but modulation params are "swing", "phase", "PWM", "smooth"...
So is all stuff that modulates the given curve, it is not about to re-create/calc the curve on every modulation tick (LFOtool causes no CPU load at all, even with all 12 envs enabled.. because they restricted what they can do to a set of use-cases that can be optimized ;) ).

Not at all dude :) Also on LFOtool you can automate points as well, and as you want (by external modulation for example).

Look here. Two points mod + 1 tension mod, on a single envelope, with global audio path (since there is no voice concept) and its already at 4% of CPU (modulating nothing except amplitude, because other amounts are 0; if you multiply this by 12 = n envelopes and 16 = n voices, well... its huge):

http://www.fastswf.com/MCucIso

I don't think it can use any kind of pre-calc here (it does exactly what I'm doing with my plug, except that on mine the points can also "stretch" the prev/next ones).
PurpleSunray
KVRian
 
587 posts since 13 Mar, 2012

Postby PurpleSunray; Fri May 19, 2017 1:15 am Re: How much CPU would take 10 Envelopes per voice? It seems a lot...

ahah, ok, than I have just not discovered that LFOtool feature so far :D

its already at 4% of CPU (modulating nothing except amplitude, because other amounts are 0; if you multiply this by 12 = n envelopes and 16 = n voices, well... its huge)

Well, than this somehow shows that they also haven't found a ground-breaking idea to recude computing complexity but have the same problem like you have.
So you proably have to live with it - maybe do some code-level optimization to get improve for some %, but it won't bring any major performance jumps for sure. But the good thing is.. if you managed it come up with a better concept of doing this, you have a very unique feature nobody else has :D (I run 100 envelopes on 20 vioces with 5% CPU load, HA! lol)
Nowhk
KVRian
 
576 posts since 2 Oct, 2013

Postby Nowhk; Fri May 19, 2017 1:21 am Re: How much CPU would take 10 Envelopes per voice? It seems a lot...

PurpleSunray wrote:ahah, ok, than I have just not discovered that LFOtool feature so far :D

No matter ;)

PurpleSunray wrote:Well, than this somehow shows that they also haven't found a ground-breaking idea to recude computing complexity but have the same problem like you have.
So you proably have to live with it - maybe do some code-level optimization to get improve for some %, but it won't bring any major performance jumps for sure. But the good thing is.. if you managed it come up with a better concept of doing this, you have a very unique feature nobody else has :D (I run 100 envelopes on 20 vioces with 5% CPU load, HA! lol)

;) I see!

So, what about removing those "if" within the sample block iteration? How would you do this? I think it could really increase performances here.

Is there a way to substitute an if statement that does somethings/function (like check voiceParameters.mBlockStep >= gBlockSize or isFirstVoiceSample above) with faster code?

I read about "binary look-up table", somethings like this:

Code: Select all
// conditional
a = b ? c : d;

// ... which is...
if (b) a = c;
else a = d;

// to this
static const tipo lookup_table[] = { d, c };
a = lookup_table[b];

But in this case, if b = 1, I should call a function?!?!
Not sure if it achievable...
PurpleSunray
KVRian
 
587 posts since 13 Mar, 2012

Postby PurpleSunray; Fri May 19, 2017 4:37 am Re: How much CPU would take 10 Envelopes per voice? It seems a lot...

But in this case, if b = 1, I should call a function?!?!

Then this is a condition that may cause a branch (you want to call function if b=1, so actually want to branch).
You can remove the if by having function pointer on that loopup_table.
Function pointer 0 points to any empy function (lookup_table[b]() will do nothing if b is 0), function pointer 1 points to the real function (lookup_table[b]() will call it if b is 1).
But it only removes the if instruction, not the conditional branch. CPU might jump to empty function, or to real function, depending on b - so this a condicitonal jump by definition, no matter if do the if(b) or not.

But I can just repeat again to read (and understand) this:
https://en.wikipedia.org/wiki/Instruction_pipelining

It is not about to remove if's of switches or whatever, but keep the pipeline bussy as much possible.
Bubbles are bad, stalls are bad, flushes are very bad. You cannot avoid it completly, but only try to minimze it.

Have you ever seen NOP instructions when stepping through assembly code and wondered what this is about?
It is a instruction that does nothing (waste a cycle) and it is there because some instruction must wait until a prevoius one comes out of the pipeline. During that time, your CPU processes NOPs (it does nothing.. instead of calculating usefull stuff).
PreviousNext

Moderator: Moderators (Main)

Return to DSP and Plug-in Development