How much CPU would take 10 Envelopes per voice? It seems a lot...

DSP, Plugin and Host development discussion.
Post Reply New Topic
RELATED
PRODUCTS

Post

SORRY WRONG POST :oops:
www.solostuff.net
Advice is heavy. So don’t send it like a mountain.

Post

S0lo wrote:
PurpleSunray wrote:If you have uncessary branches, you waste CPU. So you want to reduce it to a minimum.
And if you have necessary branches, you can save CPU.
Yes, but if you only have necessary branches, you have reducted to a minimum.
We talk about the unecessary ones
Last edited by PurpleSunray on Tue May 16, 2017 6:24 pm, edited 1 time in total.

Post

matt42 wrote:
PurpleSunray wrote:If you have uncessary branches, you waste CPU. So you want to reduce it to a minimum.

next
lol the same thing could be said for any function. If you have too many sin() calls you also waste CPU, best to keep those to the absolute minimum too, for example. Use of branches isn't a special case in this regard.
:wink:

Post

PurpleSunray wrote:
S0lo wrote:
PurpleSunray wrote:If you have uncessary branches, you waste CPU. So you want to reduce it to a minimum.
And if you have necessary branches, you can save CPU.
Yes, but if you only have necessary branches, you have reducted to a minimum.
We talk about the unecessary ones
Here is a counter example:

Code1:

Code: Select all

 function insertionSortR(array A, int n)
     if n>1
        insertionSortR(A,n-1)
        x = A[n]
        j = n-1
        while j >= 0 and A[j] > x
            A[j+1] = A[j]
            j = j-1
        end while
        A[j+1] = x
     end if
 end function
Code2:

Code: Select all

algorithm quicksort(A, lo, hi) is
    if lo < hi then
        p := partition(A, lo, hi)
        quicksort(A, lo, p)
        quicksort(A, p + 1, hi)

algorithm partition(A, lo, hi) is
    pivot := A[lo]
    i := lo - 1
    j := hi + 1
    loop forever
        do
            i := i + 1
        while A[i] < pivot

        do
            j := j - 1
        while A[j] > pivot

        if i >= j then
            return j

        swap A[i] with A[j]
Both codes do the same thing. "sorting". But Code1 (insertion sort) uses two conditionals (an if and a while). While code2 (quicksort) has four conditionals (2 ifs and 2 whiles). Yet code2 is well known to perform much faster (nlog(n)) than code1 (quadratic) on the average.

check here: https://en.wikipedia.org/wiki/Quicksort
and here: https://en.wikipedia.org/wiki/Insertion_sort
www.solostuff.net
Advice is heavy. So don’t send it like a mountain.

Post

Anyone know how to do a hard limit without conditional branch?
Let's say the "normal" way to do it is:
if (val > 1.f) { val = 1.f; }
Or maybe it's possible to select between two values instead, and have a faster result?

I think in this simple yet useful example, the branch is ok... it's very small and the resultant op is only an assignment from the instruction stream, no data is being accessed.

Post

@S0lo:
ok, now you lost me :?
how does a comparsion of two sorting algorithms contribute to our banch discussion?
Quicksort is no branch-optimized version of insertion-sort, or vice versa. Different dicussion.

Post

PurpleSunray wrote:@S0lo:
ok, now you lost me :?
how does a comparsion of two sorting algorithms contribute to our banch discussion?
Quicksort is no branch-optimized version of insertion-sort, or vice versa. Different dicussion.
I'm sure you know what I meant.
www.solostuff.net
Advice is heavy. So don’t send it like a mountain.

Post

camsr wrote:Anyone know how to do a hard limit without conditional branch?
I use code paths for SSE2 and AVX which for float and double is: _mm_max_ps / _mm256_max_ps / _mm_max_pd / _mm256_max_pd (plus min alternatives)

Post

S0lo wrote:
PurpleSunray wrote:@S0lo:
ok, now you lost me :?
how does a comparsion of two sorting algorithms contribute to our banch discussion?
Quicksort is no branch-optimized version of insertion-sort, or vice versa. Different dicussion.
I'm sure you know what I meant.
No, we talked about that you do not agree with that the goal should be to minimized branching.
Quicksort and insertion-sort have different performance, beacuse they sort differntly.
It's like telling me that SQL server is way faster than Orcale.. yes, ok accepted, idk... but still don't get the link to branching.

(but let's stop that yes / no game now.. ppl want to dicusss usefull things again as it seems)

Post

PurpleSunray wrote:but let's stop that yes / no game now.. ppl want to dicusss usefull things again as it seems)
No problem :)
www.solostuff.net
Advice is heavy. So don’t send it like a mountain.

Post

keithwood wrote:
camsr wrote:Anyone know how to do a hard limit without conditional branch?
I use code paths for SSE2 and AVX which for float and double is: _mm_max_ps / _mm256_max_ps / _mm_max_pd / _mm256_max_pd (plus min alternatives)
Thanks, I forgot about those.
I don't think my compiler (mingw) automatically converts a statement like I wrote above to use SSE intrinsics, even if I have specified the option -mfpmath=sse

Post

To hard limit without conditionals:

Code: Select all

double hardClip(double x)
{
     x = x - max(x - 1.0, 0.0);
     x = x - min(x + 1.0, 0.0);
     return x;
}
Clips to +- 1.0. Erm, maybe there's a better way, I haven't used this :D
Last edited by matt42 on Tue May 16, 2017 8:59 pm, edited 2 times in total.

Post

matt42 wrote:To hard limit without conditionals:

Code: Select all

x = x - max(x - 1.0, 0.0);
x = x - min(x + 1.0, 0.0);
Clips to +- 1.0. Erm, maybe there's a better way, I haven't used this :D
problem is max and min implementation it self uses conditionals.

Any way, you can do this:

x = min(x, 1);

Which actually does worse than a mere if statement. because the assignment is always performed
Last edited by S0lo on Tue May 16, 2017 8:15 pm, edited 1 time in total.
www.solostuff.net
Advice is heavy. So don’t send it like a mountain.

Post

S0lo wrote:
PurpleSunray wrote:If you have uncessary branches, you waste CPU. So you want to reduce it to a minimum.
And if you have necessary branches, you can save CPU. So you want to increase them to the sufficient but not more.
Except if the branches can be predicted (i.e. always take the same branch), this is false in any modern architecture, ARM included.

Post

S0lo wrote:Both codes do the same thing. "sorting". But Code1 (insertion sort) uses two conditionals (an if and a while). While code2 (quicksort) has four conditionals (2 ifs and 2 whiles). Yet code2 is well known to perform much faster (nlog(n)) than code1 (quadratic) on the average.

check here: https://en.wikipedia.org/wiki/Quicksort
and here: https://en.wikipedia.org/wiki/Insertion_sort
You are comparing apples and oranges. Of course if the complexity is different, this changes everything! We are talking about the same complexity in number of instructions.

Post Reply

Return to “DSP and Plugin Development”