How much CPU would take 10 Envelopes per voice? It seems a lot...
- KVRian
- 1253 posts since 31 Dec, 2008
-
- KVRian
- 853 posts since 13 Mar, 2012
Yes, but if you only have necessary branches, you have reducted to a minimum.S0lo wrote:And if you have necessary branches, you can save CPU.PurpleSunray wrote:If you have uncessary branches, you waste CPU. So you want to reduce it to a minimum.
We talk about the unecessary ones
Last edited by PurpleSunray on Tue May 16, 2017 6:24 pm, edited 1 time in total.
~~ ॐ http://soundcloud.com/mfr ॐ ~~
-
- KVRian
- 853 posts since 13 Mar, 2012
matt42 wrote:lol the same thing could be said for any function. If you have too many sin() calls you also waste CPU, best to keep those to the absolute minimum too, for example. Use of branches isn't a special case in this regard.PurpleSunray wrote:If you have uncessary branches, you waste CPU. So you want to reduce it to a minimum.
next
~~ ॐ http://soundcloud.com/mfr ॐ ~~
- KVRian
- 1253 posts since 31 Dec, 2008
Here is a counter example:PurpleSunray wrote:Yes, but if you only have necessary branches, you have reducted to a minimum.S0lo wrote:And if you have necessary branches, you can save CPU.PurpleSunray wrote:If you have uncessary branches, you waste CPU. So you want to reduce it to a minimum.
We talk about the unecessary ones
Code1:
Code: Select all
function insertionSortR(array A, int n)
if n>1
insertionSortR(A,n-1)
x = A[n]
j = n-1
while j >= 0 and A[j] > x
A[j+1] = A[j]
j = j-1
end while
A[j+1] = x
end if
end function
Code: Select all
algorithm quicksort(A, lo, hi) is
if lo < hi then
p := partition(A, lo, hi)
quicksort(A, lo, p)
quicksort(A, p + 1, hi)
algorithm partition(A, lo, hi) is
pivot := A[lo]
i := lo - 1
j := hi + 1
loop forever
do
i := i + 1
while A[i] < pivot
do
j := j - 1
while A[j] > pivot
if i >= j then
return j
swap A[i] with A[j]
check here: https://en.wikipedia.org/wiki/Quicksort
and here: https://en.wikipedia.org/wiki/Insertion_sort
www.solostuff.net
Advice is heavy. So don’t send it like a mountain.
Advice is heavy. So don’t send it like a mountain.
-
- KVRAF
- 7401 posts since 17 Feb, 2005
Anyone know how to do a hard limit without conditional branch?
Let's say the "normal" way to do it is:
if (val > 1.f) { val = 1.f; }
Or maybe it's possible to select between two values instead, and have a faster result?
I think in this simple yet useful example, the branch is ok... it's very small and the resultant op is only an assignment from the instruction stream, no data is being accessed.
Let's say the "normal" way to do it is:
if (val > 1.f) { val = 1.f; }
Or maybe it's possible to select between two values instead, and have a faster result?
I think in this simple yet useful example, the branch is ok... it's very small and the resultant op is only an assignment from the instruction stream, no data is being accessed.
-
- KVRian
- 853 posts since 13 Mar, 2012
@S0lo:
ok, now you lost me
how does a comparsion of two sorting algorithms contribute to our banch discussion?
Quicksort is no branch-optimized version of insertion-sort, or vice versa. Different dicussion.
ok, now you lost me
how does a comparsion of two sorting algorithms contribute to our banch discussion?
Quicksort is no branch-optimized version of insertion-sort, or vice versa. Different dicussion.
~~ ॐ http://soundcloud.com/mfr ॐ ~~
- KVRian
- 1253 posts since 31 Dec, 2008
I'm sure you know what I meant.PurpleSunray wrote:@S0lo:
ok, now you lost me
how does a comparsion of two sorting algorithms contribute to our banch discussion?
Quicksort is no branch-optimized version of insertion-sort, or vice versa. Different dicussion.
www.solostuff.net
Advice is heavy. So don’t send it like a mountain.
Advice is heavy. So don’t send it like a mountain.
- KVRist
- 91 posts since 24 Dec, 2015 from Bristol, UK
I use code paths for SSE2 and AVX which for float and double is: _mm_max_ps / _mm256_max_ps / _mm_max_pd / _mm256_max_pd (plus min alternatives)camsr wrote:Anyone know how to do a hard limit without conditional branch?
-
- KVRian
- 853 posts since 13 Mar, 2012
No, we talked about that you do not agree with that the goal should be to minimized branching.S0lo wrote:I'm sure you know what I meant.PurpleSunray wrote:@S0lo:
ok, now you lost me
how does a comparsion of two sorting algorithms contribute to our banch discussion?
Quicksort is no branch-optimized version of insertion-sort, or vice versa. Different dicussion.
Quicksort and insertion-sort have different performance, beacuse they sort differntly.
It's like telling me that SQL server is way faster than Orcale.. yes, ok accepted, idk... but still don't get the link to branching.
(but let's stop that yes / no game now.. ppl want to dicusss usefull things again as it seems)
~~ ॐ http://soundcloud.com/mfr ॐ ~~
- KVRian
- 1253 posts since 31 Dec, 2008
No problemPurpleSunray wrote:but let's stop that yes / no game now.. ppl want to dicusss usefull things again as it seems)
www.solostuff.net
Advice is heavy. So don’t send it like a mountain.
Advice is heavy. So don’t send it like a mountain.
-
- KVRAF
- 7401 posts since 17 Feb, 2005
Thanks, I forgot about those.keithwood wrote:I use code paths for SSE2 and AVX which for float and double is: _mm_max_ps / _mm256_max_ps / _mm_max_pd / _mm256_max_pd (plus min alternatives)camsr wrote:Anyone know how to do a hard limit without conditional branch?
I don't think my compiler (mingw) automatically converts a statement like I wrote above to use SSE intrinsics, even if I have specified the option -mfpmath=sse
-
- KVRian
- 1273 posts since 9 Jan, 2006
To hard limit without conditionals:
Clips to +- 1.0. Erm, maybe there's a better way, I haven't used this
Code: Select all
double hardClip(double x)
{
x = x - max(x - 1.0, 0.0);
x = x - min(x + 1.0, 0.0);
return x;
}
Last edited by matt42 on Tue May 16, 2017 8:59 pm, edited 2 times in total.
- KVRian
- 1253 posts since 31 Dec, 2008
problem is max and min implementation it self uses conditionals.matt42 wrote:To hard limit without conditionals:
Clips to +- 1.0. Erm, maybe there's a better way, I haven't used thisCode: Select all
x = x - max(x - 1.0, 0.0); x = x - min(x + 1.0, 0.0);
Any way, you can do this:
x = min(x, 1);
Which actually does worse than a mere if statement. because the assignment is always performed
Last edited by S0lo on Tue May 16, 2017 8:15 pm, edited 1 time in total.
www.solostuff.net
Advice is heavy. So don’t send it like a mountain.
Advice is heavy. So don’t send it like a mountain.
-
- KVRian
- 1379 posts since 26 Apr, 2004 from UK
Except if the branches can be predicted (i.e. always take the same branch), this is false in any modern architecture, ARM included.S0lo wrote:And if you have necessary branches, you can save CPU. So you want to increase them to the sufficient but not more.PurpleSunray wrote:If you have uncessary branches, you waste CPU. So you want to reduce it to a minimum.
-
- KVRian
- 1379 posts since 26 Apr, 2004 from UK
You are comparing apples and oranges. Of course if the complexity is different, this changes everything! We are talking about the same complexity in number of instructions.S0lo wrote:Both codes do the same thing. "sorting". But Code1 (insertion sort) uses two conditionals (an if and a while). While code2 (quicksort) has four conditionals (2 ifs and 2 whiles). Yet code2 is well known to perform much faster (nlog(n)) than code1 (quadratic) on the average.
check here: https://en.wikipedia.org/wiki/Quicksort
and here: https://en.wikipedia.org/wiki/Insertion_sort