What about std::isfinite()?mystran wrote: ↑Thu Nov 21, 2019 5:27 amIf you want to sanitise input, you can use std::isnan() for MSVC and __isnan() for clang and GCC. Note that both clang and GCC can optimise std::isnan() into a NOP when using fast-math (which is about as retarded as it gets, but whatever), so you really have to use __isnan() instead.
Optimize plugin code for balanced load or least load?
-
- KVRAF
- 1607 posts since 12 Apr, 2002
- KVRAF
- 7890 posts since 12 Feb, 2006 from Helsinki, Finland
No idea, but probably the same as std::isnan(). I don't usually check for infinities explicitly, but rather clip very large input samples at a finite threshold, such that there is still some headroom before any internal computations would start producing infinities (ie. I sanitise input in such a way that output is known to stay finite). Unlike NaNs, infinities don't really require any special handling if you're clipping anyway.Z1202 wrote: ↑Thu Nov 21, 2019 10:44 amWhat about std::isfinite()?mystran wrote: ↑Thu Nov 21, 2019 5:27 amIf you want to sanitise input, you can use std::isnan() for MSVC and __isnan() for clang and GCC. Note that both clang and GCC can optimise std::isnan() into a NOP when using fast-math (which is about as retarded as it gets, but whatever), so you really have to use __isnan() instead.
-
- KVRian
- Topic Starter
- 626 posts since 30 Aug, 2012
Xcode gives me an "undetermined" compiler warning for delayIdx with this code. It seems to run OK but I'm not sure it's safe way to do this operation given that warning. Just FYI.JCJR wrote: ↑Wed Nov 20, 2019 2:52 am delayIdx *= ((delayIdx += 1) < DelayLength);
Assuming jsfx compiles a branchless comparison, TRUE comparisons return 1 and FALSE comparisons return 0. Also jsfx seems to run fastest the fewer times you reference vars.
So the above line first incs delayIdx, then if the new [delayIdx < DelayLength] it multiplies the new delayIdx by 1 (no change), otherwise it multiplies the new delayIdx by 0, resetting the pointer to the buffer bottom.
- KVRAF
- 2237 posts since 25 Sep, 2014 from Specific Northwest
Using Booleans as real values is generally frowned upon. Depending on the language, "false"/"true" can be 0/-1, 0/1, negative/positive, 0 or unreal/everything else (positive or negative), etc. I've seen flavors of the same language do it differently. However, it's a great way to use a compare to avoid a branch. So, if you do use it, document what you're doing and use casts liberally, if needed. Plus, I'd code it like this, assuming false=0 and true=1:Fender19 wrote: ↑Fri Nov 22, 2019 9:40 pmXcode gives me an "undetermined" compiler warning for delayIdx with this code. It seems to run OK but I'm not sure it's safe way to do this operation given that warning. Just FYI.JCJR wrote: ↑Wed Nov 20, 2019 2:52 am delayIdx *= ((delayIdx += 1) < DelayLength);
Assuming jsfx compiles a branchless comparison, TRUE comparisons return 1 and FALSE comparisons return 0. Also jsfx seems to run fastest the fewer times you reference vars.
So the above line first incs delayIdx, then if the new [delayIdx < DelayLength] it multiplies the new delayIdx by 1 (no change), otherwise it multiplies the new delayIdx by 0, resetting the pointer to the buffer bottom.
Code: Select all
++delayIdx;
delayIdx *= (delayIdx < DelayLength);
Code: Select all
delayIdx = ++delayIdx & 0xFF; // 0 - 255
I started on Logic 5 with a PowerBook G4 550Mhz. I now have a MacBook Air M1 and it's ~165x faster! So, why is my music not proportionally better?
-
- KVRian
- Topic Starter
- 626 posts since 30 Aug, 2012
I misspoke above - Xcode gives me an "unsequenced modification" compiler warning for this code.JCJR wrote: ↑Wed Nov 20, 2019 2:52 am delayIdx *= ((delayIdx += 1) < DelayLength);
Assuming jsfx compiles a branchless comparison, TRUE comparisons return 1 and FALSE comparisons return 0. Also jsfx seems to run fastest the fewer times you reference vars.
So the above line first incs delayIdx, then if the new [delayIdx < DelayLength] it multiplies the new delayIdx by 1 (no change), otherwise it multiplies the new delayIdx by 0, resetting the pointer to the buffer bottom.
-
- KVRian
- 631 posts since 21 Jun, 2013
This expression is UB. Nice catch by compiler.Fender19 wrote: ↑Fri Nov 22, 2019 11:11 pmI misspoke above - Xcode gives me an "unsequenced modification" compiler warning for this code.JCJR wrote: ↑Wed Nov 20, 2019 2:52 am delayIdx *= ((delayIdx += 1) < DelayLength);
Assuming jsfx compiles a branchless comparison, TRUE comparisons return 1 and FALSE comparisons return 0. Also jsfx seems to run fastest the fewer times you reference vars.
So the above line first incs delayIdx, then if the new [delayIdx < DelayLength] it multiplies the new delayIdx by 1 (no change), otherwise it multiplies the new delayIdx by 0, resetting the pointer to the buffer bottom.
- KVRist
- 347 posts since 20 Apr, 2005 from Moscow, Russian Federation
This expression is UB.
In С/С++ - yes. But JCJR posts his jsfx code . Obviously in C one is supposed to replace this expression with:
Same for `delayIdx = ++delayIdx & 0xFF;` etc. etc. (now this one was no longer posted as jsfx so (<- )).
In С/С++ - yes. But JCJR posts his jsfx code . Obviously in C one is supposed to replace this expression with:
Code: Select all
++delayIdx;
delayIdx *= (delayIdx < DelayLength);
-
- KVRian
- Topic Starter
- 626 posts since 30 Aug, 2012
Yes, my bad re the jsfx reference. Code shown above, on two lines, works fine in Xcode.Max M. wrote: ↑Sat Nov 23, 2019 3:11 am This expression is UB.
In С/С++ - yes. But JCJR posts his jsfx code . Obviously in C one is supposed to replace this expression with:Same for `delayIdx = ++delayIdx & 0xFF;` etc. etc. (now this one was no longer posted as jsfx so (<- )).Code: Select all
++delayIdx; delayIdx *= (delayIdx < DelayLength);
- KVRAF
- 7890 posts since 12 Feb, 2006 from Helsinki, Finland
Actually, in C++ booleans are either true or false and the compiler is free to represent them in whatever way it finds the most convenient (eg. CPU flags are pretty common). Conversion of a boolean into numeric types does give you either 1 (for true) or 0 (for false), but this is potentially an actual conversion that in theory might even involve a branch (well, not really, except as far as it would be standard compliant).syntonica wrote: ↑Fri Nov 22, 2019 10:49 pm Using Booleans as real values is generally frowned upon. Depending on the language, "false"/"true" can be 0/-1, 0/1, negative/positive, 0 or unreal/everything else (positive or negative), etc. I've seen flavors of the same language do it differently. However, it's a great way to use a compare to avoid a branch.
The code we are discussing is a pessimisation.
Just use "if(condition) index=0;" and you'll get either a well-predicted branch or a conditional move; either of these is faster than doing multiplications.
- KVRAF
- 2237 posts since 25 Sep, 2014 from Specific Northwest
I general start with the naive solution as you suggest and then go from there to see if there are speedier alternatives because the code profiler might disagree in the end! That's why I always set up tests to see what's fastest and avoid following the wisdom of the herd.
I started on Logic 5 with a PowerBook G4 550Mhz. I now have a MacBook Air M1 and it's ~165x faster! So, why is my music not proportionally better?
- KVRAF
- 7890 posts since 12 Feb, 2006 from Helsinki, Finland
Doing a bit of testing with Godbolt, clang seems to prefer CMOVcc while GCC opts for a branch.
MSVC seems to insist on keeping the index variable in memory (either directly operating on memory, or loading/storing on per-iteration basis), no matter what (ie. apparently even with modern MSVC one should still cache the delay index in a local; in other compilers this used to be a thing some 20 years ago). If the state is in globals, it uses CMOVcc, but with a struct it seems to go for a branch instead.
ICC seems to unroll the loop, then use CMOVcc in the unrolled part and branch in the remainder. When told not to unroll, it uses CMOVcc.
conclusions: don't use MSVC.
MSVC seems to insist on keeping the index variable in memory (either directly operating on memory, or loading/storing on per-iteration basis), no matter what (ie. apparently even with modern MSVC one should still cache the delay index in a local; in other compilers this used to be a thing some 20 years ago). If the state is in globals, it uses CMOVcc, but with a struct it seems to go for a branch instead.
ICC seems to unroll the loop, then use CMOVcc in the unrolled part and branch in the remainder. When told not to unroll, it uses CMOVcc.
conclusions: don't use MSVC.
- KVRist
- 347 posts since 20 Apr, 2005 from Moscow, Russian Federation
The weird thing about MSVC is that it did optimize code to CMOV* somewhere around VC2002 or so. Then suddenly one day (in VC2003) they have just totally forgotten about these kind of instructions (the rumors were that this is just because of some bugs in the optimizer). And yet after almost 20 years it's still like they have no idea such instructions exist at all. Doh!
---
Though here they declare that at least after VS2017 CMOV may appear under certain conditions... Good morning!
---
Though here they declare that at least after VS2017 CMOV may appear under certain conditions... Good morning!
-
- KVRAF
- 7400 posts since 17 Feb, 2005
To add, C languages use sequence points to ensure the order of execution. C++ is way more flexible than C with regard to how an assignment operator can be used, return-by-reference is one example. Operations with an implicit sequence point, like pre or post increment, should not be written in a compound statement, regardless if it works or not, simply because it's easier to figure out and to avoid mistakes. When in doubt, use parentheses.Max M. wrote: ↑Sat Nov 23, 2019 3:11 am This expression is UB.
In С/С++ - yes. But JCJR posts his jsfx code . Obviously in C one is supposed to replace this expression with:Same for `delayIdx = ++delayIdx & 0xFF;` etc. etc. (now this one was no longer posted as jsfx so (<- )).Code: Select all
++delayIdx; delayIdx *= (delayIdx < DelayLength);
https://stackoverflow.com/questions/357 ... oints-in-c
- KVRAF
- 7890 posts since 12 Feb, 2006 from Helsinki, Finland
There are no "implicit sequence points" with pre/post increments and this is why mixing such operations with other access to the same variable without an explicit sequence point is undefined behaviour. Parenthesis don't help either, because those just change how the parse tree is built.