Random cheap sigmoid

DSP, Plugin and Host development discussion.
RELATED
PRODUCTS

Post

mystran wrote: Wed May 05, 2021 10:06 am
While min/max is a no-brainer here, it got me wondering: is AVX the minimum to actually do generalized branches?

Basically with AVX one can do _mm_cmp_ps (or the wider equivalents) to get a mask for conditionally selecting values, then follow up with PTEST (_mm_test_all_zeroes, etc; SSE4.1) to get CF or ZF to only compute the branches where at least one value is actually needed, so Turing equivalent computation is possible without ever unpacking the vectors, but is there some trick to do this with just SSE2 with sensible amount of shuffling?
_mm_movemask_ps might do the trick.

Post

2DaT wrote: Wed May 05, 2021 9:27 pm
mystran wrote: Wed May 05, 2021 10:06 am
While min/max is a no-brainer here, it got me wondering: is AVX the minimum to actually do generalized branches?

Basically with AVX one can do _mm_cmp_ps (or the wider equivalents) to get a mask for conditionally selecting values, then follow up with PTEST (_mm_test_all_zeroes, etc; SSE4.1) to get CF or ZF to only compute the branches where at least one value is actually needed, so Turing equivalent computation is possible without ever unpacking the vectors, but is there some trick to do this with just SSE2 with sensible amount of shuffling?
_mm_movemask_ps might do the trick.
Oh, I just realized the SSE versions of comparisons are _mm_cmpXX_ps which map to CMPPS where as _mm_cmp_ps maps to VCMPPS, which is basically the same thing, except the intrinsic-syntax is different. It's just a joy that Intel is consistent with these?!?

You're right though, _mm_movemask_ps will do the job at the cost of extra integer TEST to set the flags.

Post

Thanks for suggestions. Looks like I got branch working now ... hmmm... maybe (xmax) position needs some adjusting (CE).

Post

martinvicanek wrote: Fri Mar 22, 2019 12:51 am You dont even need to worry about division if you rewrite
(sqrt(1 + x^2) - sqrt(1+ x1^2))/(x - x1) = (x + x1)/(sqrt(1 + x^2) + sqrt(1 + x1^2))
:wink:
Excuse my extremely rusty math. What happened here?

Post

rafa1981 wrote: Wed Jun 16, 2021 7:00 am
martinvicanek wrote: Fri Mar 22, 2019 12:51 am You dont even need to worry about division if you rewrite
(sqrt(1 + x^2) - sqrt(1+ x1^2))/(x - x1) = (x + x1)/(sqrt(1 + x^2) + sqrt(1 + x1^2))
:wink:
Excuse my extremely rusty math. What happened here?
It should actually read "You dont even need to worry about division by zero". You still have a division, but the denominator is always >= 2.
To prove the above equality, multiply on the left hand side both the numerator and dennominator by (sqrt(1 + x^2) + sqrt(1+ x1^2)), then use the identity x^2 - x1^2 = (x + x1)(x - x1).

Post

martinvicanek wrote: Wed Jun 16, 2021 10:44 am To prove the above equality, multiply on the left hand side both the numerator and dennominator by (sqrt(1 + x^2) + sqrt(1+ x1^2)), then use the identity x^2 - x1^2 = (x + x1)(x - x1).
I see. Clever use of "the basics".

Post

martinvicanek wrote: Wed Jun 16, 2021 10:44 am
rafa1981 wrote: Wed Jun 16, 2021 7:00 am
martinvicanek wrote: Fri Mar 22, 2019 12:51 am You dont even need to worry about division if you rewrite
(sqrt(1 + x^2) - sqrt(1+ x1^2))/(x - x1) = (x + x1)/(sqrt(1 + x^2) + sqrt(1 + x1^2))
:wink:
Excuse my extremely rusty math. What happened here?
It should actually read "You dont even need to worry about division by zero". You still have a division, but the denominator is always >= 2.
To prove the above equality, multiply on the left hand side both the numerator and dennominator by (sqrt(1 + x^2) + sqrt(1+ x1^2)), then use the identity x^2 - x1^2 = (x + x1)(x - x1).
Two square roots and a division make it quite expensive to actually compute though.

Post

Actually you can save the previous "sqrt" result along with the previous sample input (x1), so one "sqrt" can go away at the expense of one extra state variable.

This is doing antialiasing at very low latency (0.5 samples? 1 sample?) and almost no memory usage, so the definition of cheap is relative. I'm very bad at math and DSP and I might be missing some better ways.

Post Reply

Return to “DSP and Plugin Development”