KVR Audio

mystran · Post by **mystran** » Fri Aug 09, 2013 8:36 pm

Regarding branch-less clipping: if you can assume the signal is predictable (which audio often is, unless it's very high-frequency heavy) then branching might be faster than a branch-less saturation..

..but if you do that, make sure you don't do something like this:

Code: Select all

// just something
static inline internalClip(double x)
{
   return (0.5*x)*(3 - x*x);
}

// this will almost certainly perform badly
double badBranchClip(double x)
{
  if(x > 1) x = 1;
  if(x < -1) x = -1;
  return internalClip(x);
}

Instead do it like this:

Code: Select all

// this is likely to perform better, especially
// if you'd do more math with the unclipped x
double betterClip(double x)
{
  if(x > 1) return internalClip(1);
  if(x < -1) return internalClip(-1);
  return internalClip(x);
}

So what's the deal? With the first version, the approximation depends on what branches are taken. In the second version it doesn't. Now another benefit is that if you're heavy into clipping, you also save some math, since the clipped paths can be evaluated at compile time.

I'm not sure how large the difference is on modern CPUs, but a couple of years back it was pretty obvious.

mystran · Post by **mystran** » Fri Aug 09, 2013 9:11 pm

Aleksey Vaneev wrote:Wanted to correct myself: of course, step function has a low-pass response. But I'm not getting how waveshaper function's derivative relates to step function. A formula would be useful.

Step function has 6dB/oct decay, and each additional integral gives another 6dB/oct decay.. so for a discontinuity in Nth derivative, one would expect the noise from the discontinuity to have a spectral decay of (N+1)*6dB/oct.

I don't know if this is correct, but it seems plausible.

FastTriggerFish · Post by **FastTriggerFish** » Fri Aug 09, 2013 9:27 pm

@Mystran : yes, exactly.
I started from the Dirac impulse function which has full band spectrum ( amplitude = 1 from -infinity frequency to + infinity frequency )
and said the step function is its integral and thus has 6dB decay ( because integrating is dividing by s or jw in the laplace / fourrier domain which means it corresponds to a 1 pole filter with cutoff at DC , thus 6dB decay ) which is the same as what you're saying.
And as you say each successive integration is multiplying by 1/s in the frequency domain, i.e filtering by a one pole, i.e 6 additional dB of spectral decay.

So any discontinuity in a derivative is bad, but the higher the order the more it will be filtered and so the less the aliasing will be perceptible.

By the way great tip about the branching, I always forget about this it's very good to be reminded !

mystran · Post by **mystran** » Fri Aug 09, 2013 9:44 pm

Well, whether discontinuities matter depends on the amplitude of the distortion components. If the discontinuity is small enough that that even the first harmonic sinks below whatever other noise floor you have (say the aliasing noise floor from having a non-linearity in the first place), then it hardly matters if the approximation smooth or not.

FastTriggerFish · Post by **FastTriggerFish** » Fri Aug 09, 2013 10:29 pm

Yes of course, agreed.

Aleksey Vaneev · Post by **Aleksey Vaneev** » Sat Aug 10, 2013 9:26 am

I was lucky today, found a very good tanh() approximation (public domain):

Code: Select all

inline double vox_fasttanh2( const double x )
{
	const double ax = fabs( x );
	const double x2 = x * x;

	return( x * ( 2.45550750702956 + 2.45550750702956 * ax +
		( 0.893229853513558 + 0.821226666969744 * ax ) * x2 ) /
		( 2.44506634652299 + ( 2.44506634652299 + x2 ) *
		fabs( x + 0.814642734961073 * x * ax )));
}

Just 0.427% peak relative error, smooth 1st and 2nd derivatives. 3rd derivative's magnitude relative to the 1st derivative at discontinuity point is around -191 dB.

duncanparsons · Post by **duncanparsons** » Sat Aug 10, 2013 10:07 am

mystran wrote:Among the silly things, I've even found cases where unrolling really short loops can end up with significant performance hits... but then you might change the total loop count a bit (shorter or larger), and suddenly unrolling is faster again. None of it really makes any sense most of the time.

I've always put that down to RAW stalls. In a loop, the branch back takes sufficient time for the memory to settle before the next operation, unrolled the reads may come up too soon. I'll often try to order things so there's an operation or two between significant value set/gets in an algorithm. With sufficiently unrolled loops the compiler may be programmed to pick up a pattern which isn't obvious with a couple of unrolled iterations.

But, yes, it is a curiosity.

FastTriggerFish · Post by **FastTriggerFish** » Tue Aug 13, 2013 10:41 pm

Alright so I made a small test to compare the tanh approximations above to the real thing.
http://nbviewer.ipython.org/6226363
A you can see at the chosen frequency and amplitude, the real tanh has little aliasing, just one peak folded back before the first harmonic.
The first "fake" tanh that was posted on page 2 looks as I suspected pretty horrible, whereas the latest one is pretty good and I would expect it to sound much nicer.
As you increase the amplitude things become less clear as the real tanh starts aliasing a lot too.

Fast TANH aproximation