Fast TANH aproximation

DSP, Plugin and Host development discussion.
RELATED
PRODUCTS

Post

Regarding branch-less clipping: if you can assume the signal is predictable (which audio often is, unless it's very high-frequency heavy) then branching might be faster than a branch-less saturation..

..but if you do that, make sure you don't do something like this:

Code: Select all

// just something
static inline internalClip(double x)
{
   return (0.5*x)*(3 - x*x);
}

// this will almost certainly perform badly
double badBranchClip(double x)
{
  if(x > 1) x = 1;
  if(x < -1) x = -1;
  return internalClip(x);
}
Instead do it like this:

Code: Select all

// this is likely to perform better, especially
// if you'd do more math with the unclipped x
double betterClip(double x)
{
  if(x > 1) return internalClip(1);
  if(x < -1) return internalClip(-1);
  return internalClip(x);
}
So what's the deal? With the first version, the approximation depends on what branches are taken. In the second version it doesn't. Now another benefit is that if you're heavy into clipping, you also save some math, since the clipped paths can be evaluated at compile time.

I'm not sure how large the difference is on modern CPUs, but a couple of years back it was pretty obvious. :)

Post

Aleksey Vaneev wrote:Wanted to correct myself: of course, step function has a low-pass response. But I'm not getting how waveshaper function's derivative relates to step function. A formula would be useful.
Step function has 6dB/oct decay, and each additional integral gives another 6dB/oct decay.. so for a discontinuity in Nth derivative, one would expect the noise from the discontinuity to have a spectral decay of (N+1)*6dB/oct.

I don't know if this is correct, but it seems plausible.

Post

@Mystran : yes, exactly.
I started from the Dirac impulse function which has full band spectrum ( amplitude = 1 from -infinity frequency to + infinity frequency )
and said the step function is its integral and thus has 6dB decay ( because integrating is dividing by s or jw in the laplace / fourrier domain which means it corresponds to a 1 pole filter with cutoff at DC , thus 6dB decay ) which is the same as what you're saying.
And as you say each successive integration is multiplying by 1/s in the frequency domain, i.e filtering by a one pole, i.e 6 additional dB of spectral decay.

So any discontinuity in a derivative is bad, but the higher the order the more it will be filtered and so the less the aliasing will be perceptible.

By the way great tip about the branching, I always forget about this it's very good to be reminded !

Post

Well, whether discontinuities matter depends on the amplitude of the distortion components. If the discontinuity is small enough that that even the first harmonic sinks below whatever other noise floor you have (say the aliasing noise floor from having a non-linearity in the first place), then it hardly matters if the approximation smooth or not.

Post

Yes of course, agreed.

Post

I was lucky today, found a very good tanh() approximation (public domain):

Code: Select all

inline double vox_fasttanh2( const double x )
{
	const double ax = fabs( x );
	const double x2 = x * x;

	return( x * ( 2.45550750702956 + 2.45550750702956 * ax +
		( 0.893229853513558 + 0.821226666969744 * ax ) * x2 ) /
		( 2.44506634652299 + ( 2.44506634652299 + x2 ) *
		fabs( x + 0.814642734961073 * x * ax )));
}
Just 0.427% peak relative error, smooth 1st and 2nd derivatives. 3rd derivative's magnitude relative to the 1st derivative at discontinuity point is around -191 dB.
Image

Post

mystran wrote:Among the silly things, I've even found cases where unrolling really short loops can end up with significant performance hits... but then you might change the total loop count a bit (shorter or larger), and suddenly unrolling is faster again. None of it really makes any sense most of the time.
I've always put that down to RAW stalls. In a loop, the branch back takes sufficient time for the memory to settle before the next operation, unrolled the reads may come up too soon. I'll often try to order things so there's an operation or two between significant value set/gets in an algorithm. With sufficiently unrolled loops the compiler may be programmed to pick up a pattern which isn't obvious with a couple of unrolled iterations.

But, yes, it is a curiosity.
Image

Post

Alright so I made a small test to compare the tanh approximations above to the real thing.
http://nbviewer.ipython.org/6226363
A you can see at the chosen frequency and amplitude, the real tanh has little aliasing, just one peak folded back before the first harmonic.
The first "fake" tanh that was posted on page 2 looks as I suspected pretty horrible, whereas the latest one is pretty good and I would expect it to sound much nicer.
As you increase the amplitude things become less clear as the real tanh starts aliasing a lot too.

Post Reply

Return to “DSP and Plugin Development”