## Estrin's scheme vs. factored polynomials

DSP, Plug-in and Host development discussion.
Z1202
KVRian
1007 posts since 12 Apr, 2002
On modern computers it's recommended to use Estrin's scheme rather than Horner method due to the parallelization options of the former. I was wondering why instead we don't use a factored form of the polynomials, which should be almost freely parallelizable? At least when the polynomials are known at compile time

2DaT
KVRist
343 posts since 21 Jun, 2013
Evaluation of factored polynomials is ill-conditioned, esp. for high order polynomials. OTOH, Horner form is well-conditioned (at least for good polynomials), and it has error-cancelling properties. Estrin's scheme also has some error-cancelling properties, and is suitable for accurate evaluation.

Also, Fused-multiply-add operations (FMA) can't be used on a factored form.

vortico
KVRist
246 posts since 19 Jul, 2008
Not related to your question, but Estrins scheme only makes sense when you aren't already evaluating 4 or 8 floats in parallel (4/8 note polyphony for example). In that case, Horner's method is better since it uses fewer arithmetic operations.
VCV Rack open-source virtual modular synthesizer

Z1202
KVRian
1007 posts since 12 Apr, 2002
2DaT wrote:
Tue Feb 12, 2019 1:52 am
Evaluation of factored polynomials is ill-conditioned, esp. for high order polynomials.
Hmmm, googling around I found that conversion from polynomial coefficients to the roots (and maybe vice versa) could be ill-conditioned. But we are not interested in the polynomial coefficients or the roots per se, we are interested in the value of the polynomial itself. In that case ill-conditioning of the polynomial value in respect to the coefficient values would mean that Horner or Estrin's scheme would be equally ill-conditioned. As for the factored form, I'm not sure if there are any cases when the polynomial value is ill-conditioned with respect to the roots, at least more than it would be in the Horner scheme.

Would you have any further pointers?

PS. As for FMA, that's a good point, although in practice we could (or often will have to) factor into 2nd order polynomials, which would partially allow some FMAs.

2DaT
KVRist
343 posts since 21 Jun, 2013
Z1202 wrote:
Tue Feb 12, 2019 2:18 am
2DaT wrote:
Tue Feb 12, 2019 1:52 am
Evaluation of factored polynomials is ill-conditioned, esp. for high order polynomials.
Hmmm, googling around I found that conversion from polynomial coefficients to the roots (and maybe vice versa) could be ill-conditioned. But we are not interested in the polynomial coefficients or the roots per se, we are interested in the value of the polynomial itself. In that case ill-conditioning of the polynomial value in respect to the coefficient values would mean that Horner or Estrin's scheme would be equally ill-conditioned. As for the factored form, I'm not sure if there are any cases when the polynomial value is ill-conditioned with respect to the roots, at least more than it would be in the Horner scheme.

Would you have any further pointers?

PS. As for FMA, that's a good point, although in practice we could (or often will have to) factor into 2nd order polynomials, which would partially allow some FMAs.
Well, polynomials that we are interested in, usually have decaying coefficients, which does not translate well into factored form.
For example: (2^x-1)/x remez on [-0.5,0.5].

Code: Select all

``0.00015403599537x^5 + 0.00133907760028x^4 + 0.00961823761463x^3 + 0.05550357326865x^2 + 0.24022649228573x + 0.69314718246460``
"Decaying" coefficients allow accurate evaluation in Horner form.

2DaT
KVRist
343 posts since 21 Jun, 2013
vortico wrote:
Tue Feb 12, 2019 2:16 am
Not related to your question, but Estrins scheme only makes sense when you aren't already evaluating 4 or 8 floats in parallel (4/8 note polyphony for example). In that case, Horner's method is better since it uses fewer arithmetic operations.
Long dependency chains constrain out-of-order execution. In Horner's scheme every operation is dependent on previous, but in Estrin's scheme there's less dependency. Of course performance depends on the surrounding code, but in practice Estrin's scheme wins for orders >5.

Z1202
KVRian
1007 posts since 12 Apr, 2002
2DaT wrote:
Tue Feb 12, 2019 3:56 am
Well, polynomials that we are interested in, usually have decaying coefficients, which does not translate well into factored form.
For example: (2^x-1)/x remez on [-0.5,0.5].

Code: Select all

``0.00015403599537x^5 + 0.00133907760028x^4 + 0.00961823761463x^3 + 0.05550357326865x^2 + 0.24022649228573x + 0.69314718246460``
"Decaying" coefficients allow accurate evaluation in Horner form.
The maximum difference between
((((0.00015403599537f*x + 0.00133907760028f)*x + 0.00961823761463f)*x
+ 0.05550357326865f)*x + 0.24022649228573f)*x + 0.69314718246460f
and 0.00015403599537f*(x+5.14278588728699f)*
( (x-2.522928813467399f)*x+32.9484316700106f )*
( (x+6.073420085609848f)*x+26.55645202416117f)
on [-0.5,0.5] is 2.4*10^-7 (evaluated at 10^6 different points with 32 bit float)

mystran
KVRAF
5150 posts since 12 Feb, 2006 from Helsinki, Finland
If you'd like Signaldust to return, please ask Katinka Tuisku to resign.

Z1202
KVRian
1007 posts since 12 Apr, 2002
TLDR, but, as far as I understand, this is exactly about ill-conditioning of the roots in respect to the coefficients, which is not what we really care about (because, IIUC, the very same ill-conditioning will ill-condition the polynomial values evaluated by the Horner scheme in a comparable amount). Or did I miss any important part while glancing through?
Last edited by Z1202 on Tue Feb 12, 2019 6:48 am, edited 1 time in total.

Z1202
KVRian
1007 posts since 12 Apr, 2002
posted by mistake, delete

mystran
KVRAF
5150 posts since 12 Feb, 2006 from Helsinki, Finland
2DaT wrote:
Tue Feb 12, 2019 3:56 am
Well, polynomials that we are interested in, usually have decaying coefficients, which does not translate well into factored form.
For example: (2^x-1)/x remez on [-0.5,0.5].

Code: Select all

``0.00015403599537x^5 + 0.00133907760028x^4 + 0.00961823761463x^3 + 0.05550357326865x^2 + 0.24022649228573x + 0.69314718246460``
"Decaying" coefficients allow accurate evaluation in Horner form.
Depends on what polynomials you are interested in. The ones I've been looking at lately are not like this at all.
If you'd like Signaldust to return, please ask Katinka Tuisku to resign.

2DaT
KVRist
343 posts since 21 Jun, 2013
Z1202 wrote:
Tue Feb 12, 2019 5:00 am
2DaT wrote:
Tue Feb 12, 2019 3:56 am
Well, polynomials that we are interested in, usually have decaying coefficients, which does not translate well into factored form.
For example: (2^x-1)/x remez on [-0.5,0.5].

Code: Select all

``0.00015403599537x^5 + 0.00133907760028x^4 + 0.00961823761463x^3 + 0.05550357326865x^2 + 0.24022649228573x + 0.69314718246460``
"Decaying" coefficients allow accurate evaluation in Horner form.
The maximum difference between
((((0.00015403599537f*x + 0.00133907760028f)*x + 0.00961823761463f)*x
+ 0.05550357326865f)*x + 0.24022649228573f)*x + 0.69314718246460f
and 0.00015403599537f*(x+5.14278588728699f)*
( (x-2.522928813467399f)*x+32.9484316700106f )*
( (x+6.073420085609848f)*x+26.55645202416117f)
on [-0.5,0.5] is 2.4*10^-7 (evaluated at 10^6 different points with 32 bit float)
I did my own tests.

Code: Select all

``````Factored: 3.8990848138928413ulp
Horner: 0.76185148768126965ulp``````
The error is not dramatic, but not neligible either. And i have a feeling that it will be worse for higher orders.
mystran wrote:
Tue Feb 12, 2019 7:13 am
2DaT wrote:
Tue Feb 12, 2019 3:56 am
Well, polynomials that we are interested in, usually have decaying coefficients, which does not translate well into factored form.
For example: (2^x-1)/x remez on [-0.5,0.5].

Code: Select all

``0.00015403599537x^5 + 0.00133907760028x^4 + 0.00961823761463x^3 + 0.05550357326865x^2 + 0.24022649228573x + 0.69314718246460``
"Decaying" coefficients allow accurate evaluation in Horner form.
Depends on what polynomials you are interested in. The ones I've been looking at lately are not like this at all.
Worth a look. Author claims excellent numerical stability.
https://people.maths.ox.ac.uk/trefethen/barycentric.pdf
Used it once and it's accurate, but if monomial form is needed in the end, does not help at all.

mystran
KVRAF
5150 posts since 12 Feb, 2006 from Helsinki, Finland
2DaT wrote:
Tue Feb 12, 2019 10:11 am
I did my own tests.

Code: Select all

``````Factored: 3.8990848138928413ulp
Horner: 0.76185148768126965ulp``````
The error is not dramatic, but not neligible either. And i have a feeling that it will be worse for higher orders.
I'm curious, did you solve the factors from the monomial, or build the monomial from the factors?
2DaT wrote:
Tue Feb 12, 2019 10:11 am
Worth a look. Author claims excellent numerical stability.
https://people.maths.ox.ac.uk/trefethen/barycentric.pdf
Used it once and it's accurate, but if monomial form is needed in the end, does not help at all.
Curiously enough, I actually had that document open in a tab already... but yeah, I'm aware.
If you'd like Signaldust to return, please ask Katinka Tuisku to resign.

juha_p
KVRian
517 posts since 21 Feb, 2006 from FI
Is there any advantage to use hexadecimal floating-point constants ... looks like many does use it ?

Example (atan)

Edit: g++ (printf(%.14a" ...)) gave these hex values from 2DaT's example :

Code: Select all

``````0x1.430972002bf370p-13 + // x^5
0x1.5f07fa00024be0p-10 +
0x1.3b2b9fffffc470p-7 +
0x1.c6af6dffffee50p-5 +
0x1.ebfbde00000380p-3 + //x
0x1.62e43000000040p-1
``````
Last edited by juha_p on Tue Feb 12, 2019 2:39 pm, edited 2 times in total.

mystran
KVRAF
5150 posts since 12 Feb, 2006 from Helsinki, Finland
juha_p wrote:
Tue Feb 12, 2019 1:02 pm
Is there any advantage to use hexadecimal floating-point constants ... looks like many does use it ?