It really depends what you are trying to achieve. Do you want to oversample the direct audio signal or just the control voltage? Or maybe just a "wet" part? These details can relax the design constraints make a huge performance difference.
Generally, both the interpolation and decimation stage from 44.1kHz to 88.2kHz is particularly demanding, especially if you want to deliver a near full-band result.
IMHO, a clever FIR half-band and SSE combination will nuke both the polynomial and allpass polyphase approach performance and quality wise (polynomials mess with the pass band and have low aliasing rejection, allpass based halfbands on the other hand come with a strong phase-distortion and aren't that fast).
The problem with the latter is that usually, one can avoid oversampling the direct signal with a differential processing approach instead (i.e. only oversample the actual "wet/processed" part). But these tricks often only work with linear phase filters. In practice, linear phase filters often make the implementation of a faster overall processing structure easier, even if the FIR required my seem slower in direct comparison.
BTW, don't do the mistake to oversee that Laurent de Soras' assumes an already oversampled input in his polynomial paper! (the paper is easy to read, but it's also easy to oversee this important sentence) Polynomials are particularly bad oversampling filters from my observations, no matter in which situations. It's easy to design very fast FIRs too if we apply the same restrictions, and we can easily avoid the inherently problematic passband behavior of polynomial interpolators.
Here's a nice introduction to resampling and FIR structures (you can also find pretty good c++ FIR implementations packed in their ScopeFIR software):