Phase Modulation synth development

DSP, Plug-in and Host development discussion.
KVRer
4 posts since 27 Oct, 2021
EDIT: SOLVED! Here's the solution for others who might encounter a similar issue. I implemented linear interpolation for all the sine wavetables and instead of modulating each operator's phase directly, summed all the modulation results and added that to the carrier phase. The result sounds like nice clean phase modulation with all 4 operators working as intended.

Original post:
Hey all. So a little disclaimer, I don't have a formal education in advanced math, DSP, or programming so go easy on me. I'm very passionate about Phase Modulation synthesis and made some great progress on a synthesizer of my own in the C programming language. That being said, I'm stuck. I have 2 operator PM working great, with results that sound similar to my Yamaha Reface DX synth. But as soon as I cascade more than 2 operators some very unusual behavior occurs.

The problem: It sounds as if the 3rd operator changes frequency as it's amplitude is increased even though it's frequency is set to a 1:1 ratio (all 3 operators are set to 220 hz). I notice no such thing when only using 2 operators, modulator and carrier.

This is my modulation function, I'm aware that my approach is different than others, I'm still learning.

void modulate(struct Operator *op_a, struct Operator *op_b, int16_t *LUT, int LUT_SIZE) {

//Convert float to int
int op_a_phase_i = (int) op_a->phase;

//Modulation happens here
op_b->phase += (float) LUT[op_a_phase_i] * op_a->amplitude;

//Increment phase accumulator
op_a->phase += op_a->delta_phi;

//LUT wrapping
if (op_a->phase >= LUT_SIZE)
op_a->phase -= LUT_SIZE;
if (op_a->phase < 0)
op_a->phase += LUT_SIZE;
if (op_b->phase >= LUT_SIZE)
op_b->phase -= LUT_SIZE;
if (op_b->phase < 0)
op_b->phase += LUT_SIZE;

return;

Operator A modulates Operator B's phase with it's current sample's amplitude. This function is called in an algorithm iterator function,which will go through at present a 4 operator cascade in series.

So 4 -> 3 -> 2 -> 1 -> output. Operators 4 and 3 become op_a and op_b on the first modulate(), ops 3 and 2 become op_a and op_b on the 2nd, 2 and 1 become op_a and op_b on the 3rd, finally a separate function gets the sample from the carrier, which goes to the audio output buffer. Operators that are disabled are skipped. There is little else happening in between.

KVRian
786 posts since 6 Aug, 2005 from England
Glad you solved the problem. BTW I've used that kind of phase wrapping before, and it bit me later. So you might want to make sure it works with any value with something like this:-

Code: Select all

``````		double wrapPhase(double angle, double wrap)
{
angle = std::fmod(angle, wrap);
return angle >= 0 ? angle : (angle + wrap);
}
``````

KVRer

Topic Starter

4 posts since 27 Oct, 2021
quikquak wrote:
Thu Oct 28, 2021 1:09 pm
Glad you solved the problem. BTW I've used that kind of phase wrapping before, and it bit me later. So you might want to make sure it works with any value with something like this:-

Code: Select all (#)

``````		double wrapPhase(double angle, double wrap)
{
angle = std::fmod(angle, wrap);
return angle >= 0 ? angle : (angle + wrap);
}
``````
Thanks for sharing.

Actually it bit me too especially when bringing up modulation strength, as it would easily lead to segmentation faults. I solved this by changing the if statements to while loops, which covers values that go absurdly out of range. However since I'm not very familiar with methods of phase wrapping and since optimization will probably be critical later on down the road, I'm open to trying alternatives.

KVRAF
6578 posts since 12 Feb, 2006 from Helsinki, Finland
LeviathaninWaves wrote:
Fri Oct 29, 2021 2:45 pm
Actually it bit me too especially when bringing up modulation strength, as it would easily lead to segmentation faults. I solved this by changing the if statements to while loops, which covers values that go absurdly out of range. However since I'm not very familiar with methods of phase wrapping and since optimization will probably be critical later on down the road, I'm open to trying alternatives.
With high modulation strength, using a branch here (let alone a loop) is going to be inefficient, because you'll take a lot of branch-prediction misses when the branches go one way or another more or less randomly. Unfortunately at least clang compiles fmod() into an actual function call, which is less than ideal as well.

What I would probably do is first scale the actual phase so that it's range is [0,1] which then means you can wrap large positive values into range like this:

Code: Select all

``````float wrapped = phase - (int) phase;
``````
That should (on x86) compile into CVTTSS2SI + CVTSI2SS + SUBSS (and all of these have vector variants), which is reasonably efficient. Integer cast truncates towards zero though, where as we really want the floor() here, so you might want to do something like:

Code: Select all

``````int32_t iphase = (int) phase;
iphase += iphase >> 31;
wrapped = phase - iphase;
``````
That's not totally portable, because >> is not really guaranteed to be arithmetic even for signed, but that's what all popular compilers do (casting to unsigned, shifting and then subtracting might be theoretically more portable)... but it's branchless and still reasonably efficient. You'll then need to multiply by LUT_SIZE for lookup, but that's a small price to pay.

If (when) you want to do linear interpolation on the lookup table, another classic trick is to add a few (eg. 2 is enough typically) extra samples to the array that wrap around from the beginning. This means that you can access slightly out of bounds (eg. i and i+1) without having to rewrap the second index.

That said.. this wrapping around and float->int->float->int conversion thing does still cost a few cycles and the truly efficient approach to classic FM (or well, usually PM) is to (1) rewrite the whole thing in fixed-point and (2) use power-of-two lookup sizes. Your fixed-point keeps only the decimals, the wrap-around comes from natural wrap-around of machine arithmetics and to index into the lookup table you just shift-right to pick as many bits as you need. The LUTs then store fixed point too and you only ever convert to floats at the very end when you write the output.
Preferred pronouns would be "it/it" because according to this country, I'm a piece of human trash.

KVRian
786 posts since 6 Aug, 2005 from England
And 'fmod' probably also uses a divide - oh the horror! I use mystran's 0-1 fraction removal example above, a lot.

Fixed point is cool. The first time I used it was maybe doing 3D stuff on the BBC Micro
It's a fun rabbit hole to explore, back then we had no choice.
I wonder what the Reface DX uses?

KVRAF
6578 posts since 12 Feb, 2006 from Helsinki, Finland
quikquak wrote:
Sun Oct 31, 2021 4:35 am
And 'fmod' probably also uses a divide - oh the horror! I use mystran's 0-1 fraction removal example above, a lot.
It's hard to do much better when the int-cast and sub totals somewhere around 11 cycles latency (eg. division alone is about twice that?) and about half that on throughput.. not to mention it's trivially SIMD-compatible (just replace all ops with vector versions).

As for fixed point, I'm usually not much of a fan (for audio anyway), but this is one of those situations where it definitely does speed things up, because you're doing very little actual math and just a lot of table lookups and the few extra shifts and bitmasks end up being much faster than converting back and forth into floats all the time.
Preferred pronouns would be "it/it" because according to this country, I'm a piece of human trash.

KVRer

Topic Starter

4 posts since 27 Oct, 2021
mystran wrote:
Sat Oct 30, 2021 6:27 pm

That said.. this wrapping around and float->int->float->int conversion thing does still cost a few cycles and the truly efficient approach to classic FM (or well, usually PM) is to (1) rewrite the whole thing in fixed-point and (2) use power-of-two lookup sizes. Your fixed-point keeps only the decimals, the wrap-around comes from natural wrap-around of machine arithmetics and to index into the lookup table you just shift-right to pick as many bits as you need. The LUTs then store fixed point too and you only ever convert to floats at the very end when you write the output.
EDIT: Figured it out. The fixed point pm algorithm now works. Now to just rewrite wraparound and maybe rewrite the linear interpolation.

I've been trying to rewrite this code to use fixed point the past couple of days and what's really tripping me up is:

1) The phase step value calculation which is frequency / sample rate * LUT_SIZE

2) Controlling amplitude of the sine signal, which was basically something like LUT[phase] * amplitude.

It's the multiplication and the division. Addition, subtraction, <<, >>, & and | is trivial. Oh and the sine LUT was originally int16_t. For the fixed point I was attempting to go for a 1.16.15 signed 32 bit format.

KVRAF
6578 posts since 12 Feb, 2006 from Helsinki, Finland
LeviathaninWaves wrote:
Tue Nov 02, 2021 10:52 am
It's the multiplication and the division. Addition, subtraction, <<, >>, & and | is trivial. Oh and the sine LUT was originally int16_t. For the fixed point I was attempting to go for a 1.16.15 signed 32 bit format.
First of all, when doing fixed point (well, really any "low-level bit-twiddling"), always use unsigned numbers. Even if that stuff represents signed numbers, you probably want to do the math in unsigned anyway (addition, subtraction and multiplication don't care and you shouldn't need any divisions). Wrap-around in particular is undefined behaviour (read optimizers assume it doesn't happen) for signed, but it's fine for unsigned.

Second.. your phase accumulator fixed point format should usually be 0:32, assuming you can get 64-bit results for multiplication (since generally you always want double-wide temporary before right-shifting back to range), but that's not a problem these days. You don't need an integer part (or a sign bit) at all... that's the whole point! You just let it overflow, so your values are always in the [0,1[ range.

And just in case... for multiplication your fixed point number is A*2^b where b is the fixed-point exponent. When multiplying A*2^b with C*2^d (where the exponents can be different), the result is A*C*2^(b+d). This is how floating point works as well, except with fixed-point you keep track of the exponents yourself. In algorithms like this, you usually do not want to "standardize" on a single fixed-point exponent, as you might want some to be 0:32 and some to be 8:24 or something.

A practical tip for situations where you have a bunch of different exponents is to put them in the variable names, eg. phase_0p32 or modDepth_8p24 or something. This way it's trivial to keep track of what the representation of every variable is, which is great when you need to edit your code a few years from now.
Preferred pronouns would be "it/it" because according to this country, I'm a piece of human trash.

KVRer

Topic Starter

4 posts since 27 Oct, 2021
Thanks mystran. It took me a bit to wrap my head around it but once I did I replaced the loop wrapping and odd signed fixed point, with fraction only unsigned 32 bit fixed point. It works very well and the code is so much cleaner now. Feedback also seems to behave correctly.